[Impala-ASF-CR](2.x) IMPALA-6802 (part 1): Clean up authorization tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12708 ) Change subject: IMPALA-6802 (part 1): Clean up authorization tests .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/12708 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: I9cd5713607c423f644451af5ebb3494d3a728e3b Gerrit-Change-Number: 12708 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Sat, 09 Mar 2019 06:43:52 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-6802 (part 1): Clean up authorization tests
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12708 ) Change subject: IMPALA-6802 (part 1): Clean up authorization tests .. IMPALA-6802 (part 1): Clean up authorization tests The first patch of this patch is to introduce a new mechanism of testing authorization that tests authorization at every hierarchy. This patch rewrites the authorization tests for select statements. Testing: - Added new authorization tests - Ran all front-end tests Cherry-picks: not for 2.x Change-Id: I9cd5713607c423f644451af5ebb3494d3a728e3b Reviewed-on: http://gerrit.cloudera.org:8080/10135 Reviewed-by: Alex Behm Tested-by: Impala Public Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/12708 Reviewed-by: Fredy Wijaya --- A fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java M fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java 2 files changed, 659 insertions(+), 3 deletions(-) Approvals: Fredy Wijaya: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/12708 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: merged Gerrit-Change-Id: I9cd5713607c423f644451af5ebb3494d3a728e3b Gerrit-Change-Number: 12708 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR](2.x) IMPALA-6802 (part 1): Clean up authorization tests
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/12708 ) Change subject: IMPALA-6802 (part 1): Clean up authorization tests .. Patch Set 1: Code-Review+2 > Patch Set 1: > > Hi Fredy, I'm going to pick your patch of COMMENT ON COLUMN to branch 2.x. > However, the tests in it are in > fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java. This > file doesn't exist in branch 2.x. It's introduce by your 6 patches of > IMPALA-6802. So I'd like to cherry-pick these 6 patches first. Do you have > any concerns? Nope, I think it's definitely a good idea to cherry-pick those 6 patches. -- To view, visit http://gerrit.cloudera.org:8080/12708 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: I9cd5713607c423f644451af5ebb3494d3a728e3b Gerrit-Change-Number: 12708 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Sat, 09 Mar 2019 05:40:10 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-6802 (part 1): Clean up authorization tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12708 ) Change subject: IMPALA-6802 (part 1): Clean up authorization tests .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2399/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12708 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: I9cd5713607c423f644451af5ebb3494d3a728e3b Gerrit-Change-Number: 12708 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Sat, 09 Mar 2019 03:29:02 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-6802 (part 1): Clean up authorization tests
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/12708 ) Change subject: IMPALA-6802 (part 1): Clean up authorization tests .. Patch Set 1: Hi Fredy, I'm going to pick your patch of COMMENT ON COLUMN to branch 2.x. However, the tests in it are in fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java. This file doesn't exist in branch 2.x. It's introduce by your 6 patches of IMPALA-6802. So I'd like to cherry-pick these 6 patches first. Do you have any concerns? -- To view, visit http://gerrit.cloudera.org:8080/12708 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: I9cd5713607c423f644451af5ebb3494d3a728e3b Gerrit-Change-Number: 12708 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Sat, 09 Mar 2019 02:53:54 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-6802 (part 1): Clean up authorization tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12708 ) Change subject: IMPALA-6802 (part 1): Clean up authorization tests .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3896/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/12708 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: I9cd5713607c423f644451af5ebb3494d3a728e3b Gerrit-Change-Number: 12708 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Sat, 09 Mar 2019 02:55:12 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-6802 (part 1): Clean up authorization tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12708 ) Change subject: IMPALA-6802 (part 1): Clean up authorization tests .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/12708/1/fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java File fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java: http://gerrit.cloudera.org:8080/#/c/12708/1/fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java@126 PS1, Line 126: public Role addRoleGrantGroup(String roleName, String groupName) throws CatalogException { line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12708 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: I9cd5713607c423f644451af5ebb3494d3a728e3b Gerrit-Change-Number: 12708 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 09 Mar 2019 02:50:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR](2.x) IMPALA-6802 (part 1): Clean up authorization tests
Hello Alex Behm, Impala Public Jenkins, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/12708 to review the following change. Change subject: IMPALA-6802 (part 1): Clean up authorization tests .. IMPALA-6802 (part 1): Clean up authorization tests The first patch of this patch is to introduce a new mechanism of testing authorization that tests authorization at every hierarchy. This patch rewrites the authorization tests for select statements. Testing: - Added new authorization tests - Ran all front-end tests Cherry-picks: not for 2.x Change-Id: I9cd5713607c423f644451af5ebb3494d3a728e3b Reviewed-on: http://gerrit.cloudera.org:8080/10135 Reviewed-by: Alex Behm Tested-by: Impala Public Jenkins --- A fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java M fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java 2 files changed, 659 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/12708/1 -- To view, visit http://gerrit.cloudera.org:8080/12708 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: newchange Gerrit-Change-Id: I9cd5713607c423f644451af5ebb3494d3a728e3b Gerrit-Change-Number: 12708 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12702 ) Change subject: IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint. .. IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint. When PrettyPrinter prints a time measured in seconds, it multiplies the time by 1000, and prints that value as milliseconds. Because PrettyPrinter is implemented using templates, the type of the value depends on the type of the value parameter. For typical values of seconds the type is an int32_t. When this is multiplied by 1000 it can therefore overflow, which triggers a DHECK failure in DEBUG builds. Fix this by instead multiplying the value parameter by the existing constant 'THOUSAND' which is declared as a int64_t. This produces a result which is also a int64_t, and which does not overflow so easily. TESTING: Add more test cases to pretty-printer-test.cc including cases that previously caused overflows. Expand the coverage to include cases printing NanoSeocnds, MillisSeconds and MicroSeconds. These cases are not supposed to show that PrettyPrinter always behaves consistently, but to help maintainers avoid regressions when changing PrettyPrinter. Ran all end-to-end tests. Change-Id: I00d9769cf34e2ccd796ec1cf88797c8f8250f718 Reviewed-on: http://gerrit.cloudera.org:8080/12702 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/util/pretty-printer-test.cc M be/src/util/pretty-printer.h 2 files changed, 65 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/12702 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I00d9769cf34e2ccd796ec1cf88797c8f8250f718 Gerrit-Change-Number: 12702 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12702 ) Change subject: IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint. .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/12702 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I00d9769cf34e2ccd796ec1cf88797c8f8250f718 Gerrit-Change-Number: 12702 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Sat, 09 Mar 2019 01:21:04 + Gerrit-HasComments: No
[Impala-ASF-CR] [stress] factor out MemBroker into its own file.
Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/12687 ) Change subject: [stress] factor out MemBroker into its own file. .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12687 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I475d2e166bbf940e89135ea04be7b3003c37141b Gerrit-Change-Number: 12687 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Fri, 08 Mar 2019 22:08:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12702 ) Change subject: IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint. .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12702 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I00d9769cf34e2ccd796ec1cf88797c8f8250f718 Gerrit-Change-Number: 12702 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 08 Mar 2019 21:09:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12680 ) Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression. .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/12680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40 Gerrit-Change-Number: 12680 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Fri, 08 Mar 2019 21:18:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12680 ) Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression. .. IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression. IMPALA-6658 changed RleEncoder to have the ability to use run lengths other than 8. It seemed that a slightly more complex RleEncoder could save a small amount of disk space by using the longer run lengths, in particular for bit width of 1. We now see a performance regression on a simple ETL query. Overall it seems that the costs of IMPALA-6658 exceed the benefits. This change removes IMPALA-6658. The strategy for this was that the change to rle-encoding.h, which contains the code, was undone using 'git revert'. I removed the test changes in rle-test.cc that rely on different encoding lengths. This allows us to keep some useful new tests that were written as part of IMPALA-6658 TESTING: Ran all end-to-end tests. Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40 Reviewed-on: http://gerrit.cloudera.org:8080/12680 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/exec/parquet/parquet-bool-decoder-test.cc M be/src/util/rle-encoding.h M be/src/util/rle-test.cc 3 files changed, 141 insertions(+), 383 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/12680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40 Gerrit-Change-Number: 12680 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-8014: Revise FK/PK cardinality estimation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12535 ) Change subject: IMPALA-8014: Revise FK/PK cardinality estimation .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2398/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id0db82564596b0a9271b89b8e3f7a3cf92d306da Gerrit-Change-Number: 12535 Gerrit-PatchSet: 5 Gerrit-Owner: Paul Rogers Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 08 Mar 2019 21:16:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8284. KuduTableSink spends too much CPU in KuduSchema::Column()
Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12692 ) Change subject: IMPALA-8284. KuduTableSink spends too much CPU in KuduSchema::Column() .. IMPALA-8284. KuduTableSink spends too much CPU in KuduSchema::Column() The KuduSchema::Column() accessor actually returns a copy of the KuduColumnSchema object, which is not lightweight. We were inadvertently calling this function once for every null cell seen during an insertion. This caused a performance bottleneck for datasets with large numbers of NULL cells. This improves the situation by caching the nullability of the Kudu columns in our own vector. The vector lookups should be inlined and much faster than copying a KuduColumnSchema. No new tests included as this is a perf fix. Change-Id: I1b4d14d20252bdb190f50ebaaf6179a46eafb932 Reviewed-on: http://gerrit.cloudera.org:8080/12692 Reviewed-by: Will Berkeley Reviewed-by: Thomas Marshall Tested-by: Impala Public Jenkins --- M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-table-sink.h 2 files changed, 16 insertions(+), 4 deletions(-) Approvals: Will Berkeley: Looks good to me, but someone else must approve Thomas Marshall: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/12692 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I1b4d14d20252bdb190f50ebaaf6179a46eafb932 Gerrit-Change-Number: 12692 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[Impala-ASF-CR] IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint.
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/12702 ) Change subject: IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint. .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12702 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I00d9769cf34e2ccd796ec1cf88797c8f8250f718 Gerrit-Change-Number: 12702 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 08 Mar 2019 21:09:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8283. Order of Kudu PRIMARY KEYs can be silently ignored
Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/12694 ) Change subject: IMPALA-8283. Order of Kudu PRIMARY KEYs can be silently ignored .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/12694/1/fe/src/main/java/org/apache/impala/common/PrintUtils.java File fe/src/main/java/org/apache/impala/common/PrintUtils.java: http://gerrit.cloudera.org:8080/#/c/12694/1/fe/src/main/java/org/apache/impala/common/PrintUtils.java@205 PS1, Line 205: public static String joinQuoted(Iterable objs) { brief comment -- To view, visit http://gerrit.cloudera.org:8080/12694 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0499cee7c532db19cddac3906198d965b27ea604 Gerrit-Change-Number: 12694 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 08 Mar 2019 21:10:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12702 ) Change subject: IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint. .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3895/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/12702 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I00d9769cf34e2ccd796ec1cf88797c8f8250f718 Gerrit-Change-Number: 12702 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 08 Mar 2019 21:09:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().
Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/12672 ) Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). .. Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/12672/2/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/12672/2/be/src/rpc/rpc-mgr.h@29 PS2, Line 29: using kudu::MonoDelta; Don't use 'using' in header files, just fully qualify it everywhere below. http://gerrit.cloudera.org:8080/#/c/12672/2/be/src/rpc/rpc-mgr.h@185 PS2, Line 185: F&& rpc_call Rather than passing in a lambda that performs the call, how would you feel about passing in the proxy object and the function to call, i.e. the way ClientCache::DoRpc() currently works? The motivation for this is that I've written an equivalent DoAsyncRpcWithRetry() function (see https://gerrit.cloudera.org/#/c/12297/), and for the async case we can't take a lambda that calls the rpc because we need to wrap the callback that gets passed to the rpc function in order to simulate recv errors, and it would be nice for the async and sync cases to work the same. Two downsides that I see to this: - It makes the arg list very long, but this is alleviated somewhat by my patch which should prevent callers from having to specify 'error_msg' or 'debug_action' - It prevents patterns like what you did in query-state.cc. That doesn't matter for now though (see my comments in query-state.cc) and I think we can come up with a solution if needed in the future. http://gerrit.cloudera.org:8080/#/c/12672/2/be/src/rpc/rpc-mgr.h@194 PS2, Line 194: typename L There's really only one signature we should be accepting for the log function (i.e. no args and returns void) so I think it might be cleaner just specify the type, something like: typedef boost::function RpcLogFn; http://gerrit.cloudera.org:8080/#/c/12672/2/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/12672/2/be/src/runtime/query-state.cc@400 PS2, Line 400: rpc_status = RpcMgr::DoRpcWithRetry(report_exec_status, log_failure, query_ctx_, We don't actually want to retry this rpc like this. Its low impact if the rpc doesn't succeed, because we'll send the missed info with the next report, so we prefer to minimize the number of report rpcs being sent by not retrying the same report. http://gerrit.cloudera.org:8080/#/c/12672/2/be/src/service/control-service.cc File be/src/service/control-service.cc: http://gerrit.cloudera.org:8080/#/c/12672/2/be/src/service/control-service.cc@171 PS2, Line 171: DebugActionNoFail(impala_server->default_query_options_, "RPC_CANCELQUERYFINSTANCES"); I think its probably easier if you leave this alone in this patch and let me handle it in my debug action patch. For example, because this use of 'default_query_options_' is unfortunate. I think the right solution is to add a new flag, say 'krpc_debug_action', and then define a KrpcDebugActionNoFail() and a few other related things which is all stuff that I'll be touching in my patch anyways. -- To view, visit http://gerrit.cloudera.org:8080/12672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 Gerrit-Change-Number: 12672 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Fri, 08 Mar 2019 21:07:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8014: Revise FK/PK cardinality estimation
Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/12535 ) Change subject: IMPALA-8014: Revise FK/PK cardinality estimation .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/12535/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java File fe/src/main/java/org/apache/impala/planner/JoinNode.java: http://gerrit.cloudera.org:8080/#/c/12535/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java@350 PS4, Line 350: // The code has decided that the RHS is a dimension (detail, FK) table, and : // that the LHS is a fact (master, FK) table > isn't it the opposite? Done http://gerrit.cloudera.org:8080/#/c/12535/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java@416 PS4, Line 416: fkCard = Math.max(fkCard, largestKeyCard); > Sorry, I meant you are multiplying all lhsNdv() for each slot and comparing Right. Very confusing. This code is only for the compound key case. We are dealing with the lack of an NDV for the key as a whole. (This is where we can very much use Anurag's FK/PK enhancement.) Note the step above in which we constrain the fk cardinality to be no greater than the master table cardinality. This can occur, as in our tests, when the fk consists of (id, int_col), so NDV(compound key) = NDV(id) * NDV(int_col) > master tale cardinality. So, we constrain the foreign key to be no greater than the master table (based on our M:1 assumption.) But, then we have to deal with the case that our guess is wrong: the FK cardinality really is greater. Even if the two columns are perfectly correlated, their joint NDV has to be at least as large as the largest individual NDV. To handle the reality of messy schemas and stats, we: * Compute the "raw" joint NDV * Constrain it by the detail table size (can't have more keys than rows) * Constrain it by the master table size (can't have more FKs than PKs, ideally) * Constrain it by largest key NDV size (can't have fewer compound keys than column NDV, even if that is greater than master table size) I *think* this is right. The above messy adjustments are the result of inspecting tests that changed. But please do think it through to be certain. -- To view, visit http://gerrit.cloudera.org:8080/12535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id0db82564596b0a9271b89b8e3f7a3cf92d306da Gerrit-Change-Number: 12535 Gerrit-PatchSet: 4 Gerrit-Owner: Paul Rogers Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 08 Mar 2019 20:31:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8014: Revise FK/PK cardinality estimation
Hello Bharath Vissapragada, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12535 to look at the new patch set (#5). Change subject: IMPALA-8014: Revise FK/PK cardinality estimation .. IMPALA-8014: Revise FK/PK cardinality estimation Impala uses two techniques to estimate join cardinality: one for FK/PK, another for the "generic" case. (IMPALA-8018 suggests we combine the two. But, for the purposes of this fix, we leave them separate.) The code for the FK/PK estimation is mostly right for a key that consists of a singe column. But, if a key is compound (contains more than one column), the calculations can produce incorrect results. This patch modifies the code to use the standard join calculation: |join| = |left| * |right| / max(|left key|, |right key|). The math for compound keys had a number of errors. Since no test tables represent a proper compound key, all tests use unrealistic keys such as a unique Id along with a second, superfluous column. To make the math come out OK for these, the code included adjustments that misfired on real-world keys. This patch fixes those errors. A separate patch introduces new tables with the required schema. Tests: Many planner tests were modified to reflect the new join cardinality estimate. Some plans changed, including some TPC-H plans. Inspected each change to verify that the numbers are correct using the formula cited above. Turns out that the change caused a spill test to fail. The new plan is more efficient and didn't require as much memory. Shrank the memory given to the query to force the newer, more efficient plan to also spill. Testing was also done using a production query that misfired without this fix, but that produced correct results with this fix. Change-Id: Id0db82564596b0a9271b89b8e3f7a3cf92d306da --- M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/card-inner-join.test M testdata/workloads/functional-planner/queries/PlannerTest/card-multi-join.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test M testdata/workloads/functional-planner/queries/PlannerTest/order.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test M testdata/workloads/functional-query/queries/QueryTest/spilling.test 17 files changed, 1,920 insertions(+), 1,829 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/12535/5 -- To view, visit http://gerrit.cloudera.org:8080/12535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id0db82564596b0a9271b89b8e3f7a3cf92d306da Gerrit-Change-Number: 12535 Gerrit-PatchSet: 5 Gerrit-Owner: Paul Rogers Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8271: Refactor the use of Thrift enums for query options
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12682 ) Change subject: IMPALA-8271: Refactor the use of Thrift enums for query options .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2397/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12682 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6d747aae2c689765be72e117ce030ce4e3ce4641 Gerrit-Change-Number: 12682 Gerrit-PatchSet: 6 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 08 Mar 2019 19:34:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7917 (Part 3): Decouple Sentry from Impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12684 ) Change subject: IMPALA-7917 (Part 3): Decouple Sentry from Impala .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2396/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12684 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1a5d3e0a3e86ac2b0329b56247357fca93229dd0 Gerrit-Change-Number: 12684 Gerrit-PatchSet: 8 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Comment-Date: Fri, 08 Mar 2019 19:18:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8271: Refactor the use of Thrift enums for query options
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/12682 ) Change subject: IMPALA-8271: Refactor the use of Thrift enums for query options .. Patch Set 6: Code-Review+1 (1 comment) Carry Csaba's +1. Tim, can you take a look at it for the +2? http://gerrit.cloudera.org:8080/#/c/12682/5/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/12682/5/be/src/service/query-options.cc@504 PS5, Line 504: query_options->__ > nit: could fit in one less lines Done -- To view, visit http://gerrit.cloudera.org:8080/12682 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6d747aae2c689765be72e117ce030ce4e3ce4641 Gerrit-Change-Number: 12682 Gerrit-PatchSet: 6 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 08 Mar 2019 19:02:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8271: Refactor the use of Thrift enums for query options
Fredy Wijaya has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/12682 ) Change subject: IMPALA-8271: Refactor the use of Thrift enums for query options .. IMPALA-8271: Refactor the use of Thrift enums for query options This patch refactors the use of Thrift enums with GetThriftEnum helper function that can automatically validate and convert the query option value into the corresponding Thrift enum value. The validation error message has also been improved to list all possible valid query option values. Testing: - Added missing test cases in both BE and E2E - Ran query-options-test.cc - Ran metadata/test_set.py - Ran query_test/test_nested_types.py - Ran query_test/test_scanners.py Change-Id: I6d747aae2c689765be72e117ce030ce4e3ce4641 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M testdata/workloads/functional-query/queries/QueryTest/parquet-ambiguous-list-modern.test M testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test M testdata/workloads/functional-query/queries/QueryTest/set.test 5 files changed, 125 insertions(+), 160 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/12682/6 -- To view, visit http://gerrit.cloudera.org:8080/12682 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6d747aae2c689765be72e117ce030ce4e3ce4641 Gerrit-Change-Number: 12682 Gerrit-PatchSet: 6 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8271: Refactor the use of Thrift enums for query options
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12682 ) Change subject: IMPALA-8271: Refactor the use of Thrift enums for query options .. Patch Set 5: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/12682/5/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/12682/5/be/src/service/query-options.cc@504 PS5, Line 504: &enum_type)); nit: could fit in one less lines -- To view, visit http://gerrit.cloudera.org:8080/12682 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6d747aae2c689765be72e117ce030ce4e3ce4641 Gerrit-Change-Number: 12682 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 08 Mar 2019 18:54:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7917 (Part 3): Decouple Sentry from Impala
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/12684 ) Change subject: IMPALA-7917 (Part 3): Decouple Sentry from Impala .. Patch Set 8: Code-Review+1 (3 comments) Carry Bharath's +1. Paul, can you take another look for the +2? http://gerrit.cloudera.org:8080/#/c/12684/7/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java: http://gerrit.cloudera.org:8080/#/c/12684/7/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java@76 PS7, Line 76: > nit: \n Done http://gerrit.cloudera.org:8080/#/c/12684/7/fe/src/main/java/org/apache/impala/authorization/sentry/SentryCatalogdAuthorizationManager.java File fe/src/main/java/org/apache/impala/authorization/sentry/SentryCatalogdAuthorizationManager.java: http://gerrit.cloudera.org:8080/#/c/12684/7/fe/src/main/java/org/apache/impala/authorization/sentry/SentryCatalogdAuthorizationManager.java@52 PS7, Line 52: uses Sentry for Catalogd. > I think this is not super clear. Mind elaborating this a little bit? Done http://gerrit.cloudera.org:8080/#/c/12684/7/fe/src/main/java/org/apache/impala/util/ClassUtil.java File fe/src/main/java/org/apache/impala/util/ClassUtil.java: http://gerrit.cloudera.org:8080/#/c/12684/7/fe/src/main/java/org/apache/impala/util/ClassUtil.java@25 PS7, Line 25: StackTraceElement stackTrace = Thread.currentThread().getStackTrace()[2]; > Do we need to worry about the stack length? No, first element is always getStackTrace(), second element is always getMethodName(). Third is the actual method that invokes this method. -- To view, visit http://gerrit.cloudera.org:8080/12684 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1a5d3e0a3e86ac2b0329b56247357fca93229dd0 Gerrit-Change-Number: 12684 Gerrit-PatchSet: 8 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Comment-Date: Fri, 08 Mar 2019 18:36:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7917 (Part 3): Decouple Sentry from Impala
Fredy Wijaya has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/12684 ) Change subject: IMPALA-7917 (Part 3): Decouple Sentry from Impala .. IMPALA-7917 (Part 3): Decouple Sentry from Impala The third part of this patch introduces an interface called AuthorizationManager to perform authorization management related functions, such as granting/revoking a privilege, etc. Some existing authorization management methods have been refactored to reduce the need for if/else conditions to perform certain actions. This patch moves code related to Sentry authorization management code to: - SentryCatalogdAuthorizationManager - SentryImpaladAuthorizationManager The split makes it easier to differentiate between authorization DDL operations performed in Catalogd vs Impalad. This patch does not implement AuthorizationManager for Ranger. This patch has no functionality change. Testing: - Ran all FE tests - Ran all E2E authorization tests Change-Id: I1a5d3e0a3e86ac2b0329b56247357fca93229dd0 --- M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java A fe/src/main/java/org/apache/impala/authorization/AuthorizationManager.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthProvider.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java A fe/src/main/java/org/apache/impala/authorization/sentry/SentryCatalogdAuthorizationManager.java A fe/src/main/java/org/apache/impala/authorization/sentry/SentryImpaladAuthorizationManager.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryProxy.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/FeCatalogManager.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/main/java/org/apache/impala/util/ClassUtil.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/testutil/PlannerTestCaseLoader.java 23 files changed, 1,056 insertions(+), 425 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/12684/8 -- To view, visit http://gerrit.cloudera.org:8080/12684 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1a5d3e0a3e86ac2b0329b56247357fca93229dd0 Gerrit-Change-Number: 12684 Gerrit-PatchSet: 8 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers
[Impala-ASF-CR] IMPALA-8271: Refactor the use of Thrift enums for query options
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12682 ) Change subject: IMPALA-8271: Refactor the use of Thrift enums for query options .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2395/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12682 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6d747aae2c689765be72e117ce030ce4e3ce4641 Gerrit-Change-Number: 12682 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 08 Mar 2019 18:32:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8014: Incorrect FK/PK cardinality estimation
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12535 ) Change subject: IMPALA-8014: Incorrect FK/PK cardinality estimation .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/12535/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java File fe/src/main/java/org/apache/impala/planner/JoinNode.java: http://gerrit.cloudera.org:8080/#/c/12535/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java@416 PS4, Line 416: fkCard = Math.max(fkCard, largestKeyCard); > Pardon my ignorance, isn't fkCard always >= largestKeyCard here? You are ju Sorry, I meant you are multiplying all lhsNdv() for each slot and comparing it to the max among them. Generally I think the else {} block here could use an example. -- To view, visit http://gerrit.cloudera.org:8080/12535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id0db82564596b0a9271b89b8e3f7a3cf92d306da Gerrit-Change-Number: 12535 Gerrit-PatchSet: 4 Gerrit-Owner: Paul Rogers Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 08 Mar 2019 18:27:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8014: Incorrect FK/PK cardinality estimation
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12535 ) Change subject: IMPALA-8014: Incorrect FK/PK cardinality estimation .. Patch Set 4: (2 comments) My understanding of the math here is limited but I think the patch generally makes sense. I have a couple of nits. http://gerrit.cloudera.org:8080/#/c/12535/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java File fe/src/main/java/org/apache/impala/planner/JoinNode.java: http://gerrit.cloudera.org:8080/#/c/12535/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java@350 PS4, Line 350: // The code has decided that the RHS is a dimension (detail, FK) table, and : // that the LHS is a fact (master, FK) table isn't it the opposite? http://gerrit.cloudera.org:8080/#/c/12535/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java@416 PS4, Line 416: fkCard = Math.max(fkCard, largestKeyCard); Pardon my ignorance, isn't fkCard always >= largestKeyCard here? You are just multiplying. Can you please give an example where it isn't? -- To view, visit http://gerrit.cloudera.org:8080/12535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id0db82564596b0a9271b89b8e3f7a3cf92d306da Gerrit-Change-Number: 12535 Gerrit-PatchSet: 4 Gerrit-Owner: Paul Rogers Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 08 Mar 2019 18:25:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8271: Refactor the use of Thrift enums for query options
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/12682 ) Change subject: IMPALA-8271: Refactor the use of Thrift enums for query options .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/12682/2/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/12682/2/be/src/service/query-options.cc@343 PS2, Line 343: {0, "CACHE_LOCAL"}, > I see. There are some other query options where not all values are actually Done. We can actually just pass our own map in GetThriftEnum for the list of valid enum values. http://gerrit.cloudera.org:8080/#/c/12682/2/be/src/service/query-options.cc@365 PS2, Line 365: > Specifying the template does not seem necessary here, as it can be deduced Done -- To view, visit http://gerrit.cloudera.org:8080/12682 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6d747aae2c689765be72e117ce030ce4e3ce4641 Gerrit-Change-Number: 12682 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 08 Mar 2019 18:10:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12702 ) Change subject: IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint. .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2394/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12702 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I00d9769cf34e2ccd796ec1cf88797c8f8250f718 Gerrit-Change-Number: 12702 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 08 Mar 2019 18:09:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8271: Refactor the use of Thrift enums for query options
Fredy Wijaya has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/12682 ) Change subject: IMPALA-8271: Refactor the use of Thrift enums for query options .. IMPALA-8271: Refactor the use of Thrift enums for query options This patch refactors the use of Thrift enums with GetThriftEnum helper function that can automatically validate and convert the query option value into the corresponding Thrift enum value. The validation error message has also been improved to list all possible valid query option values. Testing: - Added missing test cases in both BE and E2E - Ran query-options-test.cc - Ran metadata/test_set.py - Ran query_test/test_nested_types.py - Ran query_test/test_scanners.py Change-Id: I6d747aae2c689765be72e117ce030ce4e3ce4641 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M testdata/workloads/functional-query/queries/QueryTest/parquet-ambiguous-list-modern.test M testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test M testdata/workloads/functional-query/queries/QueryTest/set.test 5 files changed, 126 insertions(+), 160 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/12682/5 -- To view, visit http://gerrit.cloudera.org:8080/12682 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6d747aae2c689765be72e117ce030ce4e3ce4641 Gerrit-Change-Number: 12682 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12672 ) Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2393/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 Gerrit-Change-Number: 12672 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Fri, 08 Mar 2019 18:03:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7917 (Part 3): Decouple Sentry from Impala
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12684 ) Change subject: IMPALA-7917 (Part 3): Decouple Sentry from Impala .. Patch Set 7: Code-Review+1 (4 comments) I'll let Paul take another pass as the class structure changed since he +1ed it the last time. http://gerrit.cloudera.org:8080/#/c/12684/5/fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java File fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java: http://gerrit.cloudera.org:8080/#/c/12684/5/fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java@62 PS5, Line 62:*/ : AuthorizationManager newAuthorizationManager(FeCatalogManager catalog, : AuthorizationChecker authzChecker); : : /** :* Creates a new instance of {@link AuthorizationManager}. :*/ > I like the idea of splitting between SentryCatalogdAuthzManager and SentryI I think that makes sense. http://gerrit.cloudera.org:8080/#/c/12684/7/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java: http://gerrit.cloudera.org:8080/#/c/12684/7/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java@76 PS7, Line 76: } nit: \n http://gerrit.cloudera.org:8080/#/c/12684/7/fe/src/main/java/org/apache/impala/authorization/sentry/SentryCatalogdAuthorizationManager.java File fe/src/main/java/org/apache/impala/authorization/sentry/SentryCatalogdAuthorizationManager.java: http://gerrit.cloudera.org:8080/#/c/12684/7/fe/src/main/java/org/apache/impala/authorization/sentry/SentryCatalogdAuthorizationManager.java@52 PS7, Line 52: uses Sentry for Catalogd. I think this is not super clear. Mind elaborating this a little bit? http://gerrit.cloudera.org:8080/#/c/12684/7/fe/src/main/java/org/apache/impala/util/ClassUtil.java File fe/src/main/java/org/apache/impala/util/ClassUtil.java: http://gerrit.cloudera.org:8080/#/c/12684/7/fe/src/main/java/org/apache/impala/util/ClassUtil.java@25 PS7, Line 25: StackTraceElement stackTrace = Thread.currentThread().getStackTrace()[2]; Do we need to worry about the stack length? -- To view, visit http://gerrit.cloudera.org:8080/12684 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1a5d3e0a3e86ac2b0329b56247357fca93229dd0 Gerrit-Change-Number: 12684 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Comment-Date: Fri, 08 Mar 2019 17:56:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint.
Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12702 Change subject: IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint. .. IMPALA-8288: avoid overflow causing DCHECK in PrettyPrint. When PrettyPrinter prints a time measured in seconds, it multiplies the time by 1000, and prints that value as milliseconds. Because PrettyPrinter is implemented using templates, the type of the value depends on the type of the value parameter. For typical values of seconds the type is an int32_t. When this is multiplied by 1000 it can therefore overflow, which triggers a DHECK failure in DEBUG builds. Fix this by instead multiplying the value parameter by the existing constant 'THOUSAND' which is declared as a int64_t. This produces a result which is also a int64_t, and which does not overflow so easily. TESTING: Add more test cases to pretty-printer-test.cc including cases that previously caused overflows. Expand the coverage to include cases printing NanoSeocnds, MillisSeconds and MicroSeconds. These cases are not supposed to show that PrettyPrinter always behaves consistently, but to help maintainers avoid regressions when changing PrettyPrinter. Ran all end-to-end tests. Change-Id: I00d9769cf34e2ccd796ec1cf88797c8f8250f718 --- M be/src/util/pretty-printer-test.cc M be/src/util/pretty-printer.h 2 files changed, 65 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/12702/1 -- To view, visit http://gerrit.cloudera.org:8080/12702 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I00d9769cf34e2ccd796ec1cf88797c8f8250f718 Gerrit-Change-Number: 12702 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman
[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().
Andrew Sherman has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/12672 ) Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). .. IMPALA-8143: Enhance DoRpcWithRetry(). Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if the remote service is busy. Allow callers to specify a logging lambda that is called for rpc failures. Using this feature, change QueryState to use DoRpcWithRetry for reporting execution status. Reviewers should note that the following changes to retry parameters were made, somewhat arbitrarily. I welcome suggestions as to the correct values. Cancel: add a 3 second sleep if service is busy. RemoteShutdown: add a 3 second sleep if service is busy. ReportExecStatus: Try up to 2 times to do the rpc, add a 3 second sleep if service is busy. TESTING: Ran all end-to-end tests. Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in two ways. Firstly we test the case where the remote service is busy (which we force by filling up the queue), and secondly we exercise DoRpcWithRetry by passing lambdas that simulate failures. Change the service-side fault injection mechanism of CancelQueryFInstances and RemoteShutdown to use DebugAction. Clean up unused values of FaultInjectionUtil.RpcCallType and match values with test_rpc_timeout.py. Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-mgr.inline.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/service/client-request-state.cc M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/testutil/fault-injection-util.h M tests/custom_cluster/test_rpc_timeout.py 13 files changed, 326 insertions(+), 91 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/2 -- To view, visit http://gerrit.cloudera.org:8080/12672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 Gerrit-Change-Number: 12672 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-8271: Refactor the use of Thrift enums for query options
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12682 ) Change subject: IMPALA-8271: Refactor the use of Thrift enums for query options .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/12682/2/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/12682/2/be/src/service/query-options.cc@343 PS2, Line 343: uery_options->__set_replica_preference(T > Yeah this is an odd one because we don't remove the unused options, .i.e. C I see. There are some other query options where not all values are actually supported, e.g. I am sure that parquet_compression_codec will only work with a subset of codecs. default_file_format is also questionable, as Impala can create all kinds of tables, but cannot insert into some of them, so I am not sure if it makes sense to allow those as defaults. I am ok with the current state of the patch, but it may make sense to filter unsupported enums, e.g. by adding an optional argument to GetThriftEnum with the list of valid values. http://gerrit.cloudera.org:8080/#/c/12682/2/be/src/service/query-options.cc@365 PS2, Line 365: Specifying the template does not seem necessary here, as it can be deduced from argument 'enum_type'. -- To view, visit http://gerrit.cloudera.org:8080/12682 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6d747aae2c689765be72e117ce030ce4e3ce4641 Gerrit-Change-Number: 12682 Gerrit-PatchSet: 3 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 08 Mar 2019 17:17:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12680 ) Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression. .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3894/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/12680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40 Gerrit-Change-Number: 12680 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Fri, 08 Mar 2019 16:53:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12680 ) Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression. .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40 Gerrit-Change-Number: 12680 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Fri, 08 Mar 2019 16:53:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12680 ) Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression. .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40 Gerrit-Change-Number: 12680 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Fri, 08 Mar 2019 16:52:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6503: Support reading complex types from ORC
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/12168 ) Change subject: IMPALA-6503: Support reading complex types from ORC .. Patch Set 25: (15 comments) > Patch Set 25: > > (33 comments) > > Ah, I missed that this was almost committed. I had started reviewing but > hadn't finished so hadn't posted the comments yet. I'll post my comments here > for now. Todd, thanks for your helpful comments! I should notify you before I run GVO... What if I can get this several hours earlier! > Quanlong, do you mind if I build a patch on this to take care of my suggested > cleanups? I feel like we still need to polish the codes (especially some comments) to improve readablity. Most of the reasons and examples are given in the Google Docs and slides (https://docs.google.com/presentation/d/1uj8m7y69o47MhpqCc0SJ03GDTtPDrg4m04eAFVmq34A). We may need to write them down in comments. It'll be great if Cloudera folks can join the development of the ORC support. Todd, feel free to create a follow-up JIRA for this. I'll reply more comments in details later. http://gerrit.cloudera.org:8080/#/c/12168/18/be/src/exec/hdfs-orc-scanner.h File be/src/exec/hdfs-orc-scanner.h: http://gerrit.cloudera.org:8080/#/c/12168/18/be/src/exec/hdfs-orc-scanner.h@182 PS18, Line 182: tch batch updated > I noticed that the ORC docs are not very consistent on terminology here. Th Yes. "Type" and "node" in the ORC relative codes all means "column". http://gerrit.cloudera.org:8080/#/c/12168/18/be/src/exec/hdfs-orc-scanner.h@235 PS18, Line 235: /// Advances 'stripe_idx_' to the next non-empty stripe and initializes > can you update the doc to explain the 'column_reader' parameter here? The d Sorry for my misleading comments. The "orc_root_reader_" keeps reference of the "orc_root_batch_". Each child of "orc_root_reader_" keeps the reference of the corresponding sub batch of "orc_root_batch_" and so on recursively. What I want to explain is that the "column_reader" will read values of the orc batch it refers and then materialize tuples of "dst_batch". http://gerrit.cloudera.org:8080/#/c/12168/18/be/src/exec/hdfs-orc-scanner.h@243 PS18, Line 243: > 'selected_indices' doesn't seem to exist here. Should it be "selected nodes Yes. Fixed in a later version of the patch. http://gerrit.cloudera.org:8080/#/c/12168/18/be/src/exec/hdfs-orc-scanner.h@244 PS18, Line 244: /// Materialize collection(list/map) tuples belong to the 'row_idx'-th row of > I'm having trouble understanding this documentation. Maybe worth giving an Yes. There're some examples here: https://docs.google.com/presentation/d/1uj8m7y69o47MhpqCc0SJ03GDTtPDrg4m04eAFVmq34A http://gerrit.cloudera.org:8080/#/c/12168/18/be/src/exec/orc-column-readers.h File be/src/exec/orc-column-readers.h: http://gerrit.cloudera.org:8080/#/c/12168/18/be/src/exec/orc-column-readers.h@72 PS18, Line 72: OrcColumnReader(const orc::Type* node, const SlotDescriptor* slot_desc, : HdfsOrcScanner* scanner); > instead of having both a public constructor and a public factory function a Yes! We can make it protected. http://gerrit.cloudera.org:8080/#/c/12168/18/be/src/exec/orc-column-readers.h@79 PS18, Line 79: virtual bool MaterializeTuple() const { return true; } > Can we rename this to sound more like a boolean? As is, it sounds like a fu Sure. Actually, I struggled to think of a name. Maybe better to be "shouldMaterializeTuples". This function is mainly used in complex type readers. http://gerrit.cloudera.org:8080/#/c/12168/18/be/src/exec/orc-column-readers.h@88 PS18, Line 88: orc::ColumnVectorBatch > can this be const? I tried to do this before but failed because orc::ColumnVectorBatch::notNull is a "orc::DataBuffer". We need to use "notNull[row_idx]" but the operator[] of "orc::DataBuffer" is lake of a const version: https://github.com/apache/orc/blob/fa0a33e4a7331dd21485bf4ccc0a93edf4b3ae16/c++/include/orc/MemoryPool.hh#L72 http://gerrit.cloudera.org:8080/#/c/12168/18/be/src/exec/orc-column-readers.h@122 PS18, Line 122: class OrcBoolColumnReader : public OrcColumnReader { > do we need to define all the subtypes of readers in the .h file or could th I think they sould be here since HdfsOrcScanner reference them as friend classes and hdfs-orc-scanner.h only includes this .h file. http://gerrit.cloudera.org:8080/#/c/12168/18/be/src/exec/orc-column-readers.cc File be/src/exec/orc-column-readers.cc: http://gerrit.cloudera.org:8080/#/c/12168/18/be/src/exec/orc-column-readers.cc@46 PS18, Line 46: switch (slot_desc->type().type) { > I think it's worth a comment explaining why we switched above on node->getK Yes. There's a reason why we switch on Impala slot type here. Will sumarize it later. We've verify the compatibility before we create the column readers. The verification is in OrcSchemaResolver::ResolveColumn. http://gerrit.cloudera.org:8080/#/
[Impala-ASF-CR] IMPALA-7917 (Part 3): Decouple Sentry from Impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12684 ) Change subject: IMPALA-7917 (Part 3): Decouple Sentry from Impala .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2392/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12684 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1a5d3e0a3e86ac2b0329b56247357fca93229dd0 Gerrit-Change-Number: 12684 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Comment-Date: Fri, 08 Mar 2019 10:16:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8284. KuduTableSink spends too much CPU in KuduSchema::Column()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12692 ) Change subject: IMPALA-8284. KuduTableSink spends too much CPU in KuduSchema::Column() .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/12692 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b4d14d20252bdb190f50ebaaf6179a46eafb932 Gerrit-Change-Number: 12692 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 08 Mar 2019 10:03:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7917 (Part 3): Decouple Sentry from Impala
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/12684 ) Change subject: IMPALA-7917 (Part 3): Decouple Sentry from Impala .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/12684/5/fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java File fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java: http://gerrit.cloudera.org:8080/#/c/12684/5/fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java@62 PS5, Line 62:*/ : AuthorizationManager newAuthorizationManager(FeCatalogManager catalog, : AuthorizationChecker authzChecker); : : /** :* Creates a new instance of {@link AuthorizationManager}. :*/ > I have a feeling that this AuthzMgr should be tied to a Catalog and not whe I like the idea of splitting between SentryCatalogdAuthzManager and SentryImpaladAuthzManager. However I don't think taking Catalog is going to be good enough. FeCatalog is another type of Catalog that doesn't actually implement Catalog. Taking FeCatalog is also not a bad idea since depending on the implementation (for example the CatalogdImpl), the catalog instance can change, so we need to use the FeCatalogManager instead. So, I'm thinking of having two methods, one that takes FeCatalogManager and the other one that takes CatalogServiceCatalog. This way the intention is clear one is for impalad and the other one is for catalogd. Let me know what you think. -- To view, visit http://gerrit.cloudera.org:8080/12684 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1a5d3e0a3e86ac2b0329b56247357fca93229dd0 Gerrit-Change-Number: 12684 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Comment-Date: Fri, 08 Mar 2019 09:31:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7917 (Part 3): Decouple Sentry from Impala
Fredy Wijaya has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/12684 ) Change subject: IMPALA-7917 (Part 3): Decouple Sentry from Impala .. IMPALA-7917 (Part 3): Decouple Sentry from Impala The third part of this patch introduces an interface called AuthorizationManager to perform authorization management related functions, such as granting/revoking a privilege, etc. Some existing authorization management methods have been refactored to reduce the need for if/else conditions to perform certain actions. This patch moves code related to Sentry authorization management code to: - SentryCatalogdAuthorizationManager - SentryImpaladAuthorizationManager The split makes it easier to differentiate between authorization DDL operations performed in Catalogd vs Impalad. This patch does not implement AuthorizationManager for Ranger. This patch has no functionality change. Testing: - Ran all FE tests - Ran all E2E authorization tests Change-Id: I1a5d3e0a3e86ac2b0329b56247357fca93229dd0 --- M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java A fe/src/main/java/org/apache/impala/authorization/AuthorizationManager.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java M fe/src/main/java/org/apache/impala/authorization/NoneAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthProvider.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java A fe/src/main/java/org/apache/impala/authorization/sentry/SentryCatalogdAuthorizationManager.java A fe/src/main/java/org/apache/impala/authorization/sentry/SentryImpaladAuthorizationManager.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryProxy.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/FeCatalogManager.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java A fe/src/main/java/org/apache/impala/util/ClassUtil.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/testutil/PlannerTestCaseLoader.java 23 files changed, 1,044 insertions(+), 425 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/12684/7 -- To view, visit http://gerrit.cloudera.org:8080/12684 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1a5d3e0a3e86ac2b0329b56247357fca93229dd0 Gerrit-Change-Number: 12684 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers