[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14885 ) Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws() .. IMPALA-8891: Fix non-standard null handling in concat_ws() This patch fixes the non-standard null handling logic for function 'concat_ws', while maintaining the original null handling for function 'concat' Existing statuses: For function concat_ws, any null string element in array argument 'strs' will result in null result, just like below: select concat_ws('-','foo',null,'bar') as expr1; +---+ | expr1 | +---+ | NULL | +---+ New Statuses: In this implementation, the function conforms to hive standard: 1.will join all the non-null string objects as the result 2.if all string objects are null, return empty string 3.if separator is null, return null below is a example: - select concat_ws('-','foo',null,'bar') as expr1; +--+ | expr1 | +--+ | foo-bar | +--+ Key changes: * Reimplement function StringFunctions::ConcatWs by filtering the null value and only process the valid string values, based on original code structure. * StringFunctions::Concat was also reimplemented, as it used to call ConcatWs but should keep the original NULL handling. Testing: * Ran exaustive tests. Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b Reviewed-on: http://gerrit.cloudera.org:8080/14885 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/exprs/expr-test.cc M be/src/exprs/string-functions-ir.cc 2 files changed, 77 insertions(+), 21 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/14885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b Gerrit-Change-Number: 14885 Gerrit-PatchSet: 13 Gerrit-Owner: jichen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: jichen
[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14885 ) Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws() .. Patch Set 12: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b Gerrit-Change-Number: 14885 Gerrit-PatchSet: 12 Gerrit-Owner: jichen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: jichen Gerrit-Comment-Date: Tue, 17 Dec 2019 07:11:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4192: Move static state from ExecNode into a PlanNode
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14764 ) Change subject: IMPALA-4192: Move static state from ExecNode into a PlanNode .. IMPALA-4192: Move static state from ExecNode into a PlanNode This patch adds a new class called PlanNode which contains a subset of the static state of their corresponding ExecNode, of which there is one instance per fragment. ExecNode contains the runtime state and there can be up to MT_DOP instances of it per fragment. It also adds a similar class called AggregatorConfig which contains static state initialized from the thrift aggregator struct and is passed as an input to the Aggregator class's constructor. Eventually all static state including codegened function pointers would be moved to the PlanNodes. Testing: Ran exhaustive tests successfully. Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472 Reviewed-on: http://gerrit.cloudera.org:8080/14764 Reviewed-by: Bikramjeet Vig Tested-by: Impala Public Jenkins --- M be/src/exec/aggregation-node-base.cc M be/src/exec/aggregation-node-base.h M be/src/exec/aggregation-node.cc M be/src/exec/aggregation-node.h M be/src/exec/aggregator.cc M be/src/exec/aggregator.h M be/src/exec/analytic-eval-node.cc M be/src/exec/analytic-eval-node.h M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h M be/src/exec/cardinality-check-node.cc M be/src/exec/cardinality-check-node.h M be/src/exec/data-source-scan-node.cc M be/src/exec/data-source-scan-node.h M be/src/exec/empty-set-node.cc M be/src/exec/empty-set-node.h M be/src/exec/exchange-node.cc M be/src/exec/exchange-node.h M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-scan-node.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node-mt.h M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/kudu-scan-node-base.cc M be/src/exec/kudu-scan-node-base.h M be/src/exec/kudu-scan-node-mt.cc M be/src/exec/kudu-scan-node-mt.h M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scan-node.h M be/src/exec/nested-loop-join-node.cc M be/src/exec/nested-loop-join-node.h M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/partial-sort-node.cc M be/src/exec/partial-sort-node.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/singular-row-src-node.cc M be/src/exec/singular-row-src-node.h M be/src/exec/sort-node.cc M be/src/exec/sort-node.h M be/src/exec/streaming-aggregation-node.cc M be/src/exec/streaming-aggregation-node.h M be/src/exec/subplan-node.cc M be/src/exec/subplan-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exec/unnest-node.cc M be/src/exec/unnest-node.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h 64 files changed, 1,338 insertions(+), 700 deletions(-) Approvals: Bikramjeet Vig: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/14764 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472 Gerrit-Change-Number: 14764 Gerrit-PatchSet: 14 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4192: Move static state from ExecNode into a PlanNode
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14764 ) Change subject: IMPALA-4192: Move static state from ExecNode into a PlanNode .. Patch Set 13: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14764 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472 Gerrit-Change-Number: 14764 Gerrit-PatchSet: 13 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 17 Dec 2019 05:41:47 + Gerrit-HasComments: No
[Impala-ASF-CR] Added support of Kudu DATE type to Impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14705 ) Change subject: Added support of Kudu DATE type to Impala .. Patch Set 4: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/5299/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/14705 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91656749a58ac769b54c2a63bdd4f85c89520b32 Gerrit-Change-Number: 14705 Gerrit-PatchSet: 4 Gerrit-Owner: Volodymyr Verovkin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Volodymyr Verovkin Gerrit-Comment-Date: Tue, 17 Dec 2019 05:20:59 + Gerrit-HasComments: No
[Impala-ASF-CR] Added support of Kudu DATE type to Impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14705 ) Change subject: Added support of Kudu DATE type to Impala .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14705/4/tests/query_test/test_kudu.py File tests/query_test/test_kudu.py: http://gerrit.cloudera.org:8080/#/c/14705/4/tests/query_test/test_kudu.py@437 PS4, Line 437: , flake8: E501 line too long (99 > 90 characters) -- To view, visit http://gerrit.cloudera.org:8080/14705 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91656749a58ac769b54c2a63bdd4f85c89520b32 Gerrit-Change-Number: 14705 Gerrit-PatchSet: 4 Gerrit-Owner: Volodymyr Verovkin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Volodymyr Verovkin Gerrit-Comment-Date: Tue, 17 Dec 2019 04:59:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Added support of Kudu DATE type to Impala
Volodymyr Verovkin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14705 ) Change subject: Added support of Kudu DATE type to Impala .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14705/1/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test File testdata/workloads/functional-query/queries/QueryTest/kudu_create.test: http://gerrit.cloudera.org:8080/#/c/14705/1/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test@155 PS1, Line 155: date > Do you plan to add that new test in this patchset or in a separate one? Done -- To view, visit http://gerrit.cloudera.org:8080/14705 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91656749a58ac769b54c2a63bdd4f85c89520b32 Gerrit-Change-Number: 14705 Gerrit-PatchSet: 4 Gerrit-Owner: Volodymyr Verovkin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Volodymyr Verovkin Gerrit-Comment-Date: Tue, 17 Dec 2019 04:59:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Added support of Kudu DATE type to Impala
Hello Thomas Tauber-Marshall, Alexey Serbin, Adar Dembo, Grant Henke, Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14705 to look at the new patch set (#4). Change subject: Added support of Kudu DATE type to Impala .. Added support of Kudu DATE type to Impala Change-Id: I91656749a58ac769b54c2a63bdd4f85c89520b32 --- M be/src/exec/kudu-util.cc M fe/src/main/java/org/apache/impala/util/KuduUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test M tests/query_test/test_kudu.py 10 files changed, 649 insertions(+), 570 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/14705/4 -- To view, visit http://gerrit.cloudera.org:8080/14705 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I91656749a58ac769b54c2a63bdd4f85c89520b32 Gerrit-Change-Number: 14705 Gerrit-PatchSet: 4 Gerrit-Owner: Volodymyr Verovkin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Volodymyr Verovkin
[Impala-ASF-CR] Fix handling of 'Connection' header in squeasel
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14915 ) Change subject: Fix handling of 'Connection' header in squeasel .. Fix handling of 'Connection' header in squeasel This patch bumps Impala's version of squeasel to 030ccce87359d892e22fb368c5fc5b75d9a2a5f7 in order to pull in a fix for an issue where squeasel was improperly closing connections when the 'Connection: Upgrade' header was included in the request, which Apache Knox will include in certain situations. See https://github.com/cloudera/squeasel/pull/16 Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417 Reviewed-on: http://gerrit.cloudera.org:8080/14915 Reviewed-by: Adar Dembo Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M be/src/thirdparty/squeasel/squeasel.c 1 file changed, 21 insertions(+), 10 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/14915 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417 Gerrit-Change-Number: 14915 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Fix handling of 'Connection' header in squeasel
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14915 ) Change subject: Fix handling of 'Connection' header in squeasel .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14915 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417 Gerrit-Change-Number: 14915 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 17 Dec 2019 04:34:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14894 ) Change subject: IMPALA-9009: Core support for Ranger column masking .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5298/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14894 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61 Gerrit-Change-Number: 14894 Gerrit-PatchSet: 7 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 17 Dec 2019 04:06:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9257: Last event id should be advanced if all events are skipped
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14916 ) Change subject: IMPALA-9257: Last event id should be advanced if all events are skipped .. Patch Set 1: (1 comment) Nice find! http://gerrit.cloudera.org:8080/#/c/14916/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/14916/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@593 PS1, Line 593: if (events.isEmpty()) return; Should we still update EVENTS_RECEIVED_METRIC in this case? -- To view, visit http://gerrit.cloudera.org:8080/14916 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f94c1a8e8c221f504262d5591cda8c3a25c0c32 Gerrit-Change-Number: 14916 Gerrit-PatchSet: 1 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 17 Dec 2019 03:45:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14894 ) Change subject: IMPALA-9009: Core support for Ranger column masking .. Patch Set 7: Fix issues for CreateView and AlterView. Also add test for Union statements. -- To view, visit http://gerrit.cloudera.org:8080/14894 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61 Gerrit-Change-Number: 14894 Gerrit-PatchSet: 7 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 17 Dec 2019 03:36:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking
Hello Fang-Yu Rao, Vihang Karajgaonkar, Kurt Deschler, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14894 to look at the new patch set (#7). Change subject: IMPALA-9009: Core support for Ranger column masking .. IMPALA-9009: Core support for Ranger column masking Ranger provides column masking policies about how to show masked values to specific users when reading specific columns. This patch add support to rewrite the query AST based on column masking policies. We perform the column masking policies by replacing the TableRef with a subquery doing the masking. For instance, the following query select c_id, c_name from customer c join orders on c_id = o_cid will be transfomed into select c_id, c_name from ( select mask1(c_id) as c_id, mask2(c_name) as c_name from customer ) c join orders on c_id = o_cid The transfomation is done in AST resolution. Just like view resolution, if the table needs masking we replace it with a subquery(InlineViewRef) containing the masking expressions. This patch only adds support for mask types that don't require builtin mask functions. So currently supported masking types are MASK_NULL and CUSTOM. Current Limitations: - Users are required to have privileges on all columns of a masked table(IMPALA-9223), since the table mask subquery contains all the columns. Tests: - Add e2e tests for masked results - Run core tests Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61 --- M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java A fe/src/main/java/org/apache/impala/authorization/TableMask.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationFactory.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java A testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test M tests/authorization/test_ranger.py M tests/common/impala_test_suite.py 23 files changed, 756 insertions(+), 75 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/14894/7 -- To view, visit http://gerrit.cloudera.org:8080/14894 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61 Gerrit-Change-Number: 14894 Gerrit-PatchSet: 7 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14885 ) Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws() .. Patch Set 12: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b Gerrit-Change-Number: 14885 Gerrit-PatchSet: 12 Gerrit-Owner: jichen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: jichen Gerrit-Comment-Date: Tue, 17 Dec 2019 02:38:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14885 ) Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws() .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5352/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b Gerrit-Change-Number: 14885 Gerrit-PatchSet: 12 Gerrit-Owner: jichen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: jichen Gerrit-Comment-Date: Tue, 17 Dec 2019 02:38:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14885 ) Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws() .. Patch Set 11: Code-Review+2 (1 comment) LGTM. Thank Ji! http://gerrit.cloudera.org:8080/#/c/14885/11/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/14885/11/be/src/exprs/string-functions-ir.cc@918 PS11, Line 918: DCHECK_GT(valid_num_children, 0); > This should be moved to before the if-clause since we used to check it in t Sorry, that I miss-read this as DCHECK_GE. Then it's right here since "valid_num_children == 0" is covered by "valid_start_index < 0" -- To view, visit http://gerrit.cloudera.org:8080/14885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b Gerrit-Change-Number: 14885 Gerrit-PatchSet: 11 Gerrit-Owner: jichen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: jichen Gerrit-Comment-Date: Tue, 17 Dec 2019 02:37:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14885 ) Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws() .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/14885/11/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/14885/11/be/src/exprs/string-functions-ir.cc@918 PS11, Line 918: DCHECK_GT(valid_num_children, 0); This should be moved to before the if-clause since we used to check it in the if-clause. You can put it right after the for-loop. -- To view, visit http://gerrit.cloudera.org:8080/14885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b Gerrit-Change-Number: 14885 Gerrit-PatchSet: 11 Gerrit-Owner: jichen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: jichen Gerrit-Comment-Date: Tue, 17 Dec 2019 02:27:30 + Gerrit-HasComments: Yes
[native-toolchain-CR] IMPALA-9236 Make llvm for aarch64 platform.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14900 ) Change subject: IMPALA-9236 Make llvm for aarch64 platform. .. Patch Set 3: Verified+1 Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14900 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4a59e8c9c847b3c7eb05584230e1ecbfebcf0104 Gerrit-Change-Number: 14900 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 17 Dec 2019 01:48:25 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-9236 Make llvm for aarch64 platform.
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14900 ) Change subject: IMPALA-9236 Make llvm for aarch64 platform. .. IMPALA-9236 Make llvm for aarch64 platform. Change-Id: I4a59e8c9c847b3c7eb05584230e1ecbfebcf0104 Reviewed-on: http://gerrit.cloudera.org:8080/14900 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M source/llvm/build-3.3.sh M source/llvm/build-source-tarball.sh 2 files changed, 4 insertions(+), 0 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/14900 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I4a59e8c9c847b3c7eb05584230e1ecbfebcf0104 Gerrit-Change-Number: 14900 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14885 ) Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws() .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5297/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b Gerrit-Change-Number: 14885 Gerrit-PatchSet: 11 Gerrit-Owner: jichen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: jichen Gerrit-Comment-Date: Tue, 17 Dec 2019 01:37:10 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-9246 Add patch for crcutils to support aarch64
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14901 to look at the new patch set (#4). Change subject: IMPALA-9246 Add patch for crcutils to support aarch64 .. IMPALA-9246 Add patch for crcutils to support aarch64 This adds a patch for crcutils to make it works on aarch64 platform. Change-Id: Ia3c38563a549ab9c522e23a19e9951f5ab0ada37 --- M buildall.sh A source/crcutil/crcutil-440ba7babeff77ffad992df3a10c767f184e946e-patches/0002-Build-crcutil-440ba7b-on-aarch64.patch 2 files changed, 119 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/01/14901/4 -- To view, visit http://gerrit.cloudera.org:8080/14901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia3c38563a549ab9c522e23a19e9951f5ab0ada37 Gerrit-Change-Number: 14901 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9257: Last event id should be advanced if all events are skipped
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14916 ) Change subject: IMPALA-9257: Last event id should be advanced if all events are skipped .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5296/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14916 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f94c1a8e8c221f504262d5591cda8c3a25c0c32 Gerrit-Change-Number: 14916 Gerrit-PatchSet: 1 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 17 Dec 2019 01:28:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4400: aggregate runtime filters locally
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14538 ) Change subject: IMPALA-4400: aggregate runtime filters locally .. Patch Set 16: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5295/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14538 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iabeeab5eec869ff2197250ad41c1eb5551704acc Gerrit-Change-Number: 14538 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 17 Dec 2019 01:24:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()
jichen has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/14885 ) Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws() .. IMPALA-8891: Fix non-standard null handling in concat_ws() This patch fixes the non-standard null handling logic for function 'concat_ws', while maintaining the original null handling for function 'concat' Existing statuses: For function concat_ws, any null string element in array argument 'strs' will result in null result, just like below: select concat_ws('-','foo',null,'bar') as expr1; +---+ | expr1 | +---+ | NULL | +---+ New Statuses: In this implementation, the function conforms to hive standard: 1.will join all the non-null string objects as the result 2.if all string objects are null, return empty string 3.if separator is null, return null below is a example: - select concat_ws('-','foo',null,'bar') as expr1; +--+ | expr1 | +--+ | foo-bar | +--+ Key changes: * Reimplement function StringFunctions::ConcatWs by filtering the null value and only process the valid string values, based on original code structure. * StringFunctions::Concat was also reimplemented, as it used to call ConcatWs but should keep the original NULL handling. Testing: * Ran exaustive tests. Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b --- M be/src/exprs/expr-test.cc M be/src/exprs/string-functions-ir.cc 2 files changed, 77 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/14885/11 -- To view, visit http://gerrit.cloudera.org:8080/14885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b Gerrit-Change-Number: 14885 Gerrit-PatchSet: 11 Gerrit-Owner: jichen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: jichen
[Impala-ASF-CR] IMPALA-4192: Move static state from ExecNode into a PlanNode
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/14764 ) Change subject: IMPALA-4192: Move static state from ExecNode into a PlanNode .. Patch Set 13: Code-Review+2 carrying over +2 -- To view, visit http://gerrit.cloudera.org:8080/14764 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472 Gerrit-Change-Number: 14764 Gerrit-PatchSet: 13 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 17 Dec 2019 01:15:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4192: Move static state from ExecNode into a PlanNode
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14764 ) Change subject: IMPALA-4192: Move static state from ExecNode into a PlanNode .. Patch Set 13: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5351/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14764 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472 Gerrit-Change-Number: 14764 Gerrit-PatchSet: 13 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 17 Dec 2019 01:15:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9257: Last event id should be advanced if all events are skipped
Vihang Karajgaonkar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14916 Change subject: IMPALA-9257: Last event id should be advanced if all events are skipped .. IMPALA-9257: Last event id should be advanced if all events are skipped Events processor implements a filtering method which skips certain unnecessary events (eg. events on blacklisted dbs and tables). However, if the received batch has all the events which are filtered out, it fails to update its lastSyncedEventId. This causes unnecessary logs being printed in catalog and the same event batch being fetched repeatedly. Testing: Modified existing test to compare event id after events on blacklisted dbs and tables. Change-Id: I7f94c1a8e8c221f504262d5591cda8c3a25c0c32 --- M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M tests/custom_cluster/test_event_processing.py 2 files changed, 7 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/14916/1 -- To view, visit http://gerrit.cloudera.org:8080/14916 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7f94c1a8e8c221f504262d5591cda8c3a25c0c32 Gerrit-Change-Number: 14916 Gerrit-PatchSet: 1 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-4400: aggregate runtime filters locally
Tim Armstrong has uploaded a new patch set (#16). ( http://gerrit.cloudera.org:8080/14538 ) Change subject: IMPALA-4400: aggregate runtime filters locally .. IMPALA-4400: aggregate runtime filters locally Move RuntimeFilterBank to QueryState(). Implement fine-grained locking for each filter to mitigate any increased lock contention from the change. Make RuntimeFilterBank handle multiple producers of the same filter, e.g. multiple instances of a partitioned join. It computes the expected number of filters upfront then sends the filter to the coordinator once all the local instances have been merged together. The merging can done in parallel locally to improve latency of filter propagation. Add Or() methods to MinMaxFilter and BloomFilter, since we now need to merge those, not just the thrift versions. Update coordinator filter routing to expect only one instance of a filter from each producer backend and to only send one instance to each consumer backend (instead of sending one per fragment). Update memory reservations and estimates to be lower to account for sharing of filters between fragment instances. mt_dop plans are modified to show these shared and non-shared resources separately. Enable waiting for runtime filters for kudu scanner with mt_dop. Made min/max filters const-correct. Testing * Added unit tests for Or() methods. * Added some additional e2e test coverage for mt_dop queries * Updated planner tests with new estimates and reservation. * Ran a single node 3-impalad stress test with TPC-H kudu and TPC-DS parquet. * Ran exhaustive tests. * Ran core tests with ASAN. Perf * Did a single-node perf run on TPC-H with default settings. No perf change. * Single-node perf run with mt_dop=8 showed significant speedups: +--+---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +--+---+-++++ | TPCH(30) | parquet / none / none | 10.14 | -7.29% | 5.05 | -11.68%| +--+---+-++++ +--+--+---++-++---++---++-+-+ | Workload | Query| File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | Tval| +--+--+---++-++---++---++-+-+ | TPCH(30) | TPCH-Q7 | parquet / none / none | 38.87 | 38.44 | +1.13% | 7.17% | * 10.92% * | 20| +0.72% | 0.72| 0.39| | TPCH(30) | TPCH-Q1 | parquet / none / none | 4.28 | 4.26| +0.50% | 1.92% | 1.09%| 20| +0.03% | 0.31| 1.01| | TPCH(30) | TPCH-Q22 | parquet / none / none | 2.32 | 2.32| +0.05% | 2.01% | 1.89%| 20| -0.03% | -0.36 | 0.08| | TPCH(30) | TPCH-Q15 | parquet / none / none | 3.73 | 3.75| -0.42% | 0.84% | 1.05%| 20| -0.25% | -0.77 | -1.40 | | TPCH(30) | TPCH-Q13 | parquet / none / none | 9.80 | 9.83| -0.38% | 0.51% | 0.80%| 20| -0.32% | -1.30 | -1.81 | | TPCH(30) | TPCH-Q2 | parquet / none / none | 1.98 | 2.00| -1.32% | 1.74% | 2.81%| 20| -0.64% | -1.71 | -1.79 | | TPCH(30) | TPCH-Q6 | parquet / none / none | 1.22 | 1.25| -2.14% | 2.66% | 4.15%| 20| -0.96% | -2.00 | -1.95 | | TPCH(30) | TPCH-Q19 | parquet / none / none | 5.13 | 5.22| -1.65% | 1.20% | 1.40%| 20| -1.76% | -3.34 | -4.02 | | TPCH(30) | TPCH-Q16 | parquet / none / none | 2.46 | 2.56| -4.13% | 2.49% | 1.99%| 20| -4.31% | -4.04 | -5.94 | | TPCH(30) | TPCH-Q9 | parquet / none / none | 81.63 | 85.07 | -4.05% | 4.94% | 3.06%| 20| -5.46% | -3.28 | -3.21 | | TPCH(30) | TPCH-Q10 | parquet / none / none | 5.07 | 5.50| I -7.92% | 0.96% | 1.33%| 20| I -8.51% | -5.27 | -22.14 | | TPCH(30) | TPCH-Q21 | parquet / none / none | 24.00 | 26.24 | I -8.57% | 0.46% | 0.38%| 20| I -9.34% | -5.27 | -67.47 | | TPCH(30) | TPCH-Q18 | parquet / none / none | 8.66 | 9.50| I -8.86% | 0.62% | 0.44%| 20| I -9.75% | -5.27 | -55.17 | | TPCH(30) | TPCH-Q3 | parquet / none / none | 6.01 | 6.70| I -10.19% | 1.01% | 0.90%| 20| I -11.25% | -5.27 | -35.76 | | TPCH(30
[Impala-ASF-CR] Fix handling of 'Connection' header in squeasel
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14915 ) Change subject: Fix handling of 'Connection' header in squeasel .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5350/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14915 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417 Gerrit-Change-Number: 14915 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 17 Dec 2019 00:10:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14894 ) Change subject: IMPALA-9009: Core support for Ranger column masking .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/14894/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14894/6//COMMIT_MSG@32 PS6, Line 32: - CreateView and AlterView statements on masked tables/views are not :dealt with in this patch, since the rewritten AST may not be able to :generate sql without syntax errors (e.g. missing alias). Just realize that we don't need to rewrite statements for table masking for CreateView and AlterView. The table masking will happen when the view is used. Will update the patch for this. -- To view, visit http://gerrit.cloudera.org:8080/14894 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61 Gerrit-Change-Number: 14894 Gerrit-PatchSet: 6 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 17 Dec 2019 00:08:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix handling of 'Connection' header in squeasel
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14915 ) Change subject: Fix handling of 'Connection' header in squeasel .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14915 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417 Gerrit-Change-Number: 14915 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 17 Dec 2019 00:07:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4192: Move static state from ExecNode into a PlanNode
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14764 ) Change subject: IMPALA-4192: Move static state from ExecNode into a PlanNode .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5294/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14764 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472 Gerrit-Change-Number: 14764 Gerrit-PatchSet: 13 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 16 Dec 2019 23:42:21 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix handling of 'Connection' header in squeasel
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14915 ) Change subject: Fix handling of 'Connection' header in squeasel .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5293/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14915 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417 Gerrit-Change-Number: 14915 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 16 Dec 2019 23:27:14 + Gerrit-HasComments: No
[Impala-ASF-CR] Fix handling of 'Connection' header in squeasel
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14915 ) Change subject: Fix handling of 'Connection' header in squeasel .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14915 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417 Gerrit-Change-Number: 14915 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 16 Dec 2019 23:26:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4192: Move static state from ExecNode into a PlanNode
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/14764 ) Change subject: IMPALA-4192: Move static state from ExecNode into a PlanNode .. Patch Set 13: Seems like an unrelated failure (Auth test), not sure why it failed since I had earlier ran exhaustive tests and core tests on ASAN build successfully. Rebased and testing again. -- To view, visit http://gerrit.cloudera.org:8080/14764 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472 Gerrit-Change-Number: 14764 Gerrit-PatchSet: 13 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 16 Dec 2019 23:12:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4192: Move static state from ExecNode into a PlanNode
Hello Michael Ho, Daniel Becker, Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14764 to look at the new patch set (#13). Change subject: IMPALA-4192: Move static state from ExecNode into a PlanNode .. IMPALA-4192: Move static state from ExecNode into a PlanNode This patch adds a new class called PlanNode which contains a subset of the static state of their corresponding ExecNode, of which there is one instance per fragment. ExecNode contains the runtime state and there can be up to MT_DOP instances of it per fragment. It also adds a similar class called AggregatorConfig which contains static state initialized from the thrift aggregator struct and is passed as an input to the Aggregator class's constructor. Eventually all static state including codegened function pointers would be moved to the PlanNodes. Testing: Ran exhaustive tests successfully. Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472 --- M be/src/exec/aggregation-node-base.cc M be/src/exec/aggregation-node-base.h M be/src/exec/aggregation-node.cc M be/src/exec/aggregation-node.h M be/src/exec/aggregator.cc M be/src/exec/aggregator.h M be/src/exec/analytic-eval-node.cc M be/src/exec/analytic-eval-node.h M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h M be/src/exec/cardinality-check-node.cc M be/src/exec/cardinality-check-node.h M be/src/exec/data-source-scan-node.cc M be/src/exec/data-source-scan-node.h M be/src/exec/empty-set-node.cc M be/src/exec/empty-set-node.h M be/src/exec/exchange-node.cc M be/src/exec/exchange-node.h M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-scan-node.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node-mt.h M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/kudu-scan-node-base.cc M be/src/exec/kudu-scan-node-base.h M be/src/exec/kudu-scan-node-mt.cc M be/src/exec/kudu-scan-node-mt.h M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scan-node.h M be/src/exec/nested-loop-join-node.cc M be/src/exec/nested-loop-join-node.h M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/partial-sort-node.cc M be/src/exec/partial-sort-node.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/scan-node.cc M be/src/exec/scan-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/singular-row-src-node.cc M be/src/exec/singular-row-src-node.h M be/src/exec/sort-node.cc M be/src/exec/sort-node.h M be/src/exec/streaming-aggregation-node.cc M be/src/exec/streaming-aggregation-node.h M be/src/exec/subplan-node.cc M be/src/exec/subplan-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exec/unnest-node.cc M be/src/exec/unnest-node.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h 64 files changed, 1,338 insertions(+), 700 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/14764/13 -- To view, visit http://gerrit.cloudera.org:8080/14764 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472 Gerrit-Change-Number: 14764 Gerrit-PatchSet: 13 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Fix handling of 'Connection' header in squeasel
Thomas Tauber-Marshall has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14915 Change subject: Fix handling of 'Connection' header in squeasel .. Fix handling of 'Connection' header in squeasel This patch bumps Impala's version of squeasel to 030ccce87359d892e22fb368c5fc5b75d9a2a5f7 in order to pull in a fix for an issue where squeasel was improperly closing connections when the 'Connection: Upgrade' header was included in the request, which Apache Knox will include in certain situations. See https://github.com/cloudera/squeasel/pull/16 Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417 --- M be/src/thirdparty/squeasel/squeasel.c 1 file changed, 21 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/14915/1 -- To view, visit http://gerrit.cloudera.org:8080/14915 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417 Gerrit-Change-Number: 14915 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-9248: fix test krpc rpcz on older kernels
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14913 ) Change subject: IMPALA-9248: fix test_krpc_rpcz on older kernels .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5292/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14913 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I40aece9d37622710a4ef3add2b1088bb3ac91563 Gerrit-Change-Number: 14913 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Mon, 16 Dec 2019 22:06:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9248: fix test krpc rpcz on older kernels
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14913 Change subject: IMPALA-9248: fix test_krpc_rpcz on older kernels .. IMPALA-9248: fix test_krpc_rpcz on older kernels Switched the test to check a different counter that is present in the 2.6.32 kernel (see https://kernel.googlesource.com/pub/scm/linux/kernel/git/vapier/lemote/+/linux-2.6.32-stable/include/linux/tcp.h) Testing: Ran the affected test on a CentOS 6 system. Change-Id: I40aece9d37622710a4ef3add2b1088bb3ac91563 --- M tests/webserver/test_web_pages.py 1 file changed, 5 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/14913/1 -- To view, visit http://gerrit.cloudera.org:8080/14913 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I40aece9d37622710a4ef3add2b1088bb3ac91563 Gerrit-Change-Number: 14913 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-9137: Blacklist node if a DataStreamService RPC to the node fails
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14677 ) Change subject: IMPALA-9137: Blacklist node if a DataStreamService RPC to the node fails .. Patch Set 12: > Depends on the particular RPC. When we blacklist for a failed > Exec() rpc, it will never have been retried. For TransmitData(), it > will only be retried if the error that came back is "server too > busy" How about connection / connectivity errors? If the connection is broken, do we make any attempt to retry connection establishment or do we just fail? > If auto-scaling takes into account blacklisting, I would probably > say that's the wrong approach and we should think about changing it > to only consider the statestore view of cluster membership. Well if a node is blacklisted, no queries get scheduled on that node right? So functionally, the node isn't doing any useful work, so it might make sense for the auto-scaler to think it is "dead" and no longer part of the executor group. Perhaps a better approach would be to only consider a node as "dead" if it has been blacklisted multiple times. I think this is a larger discussion though, we haven't spent enough time thinking through node blacklisting + auto-scaler integration. I think we should, but probably as a separate item. > Definitely agree that we need to put some work into realistic fault > testing, and not just rely on our intuition about blacklisting/retry > policy. Filed IMPALA-9253 for this. > I'm definitely interested in approaches to retries that don't > require marking statuses retriable (obviously I've had a lot of > concerns with that approach). I could see some problems with this > approach too, such as that we may have situations in the future > where we want to retry queries without doing any blacklisting (eg. > if we think there was stale metadata and replanning the query will > let it succeed) Yeah, its possible in the future that we might want to retry queries even if a node hasn't been blacklisted, but I don't see a strong use case for that type of pattern right now. In the interest of not over-engineering the initial solution, I'm fine with just making retries blacklist / cluster membership driven. > or probably worse - that you could have a query > that fails due to a failed rpc and reports that the node should be > blacklisted, and then also hits another error that isn't retry-able > but we retry it anyways. Not sure how common a situation like that > would be, though. Definitely possible, but a bit of an edge case. A query would have to fail for two separate reasons at approximately the same time: an error that triggers a blacklist, and an error that doesn't. Whenever the Coordinator gets an error, it cancels the rest of the query, so the non-blacklist error has to occur before the cancellation is processed. Its something we should fix, but I think we can defer out of the initial implementation. Filed IMPALA-9124 as a follow up. The fix would probably be to ensure that all failed fragments trigger a blacklist or failed because they were cancelled. -- To view, visit http://gerrit.cloudera.org:8080/14677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I733cca13847fde43c8ea2ae574d3ae04bd06419c Gerrit-Change-Number: 14677 Gerrit-PatchSet: 12 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 16 Dec 2019 21:30:07 + Gerrit-HasComments: No
[Impala-ASF-CR] Added support of Kudu DATE type to Impala
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14705 ) Change subject: Added support of Kudu DATE type to Impala .. Patch Set 3: One thing I think is missing here - adding support for DATE to our min-max runtime filters. I think its fine to leave that for a followup patch, though please file a JIRA for it, but you'll probably also need to make sure in this patch that we aren't trying to create DATE min-max filters, which would probably result in a crash. If you're not sure how to write a query that will generate a min-max filter in order to test this, feel free to reach out to me. See the DECIMAL patch for how they temporarily avoided generated DECIMAL min-max filters: https://gerrit.cloudera.org/#/c/9368/10/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java -- To view, visit http://gerrit.cloudera.org:8080/14705 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91656749a58ac769b54c2a63bdd4f85c89520b32 Gerrit-Change-Number: 14705 Gerrit-PatchSet: 3 Gerrit-Owner: Volodymyr Verovkin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Volodymyr Verovkin Gerrit-Comment-Date: Mon, 16 Dec 2019 19:53:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4400: aggregate runtime filters locally
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14538 ) Change subject: IMPALA-4400: aggregate runtime filters locally .. Patch Set 15: (15 comments) http://gerrit.cloudera.org:8080/#/c/14538/15//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14538/15//COMMIT_MSG@18 PS15, Line 18: can done typo http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/exec/kudu-scan-node-base.cc File be/src/exec/kudu-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/exec/kudu-scan-node-base.cc@110 PS15, Line 110: if (filter_ctxs_.size() > 0) WaitForRuntimeFilters(); Maybe outside the scope of this patch, but it looks like the hdfs scanners don't call this until GetNext(), which is nice since it means that all of the setup can happen before we wait. Might want to do the same thing for Kudu. http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/coordinator-backend-state.h@136 PS15, Line 136: bool HasFragmentIdx(int fragment_idx); const http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/coordinator-backend-state.h@140 PS15, Line 140: bool HasFragmentIdx(const std::unordered_set& fragment_idxs); const http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/query-state.h@161 PS15, Line 161: RuntimeFilterBank* filter_bank() { return filter_bank_.get(); } const http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/query-state.cc@a679 PS15, Line 679: Looks like maybe this was the only use of 'fragment_map_' anywhere and its unnecessary now? http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-filter-bank.h File be/src/runtime/runtime-filter-bank.h: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-filter-bank.h@68 PS15, Line 68: AllocateScratchBloomFilter() Not your change, but this is wrong, since the sentence refers to both bloom and min-max filters. Maybe change it to say "AllocateScratch*Filter()" http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-filter-bank.h@79 PS15, Line 79: PublishGlobalFilte : /// () typo http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-filter-bank.h@184 PS15, Line 184: locked separately typo http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-state.h@63 PS15, Line 63: /// A collection of items that are part of the global state of a query and shared across Not your change, but this comment seems wrong/confusing to me. RuntimeState isn't shared across a query, its per-fragment instance, right? http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/scheduling/query-schedule.cc File be/src/scheduling/query-schedule.cc: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/scheduling/query-schedule.cc@246 PS15, Line 246: for (const FInstanceExecParams& instance_params: instance_exec_params) { Looks like you're relying on the fact that all instances for a particular host will be contiguous within instance_exec_params. Might want to document that assumption where instance_exec_params is declared, or even make it a map from host to finstances. http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/bloom-filter.cc File be/src/util/bloom-filter.cc: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/bloom-filter.cc@251 PS15, Line 251: if (other.AlwaysFalse()) return; Doesn't look like you're handling the always_true case, maybe add a DCHECK? http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/min-max-filter.h File be/src/util/min-max-filter.h: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/min-max-filter.h@74 PS15, Line 74: virtual void SetAlwaysTrue() = 0; protected? http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/min-max-filter.cc File be/src/util/min-max-filter.cc: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/min-max-filter.cc@324 PS15, Line 324: // Sets 'always_true_' to true and clears the values of 'min_', 'max_', 'min_buffer_', Seems pretty clear from the code, so probably unnecessary, or maybe leave it in the header. http://gerrit.cloudera.org:8080/#/c/14538/15/common/protobuf/data_stream_service.proto File common/protobuf/data_stream_service.proto: http://gerrit.cloudera.org:8080/#/c/14538/15/common/protobuf/data_stream_service.proto@136 PS15, Line 136: // optional int32 dst_fragment_idx = 3; Any reason not to just completely remove this and renumber? Its completely in
[native-toolchain-CR] IMPALA-9244 Bump up boost to work on aarch64
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14898 ) Change subject: IMPALA-9244 Bump up boost to work on aarch64 .. Patch Set 3: This built OK on all the supported platforms. Thanks for the contribution! -- To view, visit http://gerrit.cloudera.org:8080/14898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia54aeb93f1a99a0f976fc63f2127c4f06b5ffb24 Gerrit-Change-Number: 14898 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 16 Dec 2019 19:16:51 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-9244 Bump up boost to work on aarch64
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14898 ) Change subject: IMPALA-9244 Bump up boost to work on aarch64 .. IMPALA-9244 Bump up boost to work on aarch64 Boost support ARM64 platform since version 1.58.0 https://github.com/boostorg/context/blob/boost-1.58.0/build/Jamfile.v2#L224 and Kudu-84086fe uses 1.61.0, so let's bump up boost to 1.61.0. Change-Id: Ia54aeb93f1a99a0f976fc63f2127c4f06b5ffb24 Reviewed-on: http://gerrit.cloudera.org:8080/14898 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M buildall.sh 1 file changed, 8 insertions(+), 5 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/14898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia54aeb93f1a99a0f976fc63f2127c4f06b5ffb24 Gerrit-Change-Number: 14898 Gerrit-PatchSet: 4 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Tim Armstrong
[native-toolchain-CR] IMPALA-9244 Bump up boost to work on aarch64
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14898 ) Change subject: IMPALA-9244 Bump up boost to work on aarch64 .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia54aeb93f1a99a0f976fc63f2127c4f06b5ffb24 Gerrit-Change-Number: 14898 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 16 Dec 2019 19:16:39 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Clarification on HADOOP-15720 in impala known issues.xml
Alex Rodoni has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14906 ) Change subject: [DOCS] Clarification on HADOOP-15720 in impala_known_issues.xml .. [DOCS] Clarification on HADOOP-15720 in impala_known_issues.xml Change-Id: I724085ee63005cb2ce68a4532e6cf315d4799e7a Reviewed-on: http://gerrit.cloudera.org:8080/14906 Tested-by: Impala Public Jenkins Reviewed-by: Robbie Zhang Reviewed-by: Joe McDonnell --- M docs/topics/impala_known_issues.xml 1 file changed, 7 insertions(+), 7 deletions(-) Approvals: Impala Public Jenkins: Verified Robbie Zhang: Looks good to me, but someone else must approve Joe McDonnell: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/14906 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I724085ee63005cb2ce68a4532e6cf315d4799e7a Gerrit-Change-Number: 14906 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Robbie Zhang
[Impala-ASF-CR] [DOCS] Clarification on HADOOP-15720 in impala known issues.xml
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/14906 ) Change subject: [DOCS] Clarification on HADOOP-15720 in impala_known_issues.xml .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14906 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I724085ee63005cb2ce68a4532e6cf315d4799e7a Gerrit-Change-Number: 14906 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Robbie Zhang Gerrit-Comment-Date: Mon, 16 Dec 2019 18:53:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8547: get json object fails to get value for numeric key
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14905 ) Change subject: IMPALA-8547: get_json_object fails to get value for numeric key .. Patch Set 1: (1 comment) Thanks for the contribution Euguene! A few comments: It would be great if you could include some more information in the commit message. Take a look at https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala for an example of an Impala commit message. The commit message should include information about how to reproduce the issue, how this patch fixes the issue, and how this patch was tested. For the patch itself, the behavior here seems to be a bit ambiguous between SQL engines. While Hive (and Spark SQL) can execute the query you pasted in the JIRA, the Hive docs (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF) actually state that: Extracts json object from a json string based on json path specified, and returns json string of the extracted json object. It will return null if the input json string is invalid. NOTE: The json path can only have the characters [0-9a-z_], i.e., no upper-case or special characters. Also, the keys *cannot start with numbers.* This is due to restrictions on Hive column names. I checked MySQL as well, and it actually cannot run the query from the JIRA description (JSON_EXTRACT is the equivalent in MySQL). It fails with the error: "Query Error: Error: ER_INVALID_JSON_PATH: Invalid JSON path expression. The error is around character position 3." However, if you put quotations around the path identifier, it works: select JSON_EXTRACT('{"1": 5}', '$."1"'); https://stackoverflow.com/questions/49557144/how-to-extract-values-from-a-numeric-keyed-nested-json-field-in-mysql has more info. Oddly enough, MS SQL Server and Oracle follows the MySQL behavior, but Postgres follows the Hive behavior. In general, Impala tries to follow MySQL behavior. Given that MS SQL and Oracle follow the MySQL behavior as well, I think Impala should too. http://gerrit.cloudera.org:8080/#/c/14905/1/be/src/util/string-util.cc File be/src/util/string-util.cc: http://gerrit.cloudera.org:8080/#/c/14905/1/be/src/util/string-util.cc@76 PS1, Line 76: FindEndOfIdentifier need to update the docs of this method in string-util.h. The docs currently say that this method is to find C style identifiers. A C style identifier cannot start with a digit (https://docs.microsoft.com/en-us/cpp/c-language/c-identifiers?view=vs-2019). -- To view, visit http://gerrit.cloudera.org:8080/14905 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7df037ccf2c79da0ba86a46df1dd28ab0e9a45f4 Gerrit-Change-Number: 14905 Gerrit-PatchSet: 1 Gerrit-Owner: Eugene Zimichev Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Mon, 16 Dec 2019 18:38:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9137: Blacklist node if a DataStreamService RPC to the node fails
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14677 ) Change subject: IMPALA-9137: Blacklist node if a DataStreamService RPC to the node fails .. Patch Set 12: > Right, but that RPC has already been retried multiple times, right? > I should probably validate that for these specific error codes, the > RPCs are actually retried. Depends on the particular RPC. When we blacklist for a failed Exec() rpc, it will never have been retried. For TransmitData(), it will only be retried if the error that came back is "server too busy" > So depending on an external auto-scaling policy, its possible that > blacklisting a node (even for 12 seconds) makes the executor group > unhealthy, and causes the auto-scaler to create a new executor > group (a potentially expensive operation). If auto-scaling takes into account blacklisting, I would probably say that's the wrong approach and we should think about changing it to only consider the statestore view of cluster membership. > In general, I think we can be pretty test driven here. The best way > to figure out what additional errors codes to add would be to run > some network fault injection tests and see what error codes Impala > fails with. I'll open a follow up JIRA for that. Definitely agree that we need to put some work into realistic fault testing, and not just rely on our intuition about blacklisting/retry policy. > I generally wanted to avoid doing this. I think it would be better > for only the Coordinator to make the ultimate decision about > whether a node is blacklisted. KrpcDataStreamSender just provides > enough information for the Coordinator to make the blacklisting > decision. Otherwise the logic for determining if a node should be > blacklisted will be spread across multiple classes, making it hard > to reason about blacklisting decisions. I don't think it would be so bad to add a utility function like IsRetriableRpcError(kudu::Status) or whatever to keep the logic consistent, but I don't feel strongly about how exactly we arrange this, as long as we're getting the correct interaction between blacklisting and retrying. > However, I think you do bring up a good. We don't want a failed RPC > to trigger a query retry unless we are actually blacklisting the > target node. After thinking through it some more, maybe marking > queries as "retryable" isn't necessary at all. Instead, whenever a > query fails + blacklists a node, it is retried. So query retries > are blacklisting driven. This would mean abandoning the patch for > IMPALA-9138 (Classify certain errors as retryable), which I'm fine > with doing. Let me know if you feel differently. I'm definitely interested in approaches to retries that don't require marking statuses retriable (obviously I've had a lot of concerns with that approach). I could see some problems with this approach too, such as that we may have situations in the future where we want to retry queries without doing any blacklisting (eg. if we think there was stale metadata and replanning the query will let it succeed), or probably worse - that you could have a query that fails due to a failed rpc and reports that the node should be blacklisted, and then also hits another error that isn't retry-able but we retry it anyways. Not sure how common a situation like that would be, though. > > but there's still the problem that this patch will only blacklist > > one node per query. > > Yeah, maybe this is too conservative. Assuming transparent query > retries support "x" number of retries, Impala should be able to > tolerate "x" faults across the duration of the query (and its > retries). Maybe that is okay, maybe not. Don't have strong feelings > here. I think its probably fine to leave as is for now, until we have more confidence that blacklisting is really working how we want it to. -- To view, visit http://gerrit.cloudera.org:8080/14677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I733cca13847fde43c8ea2ae574d3ae04bd06419c Gerrit-Change-Number: 14677 Gerrit-PatchSet: 12 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 16 Dec 2019 18:37:11 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-9244 Bump up boost to work on aarch64
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14898 ) Change subject: IMPALA-9244 Bump up boost to work on aarch64 .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia54aeb93f1a99a0f976fc63f2127c4f06b5ffb24 Gerrit-Change-Number: 14898 Gerrit-PatchSet: 3 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 16 Dec 2019 17:04:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14867 ) Change subject: IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5291/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826 Gerrit-Change-Number: 14867 Gerrit-PatchSet: 6 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 16 Dec 2019 16:37:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/14867 ) Change subject: IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14867/5/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/14867/5/fe/src/main/java/org/apache/impala/service/Frontend.java@776 PS5, Line 776: if (authzFactory_.getAuthorizationConfig().isEnabled() : && !userHasDbLevelAccess(user, dbName)) { : Privilege requiredPrivilege = Privilege.ANY; : > nit, may be move this back into the if block since this is unnecessary othe I kept them outside, as moving them inside + using them in an if would increase the nesting of the while loop further. -- To view, visit http://gerrit.cloudera.org:8080/14867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826 Gerrit-Change-Number: 14867 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 16 Dec 2019 16:10:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server
Hello Quanlong Huang, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14867 to look at the new patch set (#6). Change subject: IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server .. IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server Currently we always do the auth check for tables/dbs individually. If the user has privileges higher in the hierarchy then it is not necessary to do these checks as they will all succeed. This change adds a higher level check before the individual checks. The optimization is only enabled for Sentry, as Ranger has deny policies, so server/db level access does not guarantee db/table level access. Testing: - the existing auth related test coverage seems enough - there are no tests for deny policies yet - adding them seems a bigger task so I created a follow up jira: IMPALA-9252 Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826 --- M fe/src/main/java/org/apache/impala/service/Frontend.java 1 file changed, 59 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/14867/6 -- To view, visit http://gerrit.cloudera.org:8080/14867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826 Gerrit-Change-Number: 14867 Gerrit-PatchSet: 6 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] Added support of Kudu DATE type to Impala
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/14705 ) Change subject: Added support of Kudu DATE type to Impala .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/14705/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14705/3//COMMIT_MSG@7 PS3, Line 7: Added Can you create an Impala Jira and add it to the commit message? http://gerrit.cloudera.org:8080/#/c/14705/3/be/src/exec/kudu-util.cc File be/src/exec/kudu-util.cc: http://gerrit.cloudera.org:8080/#/c/14705/3/be/src/exec/kudu-util.cc@136 PS3, Line 136: dv->ToString() This will always return empty string if dv is invalid (and ToDaysSinceEpoch() would be true otherwise). I would simply return "Invalid DateValue.". http://gerrit.cloudera.org:8080/#/c/14705/3/be/src/exec/kudu-util.cc@193 PS3, Line 193: case TYPE_DATE: The changes in this file all seem write related - was there any change on the read side? TimestampValue has some special logic, e.g. https://github.com/apache/impala/blob/f2f348c0f93208a0f34c33b6a4dc82f4d9d4b290/be/src/exec/kudu-scanner.cc#L336 DateValue's in memory representation is the some as in Kudu, so it is possible that less logic is needed than for Timestamps, but it is weird that nothing changed. -- To view, visit http://gerrit.cloudera.org:8080/14705 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91656749a58ac769b54c2a63bdd4f85c89520b32 Gerrit-Change-Number: 14705 Gerrit-PatchSet: 3 Gerrit-Owner: Volodymyr Verovkin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Volodymyr Verovkin Gerrit-Comment-Date: Mon, 16 Dec 2019 14:53:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases'
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/14846 ) Change subject: IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases' .. Patch Set 7: (3 comments) The implementation looks good to me, but I am concerned about possible performance impact, see my comment in Frontend.java http://gerrit.cloudera.org:8080/#/c/14846/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14846/7//COMMIT_MSG@15 PS7, Line 15: num_check_access_threads I would prefer a name the contains "authorization" or "auth". The current name could mean many things. http://gerrit.cloudera.org:8080/#/c/14846/7//COMMIT_MSG@18 PS7, Line 18: range : of 1 to 32. The default value of 'num_check_access_threads' is 1. nit: wrap at 72 http://gerrit.cloudera.org:8080/#/c/14846/7/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/14846/7/fe/src/main/java/org/apache/impala/service/Frontend.java@859 PS7, Line 859: submit I am not familiar with ExecutorService, so I may be wrong here. What happens if SHOW DATABASES is called by several users concurrently? Does MAX_CHECK_ACCESS_POOL_SIZE maximize the number of threads that can work in parallel, or the current thread (where doGetTableNames is called) can also do some work? If no more than MAX_CHECK_ACCESS_POOL_SIZE threads can work at the same time and the default value 1 is used, then this change serializes parallel SHOW DATABASES calls which is a regression compared to the current state. I can imagine that in certain cases bumping MAX_CHECK_ACCESS_POOL_SIZE can help a lot, but we should ensure that the default 1 does not cause significant regression. related stackoverflow question: https://stackoverflow.com/questions/6581188/is-there-an-executorservice-that-uses-the-current-thread -- To view, visit http://gerrit.cloudera.org:8080/14846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I860e0d18afa0421665f8b3b1c5561d6bdacc5e96 Gerrit-Change-Number: 14846 Gerrit-PatchSet: 7 Gerrit-Owner: Zhou Xu Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zhou Xu Gerrit-Comment-Date: Mon, 16 Dec 2019 13:14:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9149: part 1: Re-enabe Ranger-related FE tests
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14798 ) Change subject: IMPALA-9149: part 1: Re-enabe Ranger-related FE tests .. Patch Set 9: (3 comments) Thanks for addressing the comments! The changes for Ranger ownership are good to me. But it still looks wired to me that we accept so many different behaviors (not relative to ownership) between Sentry and Ranger... If the cause is Ranger-2.0 changes the default policies, can we fix the tests by modifying the policies after we create ranger-hive service in bin/create-test-configuration.sh? For instance, I manually edit the "all - database" policy to remove the "Allow Conditions" of CREATE to "public" group. Then normal users can't create databases. http://gerrit.cloudera.org:8080/#/c/14798/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14798/9//COMMIT_MSG@22 PS9, Line 22: creating a database is able to do so. Do you know what change(JIRA) of Ranger causes this? Just know that RANGER-2536 causes the first one. I'm confused why Ranger allows everyone to create databases. It looks unacceptable in a production cluster... http://gerrit.cloudera.org:8080/#/c/14798/9/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java File fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java: http://gerrit.cloudera.org:8080/#/c/14798/9/fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java@1168 PS9, Line 1168: AnalysisError(stmt, "Database does not exist: nodb"); AnalysisError is a method of FrontendTestBase (the super class of AuthorizationTestBase). I'm not quite clear whether authorization is turned on when using it... Could you point me to the codes? http://gerrit.cloudera.org:8080/#/c/14798/9/fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java File fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java: http://gerrit.cloudera.org:8080/#/c/14798/9/fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java@177 PS9, Line 177: user_.getName() It can be changed to sentry_user_ too. -- To view, visit http://gerrit.cloudera.org:8080/14798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I228533aae34b9ac03bdbbcd51a380770ff17c7f2 Gerrit-Change-Number: 14798 Gerrit-PatchSet: 9 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Mon, 16 Dec 2019 09:50:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases'
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14846 ) Change subject: IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases' .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5290/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I860e0d18afa0421665f8b3b1c5561d6bdacc5e96 Gerrit-Change-Number: 14846 Gerrit-PatchSet: 7 Gerrit-Owner: Zhou Xu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zhou Xu Gerrit-Comment-Date: Mon, 16 Dec 2019 08:31:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases'
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14846 ) Change subject: IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases' .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5289/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I860e0d18afa0421665f8b3b1c5561d6bdacc5e96 Gerrit-Change-Number: 14846 Gerrit-PatchSet: 6 Gerrit-Owner: Zhou Xu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zhou Xu Gerrit-Comment-Date: Mon, 16 Dec 2019 08:17:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases'
Zhou Xu has posted comments on this change. ( http://gerrit.cloudera.org:8080/14846 ) Change subject: IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases' .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/14846/6/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/14846/6/be/src/common/global-flags.cc@299 PS6, Line 299: "tables/databases. This configuration is applicable only when authorization is " > line too long (92 > 90) Done http://gerrit.cloudera.org:8080/#/c/14846/6/be/src/common/global-flags.cc@300 PS6, Line 300: "enabled. A value of 1 disables multi-threaded execution for checking access. " > line too long (92 > 90) Done -- To view, visit http://gerrit.cloudera.org:8080/14846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I860e0d18afa0421665f8b3b1c5561d6bdacc5e96 Gerrit-Change-Number: 14846 Gerrit-PatchSet: 7 Gerrit-Owner: Zhou Xu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zhou Xu Gerrit-Comment-Date: Mon, 16 Dec 2019 08:02:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases'
Zhou Xu has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/14846 ) Change subject: IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases' .. IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases' If Sentry authorization is enabled, users with multi group-policies will take time to get the result of 'show tables/databases'. It seems that ResourceAuthorizationProvider.hasAccess performs bad for users with complex group-policies, IMPALA-9242 will target to address this problem. This patch provides a config option 'num_check_access_threads' to accelerate 'show tables/databases' by using multithreading. This configuration is applicable only when authorization is enabled. A value of 1 disables multi-threaded execution for checking access. The value must be in the range of 1 to 32. The default value of 'num_check_access_threads' is 1. Change-Id: I860e0d18afa0421665f8b3b1c5561d6bdacc5e96 --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java 5 files changed, 115 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/14846/7 -- To view, visit http://gerrit.cloudera.org:8080/14846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I860e0d18afa0421665f8b3b1c5561d6bdacc5e96 Gerrit-Change-Number: 14846 Gerrit-PatchSet: 7 Gerrit-Owner: Zhou Xu Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zhou Xu