[Impala-ASF-CR] IMPALA-4518: CopyStringVal() doesn't copy null string
Michael Ho has uploaded a new change for review. http://gerrit.cloudera.org:8080/5198 Change subject: IMPALA-4518: CopyStringVal() doesn't copy null string .. IMPALA-4518: CopyStringVal() doesn't copy null string Previously, CopyStringVal() mistakenly copies a null StringVal as an emtpy string (i.e. a non-null string with zero length). This change fixes the problem by distinguishing between these two cases in CopyStringVal() and handles them properly. Also added a test case for it. This problem only started showing up recently due to commit 51268c053ffe41dc1aa9f1b250878113d4225258 which calls CopyStringVal() in OffsetFnInit(). All other pre-existing callers of CopyStringVal() before that commit checks if 'src' is null before calling it so the problem never showed up. In that sense, this is a latent bug exposed by the aforementioned commit. Change-Id: I3a5b9349dd08556eba5cfedc8c0063cc59f5be03 --- M be/src/exprs/aggregate-functions-ir.cc M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test 2 files changed, 20 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/5198/1 -- To view, visit http://gerrit.cloudera.org:8080/5198 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I3a5b9349dd08556eba5cfedc8c0063cc59f5be03 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho
[Impala-ASF-CR] IMPALA-1788: Fold constant expressions.
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-1788: Fold constant expressions. .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5109 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If672b703db1ba0bfc26e5b9130161798b40a69e9 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4000: Restricted Sentry authorization for Kudu Tables
Alex Behm has posted comments on this change. Change subject: IMPALA-4000: Restricted Sentry authorization for Kudu Tables .. Patch Set 4: (9 comments) http://gerrit.cloudera.org:8080/#/c/5047/4//COMMIT_MSG Commit Message: Line 12: - Column level permissions are not supported. ... on Kudu tables (just to be very clear) Line 13: - Permissions on certain operations (e.g. only SELECT) are not Move this point to the top and rephrase to: - Only the ALL privilege level can be granted to Kudu tables. Finer-grained levels such as only SELECT or only INSERT are not supported. http://gerrit.cloudera.org:8080/#/c/5047/4/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java: Line 247: // See https://issues.cloudera.org/browse/IMPALA-4000 for details. no need for the whole link, just "See IMPALA-4000 for details." http://gerrit.cloudera.org:8080/#/c/5047/4/fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java File fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java: Line 287: // At this time Kudu tables only support the table level ALL privilege. slightly rephrase: We only support the ALL privilege on Kudu tables since many of the finer-grained levels (DELETE/UPDATE) are not available. Line 291: } else if (scope_ == TPrivilegeScope.COLUMN) { logic is a little simpler to me if you make this an independent "if" (and not an "else if") http://gerrit.cloudera.org:8080/#/c/5047/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeAuthStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeAuthStmtsTest.java: Line 164: AnalysisError(String.format( Write a loop somewhere that goes over all privilege levels to make sure they all fail. Line 204: AnalysisError(String.format("%s SELECT (id, bool_col) ON TABLE " + use ALL here to get the other err msg http://gerrit.cloudera.org:8080/#/c/5047/4/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java File fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java: Line 82: public class AuthorizationTest { I think we also need to check that a Sentry policy file with a non-ALL on a Kudu table is not valid. I'll need to look into this a little more, but you can do so as well concurrently. Line 2168: // IMPALA-4000: Only users with ALL privileges on SERVER may create external Kudu remove comment (this is a test for admin users) -- To view, visit http://gerrit.cloudera.org:8080/5047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4410: Safer tear-down of RuntimeState .. Patch Set 4: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/531/ -- To view, visit http://gerrit.cloudera.org:8080/4893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1788: Fold constant expressions.
Alex Behm has uploaded a new patch set (#3). Change subject: IMPALA-1788: Fold constant expressions. .. IMPALA-1788: Fold constant expressions. Adds a new ExprRewriteRule for replacing constant expressions with their literal equivalent via BE evaluation. Applies the new rule together with the existing ones on the parse tree, after analysis. Limitations - Constant folding is applied on the unresolved expressions. As a result, it only works for expressions that are constant within a single query block, as opposed to expressions that may become constant after fully substituting inline-view exprs. - Exprs are not normalized, so some opportunities for constant folding are missed for certain expr-tree shapes. This patch includes the following interesting changes: - Introduces a timestamp literal that can only be produced by constant folding (not expressible directly via SQL). - To make sure that rewrites have no user-visible effect, the original result types and column labels of the top-level statement are restored after the rewrites are performed. - Does not fold exprs if their evaluation resulted in a warning or error, or if the resulting value is not representable by corresponding FE LiteralExpr. - Fixes an existing issue with converting strings between the FE/BE. String produced in the BE that have characters with a value > 127 are not correctly deserialized into a Java String via thrift. We detect this case during constant folding and abandon folding of such exprs. - Fixes several issues with detecting/reporting errors in NativeEvalConstExprs(). - Cleans up ExprContext::GetValue() into ExprContext::GetConstantValue() which clarifies its only use of evaluating exprs from the FE. Testing: - Modifies expr-test.cc to run all tests through the constant folding path. - Adds basic planner and rewrite rule tests. - Exhaustive test run passed Change-Id: If672b703db1ba0bfc26e5b9130161798b40a69e9 --- M be/src/exprs/expr-context.cc M be/src/exprs/expr-context.h M be/src/exprs/expr-test.cc M be/src/exprs/expr.cc M be/src/exprs/literal.cc M be/src/exprs/literal.h M be/src/runtime/runtime-state.h M be/src/service/fe-support.cc M common/thrift/Exprs.thrift M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/InPredicate.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/NullLiteral.java M fe/src/main/java/org/apache/impala/analysis/PartitionSet.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StatementBase.java M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java M fe/src/main/java/org/apache/impala/analysis/Subquery.java A fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/rewrite/ExprRewriteRule.java M fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java A fe/src/main/java/org/apache/impala/rewrite/FoldConstantsRule.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/testutil/TestUtils.java M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test M testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test A testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/empty.test M testdata/workloads/functional-planner/queries/PlannerTest/insert.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M testdata/workloads/functional-planner/queries/PlannerTest/k
[Impala-ASF-CR] IMPALA-1788: Fold constant expressions.
Alex Behm has posted comments on this change. Change subject: IMPALA-1788: Fold constant expressions. .. Patch Set 2: (11 comments) Thanks for the speedy review, Marcel! http://gerrit.cloudera.org:8080/#/c/5109/2/be/src/exprs/literal.cc File be/src/exprs/literal.cc: Line 249 > that looks weird. yes, we were doing something very strange before Line 472: case TYPE_TIMESTAMP: > do we have end-to-end tests with timestamp literals? There are quite a few interesting tests in exprs.test. I modified test_exprs.py to run with and without constant folding to make sure we don't lose the old coverage. We also have a few end-to-end tests littered in various .test files (checked it with git grep). I extended exprs.test to give more coverage. http://gerrit.cloudera.org:8080/#/c/5109/2/be/src/service/fe-support.cc File be/src/service/fe-support.cc: Line 105: query_ctx.request.query_options.max_errors = 10; > 10? Arbitrary non-zero number. Changed to 1 and updated comment. http://gerrit.cloudera.org:8080/#/c/5109/2/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java: Line 423: private void getResultTypesAndLabels(StatementBase stmt, List resultTypes, > it looks like a better approach is to add a StatementBase.getResultExprs an Done Line 447: private void setResultTypesAndLabels(StatementBase stmt, List resultTypes, > move this to StatementBase? (maybe as castResultExprs() and setColLabels()? Done http://gerrit.cloudera.org:8080/#/c/5109/2/fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java File fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java: Line 177: System.out.println(e.getMessage()); > remove Done http://gerrit.cloudera.org:8080/#/c/5109/2/fe/src/main/java/org/apache/impala/rewrite/FoldConstantsRule.java File fe/src/main/java/org/apache/impala/rewrite/FoldConstantsRule.java: Line 48: for (Expr child: expr.getChildren()) { > single line Done http://gerrit.cloudera.org:8080/#/c/5109/2/fe/src/test/java/org/apache/impala/testutil/TestUtils.java File fe/src/test/java/org/apache/impala/testutil/TestUtils.java: Line 260: // Disable rewrites because some analyzer tests have non-executable constant exprs > does this mean they're disabled in the planner tests? No. The planner tests always explicitly set the query options, so they are not affected by changing the default here. I also added a comment to this function that expr rewrites are disabled by default. http://gerrit.cloudera.org:8080/#/c/5109/2/testdata/workloads/functional-planner/queries/PlannerTest/insert.test File testdata/workloads/functional-planner/queries/PlannerTest/insert.test: Line 551: WRITE TO HDFS [functional.alltypes, OVERWRITE=false, PARTITION-KEYS=(2009,5)] > oh nice :) http://gerrit.cloudera.org:8080/#/c/5109/2/testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test File testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test: Line 9: tuple-ids=0 row-size=68B cardinality=unavailable > what happened here, did something get screwed up when loading kudu data? My Kudu setup was messed up again (needed to compute stats). Fixed it. http://gerrit.cloudera.org:8080/#/c/5109/2/testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test File testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test: Line 1270:predicates: g.bigint_col = 1, g.bigint_col < 1000 > do you know why these get reordered? it seems like =1 should have been the The original expr was g.bigint_col = TRUE which had a CastExpr on the RHS, so it appeared more expensive. After folding we just have a literal. -- To view, visit http://gerrit.cloudera.org:8080/5109 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If672b703db1ba0bfc26e5b9130161798b40a69e9 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4510: Selectively filter args for metric verification tests .. IMPALA-4510: Selectively filter args for metric verification tests run-tests.py is a wrapper around impala-py.test. It abstracts away the need to invoke separate runs for serial tests, parallel tests, and metric verification tests. Because it's possible for a user to specify certain test suites, or even specific tests, on the command line when calling run-tests.py, it had been necessary to override the command line args when it came time to run the metric verification tests -- otherwise those other tests/suites would be rerun. Before this patch, we had simply been stripping away all command line args. However, that blanket approach causes problems when running tests against a remote cluster, because we need to retain those command line args that pertain to the remote cluster. This patch selectively prunes unwanted command line args for the last metric verification test stage, keeping the ones that we need, and also adds extensive documentation for explaining why we have to go through this fairly odd and elaborate step. This patch was tested by running a sample test suite locally, and against a remote cluster. Previously, the metric verification stage had been failing for remote cluster tests (since they were defaulting to localhost for services that were only available remotely.) With the patch, the remote verfification tests were passing. Also, while I'm here, add a small change that exits immediately if the user calls for --help. Before this, we actually still ran the tests. Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e Reviewed-on: http://gerrit.cloudera.org:8080/5135 Reviewed-by: Michael Brown Reviewed-by: Ishaan Joshi Tested-by: Internal Jenkins --- M tests/run-tests.py 1 file changed, 77 insertions(+), 14 deletions(-) Approvals: Ishaan Joshi: Looks good to me, approved Michael Brown: Looks good to me, but someone else must approve Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/5135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Ishaan Joshi Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4510: Selectively filter args for metric verification tests .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Ishaan Joshi Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Internal Jenkins has posted comments on this change. Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. Patch Set 6: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/532/ -- To view, visit http://gerrit.cloudera.org:8080/5136 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4283: Ensure Kudu-specific lineage and audit behavior
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4283: Ensure Kudu-specific lineage and audit behavior .. Patch Set 5: Code-Review+2 Test fix. Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/5151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4283: Ensure Kudu-specific lineage and audit behavior
Hello Internal Jenkins, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5151 to look at the new patch set (#5). Change subject: IMPALA-4283: Ensure Kudu-specific lineage and audit behavior .. IMPALA-4283: Ensure Kudu-specific lineage and audit behavior With this commit we add support for auditing all Kudu-specific operations and we enable column lineage for INSERT and UPSERT statements on Kudu tables. No lineage output is generated for DELETE and UPDATE statements. Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d --- M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/PlannerContext.java M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java M testdata/workloads/functional-planner/queries/PlannerTest/lineage.test 7 files changed, 592 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/5151/5 -- To view, visit http://gerrit.cloudera.org:8080/5151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins
[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4410: Safer tear-down of RuntimeState .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/4893/4/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: Line 303: /// Release resources and prepare this object for destruction. let's call this ReleaseResources(), which is also going to be added to a number of other control structures, and remove comment about destruction. http://gerrit.cloudera.org:8080/#/c/4893/4/be/src/runtime/test-env.h File be/src/runtime/test-env.h: Line 72: /// Per-query states with associated block managers. key? -- To view, visit http://gerrit.cloudera.org:8080/4893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4000: Restricted Sentry authorization for Kudu Tables
Taras Bobrovytsky has uploaded a new patch set (#4). Change subject: IMPALA-4000: Restricted Sentry authorization for Kudu Tables .. IMPALA-4000: Restricted Sentry authorization for Kudu Tables At this time, there is no comprehensive way of enforcing a Sentry authorization policy against tables stored in Kudu. The following behavior was implemented in this patch: - Column level permissions are not supported. - Permissions on certain operations (e.g. only SELECT) are not supported. - Only users with ALL privileges on SERVER may create external Kudu tables. Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c --- M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java M fe/src/main/java/org/apache/impala/authorization/PrivilegeRequestBuilder.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeAuthStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java 5 files changed, 52 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/5047/4 -- To view, visit http://gerrit.cloudera.org:8080/5047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4397,IMPALA-3259: reduce codegen time and memory
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory .. Patch Set 13: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/4956/13/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: Line 475: /// expressions), avoid inlining avoid inlining for the exprs exceeding this threshold. remove duplication -- To view, visit http://gerrit.cloudera.org:8080/4956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Silvius Rus Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4283: Ensure Kudu-specific lineage and audit behavior
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4283: Ensure Kudu-specific lineage and audit behavior .. Patch Set 4: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/533/ -- To view, visit http://gerrit.cloudera.org:8080/5151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store sales
David Knupp has posted comments on this change. Change subject: IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store_sales .. Patch Set 2: Sorry -- confusing typo. Correction below: "...we can't always assume that HDFS data IS already present." -- To view, visit http://gerrit.cloudera.org:8080/5177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaae97d1d44201aeeacacdd39adbae35753512950 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store sales
David Knupp has posted comments on this change. Change subject: IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store_sales .. Patch Set 2: Dimitris, I spoke with Harrison about your question, and will paraphrase his reply in case he doesn't have a chance to weigh in himself on this review. Essentially, he pointed out that we can't always assume that HDFS data isn't already present -- e.g., we don't necessarily always load test data in advance from the snapshot file. Wouldn't we possibly be introducing new bugs/regressions by replacing all of the ALTER TABLE statements with RECOVER PARTITIONS everywhere? -- To view, visit http://gerrit.cloudera.org:8080/5177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaae97d1d44201aeeacacdd39adbae35753512950 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4000: Restricted Sentry authorization for Kudu Tables
Taras Bobrovytsky has uploaded a new patch set (#2). Change subject: IMPALA-4000: Restricted Sentry authorization for Kudu Tables .. Preview: IMPALA-4000: Restricted Sentry authorization for Kudu Tables Since Kudu does not support security, we do not want to mislead anyone into thinking this is not the case by being able to set up column level privileges. Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c --- M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java M fe/src/main/java/org/apache/impala/authorization/PrivilegeRequestBuilder.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeAuthStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test 6 files changed, 49 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/5047/2 -- To view, visit http://gerrit.cloudera.org:8080/5047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4000: Restricted Sentry authorization for Kudu Tables
Taras Bobrovytsky has uploaded a new patch set (#3). Change subject: IMPALA-4000: Restricted Sentry authorization for Kudu Tables .. IMPALA-4000: Restricted Sentry authorization for Kudu Tables At this time, there is no comprehensive way of enforcing a Sentry authorization policy against tables stored in Kudu. The following behavior was implemented in this patch: - Column level permissions are not supported. - Permissions on certain operations (e.g. only SELECT) are not supported. - Only users with ALL privileges on SERVER may create external Kudu tables. Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c --- M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java M fe/src/main/java/org/apache/impala/authorization/PrivilegeRequestBuilder.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeAuthStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java 5 files changed, 54 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/5047/3 -- To view, visit http://gerrit.cloudera.org:8080/5047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4000: Restricted Sentry authorization for Kudu Tables
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-4000: Restricted Sentry authorization for Kudu Tables .. Patch Set 3: (7 comments) http://gerrit.cloudera.org:8080/#/c/5047/2/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java: Line 244: // Today there is no comprehensive way of enforcing a Sentry authorization policy > Needs brief comment explaining why this privilege level is required, point Done http://gerrit.cloudera.org:8080/#/c/5047/2/fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java File fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java: Line 249: for (String columnName: columnNames_) { > brief comment and pointer to JIRA Done Line 281: throw new AnalysisException(String.format("Error setting privileges for " + > brief comment and pointer to JIRA Done Line 283: "to issue a GRANT/REVOKE statement.", tableName_.toString())); > Kudu tables only support the ALL privilege level. Done http://gerrit.cloudera.org:8080/#/c/5047/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java File fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java: Line 907: // tables. > also add a positive test case Done Line 912: // IMPALA-4000: ALL privileges on SERVER are not required to create managed tables. > isn't this already covered somewhere? I couldn't find a test like this for Kudu. (it would make sense that it would be in this file, right?) http://gerrit.cloudera.org:8080/#/c/5047/2/testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test File testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test: Line 296: QUERY > looks more like an analyzer test Done -- To view, visit http://gerrit.cloudera.org:8080/5047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4000: Restricted Sentry authorization for Kudu Tables
Taras Bobrovytsky has uploaded a new patch set (#3). Change subject: IMPALA-4000: Restricted Sentry authorization for Kudu Tables .. IMPALA-4000: Restricted Sentry authorization for Kudu Tables At this time, there is no comprehensive way of enforcing a Sentry authorization policy against tables stored in Kudu. The following behavior was implemented in this patch: - Column level permissions are not supported. - Permissions on certain operations (e.g. only SELECT) are not supported. - Only users with ALL privileges on SERVER may create external Kudu tables. Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c --- M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java M fe/src/main/java/org/apache/impala/authorization/PrivilegeRequestBuilder.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeAuthStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java 5 files changed, 54 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/5047/3 -- To view, visit http://gerrit.cloudera.org:8080/5047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4431: Add audit event log control mechanism to prevent disk overflow
Donghui Xu has posted comments on this change. Change subject: IMPALA-4431: Add audit event log control mechanism to prevent disk overflow .. Patch Set 11: I have tested that set max_audit_event_log_files to 1. That worked as expected. I have researched CheckAndRotateLogFiles and associated glog codes. Each severity of the log creates a soft link file always point to the latest log file with the severity. The log switch process includes creating a new log file, delete the old link file, create new link file, so involved in the old and new log files. I think that restrict the minimum log file number greater than 1(>=2) is to ensure the link files to switch correctly while deleting old log files. -- To view, visit http://gerrit.cloudera.org:8080/4971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8c3229cbdb6275f969c15258c9ccab6efeb24368 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Lars Volker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3838, IMPALA-4495: Codegen EvalRuntimeFilters() and fixes filter stats updates
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3838, IMPALA-4495: Codegen EvalRuntimeFilters() and fixes filter stats updates .. Patch Set 7: Code-Review+2 meant to give this earlier... -- To view, visit http://gerrit.cloudera.org:8080/4833 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I27114869840e268d17e91d6e587ef811628e3837 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4283: Ensure Kudu-specific lineage and audit behavior
Hello Internal Jenkins, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5151 to look at the new patch set (#4). Change subject: IMPALA-4283: Ensure Kudu-specific lineage and audit behavior .. IMPALA-4283: Ensure Kudu-specific lineage and audit behavior With this commit we add support for auditing all Kudu-specific operations and we enable column lineage for INSERT and UPSERT statements on Kudu tables. No lineage output is generated for DELETE and UPDATE statements. Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d --- M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/PlannerContext.java M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java M testdata/workloads/functional-planner/queries/PlannerTest/lineage.test 7 files changed, 580 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/5151/4 -- To view, visit http://gerrit.cloudera.org:8080/5151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins
[Impala-ASF-CR] IMPALA-4283: Ensure Kudu-specific lineage and audit behavior
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4283: Ensure Kudu-specific lineage and audit behavior .. Patch Set 4: Code-Review+2 Rebase and fix test. Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/5151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4283: Ensure Kudu-specific lineage and audit behavior
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4283: Ensure Kudu-specific lineage and audit behavior .. Patch Set 3: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/530/ -- To view, visit http://gerrit.cloudera.org:8080/5151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4397,IMPALA-3259: reduce codegen time and memory
Hello Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4956 to look at the new patch set (#13). Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory .. IMPALA-4397,IMPALA-3259: reduce codegen time and memory A handful of fixes to codegen memory usage: * Delete the IR module when we're done with it (it can be fairly large) * Track the compiled code size (typically not that large, but it can add up if there are many fragments). * Estimate optimisation memory requirements and track it in the memory tracker. This is very crude but much better than not tracking it. A handful of fixes to improve codegen time/cost, particularly targeted at compute stats workloads: * Avoid over-inlining when there are many aggregate functions, conjuncts, etc by adding "NoInline" attributes. * Don't codegen non-grouping merge aggregations. They will only process one row per Impala daemon, so codegen is not worth it. * Make the Hll algorithm more efficient by specialising the hash function based on decimal width. Limitations: * This doesn't tackle over-inlining of large expr trees, but a similar approach will be used there in a follow-on patch. Perf: Compute stats on functional_parquet.widetable_1000_cols goes from 1min+ of codegen to ~ 5s codegen on my machine. Local perf runs of tpc-h and targeted perf showed no regressions and some moderate improvements (1-2%). Also did an experiment to understand the perf consequences of disabling inlining. I manually set CODEGEN_INLINE_EXPRS_THRESHOLD to 0, and ran: drop stats tpch_20_parquet.lineitem compute stats tpch_20_parquet.lineitem; There was no difference in time spent in the agg node: 30.7s with inlining, 30.5s without. Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/codegen/mcjit-mem-mgr.h M be/src/exec/aggregation-node.cc M be/src/exec/exchange-node.cc M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/hash-join-node.cc M be/src/exec/hash-table.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-aggregation-node.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/sort-node.cc M be/src/exec/topn-node.cc M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/expr.h M be/src/runtime/lib-cache.cc M be/src/runtime/runtime-state.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M testdata/workloads/functional-query/queries/QueryTest/compute-stats.test M tests/query_test/test_aggregation.py 29 files changed, 1,479 insertions(+), 145 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/4956/13 -- To view, visit http://gerrit.cloudera.org:8080/4956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Silvius Rus Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4397,IMPALA-3259: reduce codegen time and memory
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory .. Patch Set 13: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/4956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Silvius Rus Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. Patch Set 6: Code-Review+2 Rebase. Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/5136 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Hello Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5136 to look at the new patch set (#6). Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. IMPALA-2890: Support ALTER TABLE statements for Kudu tables With this commit, we add support for additional ALTER TABLE statements against Kudu tables. The new supported ALTER TABLE operations for Kudu are: - ADD/DROP range partitions. Syntax: ALTER TABLE ADD [IF NOT EXISTS] RANGE ALTER TABLE DROP [IF EXISTS] RANGE - ADD/DROP/RENAME column. Syntax: ALTER TABLE ADD COLUMNS (col_spec, [col_spec, ...]) ALTER TABLE DROP COLUMN ALTER TABLE CHANGE COLUMN - Rename Kudu table using the 'kudu.table_name' table property. Example: ALTER TABLE SET TBLPROPERTY ('kudu.tbl_name'=''), will change the underlying Kudu table name to . - Renaming the HMS/Catalog table entry of a Kudu table is supported using the existing ALTER TABLE RENAME TO syntax. Not supported: - ALTER TABLE REPLACE COLUMNS Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetFileFormatStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java M fe/src/main/java/org/apache/impala/analysis/DistributeParam.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test M tests/query_test/test_kudu.py 21 files changed, 977 insertions(+), 71 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/5136/6 -- To view, visit http://gerrit.cloudera.org:8080/5136 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis
[Impala-ASF-CR] IMPALA-4397,IMPALA-3259: reduce codegen time and memory
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory .. Patch Set 11: (9 comments) http://gerrit.cloudera.org:8080/#/c/4956/11/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: Line 939: if (is_corrupt_) { > did the formatter do this? Oops, changed it back. Line 988: bytes_allocated), > odd indentation This was clang-format. I split it into two statements to avoid the weirdness. Line 1044: estimated_memory), > odd indentation This was clang-format. I split it into two statements to avoid the weirdness. http://gerrit.cloudera.org:8080/#/c/4956/11/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: Line 127: // > augment with brief comment describing memory tracking. Done Line 467: /// expressions), avoid inlining the individual expression evaluation into the > could we make it "avoid inlining for the exprs exceeding this threshold"? Done Line 595: void DestroyIRModule(); > DestroyIrModule Done http://gerrit.cloudera.org:8080/#/c/4956/11/be/src/exec/hash-join-node.cc File be/src/exec/hash-join-node.cc: Line 140: false, std::logical_or()); > odd indentation Reverted it. It may be difficult to prevent clang-format for doing this again though. http://gerrit.cloudera.org:8080/#/c/4956/11/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: Line 302: if (IsNodeCodegenDisabled()) return; > i find this counterintuitive - why call codegen() if it's disabled (ie, i'd ExecNode::Codegen() is the function that walks the children nodes to Codegen() them, so we need to call it. We could refactor it so that we have a separate function called CodegenChildren() and ExecNode::Codegen() just calls that, but it's unclear if it's worth adding more indirection. http://gerrit.cloudera.org:8080/#/c/4956/11/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java: Line 824: mergeFragment.getPlanRoot().setDisableCodegen(true); > what is this for? does exchange ever do codegen? Non-merging exchanges don't currently (we could codegen some of the pointer swizzling). I just added this because in principle we know it has the same number of rows flowing through it as the aggregation. Otherwise there's an assumption baked in here that the backend for ExchangeNode doesn't support codegen. -- To view, visit http://gerrit.cloudera.org:8080/4956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Silvius Rus Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4397,IMPALA-3259: reduce codegen time and memory
Hello Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4956 to look at the new patch set (#12). Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory .. IMPALA-4397,IMPALA-3259: reduce codegen time and memory A handful of fixes to codegen memory usage: * Delete the IR module when we're done with it (it can be fairly large) * Track the compiled code size (typically not that large, but it can add up if there are many fragments). * Estimate optimisation memory requirements and track it in the memory tracker. This is very crude but much better than not tracking it. A handful of fixes to improve codegen time/cost, particularly targeted at compute stats workloads: * Avoid over-inlining when there are many aggregate functions, conjuncts, etc by adding "NoInline" attributes. * Don't codegen non-grouping merge aggregations. They will only process one row per Impala daemon, so codegen is not worth it. * Make the Hll algorithm more efficient by specialising the hash function based on decimal width. Limitations: * This doesn't tackle over-inlining of large expr trees, but a similar approach will be used there in a follow-on patch. Perf: Compute stats on functional_parquet.widetable_1000_cols goes from 1min+ of codegen to ~ 5s codegen on my machine. Local perf runs of tpc-h and targeted perf showed no regressions and some moderate improvements (1-2%). Also did an experiment to understand the perf consequences of disabling inlining. I manually set CODEGEN_INLINE_EXPRS_THRESHOLD to 0, and ran: drop stats tpch_20_parquet.lineitem compute stats tpch_20_parquet.lineitem; There was no difference in time spent in the agg node: 30.7s with inlining, 30.5s without. Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/codegen/mcjit-mem-mgr.h M be/src/exec/aggregation-node.cc M be/src/exec/exchange-node.cc M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/hash-join-node.cc M be/src/exec/hash-table.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-aggregation-node.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/sort-node.cc M be/src/exec/topn-node.cc M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/expr.h M be/src/runtime/lib-cache.cc M be/src/runtime/runtime-state.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M testdata/workloads/functional-query/queries/QueryTest/compute-stats.test M tests/query_test/test_aggregation.py 29 files changed, 1,479 insertions(+), 145 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/4956/12 -- To view, visit http://gerrit.cloudera.org:8080/4956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Silvius Rus Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4397,IMPALA-3259: reduce codegen time and memory
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory .. Patch Set 12: Code-Review+1 Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/4956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Silvius Rus Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState
Henry Robinson has posted comments on this change. Change subject: IMPALA-4410: Safer tear-down of RuntimeState .. Patch Set 4: Code-Review+2 Rebase. -- To view, visit http://gerrit.cloudera.org:8080/4893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState
Hello Sailesh Mukil, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4893 to look at the new patch set (#4). Change subject: IMPALA-4410: Safer tear-down of RuntimeState .. IMPALA-4410: Safer tear-down of RuntimeState * Add RuntimeState::Close() which is guaranteed to release resources safely, rather than having PFE::Close() do the same piecemeal. * Fix for crash where PFE::Prepare() fails before descriptor table is created. * Remove some dead code from TestEnv, and rename some methods for clarity. Testing: Found by debug-actions, which has a reproducible test where PFE::Prepare() fails. Manually tested on master. Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45 --- M be/src/exec/hash-table-test.cc M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/service/fe-support.cc A be/src/util/scope-exit-trigger.h 10 files changed, 95 insertions(+), 66 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/4893/4 -- To view, visit http://gerrit.cloudera.org:8080/4893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3342: Add thread counters to monitor plan fragment execution
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-3342: Add thread counters to monitor plan fragment execution .. IMPALA-3342: Add thread counters to monitor plan fragment execution This change removes the use of total_cpu_timer which incorrectly monitors the CPU time. Adding THREAD_COUNTERS to measure the user and sys time in plan fragment execution. This also accounts for the time spent in the hdfs/kudu scanner and in a blocking join. Snippet of a query plan with the newly added PlanFragment THREAD_COUNTERS: Instance 2b40b101e2626e7a:a3d8f23 - PeakMemoryUsage: 32.02 KB (32784) - PerHostPeakMemUsage: 430.52 MB (451431312) - RowsProduced: 1 (1) - TotalNetworkReceiveTime: 10s379ms - TotalNetworkSendTime: 0.000ns - TotalStorageWaitTime: 0.000ns - TotalWallClockTime: 10s577ms - SysTime: 8.000ms - UserTime: 8.000ms - VoluntaryContextSwitches: 80 (80) Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300 Reviewed-on: http://gerrit.cloudera.org:8080/4633 Reviewed-by: Bharath Vissapragada Tested-by: Internal Jenkins --- M be/src/exec/blocking-join-node.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/kudu-scan-node.cc M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h 7 files changed, 17 insertions(+), 25 deletions(-) Approvals: Bharath Vissapragada: Looks good to me, approved Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/4633 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-3342: Add thread counters to monitor plan fragment execution
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3342: Add thread counters to monitor plan fragment execution .. Patch Set 11: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4633 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: anujphadke Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4397,IMPALA-3259: reduce codegen time and memory
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory .. Patch Set 11: (9 comments) http://gerrit.cloudera.org:8080/#/c/4956/11/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: Line 939: if (is_corrupt_) { did the formatter do this? Line 988: bytes_allocated), odd indentation Line 1044: estimated_memory), odd indentation http://gerrit.cloudera.org:8080/#/c/4956/11/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: Line 127: // augment with brief comment describing memory tracking. Line 467: /// expressions), avoid inlining the individual expression evaluation into the could we make it "avoid inlining for the exprs exceeding this threshold"? Line 595: void DestroyIRModule(); DestroyIrModule although above we simply refer to it as 'module', so maybe simply DestroyModule()? http://gerrit.cloudera.org:8080/#/c/4956/11/be/src/exec/hash-join-node.cc File be/src/exec/hash-join-node.cc: Line 140: false, std::logical_or()); odd indentation http://gerrit.cloudera.org:8080/#/c/4956/11/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: Line 302: if (IsNodeCodegenDisabled()) return; i find this counterintuitive - why call codegen() if it's disabled (ie, i'd expect those lines to be reversed). http://gerrit.cloudera.org:8080/#/c/4956/11/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java: Line 824: mergeFragment.getPlanRoot().setDisableCodegen(true); what is this for? does exchange ever do codegen? -- To view, visit http://gerrit.cloudera.org:8080/4956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Silvius Rus Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4283: Ensure Kudu-specific lineage and audit behavior
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4283: Ensure Kudu-specific lineage and audit behavior .. Patch Set 3: Code-Review+2 Rebase. Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/5151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4283: Ensure Kudu-specific lineage and audit behavior
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4283: Ensure Kudu-specific lineage and audit behavior .. Patch Set 2: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/527/ -- To view, visit http://gerrit.cloudera.org:8080/5151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3838, IMPALA-4495: Codegen EvalRuntimeFilters() and fixes filter stats updates
Michael Ho has posted comments on this change. Change subject: IMPALA-3838, IMPALA-4495: Codegen EvalRuntimeFilters() and fixes filter stats updates .. Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/4833/6/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 186 > what about this? We have to keep all the filters in filter_ctxs_ because when we codegen, no filters have arrived yet so we are forced to codegen for all filters. Therefore, we need to fill in the entire vector here. At runtime, filters which always return true or have low rejection rate would be disabled. Line 61: constexpr int BATCHES_PER_FILTER_SELECTIVITY_CHECK = 16; > just curious, but does this really need constexpr? Using constexpr to make sure it's a compile time constant so it gets constant folded. It's used in GetNext(). Line 63: "ROWS_PER_FILTER_SELECTIVITY_CHECK must be a power of two"); > update Done Line 677: RETURN_IF_ERROR(CodegenEvalRuntimeFilters(codegen, > single line? or move all args to new line Done http://gerrit.cloudera.org:8080/#/c/4833/6/be/src/util/bloom-filter.h File be/src/util/bloom-filter.h: Line 55: /// machines. > a brief sentence on why noexcept here is important. Done -- To view, visit http://gerrit.cloudera.org:8080/4833 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I27114869840e268d17e91d6e587ef811628e3837 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3838, IMPALA-4495: Codegen EvalRuntimeFilters() and fixes filter stats updates
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4833 to look at the new patch set (#7). Change subject: IMPALA-3838, IMPALA-4495: Codegen EvalRuntimeFilters() and fixes filter stats updates .. IMPALA-3838, IMPALA-4495: Codegen EvalRuntimeFilters() and fixes filter stats updates This change codegens HdfsParquetScanner::EvalRuntimeFilters() by unrolling its loop, codegen'ing the expression evaluation of the runtime filter and replacing some type information with constants in the hashing function of runtime filter to avoid branching at runtime. This change also fixes IMPALA-4495 by not counting a row as 'considered' in the filter stats before the filter arrives. This avoids unnecessarily marking a runtime filter as ineffective before it's even used. With this change, TPCDS-Q88 improves by 13-14%. primitive_broadcast_join_1 improves by 24%. Change-Id: I27114869840e268d17e91d6e587ef811628e3837 --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/filter-context.cc M be/src/exec/filter-context.h M be/src/exec/hdfs-parquet-scanner-ir.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/raw-value-ir.cc M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M be/src/runtime/runtime-filter-bank.h A be/src/runtime/runtime-filter-ir.cc M be/src/runtime/runtime-filter.cc M be/src/runtime/runtime-filter.h M be/src/runtime/runtime-filter.inline.h M be/src/runtime/types.h M be/src/util/bloom-filter.cc M be/src/util/bloom-filter.h M tests/query_test/test_tpch_queries.py 27 files changed, 518 insertions(+), 174 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/4833/7 -- To view, visit http://gerrit.cloudera.org:8080/4833 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I27114869840e268d17e91d6e587ef811628e3837 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4450: qgen: use string concatenation operator for postgres queries
David Knupp has posted comments on this change. Change subject: IMPALA-4450: qgen: use string concatenation operator for postgres queries .. Patch Set 3: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I149b695889addfd7df4ca5f40dc991456da51687 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4450: qgen: use string concatenation operator for postgres queries
Michael Brown has posted comments on this change. Change subject: IMPALA-4450: qgen: use string concatenation operator for postgres queries .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/5034/2//COMMIT_MSG Commit Message: PS2, Line 22: It's difficult to fully test the effects here. I made sure that we : generate syntactically valid queries still on the PostgreSQL side. > The code looks fine to me. I'm curious about this line in the commit messag Done -- To view, visit http://gerrit.cloudera.org:8080/5034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I149b695889addfd7df4ca5f40dc991456da51687 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4410: Safer tear-down of RuntimeState .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4450: qgen: use string concatenation operator for postgres queries
Michael Brown has uploaded a new patch set (#3). Change subject: IMPALA-4450: qgen: use string concatenation operator for postgres queries .. IMPALA-4450: qgen: use string concatenation operator for postgres queries The random query generator writes a logical query Python object into Impala or PostgreSQL dialects. When the CONCAT() function is chosen, Impala's and PostgreSQL's CONCAT() implementations behave differently. However, PostgreSQL has a || operator that functions like Impala's CONCAT(). The method added here overrides the default behavior for the PostgresqlSqlWriter. It prevents CONCAT(arg1, arg2, ..., argN) from being written and instead causes the SQL to be written as 'arg1 || arg2 || ... || argN'. Testing: I made sure that we generate syntactically valid queries still on the PostgreSQL side. This includes queries that made use of string concatenation. I also re-ran some failed queries that previously produced different results. They now produce the same results. This is a very straightforward change, so unit or functional tests for this seem overkill. The full effects of using || instead of CONCAT() are hard to test. It's not clear if in my manual testing of || vs. CONCAT() that I missed some edge behavior, especially in some complicated query, nested expressions, GROUPing BY, and so on. Change-Id: I149b695889addfd7df4ca5f40dc991456da51687 --- M tests/comparison/model_translator.py 1 file changed, 6 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/5034/3 -- To view, visit http://gerrit.cloudera.org:8080/5034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I149b695889addfd7df4ca5f40dc991456da51687 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp
[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests
Ishaan Joshi has posted comments on this change. Change subject: IMPALA-4510: Selectively filter args for metric verification tests .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5135 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Ishaan Joshi Gerrit-Reviewer: Michael Brown Gerrit-HasComments: No
[Impala-ASF-CR] Preview: IMPALA-4000: Restricted Sentry authorization for Kudu Tables
Alex Behm has posted comments on this change. Change subject: Preview: IMPALA-4000: Restricted Sentry authorization for Kudu Tables .. Patch Set 2: (7 comments) Looks good overall. Still needs some additional tests. Let me know if you want me to point out the gaps in more detail. http://gerrit.cloudera.org:8080/#/c/5047/2/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java: Line 244: analyzer.registerPrivReq(new PrivilegeRequestBuilder().onServer( Needs brief comment explaining why this privilege level is required, point to JIRA http://gerrit.cloudera.org:8080/#/c/5047/2/fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java File fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java: Line 249: if (table instanceof KuduTable) { brief comment and pointer to JIRA Line 281: if (table instanceof KuduTable && privilegeLevel_ != TPrivilegeLevel.ALL) { brief comment and pointer to JIRA Line 283: "Kudu tables support only all or nothing permissions."); Kudu tables only support the ALL privilege level. http://gerrit.cloudera.org:8080/#/c/5047/2/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java File fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java: Line 907: AuthzError("create external table tpch.test_tbl3 stored as kudu TBLPROPERTIES " + also add a positive test case Line 912: AuthzOk("create table tpch.test_tbl3 (i int, j int, primary key (i))" + isn't this already covered somewhere? http://gerrit.cloudera.org:8080/#/c/5047/2/testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test File testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test: Line 296: # Not done yet looks more like an analyzer test -- To view, visit http://gerrit.cloudera.org:8080/5047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4494: Fix crash in SimpleScheduler
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4494: Fix crash in SimpleScheduler .. Patch Set 7: Code-Review+2 (4 comments) http://gerrit.cloudera.org:8080/#/c/5127/6/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: Line 201: if (topic != incoming_topic_deltas.end()) { while you're at it, change this to == and return http://gerrit.cloudera.org:8080/#/c/5127/7/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: Line 266: // and tell the statestore. We also ensure that it is always registered with itself. 'registered with itself' sounds a bit odd. '.. that it is part of our backend config'? Line 269: local_backend_descriptor_.address.hostname, nullptr)); odd indentation Line 606: BackendConfig coord_only_config; make this a class member and initialize in c'tor. -- To view, visit http://gerrit.cloudera.org:8080/5127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6e1196a2fa47e5954c4a190aa326c135d039a77f Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4450: qgen: use string concatenation operator for postgres queries
David Knupp has posted comments on this change. Change subject: IMPALA-4450: qgen: use string concatenation operator for postgres queries .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/5034/2//COMMIT_MSG Commit Message: PS2, Line 22: It's difficult to fully test the effects here. I made sure that we : generate syntactically valid queries still on the PostgreSQL side. The code looks fine to me. I'm curious about this line in the commit message though. Why are the effects hard to test? If it hadn't been difficult, what other tests would you have wanted to run -- other than checking for syntactic correctness? Finally, I know you added some unit tests for the qgen -- would it be appropriate/possible to add a unit test for this function, e.g., by passing in a mocked func object? If so, is it worth the effort? -- To view, visit http://gerrit.cloudera.org:8080/5034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I149b695889addfd7df4ca5f40dc991456da51687 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: David Knupp Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Alex Behm has posted comments on this change. Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5136 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE. .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5125 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE. .. IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE. TODO: - Corresponding changes to DESCRIBE EXTENDED/FORMATTED. Testing: A private core/hdfs run passed. Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e Reviewed-on: http://gerrit.cloudera.org:8080/5125 Reviewed-by: Alex Behm Tested-by: Internal Jenkins --- M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java A testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test M tests/query_test/test_kudu.py 9 files changed, 213 insertions(+), 92 deletions(-) Approvals: Internal Jenkins: Verified Alex Behm: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5125 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins
[Impala-ASF-CR] IMPALA-3838, IMPALA-4495: Codegen EvalRuntimeFilters() and fixes filter stats updates
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3838, IMPALA-4495: Codegen EvalRuntimeFilters() and fixes filter stats updates .. Patch Set 6: (6 comments) http://gerrit.cloudera.org:8080/#/c/4833/6/be/src/exec/filter-context.h File be/src/exec/filter-context.h: Line 86: ExprContext* expr_ctx; thank you. http://gerrit.cloudera.org:8080/#/c/4833/6/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 186 what about this? Line 61: constexpr int BATCHES_PER_FILTER_SELECTIVITY_CHECK = 16; just curious, but does this really need constexpr? Line 63: "ROWS_PER_FILTER_SELECTIVITY_CHECK must be a power of two"); update Line 677: RETURN_IF_ERROR(CodegenEvalRuntimeFilters(codegen, single line? or move all args to new line http://gerrit.cloudera.org:8080/#/c/4833/6/be/src/util/bloom-filter.h File be/src/util/bloom-filter.h: Line 55: /// machines. a brief sentence on why noexcept here is important. -- To view, visit http://gerrit.cloudera.org:8080/4833 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I27114869840e268d17e91d6e587ef811628e3837 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState
Hello Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4893 to look at the new patch set (#3). Change subject: IMPALA-4410: Safer tear-down of RuntimeState .. IMPALA-4410: Safer tear-down of RuntimeState * Add RuntimeState::Close() which is guaranteed to release resources safely, rather than having PFE::Close() do the same piecemeal. * Fix for crash where PFE::Prepare() fails before descriptor table is created. * Remove some dead code from TestEnv, and rename some methods for clarity. Testing: Found by debug-actions, which has a reproducible test where PFE::Prepare() fails. Manually tested on master. Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45 --- M be/src/exec/hash-table-test.cc M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/service/fe-support.cc A be/src/util/scope-exit-trigger.h 10 files changed, 95 insertions(+), 66 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/4893/3 -- To view, visit http://gerrit.cloudera.org:8080/4893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState
Henry Robinson has posted comments on this change. Change subject: IMPALA-4410: Safer tear-down of RuntimeState .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/4893/2/be/src/runtime/test-env.cc File be/src/runtime/test-env.cc: Line 109: for (shared_ptr& runtime_state : runtime_states_) { > The second alternative makes most sense to me since it preserves the origin Done http://gerrit.cloudera.org:8080/#/c/4893/2/be/src/runtime/test-env.h File be/src/runtime/test-env.h: PS2, Line 42: CreateRuntimeState > I can change it back; it seemed almost obfuscatory here because all it did Done -- To view, visit http://gerrit.cloudera.org:8080/4893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store sales
David Knupp has posted comments on this change. Change subject: IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store_sales .. Patch Set 2: Thanks, I'll check with Harrison. In the meantime, so that we don't lose track of it, I added the following issue: https://issues.cloudera.org/browse/IMPALA-4520. -- To view, visit http://gerrit.cloudera.org:8080/5177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaae97d1d44201aeeacacdd39adbae35753512950 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Dimitris Tsirogiannis has uploaded a new patch set (#5). Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. IMPALA-2890: Support ALTER TABLE statements for Kudu tables With this commit, we add support for additional ALTER TABLE statements against Kudu tables. The new supported ALTER TABLE operations for Kudu are: - ADD/DROP range partitions. Syntax: ALTER TABLE ADD [IF NOT EXISTS] RANGE ALTER TABLE DROP [IF EXISTS] RANGE - ADD/DROP/RENAME column. Syntax: ALTER TABLE ADD COLUMNS (col_spec, [col_spec, ...]) ALTER TABLE DROP COLUMN ALTER TABLE CHANGE COLUMN - Rename Kudu table using the 'kudu.table_name' table property. Example: ALTER TABLE SET TBLPROPERTY ('kudu.tbl_name'=''), will change the underlying Kudu table name to . - Renaming the HMS/Catalog table entry of a Kudu table is supported using the existing ALTER TABLE RENAME TO syntax. Not supported: - ALTER TABLE REPLACE COLUMNS Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetFileFormatStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java M fe/src/main/java/org/apache/impala/analysis/DistributeParam.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test M tests/query_test/test_kudu.py 20 files changed, 965 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/5136/5 -- To view, visit http://gerrit.cloudera.org:8080/5136 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. Patch Set 4: (8 comments) http://gerrit.cloudera.org:8080/#/c/5136/4/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: Line 200: // If true, an error is raised if an error occurs while adding/dropping a range > If false, errors while adding/dropping a range partition are ignored. Done Line 202: 2: required bool throw_on_error > ignore_errors? Done http://gerrit.cloudera.org:8080/#/c/5136/4/fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java: Line 33: private final boolean throwOnError_; > ignoreErrors_? Done Line 91: AlterTableStmt.analyzeAddDropRangePartition(rangePartitionSpec_, (KuduTable) table, > move analyzeAddDropRangePartition() inside this class, might even just inli Done http://gerrit.cloudera.org:8080/#/c/5136/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: Line 511: private boolean altersKuduTable(TAlterTableType type) { > brief comment, something like: Done Line 556: if (setResultSet) { > dead code block (no more alteration types that produce result sets) oops, sorry. Done http://gerrit.cloudera.org:8080/#/c/5136/4/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test: Line 136: select count(*) from tbl_to_alter > let's also add a count(col) and two where-clause predicates, one that can b Done Line 184: # Insert a row that has nulls on non-nullable columns > non-nullable columns with default values Done -- To view, visit http://gerrit.cloudera.org:8080/5136 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4516: Don't hold process wide lock connection to sessions map lock while cancelling queries
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4516: Don't hold process wide lock connection_to_sessions_map_lock_ while cancelling queries .. IMPALA-4516: Don't hold process wide lock connection_to_sessions_map_lock_ while cancelling queries We hold the connection_to_sessions_map_lock_ while closing multiple sessions, which could map to a large number of queries, which means an even larger number of fragments. We hold this process wide lock and a series of other locks while sending cancel RPCs to all the fragments that fall under the above mentioned category. This could slow down the responsiveness to the client by the daemon. Moreover, holding the lock is unnecessary and we can do without it. This patch fixes this path by dropping the lock before we call CloseSessionInternal(). Change-Id: I9fe37955027ad9fec3fdbbbcb199245c79bcac71 Reviewed-on: http://gerrit.cloudera.org:8080/5173 Reviewed-by: Sailesh Mukil Tested-by: Internal Jenkins --- M be/src/service/impala-server.cc M be/src/service/impala-server.h 2 files changed, 25 insertions(+), 17 deletions(-) Approvals: Internal Jenkins: Verified Sailesh Mukil: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I9fe37955027ad9fec3fdbbbcb199245c79bcac71 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4516: Don't hold process wide lock connection to sessions map lock while cancelling queries
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4516: Don't hold process wide lock connection_to_sessions_map_lock_ while cancelling queries .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9fe37955027ad9fec3fdbbbcb199245c79bcac71 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1788: Fold constant expressions.
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-1788: Fold constant expressions. .. Patch Set 2: (12 comments) http://gerrit.cloudera.org:8080/#/c/5109/2/be/src/exprs/literal.cc File be/src/exprs/literal.cc: Line 249 that looks weird. Line 472: case TYPE_TIMESTAMP: do we have end-to-end tests with timestamp literals? http://gerrit.cloudera.org:8080/#/c/5109/2/be/src/service/fe-support.cc File be/src/service/fe-support.cc: Line 105: query_ctx.request.query_options.max_errors = 10; 10? http://gerrit.cloudera.org:8080/#/c/5109/2/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java: Line 423: private void getResultTypesAndLabels(StatementBase stmt, List resultTypes, it looks like a better approach is to add a StatementBase.getResultExprs and .getColLabels and override appropriately in the subclasses. Line 447: private void setResultTypesAndLabels(StatementBase stmt, List resultTypes, move this to StatementBase? (maybe as castResultExprs() and setColLabels()?) http://gerrit.cloudera.org:8080/#/c/5109/2/fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java File fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java: Line 177: System.out.println(e.getMessage()); remove http://gerrit.cloudera.org:8080/#/c/5109/2/fe/src/main/java/org/apache/impala/rewrite/FoldConstantsRule.java File fe/src/main/java/org/apache/impala/rewrite/FoldConstantsRule.java: Line 48: for (Expr child: expr.getChildren()) { single line http://gerrit.cloudera.org:8080/#/c/5109/2/fe/src/test/java/org/apache/impala/testutil/TestUtils.java File fe/src/test/java/org/apache/impala/testutil/TestUtils.java: Line 260: // Disable rewrites because some analyzer tests have non-executable constant exprs does this mean they're disabled in the planner tests? http://gerrit.cloudera.org:8080/#/c/5109/1/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test: Line 121: | output: sum(2 + id), count:merge(*) > Agree. It's not super easy to fix because the agg info is computed during a get rid of which part? i agree we don't need the 1+1 here. http://gerrit.cloudera.org:8080/#/c/5109/2/testdata/workloads/functional-planner/queries/PlannerTest/insert.test File testdata/workloads/functional-planner/queries/PlannerTest/insert.test: Line 551: WRITE TO HDFS [functional.alltypes, OVERWRITE=false, PARTITION-KEYS=(2009,5)] oh nice http://gerrit.cloudera.org:8080/#/c/5109/2/testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test File testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test: Line 9: tuple-ids=0 row-size=68B cardinality=unavailable what happened here, did something get screwed up when loading kudu data? http://gerrit.cloudera.org:8080/#/c/5109/2/testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test File testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test: Line 1270:predicates: g.bigint_col = 1, g.bigint_col < 1000 do you know why these get reordered? it seems like =1 should have been the first one to start with. -- To view, visit http://gerrit.cloudera.org:8080/5109 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If672b703db1ba0bfc26e5b9130161798b40a69e9 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4357: Fix DROP TABLE for externally deleted Kudu tables
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-4357: Fix DROP TABLE for externally deleted Kudu tables .. Patch Set 1: It was pointed out by Alex that this is a more general problem than just for Kudu, and we would really like to avoid the string matching in the current version, so I explored some other options: 1. We could propagate a TableLoadingException up to DropTableOrViewStmt. The main issue here is that Analyzer.getTable doesn't currently throw TableLoadingExceptions, so to do this we would need to catch that everywhere else getTable is called, which is about 10 places. We could also define a new version of getTable (or repurpose the existing three argument version that is only used in one place) that throws TableLoadingException, minimizing the related changes that are needed. 2. In theory, we don't need to load the table at all to do a DROP TABLE, so we could basically just skip analysis except the auth check. The problem here is that analyzing the table allows us to throw AnalysisExceptions for certain things, eg. a DROP TABLE on a view or vice vera, or a table that doesn't exist. In these cases you would end up with a CatalogException instead, which is not as nice. 3. Automatically remove tables that Kudu can't find from hms. This is somewhat strange behavior, since you could see a table disappear from Impala without having explicitly DROPed it, but this is how we handle inconsistencies in columns between Kudu and hms, so there's precendent. I started playing around with this, and its easy to delete the table from hms when Kudu fails to load it, but I'm not sure how to remove it from Impala's catalog without running INVALIDATE METADATA. This solution also isn't general and only works for Kudu. I don't have a strong opinion, but I'm not a big fan of 3. -- To view, visit http://gerrit.cloudera.org:8080/5144 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6b41fc3c0e95508ab67f1d420b033b02ec75a5da Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] Preview: IMPALA-4000: Restricted Sentry authorization for Kudu Tables
Taras Bobrovytsky has uploaded a new patch set (#2). Change subject: Preview: IMPALA-4000: Restricted Sentry authorization for Kudu Tables .. Preview: IMPALA-4000: Restricted Sentry authorization for Kudu Tables Since Kudu does not support security, we do not want to mislead anyone into thinking this is not the case by being able to set up column level privileges. Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c --- M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java M fe/src/main/java/org/apache/impala/authorization/PrivilegeRequestBuilder.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeAuthStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test 6 files changed, 49 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/5047/2 -- To view, visit http://gerrit.cloudera.org:8080/5047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store sales
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store_sales .. Patch Set 2: I think we should try to be consistent in the way we handle table loading. Our infrastructure is already too messy to reason about and changing the behavior in one place makes the problem even worse. I am fine with making this change if it will unblock a specific piece of work. We just need to make sure the follow up work doesn't fall through the cracks. Maybe also check with Harrison for a second opinion. -- To view, visit http://gerrit.cloudera.org:8080/5177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaae97d1d44201aeeacacdd39adbae35753512950 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4516: Don't hold process wide lock connection to sessions map lock while cancelling queries
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4516: Don't hold process wide lock connection_to_sessions_map_lock_ while cancelling queries .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/5173/4/be/src/service/impala-server.cc File be/src/service/impala-server.cc: Line 1669: sessions_to_close = it->second; Not a big deal, but we could do a move() here to avoid the copy. -- To view, visit http://gerrit.cloudera.org:8080/5173 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9fe37955027ad9fec3fdbbbcb199245c79bcac71 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4397,IMPALA-3259: reduce codegen time and memory
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory .. Patch Set 11: Code-Review+1 Rebased, carry +1 -- To view, visit http://gerrit.cloudera.org:8080/4956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Silvius Rus Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4397,IMPALA-3259: reduce codegen time and memory
Hello Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4956 to look at the new patch set (#11). Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory .. IMPALA-4397,IMPALA-3259: reduce codegen time and memory A handful of fixes to codegen memory usage: * Delete the IR module when we're done with it (it can be fairly large) * Track the compiled code size (typically not that large, but it can add up if there are many fragments). * Estimate optimisation memory requirements and track it in the memory tracker. This is very crude but much better than not tracking it. A handful of fixes to improve codegen time/cost, particularly targeted at compute stats workloads: * Avoid over-inlining when there are many aggregate functions, conjuncts, etc by adding "NoInline" attributes. * Don't codegen non-grouping merge aggregations. They will only process one row per Impala daemon, so codegen is not worth it. * Make the Hll algorithm more efficient by specialising the hash function based on decimal width. Limitations: * This doesn't tackle over-inlining of large expr trees, but a similar approach will be used there in a follow-on patch. Perf: Compute stats on functional_parquet.widetable_1000_cols goes from 1min+ of codegen to ~ 5s codegen on my machine. Local perf runs of tpc-h and targeted perf showed no regressions and some moderate improvements (1-2%). Also did an experiment to understand the perf consequences of disabling inlining. I manually set CODEGEN_INLINE_EXPRS_THRESHOLD to 0, and ran: drop stats tpch_20_parquet.lineitem compute stats tpch_20_parquet.lineitem; There was no difference in time spent in the agg node: 30.7s with inlining, 30.5s without. Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/codegen/mcjit-mem-mgr.h M be/src/exec/aggregation-node.cc M be/src/exec/exchange-node.cc M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/hash-join-node.cc M be/src/exec/hash-table.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/sort-node.cc M be/src/exec/topn-node.cc M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/expr.h M be/src/runtime/lib-cache.cc M be/src/runtime/runtime-state.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M testdata/workloads/functional-query/queries/QueryTest/compute-stats.test M tests/query_test/test_aggregation.py 28 files changed, 1,468 insertions(+), 143 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/4956/11 -- To view, visit http://gerrit.cloudera.org:8080/4956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Silvius Rus Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store sales
David Knupp has posted comments on this change. Change subject: IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store_sales .. Patch Set 2: So, with this particular issue, we were being tripped up by this logic in generate-schema-statements.py: force_reload = options.force_reload or (partition_columns and not alter) force_reload was set to true because partition_columns had a value, with no corresponding ALTER clause(s). After scanning functional_schema_template.sql, it looks like each time we partition, we also ALTER the table, so I suspect -- though I could be wrong -- we avoid the problem there. Are you saying we should replace all of those existing ALTER table statements with RECOVER PARTITIONS anyway? From my limited perspective, I think that's probably not a bad idea. Would that be worth a second JIRA being filed though? -- To view, visit http://gerrit.cloudera.org:8080/5177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaae97d1d44201aeeacacdd39adbae35753512950 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4397,IMPALA-3259: reduce codegen time and memory
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory .. Patch Set 8: (2 comments) Sorry about that, lost track of those comments. http://gerrit.cloudera.org:8080/#/c/4956/7/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: Line 1073: module_ = NULL; > What about destroying fns_to_jit_compile_ here too as it may still contain Done http://gerrit.cloudera.org:8080/#/c/4956/8/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: PS8, Line 467: if > is Done -- To view, visit http://gerrit.cloudera.org:8080/4956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Silvius Rus Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4397,IMPALA-3259: reduce codegen time and memory
Hello Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4956 to look at the new patch set (#10). Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory .. IMPALA-4397,IMPALA-3259: reduce codegen time and memory A handful of fixes to codegen memory usage: * Delete the IR module when we're done with it (it can be fairly large) * Track the compiled code size (typically not that large, but it can add up if there are many fragments). * Estimate optimisation memory requirements and track it in the memory tracker. This is very crude but much better than not tracking it. A handful of fixes to improve codegen time/cost, particularly targeted at compute stats workloads: * Avoid over-inlining when there are many aggregate functions, conjuncts, etc by adding "NoInline" attributes. * Don't codegen non-grouping merge aggregations. They will only process one row per Impala daemon, so codegen is not worth it. * Make the Hll algorithm more efficient by specialising the hash function based on decimal width. Limitations: * This doesn't tackle over-inlining of large expr trees, but a similar approach will be used there in a follow-on patch. Perf: Compute stats on functional_parquet.widetable_1000_cols goes from 1min+ of codegen to ~ 5s codegen on my machine. Local perf runs of tpc-h and targeted perf showed no regressions and some moderate improvements (1-2%). Also did an experiment to understand the perf consequences of disabling inlining. I manually set CODEGEN_INLINE_EXPRS_THRESHOLD to 0, and ran: drop stats tpch_20_parquet.lineitem compute stats tpch_20_parquet.lineitem; There was no difference in time spent in the agg node: 30.7s with inlining, 30.5s without. Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/codegen/mcjit-mem-mgr.h M be/src/exec/aggregation-node.cc M be/src/exec/exchange-node.cc M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/hash-join-node.cc M be/src/exec/hash-table.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/sort-node.cc M be/src/exec/topn-node.cc M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/anyval-util.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/expr.h M be/src/runtime/lib-cache.cc M be/src/runtime/runtime-state.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M testdata/workloads/functional-query/queries/QueryTest/compute-stats.test M tests/query_test/test_aggregation.py 28 files changed, 1,484 insertions(+), 167 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/4956/10 -- To view, visit http://gerrit.cloudera.org:8080/4956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Silvius Rus Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3342: Add thread counters to monitor plan fragment execution
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-3342: Add thread counters to monitor plan fragment execution .. Patch Set 11: Code-Review+2 Carrying Henry's +2 after rebase. -- To view, visit http://gerrit.cloudera.org:8080/4633 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Yonghyun Hwang Gerrit-Reviewer: anujphadke Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4283: Ensure Kudu-specific lineage and audit behavior
Alex Behm has posted comments on this change. Change subject: IMPALA-4283: Ensure Kudu-specific lineage and audit behavior .. Patch Set 2: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/5151/1/testdata/workloads/functional-planner/queries/PlannerTest/lineage.test File testdata/workloads/functional-planner/queries/PlannerTest/lineage.test: Line 4925: > Hm, we don't seem to have negative tests for statements that don't generate What you have here seems fine to me. -- To view, visit http://gerrit.cloudera.org:8080/5151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store sales
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store_sales .. Patch Set 2: functional_schema_template.sql seems to have tables that could run into the same problems. Why not using RECOVER PARTITIONS everywhere? -- To view, visit http://gerrit.cloudera.org:8080/5177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaae97d1d44201aeeacacdd39adbae35753512950 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4283: Ensure Kudu-specific lineage and audit behavior
Dimitris Tsirogiannis has uploaded a new patch set (#2). Change subject: IMPALA-4283: Ensure Kudu-specific lineage and audit behavior .. IMPALA-4283: Ensure Kudu-specific lineage and audit behavior With this commit we add support for auditing all Kudu-specific operations and we enable column lineage for INSERT and UPSERT statements on Kudu tables. No lineage output is generated for DELETE and UPDATE statements. Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d --- M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/PlannerContext.java M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java M testdata/workloads/functional-planner/queries/PlannerTest/lineage.test 7 files changed, 592 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/5151/2 -- To view, visit http://gerrit.cloudera.org:8080/5151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis
[Impala-ASF-CR] IMPALA-4283: Ensure Kudu-specific lineage and audit behavior
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4283: Ensure Kudu-specific lineage and audit behavior .. Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/5151/1/fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java File fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java: Line 689: if (mentionedColumns.isEmpty()) { > Can we reverse the logic as follows: Done http://gerrit.cloudera.org:8080/#/c/5151/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: Line 795: for (int i = 0; i < resultExprs_.size(); ++i) { > We don't really need resultExprs_ right? Good point, thanks. Done http://gerrit.cloudera.org:8080/#/c/5151/1/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: Line 186: if (ctx_.isUpdateOrDelete()) return fragments; > Does this line do anything? I though we established that this is an INSERT Haha, yeah, I don't know what I was thinking here. Done http://gerrit.cloudera.org:8080/#/c/5151/1/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java File fe/src/test/java/org/apache/impala/analysis/AuditingTest.java: Line 374: public void TestKuduStatements() throws AuthorizationException, AnalysisException { > nice! Done http://gerrit.cloudera.org:8080/#/c/5151/1/testdata/workloads/functional-planner/queries/PlannerTest/lineage.test File testdata/workloads/functional-planner/queries/PlannerTest/lineage.test: Line 4656: # Insert into a Kudu table with a column list specified > Upsert Done Line 4661: "queryText":"insert into functional_kudu.alltypes (int_col, id) select int_col, id from\nfunctional.alltypes where id < 10", > upsert? Done Line 4925: > What happens for DELETE and UPDATE statements? Might be worth adding a test Hm, we don't seem to have negative tests for statements that don't generate lineage output. I'll see if it's easy to add one. -- To view, visit http://gerrit.cloudera.org:8080/5151 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store sales
David Knupp has uploaded a new patch set (#2). Change subject: IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store_sales .. IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store_sales This patch changes the way we load tpcds.store_sales test data. Before this, we were relying on a force_reload to build the table partitions based upon the data that had been copied over to HDFS from the warehouse snapshot. This worked on the local mini-cluster, but for some reason, it was selectively duplicating data when run on a remote cluster. This patch doesn't solve the mystery of why data duplication occurs on remote clusters, but it does resolve the immediate concern of loading test data by using Impala's recover partitions feature to automatically recognize the partitions in the HDFS directories. We just needed to add an ALTER TABLE store_sales RECOVER PARTITIONS to the tpcds schema template file. Tested by dropping the tpcds table on from a remote cluster setup, reloading the table, and running the tests in test_tpcds_queries.py. Tests that had been failng before are now passing. Change-Id: Iaae97d1d44201aeeacacdd39adbae35753512950 --- M testdata/datasets/tpcds/tpcds_schema_template.sql 1 file changed, 2 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/5177/2 -- To view, visit http://gerrit.cloudera.org:8080/5177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iaae97d1d44201aeeacacdd39adbae35753512950 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Dimitris Tsirogiannis
[Impala-ASF-CR] add ALTER TABLE / RECOVER PARTITION to the tpcds template file
David Knupp has uploaded a new change for review. http://gerrit.cloudera.org:8080/5187 Change subject: add ALTER TABLE / RECOVER PARTITION to the tpcds template file .. add ALTER TABLE / RECOVER PARTITION to the tpcds template file Change-Id: Ib1d712c13050d38ee7cfbfb8e21b16522232aec8 --- M testdata/datasets/tpcds/tpcds_schema_template.sql 1 file changed, 2 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/5187/1 -- To view, visit http://gerrit.cloudera.org:8080/5187 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib1d712c13050d38ee7cfbfb8e21b16522232aec8 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] separate snapshot upload as a definite function
David Knupp has uploaded a new change for review. http://gerrit.cloudera.org:8080/5181 Change subject: separate snapshot upload as a definite function .. separate snapshot upload as a definite function Change-Id: If0e8c1622b18cb71f5f750bc97db00565c04565b --- M bin/remote_data_load.py 1 file changed, 38 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/5181/1 -- To view, visit http://gerrit.cloudera.org:8080/5181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If0e8c1622b18cb71f5f750bc97db00565c04565b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] correct generate files only check in bin/load-test.py
David Knupp has abandoned this change. Change subject: correct generate_files_only check in bin/load-test.py .. Abandoned Pushed wrong branch by accident. -- To view, visit http://gerrit.cloudera.org:8080/5184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I3197580924b90d92031b50be419b6648695a538a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] add ALTER TABLE / RECOVER PARTITION to the tpcds template file
David Knupp has abandoned this change. Change subject: add ALTER TABLE / RECOVER PARTITION to the tpcds template file .. Abandoned Pushed wrong branch by accident. -- To view, visit http://gerrit.cloudera.org:8080/5187 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: Ib1d712c13050d38ee7cfbfb8e21b16522232aec8 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] filter command line args
David Knupp has abandoned this change. Change subject: filter command line args .. Abandoned Pushed wrong branch by accident. -- To view, visit http://gerrit.cloudera.org:8080/5186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I6f18ffbd65ccc505925d50e3509c5a556bc78e04 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] separate snapshot upload as a definite function
David Knupp has abandoned this change. Change subject: separate snapshot upload as a definite function .. Abandoned Pushed wrong branch by accident. -- To view, visit http://gerrit.cloudera.org:8080/5181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: If0e8c1622b18cb71f5f750bc97db00565c04565b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] add ability to delete snapshot only to remote data load.py
David Knupp has abandoned this change. Change subject: add ability to delete snapshot only to remote_data_load.py .. Abandoned Pushed wrong branch by accident. -- To view, visit http://gerrit.cloudera.org:8080/5185 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I68a645e32fa75f057623404214f2d8635299fd3a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] enable run tests temporarily
David Knupp has abandoned this change. Change subject: enable run_tests temporarily .. Abandoned Pushed wrong branch by accident. -- To view, visit http://gerrit.cloudera.org:8080/5183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: Ie074a83b999e5cf9af0e525a90fff66d2b6594c9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] add option for generating SQL, but skipping actual data load
David Knupp has abandoned this change. Change subject: add option for generating SQL, but skipping actual data load .. Abandoned Pushed wrong branch by accident. -- To view, visit http://gerrit.cloudera.org:8080/5182 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: Ia8ab4620728ef54f109e00453c7cf093f370f4a2 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] test Jim timezone changes
David Knupp has abandoned this change. Change subject: test Jim timezone changes .. Abandoned Pushed wrong branch by accident. -- To view, visit http://gerrit.cloudera.org:8080/5179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: Ice5d7d7df666fa75ee96faaac9b57b0282aa7119 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] add --user option to bin/load-data.py
David Knupp has abandoned this change. Change subject: add --user option to bin/load-data.py .. Abandoned Pushed wrong branch by accident. -- To view, visit http://gerrit.cloudera.org:8080/5180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: Ic78e9a31d7fc8d59eaecceb49f42ecf38db57566 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] remove exploration strategy from remote data load.py
David Knupp has abandoned this change. Change subject: remove exploration strategy from remote_data_load.py .. Abandoned Pushed wrong branch by accident. -- To view, visit http://gerrit.cloudera.org:8080/5178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: Ib74595fcafff5a1963af3d452e48161cdda82a65 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] correct generate files only check in bin/load-test.py
David Knupp has uploaded a new change for review. http://gerrit.cloudera.org:8080/5184 Change subject: correct generate_files_only check in bin/load-test.py .. correct generate_files_only check in bin/load-test.py Change-Id: I3197580924b90d92031b50be419b6648695a538a --- M bin/load-data.py 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/5184/1 -- To view, visit http://gerrit.cloudera.org:8080/5184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I3197580924b90d92031b50be419b6648695a538a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] add option for generating SQL, but skipping actual data load
David Knupp has uploaded a new change for review. http://gerrit.cloudera.org:8080/5182 Change subject: add option for generating SQL, but skipping actual data load .. add option for generating SQL, but skipping actual data load Change-Id: Ia8ab4620728ef54f109e00453c7cf093f370f4a2 --- M bin/load-data.py 1 file changed, 15 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/5182/1 -- To view, visit http://gerrit.cloudera.org:8080/5182 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ia8ab4620728ef54f109e00453c7cf093f370f4a2 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] filter command line args
David Knupp has uploaded a new change for review. http://gerrit.cloudera.org:8080/5186 Change subject: filter command line args .. filter command line args Change-Id: I6f18ffbd65ccc505925d50e3509c5a556bc78e04 --- M tests/run-tests.py 1 file changed, 55 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/5186/1 -- To view, visit http://gerrit.cloudera.org:8080/5186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I6f18ffbd65ccc505925d50e3509c5a556bc78e04 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] add --user option to bin/load-data.py
David Knupp has uploaded a new change for review. http://gerrit.cloudera.org:8080/5180 Change subject: add --user option to bin/load-data.py .. add --user option to bin/load-data.py Change-Id: Ic78e9a31d7fc8d59eaecceb49f42ecf38db57566 --- M bin/load-data.py 1 file changed, 4 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/5180/1 -- To view, visit http://gerrit.cloudera.org:8080/5180 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ic78e9a31d7fc8d59eaecceb49f42ecf38db57566 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] add ability to delete snapshot only to remote data load.py
David Knupp has uploaded a new change for review. http://gerrit.cloudera.org:8080/5185 Change subject: add ability to delete snapshot only to remote_data_load.py .. add ability to delete snapshot only to remote_data_load.py Change-Id: I68a645e32fa75f057623404214f2d8635299fd3a --- M bin/remote_data_load.py 1 file changed, 16 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/5185/1 -- To view, visit http://gerrit.cloudera.org:8080/5185 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I68a645e32fa75f057623404214f2d8635299fd3a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] enable run tests temporarily
David Knupp has uploaded a new change for review. http://gerrit.cloudera.org:8080/5183 Change subject: enable run_tests temporarily .. enable run_tests temporarily Change-Id: Ie074a83b999e5cf9af0e525a90fff66d2b6594c9 --- M bin/remote_data_load.py 1 file changed, 7 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/5183/1 -- To view, visit http://gerrit.cloudera.org:8080/5183 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ie074a83b999e5cf9af0e525a90fff66d2b6594c9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] test Jim timezone changes
David Knupp has uploaded a new change for review. http://gerrit.cloudera.org:8080/5179 Change subject: test Jim timezone changes .. test Jim timezone changes Change-Id: Ice5d7d7df666fa75ee96faaac9b57b0282aa7119 --- M testdata/src/main/java/org/apache/impala/datagenerator/TestDataGenerator.java 1 file changed, 5 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/5179/1 -- To view, visit http://gerrit.cloudera.org:8080/5179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ice5d7d7df666fa75ee96faaac9b57b0282aa7119 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] remove exploration strategy from remote data load.py
David Knupp has uploaded a new change for review. http://gerrit.cloudera.org:8080/5178 Change subject: remove exploration strategy from remote_data_load.py .. remove exploration strategy from remote_data_load.py Change-Id: Ib74595fcafff5a1963af3d452e48161cdda82a65 --- M bin/remote_data_load.py 1 file changed, 1 insertion(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/5178/1 -- To view, visit http://gerrit.cloudera.org:8080/5178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib74595fcafff5a1963af3d452e48161cdda82a65 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp
[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
Alex Behm has posted comments on this change. Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE. .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5125 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store sales
David Knupp has posted comments on this change. Change subject: IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store_sales .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5177/1/testdata/datasets/tpcds/tpcds_schema_template.sql File testdata/datasets/tpcds/tpcds_schema_template.sql: PS1, Line 370: {table_name} > Shouldn't this be {db_name}{db_suffix}.{table_name}? Maybe? :-) I actually copied the format from the ALTER statements in the functional_schema_template.sql file, which just use {table_name}. But your suggestion makes sense -- I'll make the change and reload/retest. -- To view, visit http://gerrit.cloudera.org:8080/5177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaae97d1d44201aeeacacdd39adbae35753512950 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables
Alex Behm has posted comments on this change. Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables .. Patch Set 4: (9 comments) Last round, we're pretty much done. http://gerrit.cloudera.org:8080/#/c/5136/4/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: Line 200: // If true, an error is raised if an error occurs while adding/dropping a range If false, errors while adding/dropping a range partition are ignored. Line 202: 2: required bool throw_on_error ignore_errors? (throw_on_error is a little weird because that seems to be the conventional behavior) http://gerrit.cloudera.org:8080/#/c/5136/4/fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java: Line 33: private final boolean throwOnError_; ignoreErrors_? Line 91: AlterTableStmt.analyzeAddDropRangePartition(rangePartitionSpec_, (KuduTable) table, move analyzeAddDropRangePartition() inside this class, might even just inline it here http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java: Line 123: if (c.hasEncoding() || c.hasCompression() || c.hasBlockSize()) { > Unfortunately we can't do that in this case. There is no Kudu API call that Makes sense. Do you know if there's a Kudu JIRA? We should file an Impala one to track making Impala's behavior regarding handling of these options consistent. http://gerrit.cloudera.org:8080/#/c/5136/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: Line 511: private boolean altersKuduTable(TAlterTableType type) { brief comment, something like: Returns true if the given alteration type changes the underlying table stored in Kudu in addition to the HMS table. Line 556: if (setResultSet) { dead code block (no more alteration types that produce result sets) http://gerrit.cloudera.org:8080/#/c/5136/4/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test: Line 136: select count(*) from tbl_to_alter let's also add a count(col) and two where-clause predicates, one that can be pushed to Kudu and one that must be evaluated by Impala to make sure that works on empty tables Line 184: # Insert a row that has nulls on non-nullable columns non-nullable columns with default values -- To view, visit http://gerrit.cloudera.org:8080/5136 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: Yes