[Impala-ASF-CR] IMPALA-4336: Cast exprs after unnesting union operands.
Alex Behm has posted comments on this change. Change subject: IMPALA-4336: Cast exprs after unnesting union operands. .. Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/4815/1//COMMIT_MSG Commit Message: PS1, Line 15: Also clarifies the use of resultExprs vs. baseTblResultExprs : in unions. We should consistently use resultExprs because the : final expr substitution happens during planning. : The only place where baseTblResultExprs should be used is in : UnionStmt.materializeRequiredSlots() because that is called : before plan generation and we need to mark the slots of resolved : exprs as materialized. > I am not sure this belongs to the commit message. If you don't have it alre Shrunk comment here. Moved expanded comment to UnionStmt class comment. http://gerrit.cloudera.org:8080/#/c/4815/1/fe/src/main/java/org/apache/impala/analysis/UnionStmt.java File fe/src/main/java/org/apache/impala/analysis/UnionStmt.java: PS1, Line 202: Propagates DISTINCT from left to right, and checks that all union operands are :* are union compatible, adding implicit casts if necessary. > I think you need to update this comment. This function seems to be doing a I removed this comment because it's redundant with the class comment. Let me know if you feel the class comment still needs to be expanded/moved. PS1, Line 203: are > remove Done PS1, Line 218: // Analyze all operands and make sure they return an equal number of exprs. : for (int i = 0; i < operands_.size(); ++i) { : try { : operands_.get(i).analyze(analyzer); : QueryStmt firstQuery = operands_.get(0).getQueryStmt(); : List firstExprs = operands_.get(0).getQueryStmt().getResultExprs(); : QueryStmt query = operands_.get(i).getQueryStmt(); : List exprs = query.getResultExprs(); : if (firstExprs.size() != exprs.size()) { : throw new AnalysisException("Operands have unequal number of columns:\n" + : "'" + queryStmtToSql(firstQuery) + "' has " + : firstExprs.size() + " column(s)\n" + : "'" + queryStmtToSql(query) + "' has " + exprs.size() + " column(s)"); : } : } catch (AnalysisException e) { : if (analyzer.getMissingTbls().isEmpty()) throw e; : } : } > This function has grown a bit too much. I would put this block is a separat Done Line 241: // Remember SQL string before unnesting operands. > the Done PS1, Line 258: // Cast all result exprs to a compatible type. > move above L263 We need to collect them first before we can analyzer.castToUnionCompatibleTypes(), so seems relevant to this loop as well. Modified comment. PS1, Line 282: // Should never happen : throw new AnalysisException("Error creating agg info in UnionStmt.analyze()"); > Maybe convert it to Preconditions.checkState(false, "Error")? Modified to an ISE which is what the Preconditions would also throw. I prefer the ISe because we can give it a cause. http://gerrit.cloudera.org:8080/#/c/4815/1/testdata/workloads/functional-query/queries/QueryTest/union.test File testdata/workloads/functional-query/queries/QueryTest/union.test: PS1, Line 1030: (select 80 union all select 90) > Maybe add a test case with nested union distinct, if not already there. A nested union distinct cannot be unnested, so is not really related to this bug. We have tests for that elsewhere. -- To view, visit http://gerrit.cloudera.org:8080/4815 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4336: Cast exprs after unnesting union operands.
Alex Behm has uploaded a new patch set (#2). Change subject: IMPALA-4336: Cast exprs after unnesting union operands. .. IMPALA-4336: Cast exprs after unnesting union operands. The bug was that we cast the result exprs of operands before unnesting them. If we unnested an operands, casts were missing on those unnested operands' result exprs. The fix is to first unnest operands and then cast result exprs. Also clarifies the use of resultExprs vs. baseTblResultExprs. Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M testdata/workloads/functional-query/queries/QueryTest/union.test 5 files changed, 137 insertions(+), 104 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4815/2 -- To view, visit http://gerrit.cloudera.org:8080/4815 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis
[Impala-ASF-CR] IMPALA-4369: Avoid DCHECK in Parquet scanner with MT DOP > 0.
Alex Behm has uploaded a new change for review. http://gerrit.cloudera.org:8080/4851 Change subject: IMPALA-4369: Avoid DCHECK in Parquet scanner with MT_DOP > 0. .. IMPALA-4369: Avoid DCHECK in Parquet scanner with MT_DOP > 0. When HdfsParquetScanner::Open() failed we used to hit a DCHECK when trying to access HdfsParquetScanner::batch() which is only valid to call for non-MT scan nodes. Change-Id: Ifbfdde505dbbd2742e7ab79a2415ff317a9bfa2f --- M be/src/exec/hdfs-scan-node-base.cc A testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test M tests/query_test/test_mt_dop.py 3 files changed, 26 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/4851/2 -- To view, visit http://gerrit.cloudera.org:8080/4851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ifbfdde505dbbd2742e7ab79a2415ff317a9bfa2f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm
[Impala-ASF-CR] IMPALA-4336: Cast exprs after unnesting union operands.
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4336: Cast exprs after unnesting union operands. .. Patch Set 1: (8 comments) Flushing some initial comments. Still need to wrap my head around all the moving parts. http://gerrit.cloudera.org:8080/#/c/4815/1//COMMIT_MSG Commit Message: PS1, Line 15: Also clarifies the use of resultExprs vs. baseTblResultExprs : in unions. We should consistently use resultExprs because the : final expr substitution happens during planning. : The only place where baseTblResultExprs should be used is in : UnionStmt.materializeRequiredSlots() because that is called : before plan generation and we need to mark the slots of resolved : exprs as materialized. I am not sure this belongs to the commit message. If you don't have it already, you may want to put it as a comment somewhere in the code. http://gerrit.cloudera.org:8080/#/c/4815/1/fe/src/main/java/org/apache/impala/analysis/UnionStmt.java File fe/src/main/java/org/apache/impala/analysis/UnionStmt.java: PS1, Line 202: Propagates DISTINCT from left to right, and checks that all union operands are :* are union compatible, adding implicit casts if necessary. I think you need to update this comment. This function seems to be doing a lot more than simply propagating distinct and checking union operand compatibility. PS1, Line 203: are remove PS1, Line 218: // Analyze all operands and make sure they return an equal number of exprs. : for (int i = 0; i < operands_.size(); ++i) { : try { : operands_.get(i).analyze(analyzer); : QueryStmt firstQuery = operands_.get(0).getQueryStmt(); : List firstExprs = operands_.get(0).getQueryStmt().getResultExprs(); : QueryStmt query = operands_.get(i).getQueryStmt(); : List exprs = query.getResultExprs(); : if (firstExprs.size() != exprs.size()) { : throw new AnalysisException("Operands have unequal number of columns:\n" + : "'" + queryStmtToSql(firstQuery) + "' has " + : firstExprs.size() + " column(s)\n" + : "'" + queryStmtToSql(query) + "' has " + exprs.size() + " column(s)"); : } : } catch (AnalysisException e) { : if (analyzer.getMissingTbls().isEmpty()) throw e; : } : } This function has grown a bit too much. I would put this block is a separate function called analyzeOperands(). Line 241: // Remember SQL string before unnesting operands. the PS1, Line 258: // Cast all result exprs to a compatible type. move above L263 PS1, Line 282: // Should never happen : throw new AnalysisException("Error creating agg info in UnionStmt.analyze()"); Maybe convert it to Preconditions.checkState(false, "Error")? http://gerrit.cloudera.org:8080/#/c/4815/1/testdata/workloads/functional-query/queries/QueryTest/union.test File testdata/workloads/functional-query/queries/QueryTest/union.test: PS1, Line 1030: (select 80 union all select 90) Maybe add a test case with nested union distinct, if not already there. -- To view, visit http://gerrit.cloudera.org:8080/4815 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5e3ab7349df7d67d0d9c2baf4a56342d3f04e76d Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements
Internal Jenkins has posted comments on this change. Change subject: IMPALA-2521: Add clustered hint to insert statements .. Patch Set 13: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4745 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-2521: Add clustered hint to insert statements .. IMPALA-2521: Add clustered hint to insert statements This change introduces a clustered/noclustered hint for insert statements. Specifying this hint adds an additional sort node to the plan, just before the table sink. This has the effect that data will be clustered by its partition prior to writing partitions, which therefore can be written sequentially. Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14 Reviewed-on: http://gerrit.cloudera.org:8080/4745 Reviewed-by: Alex Behm Tested-by: Internal Jenkins --- M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/SortInfo.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/workloads/functional-planner/queries/PlannerTest/insert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-query/queries/QueryTest/insert.test M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test 10 files changed, 418 insertions(+), 44 deletions(-) Approvals: Internal Jenkins: Verified Alex Behm: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/4745 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker
[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4023: don't attach buffered tuple streams to batches .. IMPALA-4023: don't attach buffered tuple streams to batches This simplifies the memory transfer model by eliminating one category of resources that can be attached. This patch also separates the concepts of attaching resources and flushing resources. Previously RowBatch::AddTupleStream() implicitly flushed resources from the ExecNode pipeline, which various ExecNodes relied on to free up memory reservations for subsequent processing. In a subsequent patch I want the FlushResources() API to become stronger: it will force streaming ExecNodes to flush their batches or forces blocking ExecNodes to acquire ownership of the memory resources. We can't do this right now since we don't have a way to transfer ownership of BufferedBlockMgr Blocks. Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff Reviewed-on: http://gerrit.cloudera.org:8080/4448 Reviewed-by: Tim Armstrong Tested-by: Internal Jenkins --- M be/src/exec/analytic-eval-node.cc M be/src/exec/analytic-eval-node.h M be/src/exec/nested-loop-join-builder.cc M be/src/exec/nested-loop-join-node.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/buffered-tuple-stream.cc M be/src/runtime/buffered-tuple-stream.h M be/src/runtime/row-batch-test.cc M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/sorted-run-merger.cc M be/src/runtime/sorted-run-merger.h M be/src/runtime/sorter.cc 16 files changed, 240 insertions(+), 159 deletions(-) Approvals: Internal Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/4448 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4023: don't attach buffered tuple streams to batches .. Patch Set 13: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4448 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1654: General partition exprs in DDL operations.
Hello Marcel Kornacker, Bharath Vissapragada, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3942 to look at the new patch set (#19). Change subject: IMPALA-1654: General partition exprs in DDL operations. .. IMPALA-1654: General partition exprs in DDL operations. This commit handles partition related DDL in a more general way. We can now use compound predicates to specify a list of partitions in statements like ALTER TABLE DROP PARTITION and COMPUTE INCREMENTAL STATS, etc. It will also make sure some statements only accept one partition at a time, such as PARTITION SET LOCATION and LOAD DATA. ALTER TABLE ADD PARTITION remains using the old PartitionKeyValue's logic. The changed partition related DDLs are as follows, Table: p (i int) partitioned by (j int, k string) Partitions: +---+---+---++--+--+---+ | j | k | #Rows | #Files | Size | Bytes Cached | Cache Replication | +---+---+---++--+--+---+ | 1 | a | -1| 0 | 0B | NOT CACHED | NOT CACHED| | 1 | b | -1| 0 | 0B | NOT CACHED | NOT CACHED| | 1 | c | -1| 0 | 0B | NOT CACHED | NOT CACHED| | 2 | d | -1| 0 | 0B | NOT CACHED | NOT CACHED| | 2 | e | -1| 0 | 0B | NOT CACHED | NOT CACHED| | 2 | f | -1| 0 | 0B | NOT CACHED | NOT CACHED| | Total | | -1| 0 | 0B | 0B | | +---+---+---++--+--+---+ 1. show files in p partition (j<2, k='a'); 2. alter table p partition (j<2, k in ("b","c") set cached in 'testPool'; // j can appear more than once, 3.1. alter table p partition (j<2, j>0, k<>"d") set uncached; // it is the same as 3.2. alter table p partition (j<2 and j>0, not k="e") set uncached; // we can also do 'or' 3.3. alter table p partition (j<2 or j>0, k like "%") set uncached; // missing 'k' matches all values of k 4. alter table p partition (j<2) set fileformat textfile; 5. alter table p partition (k rlike ".*") set serdeproperties ("k"="v"); 6. alter table p partition (j is not null) set tblproperties ("k"="v"); 7. alter table p drop partition (j<2); 8. compute incremental stats p partition(j<2); The remaining old partition related DDLs are as follows, 1. load data inpath '/path/from' into table p partition (j=2, k="d"); 2. alter table p add partition (j=2, k="g"); 3. alter table p partition (j=2, k="g") set location '/path/to'; 4. insert into p partition (j=2, k="g") values (1), (2), (3); General partition expressions or partially specified partition specs allows partition predicates to return empty partition set no matter 'IF EXISTS' is specified. Examples: [localhost.localdomain:21000] > alter table p drop partition (j=2, k="f"); Query: alter table p drop partition (j=2, k="f") +-+ | summary | +-+ | Dropped 1 partition(s). | +-+ Fetched 1 row(s) in 0.78s [localhost.localdomain:21000] > alter table p drop partition (j=2, k<"f"); Query: alter table p drop partition (j=2, k<"f") +-+ | summary | +-+ | Dropped 2 partition(s). | +-+ Fetched 1 row(s) in 0.41s [localhost.localdomain:21000] > alter table p drop partition (k="a"); Query: alter table p drop partition (k="a") +-+ | summary | +-+ | Dropped 1 partition(s). | +-+ Fetched 1 row(s) in 0.25s [localhost.localdomain:21000] > show partitions p; Query: show partitions p +---+---+---++--+--+---+ | j | k | #Rows | #Files | Size | Bytes Cached | Cache Replication | +---+---+---++--+--+---+ | 1 | b | -1| 0 | 0B | NOT CACHED | NOT CACHED| | 1 | c | -1| 0 | 0B | NOT CACHED | NOT CACHED| | Total | | -1| 0 | 0B | 0B | | +---+---+---++--+--+---+ Fetched 3 row(s) in 0.01s Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f --- M be/src/service/query-exec-state.cc M be/src/service/query-exec-state.h M common/thrift/CatalogService.thrift M common/thrift/Frontend.thrift M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetCachedStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetFileFormatStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java M fe/src/main/java/org/apache/i
[Impala-ASF-CR] IMPALA-3983/IMPALA-3974: Delete function jar resources after load
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3983/IMPALA-3974: Delete function jar resources after load .. Patch Set 7: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/364/ -- To view, visit http://gerrit.cloudera.org:8080/4617 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f9dedb5b342415380c83e61a72eb497371a8199 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4362: Misc. fixes for PFE counters
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4362: Misc. fixes for PFE counters .. IMPALA-4362: Misc. fixes for PFE counters * ExecTime was always 0, because it wasn't updated before the profile was reported to the coordinator * Made ExecTree* counters children of their respective parents * Moved ExecTreePrepareTime into the right profile * Renamed timings profile to something a bit more obvious. Change-Id: Iaa9dccdcd91297a11a093b921f8d84d32f32be84 Reviewed-on: http://gerrit.cloudera.org:8080/4829 Reviewed-by: Henry Robinson Tested-by: Internal Jenkins --- M be/src/runtime/plan-fragment-executor.cc 1 file changed, 23 insertions(+), 11 deletions(-) Approvals: Henry Robinson: Looks good to me, approved Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/4829 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Iaa9dccdcd91297a11a093b921f8d84d32f32be84 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins
[Impala-ASF-CR] IMPALA-4362: Misc. fixes for PFE counters
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4362: Misc. fixes for PFE counters .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4829 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaa9dccdcd91297a11a093b921f8d84d32f32be84 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4277: remove unneeded LegacyTCLIService
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4277: remove unneeded LegacyTCLIService .. IMPALA-4277: remove unneeded LegacyTCLIService Change-Id: I827590b19dc542f6256ae2e0d541eaa32a76520b Reviewed-on: http://gerrit.cloudera.org:8080/4844 Reviewed-by: Henry Robinson Tested-by: Internal Jenkins --- M common/thrift/CMakeLists.txt D common/thrift/cli_service.thrift M testdata/bin/wait-for-hiveserver2.py 3 files changed, 8 insertions(+), 1,138 deletions(-) Approvals: Henry Robinson: Looks good to me, approved Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/4844 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I827590b19dc542f6256ae2e0d541eaa32a76520b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins
[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3725 Support Kudu UPSERT in Impala .. Patch Set 13: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4277: remove unneeded LegacyTCLIService
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4277: remove unneeded LegacyTCLIService .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4844 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I827590b19dc542f6256ae2e0d541eaa32a76520b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3838: Codegen EvalRuntimeFilters().
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3838: Codegen EvalRuntimeFilters(). .. Patch Set 1: (7 comments) Mostly looks good - would be good to know whether any queries regressed with the change. http://gerrit.cloudera.org:8080/#/c/4833/1/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: PS1, Line 1415: ExternalLinkage Why not private linkage? http://gerrit.cloudera.org:8080/#/c/4833/1/be/src/exec/filter-context.cc File be/src/exec/filter-context.cc: Line 90: // @expr_type_arg = constant %"struct.impala::ColumnType" { i32 4, i32 -1, i32 -1, What query produced this? Line 155: codegen->GetFunction(IRFunction::RUNTIME_FILTER_EVAL, true); Do we need to clone this function? It doesn't look like we modify it. Line 205: return Status("Codegen'ed FilterContext::Eval() fails verification, " Weird wrapping. http://gerrit.cloudera.org:8080/#/c/4833/1/be/src/runtime/runtime-filter.h File be/src/runtime/runtime-filter.h: Line 64: bool Eval(void* val, const ColumnType& col_type) const noexcept; Document 'val' - not self-explanatory Line 86: static const char* LLVM_CLASS_NAME; Missed this one http://gerrit.cloudera.org:8080/#/c/4833/1/tests/query_test/test_tpch_queries.py File tests/query_test/test_tpch_queries.py: Line 39: v.get_value('table_format').file_format in ['text', 'parquet', 'kudu']) Does this provide meaningful coverage? It doesn't seem like it's worth running on both text and parquet in core. -- To view, visit http://gerrit.cloudera.org:8080/4833 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I27114869840e268d17e91d6e587ef811628e3837 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
Tim Armstrong has posted comments on this change. Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions .. Patch Set 9: (8 comments) Rebased to pick up IMPALA-3884 http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/codegen/codegen-anyval.cc File be/src/codegen/codegen-anyval.cc: PS7, Line 459: > that Done http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/codegen/codegen-anyval.h File be/src/codegen/codegen-anyval.h: Line 184: /// lowered type. This *Val should be non-null. The output variable is called 'name'. > nit: space Done http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: PS7, Line 347: : /// TODO: Return Status inst > Avoid cloning if possible to reduce compilation time. Done http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/exec/partitioned-aggregation-node.cc File be/src/exec/partitioned-aggregation-node.cc: PS7, Line 1660: ond, the : // value of the slot was initialized in the right way in InitAggSlots() (e.g. 0 for : // SUM) that we get the right result if UpdateSlot() pretends that the NULL bit of : // 'dst' is unse > Please also see comments above. I tried to clean this up a bit. My intention was to document the details of the initialisation in InitAggSlot() and provide a pointer here. PS7, Line 1675: dst.SetIsNull(slot_desc->CodegenIsNull(codegen, &builder, agg_tuple_arg)); > Why is this not in the old code or was it done somewhere else ? It wasn't in the old code - the old code didn't handle arbitrary UDAs, just a subset of the builtins. E.g. the old code would have given wrong results if we changed SUM() so that it returned NULL on overflow. http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/exec/partitioned-aggregation-node.h File be/src/exec/partitioned-aggregation-node.h: PS7, Line 649: 'dst_val' should contain the previous value of the aggregate : /// function, and 'updated_dst_val' is set to the new value after the Update or Merge : /// operation is applied. > Just curious why we separate 'dst_val' and 'result_val'. Doesn't 'result_va We need two different CodegenAnyVal, since it really represents the value (which changes) rather than the memory location. We could use an in-out argument instead of separate input and output arguments but this seemed simpler to me. I renamed 'result_val' to 'updated_dst_val' to hopefully make the connection clearer (the naming was confusing). http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/exprs/scalar-fn-call.cc File be/src/exprs/scalar-fn-call.cc: PS7, Line 496: false > Please feel free to ignore but you can consider setting this second argumen It seems clearer to make it explicit, given it's not obvious what the "default" behaviour should be. http://gerrit.cloudera.org:8080/#/c/4655/7/tests/query_test/test_aggregation.py File tests/query_test/test_aggregation.py: Line 141: # Verify codegen was enabled for all stages of the aggregation. > Can remove this restriction now that IMPALA-3884 is in master Done -- To view, visit http://gerrit.cloudera.org:8080/4655 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
Tim Armstrong has uploaded a new patch set (#9). Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions .. IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions This change enables codegen for all builtin aggregate functions, e.g. timestamp functions and group_concat. There are several parts to the change: * Adding support for generic UDAs. Previous the codegen code did not handle multiple input arguments or NULL return values. * Defaulting to using the UDA interface when there is not a special codegen path (we have implementations of all builtin aggregate functions for the interpreted path). * Remove all the logic to disable codegen for the special cases that now are supported. Also fix the generation of code to get/set NULL bits since I needed to add functionality there anyway. Testing: Add tests that check that codegen was enabled for builtin aggregate functions. Also fix some gaps in the preexisting tests. Also add tests for UDAs that check input/output nulls are handled correctly, in anticipation of enabling codegen for arbitrary UDAs. Perf: Ran local TPC-H and targeted perf. Spent a lot of time on TPC-H Q1, since my original approach regressed it ~5%. In the end the problem was to do with the ordering of loads/stores to the slot and null bit in the generated code: the previous version of the code exploited some properties of the particular aggregate function. I ended up replicating this behaviour to avoid regressing perf. Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260 --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/aggregation-node.cc M be/src/exec/exec-node.cc M be/src/exec/hash-join-node.cc M be/src/exec/hash-table.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/old-hash-table.cc M be/src/exec/partitioned-aggregation-node-ir.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-aggregation-node.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/text-converter.cc M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/case-expr.cc M be/src/exprs/compound-predicates.cc M be/src/exprs/expr-codegen-test.cc M be/src/exprs/expr.cc M be/src/exprs/literal.cc M be/src/exprs/null-literal.cc M be/src/exprs/scalar-fn-call.cc M be/src/exprs/slot-ref.cc M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M be/src/runtime/tuple.cc M be/src/runtime/types.h M be/src/testutil/test-udas.cc M be/src/testutil/test-udfs.cc M be/src/util/tuple-row-compare.cc M testdata/workloads/functional-query/queries/QueryTest/uda.test M tests/common/test_result_verifier.py M tests/query_test/test_aggregation.py M tests/query_test/test_udfs.py 39 files changed, 842 insertions(+), 562 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/4655/9 -- To view, visit http://gerrit.cloudera.org:8080/4655 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Move vim-specific config file to top level directory
Tim Armstrong has posted comments on this change. Change subject: Move vim-specific config file to top level directory .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4850 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I10599fdec22be3fb5c1e5105f23d96c6955052d8 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3788: Support for Kudu 'read-your-writes' consistency
Dan Hecht has posted comments on this change. Change subject: IMPALA-3788: Support for Kudu 'read-your-writes' consistency .. Patch Set 4: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/4779/3/be/src/service/query-exec-state.cc File be/src/service/query-exec-state.cc: Line 547: } > I think this could go in UnregisterQuery too. I put it here because I thoug I don't have a strong preference either way -- mostly wondering what the reasoning was. The division between the two don't seem very clear (i.e. its not clear why some stuff in Unregister() isn't in Done()), but I agree Unregister() doesn't seem any better. -- To view, visit http://gerrit.cloudera.org:8080/4779 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6bcb5fc218ad4ab935343a55b2188441d8c7dfbd Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
Tim Armstrong has uploaded a new patch set (#8). Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions .. IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions This change enables codegen for all builtin aggregate functions, e.g. timestamp functions and group_concat. There are several parts to the change: * Adding support for generic UDAs. Previous the codegen code did not handle multiple input arguments or NULL return values. * Defaulting to using the UDA interface when there is not a special codegen path (we have implementations of all builtin aggregate functions for the interpreted path). * Remove all the logic to disable codegen for the special cases that now are supported. Also fix the generation of code to get/set NULL bits since I needed to add functionality there anyway. Testing: Add tests that check that codegen was enabled for builtin aggregate functions. Also fix some gaps in the preexisting tests. Also add tests for UDAs that check input/output nulls are handled correctly, in anticipation of enabling codegen for arbitrary UDAs. Perf: Ran local TPC-H and targeted perf. Spent a lot of time on TPC-H Q1, since my original approach regressed it ~5%. In the end the problem was to do with the ordering of loads/stores to the slot and null bit in the generated code: the previous version of the code exploited some properties of the particular aggregate function. I ended up replicating this behaviour to avoid regressing perf. Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260 --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/aggregation-node.cc M be/src/exec/exec-node.cc M be/src/exec/hash-join-node.cc M be/src/exec/hash-table.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/old-hash-table.cc M be/src/exec/partitioned-aggregation-node-ir.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-aggregation-node.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/text-converter.cc M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/case-expr.cc M be/src/exprs/compound-predicates.cc M be/src/exprs/expr-codegen-test.cc M be/src/exprs/expr.cc M be/src/exprs/literal.cc M be/src/exprs/null-literal.cc M be/src/exprs/scalar-fn-call.cc M be/src/exprs/slot-ref.cc M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M be/src/runtime/tuple.cc M be/src/runtime/types.h M be/src/testutil/test-udas.cc M be/src/testutil/test-udfs.cc M be/src/util/tuple-row-compare.cc M testdata/workloads/functional-query/queries/QueryTest/uda.test M tests/common/test_result_verifier.py M tests/query_test/test_aggregation.py M tests/query_test/test_udfs.py 39 files changed, 846 insertions(+), 564 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/4655/8 -- To view, visit http://gerrit.cloudera.org:8080/4655 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Move vim-specific config file to top level directory
Lars Volker has uploaded a new change for review. http://gerrit.cloudera.org:8080/4850 Change subject: Move vim-specific config file to top level directory .. Move vim-specific config file to top level directory Files in be/ get wiped out by clean.sh if they're listed in .gitignore. Moving this file up one directory seems promising to prevent it from being wiped out. Change-Id: I10599fdec22be3fb5c1e5105f23d96c6955052d8 --- M .gitignore 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/4850/1 -- To view, visit http://gerrit.cloudera.org:8080/4850 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I10599fdec22be3fb5c1e5105f23d96c6955052d8 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker
[Impala-ASF-CR] Enabling end-to-end tests on a remote cluster
Harrison Sheinblatt has posted comments on this change. Change subject: Enabling end-to-end tests on a remote cluster .. Patch Set 1: (3 comments) Responded to comments. http://gerrit.cloudera.org:8080/#/c/4769/1/testdata/bin/compute-table-stats.sh File testdata/bin/compute-table-stats.sh: PS1, Line 27: IMPALAD > IMPALA-4346 has been filed. Can you reference the Jira in a comment? http://gerrit.cloudera.org:8080/#/c/4769/1/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: PS1, Line 38: HS2_HOST_PORT > I think the latter is preferable -- corral all the required configs in one Is it reasonable to add a comment referencing the Jira here? http://gerrit.cloudera.org:8080/#/c/4769/1/testdata/bin/setup-hdfs-env.sh File testdata/bin/setup-hdfs-env.sh: PS1, Line 53: CACHEADMIN_ARGS > Clarification: can you be more explicit about the check you want? Something If the is_kerberized block is executed above, then the CACHADMIN_ARGS would include '-owner ${PREVIOUS_USER}'. If HADOOP_USER_NAME is also true, then we add another '-owner ${USER}' to this, which probably breaks it. I think there are probably 4 bugs: 1) The is_kerberized block above probably isn't supported and should be removed and 2) the CACHEADMIN_ARGS definition logic needs a clear conditional, ideally in a single location, that sets the user/group/owner information properly in a way that you can easily tell it's always well-defined. Here it looks like the logic is intended to be that if it's kerberized it sets owner one way, if it's not kerberized and the hadoop user is defined it's set another way and if it's not kerberized and the hadoop user is not defined it stays undefined. If we want to keep the is_kerberized logic in one place, then we can have it set another parameter about owner fields and here only update it if it's set already. 3) CACHEADMIN_ARGS is prob! ably the wrong name as it is for the -addPool command and sets a subset of the args 4) We should explicitly set all arguments to cacheadmin, -addPool if possible (e.g. mode maxTtl) -- To view, visit http://gerrit.cloudera.org:8080/4769 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Harrison Sheinblatt Gerrit-Reviewer: Martin Grund Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] Enabling end-to-end tests on a remote cluster
David Knupp has posted comments on this change. Change subject: Enabling end-to-end tests on a remote cluster .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/4769/1/bin/remote_data_load.py File bin/remote_data_load.py: PS1, Line 88: RemoteDataLoad > I'd separate out the common functionality needed for dealing with remote cl IMPALA-4367 has been filed. PS1, Line 132: v10 > Hardcoding v10. Is this necessary? I think URL may be missing with later IMPALA-4367 has been filed. PS1, Line 155: get_service_client_configurations > A lot of this seems like it could be in comparisons/cluster.py, or at least IMPALA-4367 has been filed. PS1, Line 212: find_snapshot_file > It would be good to start converting the snapshot file management into pyth IMPALA-4367 has been filed. -- To view, visit http://gerrit.cloudera.org:8080/4769 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David Knupp Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Harrison Sheinblatt Gerrit-Reviewer: Martin Grund Gerrit-Reviewer: Michael Brown Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3771: Expose kudu client timeout and set default
Matthew Jacobs has uploaded a new change for review. http://gerrit.cloudera.org:8080/4849 Change subject: IMPALA-3771: Expose kudu client timeout and set default .. IMPALA-3771: Expose kudu client timeout and set default The Kudu client timeout was too low for Impala usage. This sets the default timeout to 3 minutes and exposes it as a gflag. New timeout tests were added. Note: Kudu Java client exceptions on timeouts are very verbose and not useful, so the code that catches exceptions around the client APIs that are likely to time out do not set the inner exception otherwise the query errors will be unusable. Instead, the Kudu exceptions are logged and a simple Impala exception is thrown. KUDU-1734 tracks fixing the Kudu errors. After the error messages are fixed, Kudu exceptions can be added back to the Impala exceptions. Change-Id: Iad95e8e38aad4f76d21bac6879db6c02b3c3e045 --- M be/src/catalog/catalog.cc M be/src/common/global-flags.cc M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-table-sink.cc M be/src/service/frontend.cc M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/KuduUtil.java A testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-catalogd.test A testdata/workloads/functional-query/queries/QueryTest/kudu-timeouts-impalad.test M tests/common/impala_test_suite.py M tests/custom_cluster/test_kudu.py 16 files changed, 194 insertions(+), 75 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/4849/1 -- To view, visit http://gerrit.cloudera.org:8080/4849 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Iad95e8e38aad4f76d21bac6879db6c02b3c3e045 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs
[Impala-ASF-CR] IMPALA-3788: Support for Kudu 'read-your-writes' consistency
Matthew Jacobs has uploaded a new patch set (#4). Change subject: IMPALA-3788: Support for Kudu 'read-your-writes' consistency .. IMPALA-3788: Support for Kudu 'read-your-writes' consistency Kudu provides an API to get/set a 'latest observed timestamp' on clients to allow a client which inserts to capture and send this timestamp to another client before a read to ensure that data as of that timestamp is visible. This adds support for this feature _for reads within a session_ by capturing the latest observed timestamp when the KuduTableSink is sending its last update to the coordinator. The timestamp is sent with other post-write information, and is aggregated (i.e. taking the max) at the coordinator. The max is stored in the session, and that value is then set in the Kudu client on future scans. This is being tested by running the Kudu tests after removing delays that were introduced to work around the issue that reads might not be visible after a write. Before this change, if there were no delay, inconsistent results could be returned. Change-Id: I6bcb5fc218ad4ab935343a55b2188441d8c7dfbd --- M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-table-sink.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/service/impala-hs2-server.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-exec-state.cc M common/thrift/ImpalaInternalService.thrift M tests/common/impala_test_suite.py M tests/query_test/test_kudu.py 11 files changed, 51 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/4779/4 -- To view, visit http://gerrit.cloudera.org:8080/4779 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6bcb5fc218ad4ab935343a55b2188441d8c7dfbd Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements
Alex Behm has posted comments on this change. Change subject: IMPALA-2521: Add clustered hint to insert statements .. Patch Set 13: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4745 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool
Jim Apple has posted comments on this change. Change subject: IMPALA-3676: Use clang as a static analysis tool .. Patch Set 7: > > I rearranged them all. Tim convinced me to roll some back. > > That doesn't really answer my question - given the current state of > the patch, what are the criteria we should use, going forward, to > decide whether a class should be CacheAligned? > > I suggest we reserve CacheAligned / grouping for classes that > really need it, given the reduction in comprehension. Or am I > underestimating how big the impact of properly aligning even > non-performance critical classes will be? CacheAligned and grouping I view separately. clang-tidy has a warning about cache alignment that is about correctness - If X must be cache aligned because we wanted it to be so for performance reasons, than clang will warn when NEWing classes that contain that class because we are breaking our own expectations about alignment. The cache alignments I added were only to quiet that warning. I did not set the BlockingQueue to be cache-aligned - that was there when I started. For grouping, I think you and I are on the same page. -- To view, visit http://gerrit.cloudera.org:8080/4758 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements
Lars Volker has posted comments on this change. Change subject: IMPALA-2521: Add clustered hint to insert statements .. Patch Set 13: Alex, can you +2 this again? The tests got renamed and I didn't catch the syntax error when running locally. :( -- To view, visit http://gerrit.cloudera.org:8080/4745 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements
Hello Marcel Kornacker, Internal Jenkins, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4745 to look at the new patch set (#13). Change subject: IMPALA-2521: Add clustered hint to insert statements .. IMPALA-2521: Add clustered hint to insert statements This change introduces a clustered/noclustered hint for insert statements. Specifying this hint adds an additional sort node to the plan, just before the table sink. This has the effect that data will be clustered by its partition prior to writing partitions, which therefore can be written sequentially. Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14 --- M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/SortInfo.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/workloads/functional-planner/queries/PlannerTest/insert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-query/queries/QueryTest/insert.test M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test 10 files changed, 418 insertions(+), 44 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/13 -- To view, visit http://gerrit.cloudera.org:8080/4745 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker
[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool
Henry Robinson has posted comments on this change. Change subject: IMPALA-3676: Use clang as a static analysis tool .. Patch Set 6: > I rearranged them all. Tim convinced me to roll some back. That doesn't really answer my question - given the current state of the patch, what are the criteria we should use, going forward, to decide whether a class should be CacheAligned? I suggest we reserve CacheAligned / grouping for classes that really need it, given the reduction in comprehension. Or am I underestimating how big the impact of properly aligning even non-performance critical classes will be? -- To view, visit http://gerrit.cloudera.org:8080/4758 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4153: Fix count(*) on all blank('') columns - test
Alex Behm has posted comments on this change. Change subject: IMPALA-4153: Fix count(*) on all blank('') columns - test .. Patch Set 1: Laszlo, can you try merging this again or is there something wrong with the patch? -- To view, visit http://gerrit.cloudera.org:8080/4755 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I23923f95f43d67977ee1520a1fc09ce297548b3f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4023: don't attach buffered tuple streams to batches .. Patch Set 13: Code-Review+2 (1 comment) Carry +2 http://gerrit.cloudera.org:8080/#/c/4448/11/be/src/exec/partitioned-aggregation-node.cc File be/src/exec/partitioned-aggregation-node.cc: PS11, Line 385: copied out now or sent up the : // plan and copied out by a blo > copied out now or sent up the tree and copied out by a blocking ancestor. Done -- To view, visit http://gerrit.cloudera.org:8080/4448 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches
Hello Internal Jenkins, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4448 to look at the new patch set (#12). Change subject: IMPALA-4023: don't attach buffered tuple streams to batches .. IMPALA-4023: don't attach buffered tuple streams to batches This simplifies the memory transfer model by eliminating one category of resources that can be attached. This patch also separates the concepts of attaching resources and flushing resources. Previously RowBatch::AddTupleStream() implicitly flushed resources from the ExecNode pipeline, which various ExecNodes relied on to free up memory reservations for subsequent processing. In a subsequent patch I want the FlushResources() API to become stronger: it will force streaming ExecNodes to flush their batches or forces blocking ExecNodes to acquire ownership of the memory resources. We can't do this right now since we don't have a way to transfer ownership of BufferedBlockMgr Blocks. Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff --- M be/src/exec/analytic-eval-node.cc M be/src/exec/analytic-eval-node.h M be/src/exec/nested-loop-join-builder.cc M be/src/exec/nested-loop-join-node.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/buffered-tuple-stream.cc M be/src/runtime/buffered-tuple-stream.h M be/src/runtime/row-batch-test.cc M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/sorted-run-merger.cc M be/src/runtime/sorted-run-merger.h M be/src/runtime/sorter.cc 16 files changed, 240 insertions(+), 159 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4448/12 -- To view, visit http://gerrit.cloudera.org:8080/4448 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3872: allow providing PyPi mirror for python packages
Hello Michael Brown, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4770 to look at the new patch set (#4). Change subject: IMPALA-3872: allow providing PyPi mirror for python packages .. IMPALA-3872: allow providing PyPi mirror for python packages We still rely on the python.org json API, which doesn't seem to be mirrored (instead there's a html-based index format implemented by the mirrors). Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c --- M infra/python/deps/pip_download.py 1 file changed, 8 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/4770/4 -- To view, visit http://gerrit.cloudera.org:8080/4770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3872: allow providing PyPi mirror for python packages
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3872: allow providing PyPi mirror for python packages .. Patch Set 4: Code-Review+1 Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/4770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3872: allow providing PyPi mirror for python packages
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3872: allow providing PyPi mirror for python packages .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/4770/3/infra/python/deps/pip_download.py File infra/python/deps/pip_download.py: PS3, Line 31: from urlparse import urljoin > Remove Done -- To view, visit http://gerrit.cloudera.org:8080/4770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool .. Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: PS1, Line 193: BufferOpts > const& ? Done http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/runtime/disk-io-mgr.cc File be/src/runtime/disk-io-mgr.cc: PS1, Line 642: without the client providing a " : "buffer" > is this going to make sense to a user? Probably not. The old message wasn't actionable either by a user (beyond reporting the bug). Reworded to make it clear that it's an internal error. Line 708: DCHECK_LE(buffer_size, max_buffer_size_); > how do we ensure this is true? is it the check in DiskIoMgr::Read() that wi DiskIoMgr::ReadRange() checks if there's a client buffer before calling TryAllocateNextBufferForRange(). http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/runtime/disk-io-mgr.h File be/src/runtime/disk-io-mgr.h: PS1, Line 147: these buffers are returned to the caller wrapped : /// in BufferDescriptors. > took a few seconds to parse and understand. how about: Yes that's better... Line 150: /// another read. In error cases, the IoMgr will recycle the buffers more promptly but > is this relevant for all cases, or only cases (a) and (b)? Tried to clarify. Also removed the last sentence since I think it was confusing and not really useful to the reader. PS1, Line 392: bool > maybe make all of these members 'const' since they are "options" that don't Done PS1, Line 422: BufferOpts > const BufferOpts& ? Done Line 506: int64_t client_buffer_len_; > maybe move this right before cached_buffer_ since it's related in the sense Switched it to a tagged union. We don't really have much precedent in the codebase for this pattern so I invented some conventions that made sense to me. -- To view, visit http://gerrit.cloudera.org:8080/4631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool
Tim Armstrong has uploaded a new patch set (#2). Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool .. IMPALA-3202: DiskIoMgr improvements for new buffer pool The main goal of this patch is to add support to the DiskIoMgr to read into client-provided buffers, instead of the DiskIoMgr allocating intermediate buffers. This is important for the new buffer pool because we want it to entirely manage its own memory, rather than delegating part of its memory management to the DiskIoMgr. Also do some cleanup: * Consolidate some correlated ScanRange parameters into a parameter struct. * Remove the "untracked" buffers mem tracker, which is no longer necessary. * Change the buffer type in DiskIoMgr to use uint8_t* to be consistent with the rest of Impala. Testing: Added a unit test. We also get coverage from the BufferedBlockMgr unit tests, any spilling tests and the Parquet tests with large footers. Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec --- M be/src/exec/base-sequence-scanner.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/scanner-context.cc M be/src/runtime/buffered-block-mgr.cc M be/src/runtime/disk-io-mgr-internal.h M be/src/runtime/disk-io-mgr-scan-range.cc M be/src/runtime/disk-io-mgr-stress-test.cc M be/src/runtime/disk-io-mgr-stress.cc M be/src/runtime/disk-io-mgr-stress.h M be/src/runtime/disk-io-mgr-test.cc M be/src/runtime/disk-io-mgr.cc M be/src/runtime/disk-io-mgr.h 15 files changed, 404 insertions(+), 224 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/4631/2 -- To view, visit http://gerrit.cloudera.org:8080/4631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-3983/IMPALA-3974: Delete function jar resources after load
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-3983/IMPALA-3974: Delete function jar resources after load .. Patch Set 7: Code-Review+2 Rebased. Carrying +2. Thanks Alex and Henry. -- To view, visit http://gerrit.cloudera.org:8080/4617 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f9dedb5b342415380c83e61a72eb497371a8199 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements
Internal Jenkins has posted comments on this change. Change subject: IMPALA-2521: Add clustered hint to insert statements .. Patch Set 12: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/360/ -- To view, visit http://gerrit.cloudera.org:8080/4745 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3788: Support for Kudu 'read-your-writes' consistency
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3788: Support for Kudu 'read-your-writes' consistency .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/4779/3/be/src/service/impala-server.cc File be/src/service/impala-server.cc: Line 1163: if (kudu_latest_observed_ts > 0) { > why not carry the > 0 convention and just always set the session state? lo Done http://gerrit.cloudera.org:8080/#/c/4779/3/be/src/service/query-exec-state.cc File be/src/service/query-exec-state.cc: Line 547: } > why is it better to do here rather than, say, ImpalaServer::UnregisterQuery I think this could go in UnregisterQuery too. I put it here because I thought it made more sense to do some post-query accounting in QES rather than UnregisterQuery(), which feels like it should be focusing on unregistering (not that these we are particularly consistent to begin with...). It shouldn't matter if lock_ is held or not. I should be able to call this after BlockOnWait(), or I can move it to Unregister if you prefer. -- To view, visit http://gerrit.cloudera.org:8080/4779 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6bcb5fc218ad4ab935343a55b2188441d8c7dfbd Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4340: explain how to install postgresql-9.5 or higher
Alex Behm has posted comments on this change. Change subject: IMPALA-4340: explain how to install postgresql-9.5 or higher .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/4846/1/tests/comparison/POSTGRES.txt File tests/comparison/POSTGRES.txt: Line 41: $ sudo perl -pi -e 's/(?:peer|md5)/trust/g' /etc/postgresql/9.5/main/pg_hba.conf nice -- To view, visit http://gerrit.cloudera.org:8080/4846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1e3b510120451fcb5af97145fa47ccb4c53f00d9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4362: Misc. fixes for PFE counters
Henry Robinson has posted comments on this change. Change subject: IMPALA-4362: Misc. fixes for PFE counters .. Patch Set 2: Code-Review+2 Rebase. -- To view, visit http://gerrit.cloudera.org:8080/4829 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaa9dccdcd91297a11a093b921f8d84d32f32be84 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4362: Misc. fixes for PFE counters
Dan Hecht has posted comments on this change. Change subject: IMPALA-4362: Misc. fixes for PFE counters .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4829 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaa9dccdcd91297a11a093b921f8d84d32f32be84 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Dan Hecht Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3788: Support for Kudu 'read-your-writes' consistency
Dan Hecht has posted comments on this change. Change subject: IMPALA-3788: Support for Kudu 'read-your-writes' consistency .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/4779/3/be/src/service/impala-server.cc File be/src/service/impala-server.cc: Line 1163: if (kudu_latest_observed_ts > 0) { why not carry the > 0 convention and just always set the session state? looks like scan node already checks > 0 anyway. http://gerrit.cloudera.org:8080/#/c/4779/3/be/src/service/query-exec-state.cc File be/src/service/query-exec-state.cc: Line 547: } why is it better to do here rather than, say, ImpalaServer::UnregisterQuery()? is it important that we're holding QES::lock_? http://gerrit.cloudera.org:8080/#/c/4779/3/tests/common/impala_test_suite.py File tests/common/impala_test_suite.py: Line 322 nice to see this go -- To view, visit http://gerrit.cloudera.org:8080/4779 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6bcb5fc218ad4ab935343a55b2188441d8c7dfbd Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3823: Add timer to measure Parquet footer reads
Dan Hecht has posted comments on this change. Change subject: IMPALA-3823: Add timer to measure Parquet footer reads .. Patch Set 9: After you address Matt's comments I can take a final look. -- To view, visit http://gerrit.cloudera.org:8080/4371 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4350: Crash with vlog level 2 in hash join node
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4350: Crash with vlog level 2 in hash join node .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4830 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ieedd49d8a17709177a622ddc15b78a3f48e12d3f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage
Dan Hecht has posted comments on this change. Change subject: IMPALA-1169: Admission control info on the queries debug webpage .. Patch Set 6: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/4756/6/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: PS6, Line 906: !FLAGS_disable_admission_control slightly easier to read if you reverse the if-branches and remove the ! to get rid of the double negative. -- To view, visit http://gerrit.cloudera.org:8080/4756 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4350: Crash with vlog level 2 in hash join node
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4350: Crash with vlog level 2 in hash join node .. IMPALA-4350: Crash with vlog level 2 in hash join node Testing: Reproduced by running test_mem_usage_scaling against an Impala with -v=2. Confirmed the fix avoided the crash. We don't have any routine testing of high log levels so there isn't a clear way to get good coverage of all code paths where there might be similar bugs. Change-Id: Ieedd49d8a17709177a622ddc15b78a3f48e12d3f Reviewed-on: http://gerrit.cloudera.org:8080/4830 Reviewed-by: Tim Armstrong Tested-by: Internal Jenkins --- M be/src/exec/partitioned-hash-join-node.cc 1 file changed, 13 insertions(+), 6 deletions(-) Approvals: Internal Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/4830 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ieedd49d8a17709177a622ddc15b78a3f48e12d3f Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3872: allow providing PyPi mirror for python packages
Michael Brown has posted comments on this change. Change subject: IMPALA-3872: allow providing PyPi mirror for python packages .. Patch Set 3: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/4770/2/infra/python/deps/pip_download.py File infra/python/deps/pip_download.py: PS2, Line 73: return True > This doesn't work for me - if I have PYPI_MIRROR = "https://pypi.infra.clou Too bad. I think the URL python libraries are needlessly complicated. http://gerrit.cloudera.org:8080/#/c/4770/3/infra/python/deps/pip_download.py File infra/python/deps/pip_download.py: PS3, Line 31: from urlparse import urljoin Remove -- To view, visit http://gerrit.cloudera.org:8080/4770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches
Dan Hecht has posted comments on this change. Change subject: IMPALA-4023: don't attach buffered tuple streams to batches .. Patch Set 11: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/4448/11/be/src/exec/partitioned-aggregation-node.cc File be/src/exec/partitioned-aggregation-node.cc: PS11, Line 385: copied out or sent up the plan : // tree via MarkNeedsDeepCopy() copied out now or sent up the tree and copied out by a blocking ancestor. -- To view, visit http://gerrit.cloudera.org:8080/4448 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3872: allow providing PyPi mirror for python packages
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3872: allow providing PyPi mirror for python packages .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/4770/2/infra/python/deps/pip_download.py File infra/python/deps/pip_download.py: PS2, Line 34: pypi_mirror > Use all-caps for the name? Done PS2, Line 73: pkg_url = "{0}/packages/{1}".format(pypi_mirror, pkg['path']) > What about: This doesn't work for me - if I have PYPI_MIRROR = "https://pypi.infra.cloudera.com/api/pypi/pypi-public"; then it strips off api/pypi/pypi-public automatically. I'm not sure why it's doing that. -- To view, visit http://gerrit.cloudera.org:8080/4770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3872: allow providing PyPi mirror for python packages
Tim Armstrong has uploaded a new patch set (#3). Change subject: IMPALA-3872: allow providing PyPi mirror for python packages .. IMPALA-3872: allow providing PyPi mirror for python packages We still rely on the python.org json API, which doesn't seem to be mirrored (instead there's a html-based index format implemented by the mirrors). Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c --- M infra/python/deps/pip_download.py 1 file changed, 9 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/4770/3 -- To view, visit http://gerrit.cloudera.org:8080/4770 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibc11f010332c0225121c86c9930e35c7ac01409c Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Michael Brown
[Impala-ASF-CR](hadoop-next) Merge remote-tracking branch 'gerrit/master' into HEAD
Tim Armstrong has submitted this change and it was merged. Change subject: Merge remote-tracking branch 'gerrit/master' into HEAD .. Merge remote-tracking branch 'gerrit/master' into HEAD Change-Id: I1b56e4dbb67889bbe4ff8462ccc2b49968821a1f --- M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java 1 file changed, 0 insertions(+), 4 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/4841 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I1b56e4dbb67889bbe4ff8462ccc2b49968821a1f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: hadoop-next Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches
Hello Internal Jenkins, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4448 to look at the new patch set (#10). Change subject: IMPALA-4023: don't attach buffered tuple streams to batches .. IMPALA-4023: don't attach buffered tuple streams to batches This simplifies the memory transfer model by eliminating one category of resources that can be attached. This patch also separates the concepts of attaching resources and flushing resources. Previously RowBatch::AddTupleStream() implicitly flushed resources from the ExecNode pipeline, which various ExecNodes relied on to free up memory reservations for subsequent processing. In a subsequent patch I want the FlushResources() API to become stronger: it will force streaming ExecNodes to flush their batches or forces blocking ExecNodes to acquire ownership of the memory resources. We can't do this right now since we don't have a way to transfer ownership of BufferedBlockMgr Blocks. Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff --- M be/src/exec/analytic-eval-node.cc M be/src/exec/analytic-eval-node.h M be/src/exec/nested-loop-join-builder.cc M be/src/exec/nested-loop-join-node.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/buffered-tuple-stream.cc M be/src/runtime/buffered-tuple-stream.h M be/src/runtime/row-batch-test.cc M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/sorted-run-merger.cc M be/src/runtime/sorted-run-merger.h M be/src/runtime/sorter.cc 16 files changed, 239 insertions(+), 158 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4448/10 -- To view, visit http://gerrit.cloudera.org:8080/4448 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4023: don't attach buffered tuple streams to batches .. Patch Set 9: (7 comments) http://gerrit.cloudera.org:8080/#/c/4448/9/be/src/runtime/buffered-tuple-stream.cc File be/src/runtime/buffered-tuple-stream.cc: Line 653: // the caller can pass the rows up the operator tree. > let's make this comment actually be useful. how about something like: Done http://gerrit.cloudera.org:8080/#/c/4448/9/be/src/runtime/row-batch.h File be/src/runtime/row-batch.h: PS9, Line 77: Different modes for flushing resources > these are really different modes for flushing resources. isn't it just a f Done PS9, Line 227: as soon as possible > this is pretty vague. how about saying what the rule is specifically: That's clearer, thanks. PS9, Line 242: > (that have not been attached to the batch) Done PS9, Line 243: . > and deep copied. Done. Had a few tries at wording this paragraph hopefully it's clearer now. PS9, Line 244: This is used to control memory management for streaming rows > this is kind of misleading and confusing now especially since "flush" contr Just deleted it, I think it's unnecessary. Line 246: /// are not allowed to accumulate batches with the 'needs_deep_copy' flag. > how about saying this is deprecated? Added a TODO and reference to the JIRA. -- To view, visit http://gerrit.cloudera.org:8080/4448 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4260: Alter table add column drops all the column stats
Alex Behm has posted comments on this change. Change subject: IMPALA-4260: Alter table add column drops all the column stats .. Patch Set 1: I think we recently broke something with respect to casing of column types. Jenny recently sent an email about DESCRIBE FORMATTED printing the types in uppercase. Can you take a quick look? These issues might be related (and the fix might be elsewhere) -- To view, visit http://gerrit.cloudera.org:8080/4845 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia373ec917efa7ab9f2a59b8a870b7ebc30175dda Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4023: don't attach buffered tuple streams to batches .. Patch Set 11: Code-Review+1 Rebase, carry Alex's +1 -- To view, visit http://gerrit.cloudera.org:8080/4448 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4023: don't attach buffered tuple streams to batches .. Patch Set 9: I feel like it's a bit less error-prone with the explicit flags, so I'll stick with that approach. -- To view, visit http://gerrit.cloudera.org:8080/4448 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3725 Support Kudu UPSERT in Impala .. Patch Set 13: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-1169: Admission control info on the queries debug webpage .. Patch Set 6: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/4756 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4260: Alter table add column drops all the column stats
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4260: Alter table add column drops all the column stats .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/4845/1/testdata/workloads/functional-query/queries/QueryTest/compute-stats.test File testdata/workloads/functional-query/queries/QueryTest/compute-stats.test: Line 121: Can you also add a test case for CHANGE and REPLACE columns as well? -- To view, visit http://gerrit.cloudera.org:8080/4845 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia373ec917efa7ab9f2a59b8a870b7ebc30175dda Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4340: explain how to install postgresql-9.5 or higher
Michael Brown has uploaded a new change for review. http://gerrit.cloudera.org:8080/4846 Change subject: IMPALA-4340: explain how to install postgresql-9.5 or higher .. IMPALA-4340: explain how to install postgresql-9.5 or higher The random query generator needs to compare against PostgresQL 9.5 or higher to take advantage of some of the more recent features, especially as it pertains to Impala/Kudu INSERT and UPSERT queries. Developers will need assistance setting up their development environments if they need to use the random query generator. This patch provides instructions on how to do so. We provide instructions, not automation, since this will have side-effects on developers' workstations: we can't presume to know how a developer might want to install or configure postgres, and we haven't tested on anything except our own development environment. Change-Id: I1e3b510120451fcb5af97145fa47ccb4c53f00d9 --- A tests/comparison/POSTGRES.txt M tests/comparison/README 2 files changed, 88 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/4846/1 -- To view, visit http://gerrit.cloudera.org:8080/4846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I1e3b510120451fcb5af97145fa47ccb4c53f00d9 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Brown
[Impala-ASF-CR] IMPALA-4277: remove unneeded LegacyTCLIService
Henry Robinson has posted comments on this change. Change subject: IMPALA-4277: remove unneeded LegacyTCLIService .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4844 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I827590b19dc542f6256ae2e0d541eaa32a76520b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-1169: Admission control info on the queries debug webpage .. Patch Set 6: (4 comments) http://gerrit.cloudera.org:8080/#/c/4756/5/be/src/service/impala-server.cc File be/src/service/impala-server.cc: PS5, Line 801: o handle i > I had suggested moving it to be closer to the register call if this is call Done http://gerrit.cloudera.org:8080/#/c/4756/5/be/src/service/query-exec-state.h File be/src/service/query-exec-state.h: Line 157: > Blank line before this. Done Line 159: /// been created and had the pool set yet, or this StmtType doesn't go through admission > missing / Done PS5, Line 161: ol() > nullptr Done -- To view, visit http://gerrit.cloudera.org:8080/4756 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage
Thomas Tauber-Marshall has uploaded a new patch set (#6). Change subject: IMPALA-1169: Admission control info on the queries debug webpage .. IMPALA-1169: Admission control info on the queries debug webpage This patch adds a new event, 'Queued', to the query event log to indicate when a query is queued by the admission controller. This means that queries on the '/queries' page that are currently queued will display this as their 'Last Event', making it possible to see which queries are current queued. It also adds a column to show the resource pool associated with the queries, and it updates the wording of the first event that gets marked for each query from 'Start execution' to 'Query submitted', since this is before planning and admission control and therefore execution hasn't actually startd yet. Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283 --- M be/src/scheduling/admission-controller.cc M be/src/scheduling/simple-scheduler.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-exec-state.h M tests/common/impala_service.py M tests/custom_cluster/test_admission_controller.py M www/queries.tmpl 9 files changed, 72 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/4756/6 -- To view, visit http://gerrit.cloudera.org:8080/4756 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-4277: remove unneeded LegacyTCLIService
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/4844 Change subject: IMPALA-4277: remove unneeded LegacyTCLIService .. IMPALA-4277: remove unneeded LegacyTCLIService Change-Id: I827590b19dc542f6256ae2e0d541eaa32a76520b --- M common/thrift/CMakeLists.txt D common/thrift/cli_service.thrift M testdata/bin/wait-for-hiveserver2.py 3 files changed, 8 insertions(+), 1,138 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/4844/1 -- To view, visit http://gerrit.cloudera.org:8080/4844 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I827590b19dc542f6256ae2e0d541eaa32a76520b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-4260: Alter table add column drops all the column stats
Thomas Tauber-Marshall has uploaded a new change for review. http://gerrit.cloudera.org:8080/4845 Change subject: IMPALA-4260: Alter table add column drops all the column stats .. IMPALA-4260: Alter table add column drops all the column stats Hive expects types for column stats to be specified as all lower case. For some reason, it doesn't check this when the stats are first written, but it does check when performing an 'alter table'. This causes it to drop stats that Impala wrote because we specify type names in upper case. This patch converts the types that Impala sends to Hive for the column stats to all lower case and adds a regression test. I also filed HIVE-15061 to track the issue from the Hive end. Change-Id: Ia373ec917efa7ab9f2a59b8a870b7ebc30175dda --- M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M testdata/workloads/functional-query/queries/QueryTest/compute-stats.test 2 files changed, 54 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4845/1 -- To view, visit http://gerrit.cloudera.org:8080/4845 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ia373ec917efa7ab9f2a59b8a870b7ebc30175dda Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-1702: Enforce single-table consistency in query analysis.
Alex Behm has submitted this change and it was merged. Change subject: IMPALA-1702: Enforce single-table consistency in query analysis. .. IMPALA-1702: Enforce single-table consistency in query analysis. Catalogd managed table id generation causes problems when new updates arrive at frontend while queries are being analyzed: references to the same table may suddenly refer to a different version, and different tables may share the same table id. This causes problems when one table overrides another one in the thrift descriptor table sent to backend. This commit removes the table id from the catalog Table object; instead frontend assigns a unique id to each table in DescriptorTable.toThrift(). It also implements a referencedTables_ cache in Analyzer::globalState_ so that calling Analyzer::getTable() on the same table returns the same reference for the same query. Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2 Reviewed-on: http://gerrit.cloudera.org:8080/4349 Tested-by: Internal Jenkins Reviewed-by: Alex Behm --- M common/thrift/CatalogObjects.thrift M common/thrift/Descriptors.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/IncompleteTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java D fe/src/main/java/org/apache/impala/catalog/TableId.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java M fe/src/main/java/org/apache/impala/catalog/View.java M fe/src/main/java/org/apache/impala/common/Id.java M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java M fe/src/main/java/org/apache/impala/planner/JoinTableId.java M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 27 files changed, 204 insertions(+), 219 deletions(-) Approvals: Internal Jenkins: Verified Alex Behm: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/4349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2 Gerrit-PatchSet: 18 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Huaisi Xu Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-1702: Enforce single-table consistency in query analysis.
Alex Behm has posted comments on this change. Change subject: IMPALA-1702: Enforce single-table consistency in query analysis. .. Patch Set 17: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2 Gerrit-PatchSet: 17 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Huaisi Xu Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1702: Enforce single-table consistency in query analysis.
Internal Jenkins has posted comments on this change. Change subject: IMPALA-1702: Enforce single-table consistency in query analysis. .. Patch Set 17: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2 Gerrit-PatchSet: 17 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Huaisi Xu Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Huaisi Xu Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-1169: Admission control info on the queries debug webpage .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/4756/5/be/src/service/impala-server.cc File be/src/service/impala-server.cc: PS5, Line 801: Registered > I don't think 'Registered' is very meaningful to users (and also it's being I had suggested moving it to be closer to the register call if this is called Registered, but 'Query submitted' is a good and then it makes sense to put this back where it was. We mostly just wanted to avoid the confusion of using 'Start execution' before this goes through planning and AC. -- To view, visit http://gerrit.cloudera.org:8080/4756 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala
Hello Matthew Jacobs, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4047 to look at the new patch set (#13). Change subject: IMPALA-3725 Support Kudu UPSERT in Impala .. IMPALA-3725 Support Kudu UPSERT in Impala This patch introduces a new query statement, UPSERT, for Kudu tables which operates like an INSERT and uses all of the analysis, planning, and execution machinery as INSERT, except that if there's a primary key collision instead of returning an error an update is performed. New syntax: [with_clause] UPSERT INTO [TABLE] table_name [(column list)] { query_stmt | VALUES (value [, value...]) [, (value [, (value...)]) ...] } where column list must contain all of the key columns in table_name, if specified, and table_name must be a Kudu table. This patch also improves the behavior of INSERTing into Kudu tables without specifying all of the key columns - this now results in an analysis exception, rather than attempting the INSERT and receiving an error back from Kudu. Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9 --- M be/src/exec/kudu-table-sink.cc M common/thrift/DataSinks.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java M fe/src/main/java/org/apache/impala/planner/TableSink.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java A fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test 17 files changed, 656 insertions(+), 84 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/4047/13 -- To view, visit http://gerrit.cloudera.org:8080/4047 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-4086: Add benchmark for simple scheduler
Lars Volker has posted comments on this change. Change subject: IMPALA-4086: Add benchmark for simple scheduler .. Patch Set 3: PS3 is rebase only. Still needs changes to work on top of Schedule() instead of ComputeScanRangeAssignments(). -- To view, visit http://gerrit.cloudera.org:8080/4554 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4086: Add benchmark for simple scheduler
Lars Volker has uploaded a new patch set (#3). Change subject: IMPALA-4086: Add benchmark for simple scheduler .. IMPALA-4086: Add benchmark for simple scheduler Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/simple-scheduler-benchmark.cc M be/src/util/benchmark.cc M be/src/util/debug-util.cc M be/src/util/debug-util.h 5 files changed, 176 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/4554/3 -- To view, visit http://gerrit.cloudera.org:8080/4554 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3983/IMPALA-3974: Delete function jar resources after load
Henry Robinson has posted comments on this change. Change subject: IMPALA-3983/IMPALA-3974: Delete function jar resources after load .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4617 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f9dedb5b342415380c83e61a72eb497371a8199 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool
Dan Hecht has posted comments on this change. Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool .. Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: PS1, Line 193: BufferOpts const& ? http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/runtime/disk-io-mgr.cc File be/src/runtime/disk-io-mgr.cc: PS1, Line 642: without the client providing a " : "buffer" is this going to make sense to a user? Line 708: DCHECK_LE(buffer_size, max_buffer_size_); how do we ensure this is true? is it the check in DiskIoMgr::Read() that will always enforce? http://gerrit.cloudera.org:8080/#/c/4631/1/be/src/runtime/disk-io-mgr.h File be/src/runtime/disk-io-mgr.h: PS1, Line 147: these buffers are returned to the caller wrapped : /// in BufferDescriptors. took a few seconds to parse and understand. how about: ... these buffers are wrapped in BufferDescriptors and return to the caller. Line 150: /// another read. In error cases, the IoMgr will recycle the buffers more promptly but is this relevant for all cases, or only cases (a) and (b)? PS1, Line 392: bool maybe make all of these members 'const' since they are "options" that don't change, right? PS1, Line 422: BufferOpts const BufferOpts& ? Line 506: int64_t client_buffer_len_; maybe move this right before cached_buffer_ since it's related in the sense that it's another possible source of the buffer. Alternatively, consider even making a union to make it clear that they are mutually exclusive, and the union tag indicates the buffer "mode" (Managed, HdfsCached, Client - or something like that). Though it could be that doesn't work out cleanly given that HdfsCached can fallback to Managed mode. -- To view, visit http://gerrit.cloudera.org:8080/4631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION
Attila Jeges has uploaded a new patch set (#19). Change subject: IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION .. IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION Just like Hive, Implala should support multiple partitions in ALTER TABLE ADD PARTITION statements. The syntax is as follows: ALTER TABLE table_name ADD [IF NOT EXISTS] PARTITION partition_spec1 [location_spec1] [cache_spec1] PARTITION partition_spec2 [location_spec2] [cache_spec2] ... Grammar was modified to handle the new syntax. Introduced PartitionParams class to capture the repeatable part of the statement. TPartitionParams is the name of the corresponding thrift class. AlterTableAddPartitionStmt and CatalogOpExecutor classes were also modified to work with a list of partitions. Duplicate partition specs are rejected in AlterTableAddPartitionStmt.analyze(). Added FE, E2E and integration tests. Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java A fe/src/main/java/org/apache/impala/analysis/PartitionParams.java M fe/src/main/java/org/apache/impala/analysis/PartitionSpec.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M testdata/workloads/functional-query/queries/QueryTest/alter-table.test M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test M tests/common/impala_test_suite.py M tests/metadata/test_hms_integration.py M tests/metadata/test_refresh_partition.py 16 files changed, 940 insertions(+), 215 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/4144/19 -- To view, visit http://gerrit.cloudera.org:8080/4144 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750 Gerrit-PatchSet: 19 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION
Attila Jeges has posted comments on this change. Change subject: IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION .. Patch Set 18: Reordered import statements -- To view, visit http://gerrit.cloudera.org:8080/4144 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750 Gerrit-PatchSet: 18 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION
Attila Jeges has uploaded a new patch set (#19). Change subject: IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION .. IMPALA-1670,IMPALA-4141: Support multiple partitions in ALTER TABLE ADD PARTITION Just like Hive, Implala should support multiple partitions in ALTER TABLE ADD PARTITION statements. The syntax is as follows: ALTER TABLE table_name ADD [IF NOT EXISTS] PARTITION partition_spec1 [location_spec1] [cache_spec1] PARTITION partition_spec2 [location_spec2] [cache_spec2] ... Grammar was modified to handle the new syntax. Introduced PartitionParams class to capture the repeatable part of the statement. TPartitionParams is the name of the corresponding thrift class. AlterTableAddPartitionStmt and CatalogOpExecutor classes were also modified to work with a list of partitions. Duplicate partition specs are rejected in AlterTableAddPartitionStmt.analyze(). Added FE, E2E and integration tests. Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750 --- M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java A fe/src/main/java/org/apache/impala/analysis/PartitionParams.java M fe/src/main/java/org/apache/impala/analysis/PartitionSpec.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M testdata/workloads/functional-query/queries/QueryTest/alter-table.test M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test M tests/common/impala_test_suite.py M tests/metadata/test_hms_integration.py M tests/metadata/test_refresh_partition.py 16 files changed, 940 insertions(+), 215 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/4144/19 -- To view, visit http://gerrit.cloudera.org:8080/4144 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iddbc951f2931f488f7048c9780260f6b49100750 Gerrit-PatchSet: 19 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-4223: Handle truncated file read from HDFS cache
Dan Hecht has posted comments on this change. Change subject: IMPALA-4223: Handle truncated file read from HDFS cache .. Patch Set 1: (1 comment) > (1 comment) > > > (1 comment) > > > > Can we test this by having the test load metadata and then > truncate > > a cached file? > > We can try to do this in a custom cluster test. It needs to follow > the steps outlined here: > https://issues.cloudera.org/browse/IMPALA-4223?focusedCommentId=212776&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-212776 > > They require changes to the system limits to allow for larger > cached files, and to the data nodes to increase the file cache as > well, so they might be rather disruptive. Once these are changed we > can write a custom cluster test to download, truncate, and upload > files and run queries over them, checking that the correct log > messages appear. > > Should we break this out into several Jiras / changes? The limits > should be change in impala-setup, too. The datanode settings change > will be required to integrate this into fuzz testing, too. > > I will also have a look at the scanner fuzz test and see if it is > easy to inject these error there. Yes, if it's a good amount of work to set up that kind of test, let's save it for another step. It's probably better to spend time with getting the fuzzer working rather than this one particular test case. http://gerrit.cloudera.org:8080/#/c/4828/1/be/src/runtime/disk-io-mgr-scan-range.cc File be/src/runtime/disk-io-mgr-scan-range.cc: PS1, Line 448: disk > I followed the error from L433, assuming that since we already report a sim Okay. I'm fine either leaving this one as "disk" or changing both. -- To view, visit http://gerrit.cloudera.org:8080/4828 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1e1fdb0211819c5938956abb13b512350a46f1a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4086: Add benchmark for simple scheduler
Lars Volker has uploaded a new patch set (#2). Change subject: IMPALA-4086: Add benchmark for simple scheduler .. IMPALA-4086: Add benchmark for simple scheduler Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/simple-scheduler-benchmark.cc M be/src/util/benchmark.cc M be/src/util/debug-util.cc M be/src/util/debug-util.h 5 files changed, 191 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/4554/2 -- To view, visit http://gerrit.cloudera.org:8080/4554 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4086: Add benchmark for simple scheduler
Lars Volker has posted comments on this change. Change subject: IMPALA-4086: Add benchmark for simple scheduler .. Patch Set 1: (3 comments) Thanks Tim for the review. I addressed your comments in PS2. I will rebase this next, then pick up work on it, so no need to review again as of now. http://gerrit.cloudera.org:8080/#/c/4554/1/be/src/benchmarks/simple-scheduler-benchmark.cc File be/src/benchmarks/simple-scheduler-benchmark.cc: Line 130: /// according to the parameter 'replica_preference'. > Mention that it uses the default # blocks? Done Line 147: /// Build and run a benchmark suite for various table sizes. Scheduling will be done > Mention that it's done with the default cluster size? Done http://gerrit.cloudera.org:8080/#/c/4554/1/be/src/util/benchmark.cc File be/src/util/benchmark.cc: PS1, Line 51: by > I don't think this "by" is needed Done -- To view, visit http://gerrit.cloudera.org:8080/4554 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I89ec1c6c1828bb0b86d1e13ce4dfc5a8df865c2e Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements
Alex Behm has posted comments on this change. Change subject: IMPALA-2521: Add clustered hint to insert statements .. Patch Set 12: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4745 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR](hadoop-next) Merge remote-tracking branch 'gerrit/master' into HEAD
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/4841 Change subject: Merge remote-tracking branch 'gerrit/master' into HEAD .. Merge remote-tracking branch 'gerrit/master' into HEAD Change-Id: I1b56e4dbb67889bbe4ff8462ccc2b49968821a1f --- M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java 1 file changed, 0 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/4841/1 -- To view, visit http://gerrit.cloudera.org:8080/4841 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I1b56e4dbb67889bbe4ff8462ccc2b49968821a1f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: hadoop-next Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR](hadoop-next) Merge remote-tracking branch 'gerrit/master' into HEAD
Tim Armstrong has posted comments on this change. Change subject: Merge remote-tracking branch 'gerrit/master' into HEAD .. Patch Set 1: Code-Review+2 Verified+1 Clean merge -- To view, visit http://gerrit.cloudera.org:8080/4841 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1b56e4dbb67889bbe4ff8462ccc2b49968821a1f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: hadoop-next Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements
Lars Volker has posted comments on this change. Change subject: IMPALA-2521: Add clustered hint to insert statements .. Patch Set 12: Rebase done in PS12 -- To view, visit http://gerrit.cloudera.org:8080/4745 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements
Hello Marcel Kornacker, Internal Jenkins, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4745 to look at the new patch set (#12). Change subject: IMPALA-2521: Add clustered hint to insert statements .. IMPALA-2521: Add clustered hint to insert statements This change introduces a clustered/noclustered hint for insert statements. Specifying this hint adds an additional sort node to the plan, just before the table sink. This has the effect that data will be clustered by its partition prior to writing partitions, which therefore can be written sequentially. Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14 --- M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/SortInfo.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/workloads/functional-planner/queries/PlannerTest/insert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-query/queries/QueryTest/insert.test M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test 10 files changed, 417 insertions(+), 44 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/12 -- To view, visit http://gerrit.cloudera.org:8080/4745 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker
[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-1169: Admission control info on the queries debug webpage .. Patch Set 5: Thanks! Do the exhaustive AC tests pass? -- To view, visit http://gerrit.cloudera.org:8080/4756 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage
Henry Robinson has posted comments on this change. Change subject: IMPALA-1169: Admission control info on the queries debug webpage .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/4756/5/be/src/service/impala-server.cc File be/src/service/impala-server.cc: PS5, Line 801: Registered I don't think 'Registered' is very meaningful to users (and also it's being set *before* registration, so the time is wrong. "Query submitted" or something might be clearer. The idea is an event that shows that Impala has started processing a query. I preferred it in its earlier place - what was the rationale for moving it? http://gerrit.cloudera.org:8080/#/c/4756/5/be/src/service/query-exec-state.h File be/src/service/query-exec-state.h: Line 157: /// Resource pool associated with this query, or an empty string if the schedule has not Blank line before this. Line 159: // control. missing / PS5, Line 161: NULL nullptr -- To view, visit http://gerrit.cloudera.org:8080/4756 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4223: Handle truncated file read from HDFS cache
Lars Volker has posted comments on this change. Change subject: IMPALA-4223: Handle truncated file read from HDFS cache .. Patch Set 1: (1 comment) > (1 comment) > > Can we test this by having the test load metadata and then truncate > a cached file? We can try to do this in a custom cluster test. It needs to follow the steps outlined here: https://issues.cloudera.org/browse/IMPALA-4223?focusedCommentId=212776&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-212776 They require changes to the system limits to allow for larger cached files, and to the data nodes to increase the file cache as well, so they might be rather disruptive. Once these are changed we can write a custom cluster test to download, truncate, and upload files and run queries over them, checking that the correct log messages appear. Should we break this out into several Jiras / changes? The limits should be change in impala-setup, too. The datanode settings change will be required to integrate this into fuzz testing, too. I will also have a look at the scanner fuzz test and see if it is easy to inject these error there. http://gerrit.cloudera.org:8080/#/c/4828/1/be/src/runtime/disk-io-mgr-scan-range.cc File be/src/runtime/disk-io-mgr-scan-range.cc: PS1, Line 448: disk > how about "uncached" instead? I followed the error from L433, assuming that since we already report a similar case it made sense to be consistent. Should I change both? -- To view, visit http://gerrit.cloudera.org:8080/4828 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1e1fdb0211819c5938956abb13b512350a46f1a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4350: Crash with vlog level 2 in hash join node
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4350: Crash with vlog level 2 in hash join node .. Patch Set 2: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/4830 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ieedd49d8a17709177a622ddc15b78a3f48e12d3f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Remove seemingly incorrect DCHECK-s.
Tim Armstrong has posted comments on this change. Change subject: Remove seemingly incorrect DCHECK-s. .. Patch Set 1: (1 comment) I agree that this looks like a bug. http://gerrit.cloudera.org:8080/#/c/4835/1//COMMIT_MSG Commit Message: Line 7: Remove seemingly incorrect DCHECK-s. Can you file an Impala JIRA for this? -- To view, visit http://gerrit.cloudera.org:8080/4835 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I52e1b1354e9ea056b49331e75e53759952a81b76 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-1169: Admission control info on the queries debug webpage .. Patch Set 5: (3 comments) For COMPUTE STATS, the "compute stats" query and both its child queries will be displayed on the /queries page. The child queries operate just like regular queries, eg. they are subject to admission control and can be shown as "Queued". The "compute stats" query is not subject to admission control, and its "Last Event" will be shown as "Planning Finished" while the child queries are running, before moving to "Child queries finished" http://gerrit.cloudera.org:8080/#/c/4756/4/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: PS4, Line 609: queries_json = impa > remove Done PS4, Line 613: assert query['last_event'] != 'Registered' and \ : query['last_event'] != 'Planning finished' : assert query['resource_pool'] == self.pool_name : else: : assert query['resource_pool'] == '' : > This else clause seems a bit weird. The logic would probably be easier if f Done PS4, Line 620: turns the number of queries currently in the 'queued' state from t > In my comment before I meant to give an example rather than something we ca Done -- To view, visit http://gerrit.cloudera.org:8080/4756 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4309: Introduce Expr rewrite phase and supporting classes.
Alex Behm has posted comments on this change. Change subject: IMPALA-4309: Introduce Expr rewrite phase and supporting classes. .. Patch Set 6: (3 comments) http://gerrit.cloudera.org:8080/#/c/4746/6/fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java File fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java: Line 117: // canEvalUsingPartitionMd() to fold constant expressions in-place. > comment why the cloning is necessary. It's not really necessary but seems saner not to modify the original Expr registered in the analyzer. Modified comment. http://gerrit.cloudera.org:8080/#/c/4746/6/fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java File fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java: Line 40: public Expr rewrite(Expr expr, Analyzer analyzer) throws AnalysisException { > i would have expected this to apply a single rule to the entire expr tree u I implemented your proposed alternative so you can see the code. I'm fine with either version. Line 52: Expr origExpr = rewrittenExpr.clone(); > why is this necessary? if the convention is that apply() always returns the Good point, done. -- To view, visit http://gerrit.cloudera.org:8080/4746 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage
Thomas Tauber-Marshall has uploaded a new patch set (#5). Change subject: IMPALA-1169: Admission control info on the queries debug webpage .. IMPALA-1169: Admission control info on the queries debug webpage This patch adds a new event, 'Queued', to the query event log to indicate when a query is queued by the admission controller. This means that queries on the '/queries' page that are currently queued will display this as their 'Last Event', making it possible to see which queries are current queued. It also adds a column to show the resource pool associated with the queries. Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283 --- M be/src/scheduling/admission-controller.cc M be/src/scheduling/simple-scheduler.cc M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-exec-state.h M tests/common/impala_service.py M tests/custom_cluster/test_admission_controller.py M www/queries.tmpl 9 files changed, 71 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/4756/5 -- To view, visit http://gerrit.cloudera.org:8080/4756 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-4309: Introduce Expr rewrite phase and supporting classes.
Alex Behm has uploaded a new patch set (#7). Change subject: IMPALA-4309: Introduce Expr rewrite phase and supporting classes. .. IMPALA-4309: Introduce Expr rewrite phase and supporting classes. Introduces a new phase for rewriting Exprs after analysis and before subquery rewriting. The transformed Exprs replace the original ones in analyzed statements. If Exprs were changed, the whole statement is reset() and re-analyzed, similar to how subqueries are rewritten. If both Exprs and subqueries are rewritten there is only one re-analysis of the changed statement. The following new classes work together to perform transformations: 1. ExprRewriteRule - base class for Expr transformation rules 2. ExprRewriter - drives the transformation of Exprs using a list of ExprRewriteRules Statements that have exprs to be rewritten need to implement a new method rewriteExprs() that accepts an ExprRewriter. As an example, this patch adds a rule for converting BetweenPredicates into their equivalent CompoundPredicates. The BetweenPredicate has been notoriously buggy due to a lack of such a separate rewrite phase and is now cleaned up. Testing: 1. Added a new test for checking that the rewrite framework covers all relevant statements, clauses and can properly handle nested statements and subqueries. 2. There are many existing tests for BetweePredicates and they all exercise the new rewrite rule/phase. 3. Ran a private core/hdfs run and it passed. Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e --- M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/SelectList.java M fe/src/main/java/org/apache/impala/analysis/SelectListItem.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/StatementBase.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java A fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java A fe/src/main/java/org/apache/impala/rewrite/ExprRewriteRule.java A fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java A fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test 29 files changed, 551 insertions(+), 216 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/4746/7 -- To view, visit http://gerrit.cloudera.org:8080/4746 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4223: Handle truncated file read from HDFS cache
Dan Hecht has posted comments on this change. Change subject: IMPALA-4223: Handle truncated file read from HDFS cache .. Patch Set 1: (1 comment) Can we test this by having the test load metadata and then truncate a cached file? http://gerrit.cloudera.org:8080/#/c/4828/1/be/src/runtime/disk-io-mgr-scan-range.cc File be/src/runtime/disk-io-mgr-scan-range.cc: PS1, Line 448: disk how about "uncached" instead? -- To view, visit http://gerrit.cloudera.org:8080/4828 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id1e1fdb0211819c5938956abb13b512350a46f1a Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Remove seemingly incorrect DCHECK-s.
Dan Hecht has posted comments on this change. Change subject: Remove seemingly incorrect DCHECK-s. .. Patch Set 1: Is it reasonably possible to create a file that demonstrates this and can be added to the regression test suite? -- To view, visit http://gerrit.cloudera.org:8080/4835 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I52e1b1354e9ea056b49331e75e53759952a81b76 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No