[Impala-ASF-CR] IMPALA-4564,IMPALA-4565: mt dop fixes for old aggs and joins
Dan Hecht has posted comments on this change. Change subject: IMPALA-4564,IMPALA-4565: mt_dop fixes for old aggs and joins .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5302 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I48c50c8aa0c23710eb099fba252bc3c0cb74b313 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4564,IMPALA-4565: mt dop fixes for old aggs and joins
Michael Ho has posted comments on this change. Change subject: IMPALA-4564,IMPALA-4565: mt_dop fixes for old aggs and joins .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5302 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I48c50c8aa0c23710eb099fba252bc3c0cb74b313 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4564,IMPALA-4565: mt dop fixes for old aggs and joins
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/5302 Change subject: IMPALA-4564,IMPALA-4565: mt_dop fixes for old aggs and joins .. IMPALA-4564,IMPALA-4565: mt_dop fixes for old aggs and joins Fix a test bug where we need to skip nested types tests for the old aggs and joins. Fix a product bug where *eos is not initialised by the MT scan node. This causes incorrect results when the calling ExecNode does not initialise the eos variable, e.g. the sort node and the old agg and join nodes. Testing: Added a test that reproduces the incorrect results with the sort node when run under ASAN Tested the mt_dop tests locally with old aggs and joins to ensure they pass. Change-Id: I48c50c8aa0c23710eb099fba252bc3c0cb74b313 --- M be/src/exec/hdfs-scan-node-mt.cc A testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-nested.test M testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test M tests/query_test/test_mt_dop.py 4 files changed, 50 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/5302/1 -- To view, visit http://gerrit.cloudera.org:8080/5302 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I48c50c8aa0c23710eb099fba252bc3c0cb74b313 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-4303: Do not reset() qualifier of union operands.
Alex Behm has posted comments on this change. Change subject: IMPALA-4303: Do not reset() qualifier of union operands. .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/4963/1/fe/src/main/java/org/apache/impala/analysis/UnionStmt.java File fe/src/main/java/org/apache/impala/analysis/UnionStmt.java: Line 56:* Contains a query statement and the all/distinct qualifier > adapt or remove that last sentence, which is redundant anyway. Done Line 61: // distinct propagation and unnesting that are needed after rewriting Subqueries. > i don't understand the semantics of UnionStmt.reset() anymore. if we're try Your summary seems correct to me. UnionStmt.reset() undoes all analysis state, except unnesting and distinct propagation. There is no need to preserve distinctOperands_ and allOperands_ because operands_ also contains those after analysis. A saner version of reset() would of course undo all changes made in analyze(). Unfortunately, that is rather difficult because analyze() does in-place modifications to the operands and even nested operands_ lists during unnesting, and also overwrites the original operands_. Added comment to reset(). -- To view, visit http://gerrit.cloudera.org:8080/4963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I157bb0f08c4a94fd779487d7c23edd64a537a1f6 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] Bracketing Java logging output with log level checks part 2.
Internal Jenkins has posted comments on this change. Change subject: Bracketing Java logging output with log level checks part 2. .. Patch Set 1: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/572/ -- To view, visit http://gerrit.cloudera.org:8080/5297 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib94b3a20d439d854f579d9086755eb19699fcb68 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
Alex Behm has posted comments on this change. Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes. .. Patch Set 6: Code-Review+2 trivial fix due to conflict after rebase -- To view, visit http://gerrit.cloudera.org:8080/5274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
Hello Marcel Kornacker, Internal Jenkins, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5274 to look at the new patch set (#6). Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes. .. IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes. The bug was that HdfsScanNodeMt::Close() did not properly clean up all in-flight resources when called through the query cancellation path. The main change is to clean up all resources when passing a NULL batch into HdfsparquetScanner::Close() which also needed similar changes in the scanner context. Testing: Ran test_cancellation.py, test_scanners.py and test_nested_types.py with MT_DOP=3. Added a test query with a limit that was failing before. A regular private hdfs/core test run succeeded. Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/parquet-column-readers.h M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test 6 files changed, 53 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/5274/6 -- To view, visit http://gerrit.cloudera.org:8080/5274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4562: Fix for crash on kerberized clusters w/o Kudu support
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4562: Fix for crash on kerberized clusters w/o Kudu support .. Patch Set 1: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/571/ -- To view, visit http://gerrit.cloudera.org:8080/5295 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib517d17ab12e215fe87f35bc5d03cdda736ff672 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] Fix undefined calls to builtin ctz.
Impala Public Jenkins has submitted this change and it was merged. Change subject: Fix undefined calls to __builtin_ctz. .. Fix undefined calls to __builtin_ctz. GCC's __builtin_ctz[l[l]] functions return undefined results when the argument is 0. This patch handles that 0 case, which could otherwise produce results that depend on the architecture. Change-Id: I8460bc3f7e510ce07b8673387c9440accc432abe Reviewed-on: http://gerrit.cloudera.org:8080/5004 Reviewed-by: Jim AppleReviewed-by: Dan Hecht Tested-by: Impala Public Jenkins --- M be/src/exprs/aggregate-functions-ir.cc M be/src/udf_samples/hyperloglog-uda.cc M be/src/util/bit-util.h 3 files changed, 45 insertions(+), 21 deletions(-) Approvals: Impala Public Jenkins: Verified Jim Apple: Looks good to me, but someone else must approve Dan Hecht: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I8460bc3f7e510ce07b8673387c9440accc432abe Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Amos Bird Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Fix undefined calls to builtin ctz.
Impala Public Jenkins has posted comments on this change. Change subject: Fix undefined calls to __builtin_ctz. .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8460bc3f7e510ce07b8673387c9440accc432abe Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Amos Bird Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4498: crash in to utc timestamp/from utc timestamp
Dan Hecht has posted comments on this change. Change subject: IMPALA-4498: crash in to_utc_timestamp/from_utc_timestamp .. Patch Set 4: > > What about the try/catch block in timestamp-functions-ir.cc. > > Shouldn't that code be moved into the native code so that it > > actually works? > > > > Also, maybe we should add try/catch in TimestampValue::DebugString() > > to be safe? > > The question is where to stop. We should really audit all of our > boost calls for uncaught exceptions. Sure, but these ones are already known to be problematic. -- To view, visit http://gerrit.cloudera.org:8080/5251 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc427b06ac33ec874a05cb98d01c00e970d3dde6 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Bracketing Java logging output with log level checks.
Internal Jenkins has submitted this change and it was merged. Change subject: Bracketing Java logging output with log level checks. .. Bracketing Java logging output with log level checks. This reduces creation of intermediate objects and improves performance. Change-Id: Ie0f5123dbf2caf3b03183c76820599920baa9785 Reviewed-on: http://gerrit.cloudera.org:8080/5284 Reviewed-by: Marcel KornackerTested-by: Internal Jenkins --- M fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Column.java M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java M fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/main/java/org/apache/impala/extdatasource/ExternalDataSourceExecutor.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsPartitionFilter.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M fe/src/main/java/org/apache/impala/planner/SelectNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/MetadataOp.java M fe/src/main/java/org/apache/impala/util/FsPermissionChecker.java M fe/src/main/java/org/apache/impala/util/HdfsCachingUtil.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/main/java/org/apache/impala/util/RequestPoolService.java M fe/src/main/java/org/apache/impala/util/SentryPolicyService.java 39 files changed, 388 insertions(+), 193 deletions(-) Approvals: Marcel Kornacker: Looks good to me, approved Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/5284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ie0f5123dbf2caf3b03183c76820599920baa9785 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel Kornacker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker
[Impala-ASF-CR] Bracketing Java logging output with log level checks.
Internal Jenkins has posted comments on this change. Change subject: Bracketing Java logging output with log level checks. .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie0f5123dbf2caf3b03183c76820599920baa9785 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3125: Fix assignment of equality predicates from an outer-join On-clause.
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3125: Fix assignment of equality predicates from an outer-join On-clause. .. Patch Set 1: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/4986/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: Line 1322:* whether it is correct to evaluate 'e' at a node materializing 'tids'. rephrase as "returns true if..." Line 1331:* Checks whether 'e' originates from an outer-join On-clause, and if so, checks same here -- To view, visit http://gerrit.cloudera.org:8080/4986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I719e0eeacccad070b1f9509d80aaf761b572add0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3126: Conservative assignment of inner-join On-clause predicates.
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3126: Conservative assignment of inner-join On-clause predicates. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4982/1/testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test File testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test: Line 459: | | other predicates: a.tinyint_col < 10, b.tinyint_col > 20 i'm not following: this predicate is from the inner join on clause and references an outer-joined tuple, but now it's *not* evaluated by the inner join? seems to contradict "If the predicate references an outer-joined tuple, then evaluate it at the inner join that the On-clause belongs to." -- To view, visit http://gerrit.cloudera.org:8080/4982 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idf45323ed9102ffb45c9d94a130ea3692286f215 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3167: Fix assignment of WHERE-clause predicate through grouping agg + outer join.
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3167: Fix assignment of WHERE-clause predicate through grouping agg + outer join. .. Patch Set 2: (7 comments) http://gerrit.cloudera.org:8080/#/c/4960/2//COMMIT_MSG Commit Message: Line 7: IMPALA-3167: Fix assignment of WHERE-clause predicate through grouping agg + outer join. long lines Line 28: The fix is to conservatively not mark bound predicates as assigned if there exists if there are Line 30: duplicate assignments of predicates. Those are simply deduped now. i don't understand that last part. http://gerrit.cloudera.org:8080/#/c/4960/2/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: Line 1548: (evalAfterJoin(srcConjunct) move to preceding line and fix indentation, like l1551 Line 1550: != globalState_.outerJoinedTupleIds.get(srcTid))) fix indentation Line 1553:!= globalState_.outerJoinedTupleIds.get(destTid))); same here http://gerrit.cloudera.org:8080/#/c/4960/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: Line 1194: Expr.removeDuplicates(conjuncts); how do you end up with duplicate predicates in a *scan* node due to the changes in analyzer? (and why is this only relevant for hdfs scans?) -- To view, visit http://gerrit.cloudera.org:8080/4960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I774d13a13ad1e8fe82512df98dc29983bdd232eb Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Anonymous Coward #27 Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4527: Columns in Kudu tables created from Impala default to "NULL"
Alex Behm has posted comments on this change. Change subject: IMPALA-4527: Columns in Kudu tables created from Impala default to "NULL" .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/5259/2/testdata/datasets/functional/functional_schema_template.sql File testdata/datasets/functional/functional_schema_template.sql: Line 552: CREATE TABLE {db_name}{db_suffix}.{table_name}_idx ( Can we defer changes to this file until later? I'm worried having a stale data snapshot and causing all GVMs and private builds to do a full data load. -- To view, visit http://gerrit.cloudera.org:8080/5259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I950d9a9c64e3851e11a641573617790b340ece94 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store sales
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store_sales .. Patch Set 4: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/570/ -- To view, visit http://gerrit.cloudera.org:8080/5177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaae97d1d44201aeeacacdd39adbae35753512950 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David KnuppGerrit-Reviewer: David Knupp Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Harrison Sheinblatt Gerrit-Reviewer: Internal Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4357: Fix DROP TABLE to pass analysis if the table fails to load
Alex Behm has posted comments on this change. Change subject: IMPALA-4357: Fix DROP TABLE to pass analysis if the table fails to load .. Patch Set 2: (1 comment) This is a great fix! Pretty minimal. I think there is an existing JIRA to fix this behavior for DROP of tables that fail to load. Let me know if you cannot find it, I can help you dig it up. http://gerrit.cloudera.org:8080/#/c/5144/2/fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java File fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java: Line 100: // We should still try to DROP tables that failed to load, so that tables that are We need to add an access event for auditing in this case, because we are indeed going to access (drop) the table. We should also add a test for this case. You can take a look at AnalyzerTest.TestUnsupportedTypes() for suitable test tables. We should also add a test in AnalyzeDDL that shows such tables pass analysis. -- To view, visit http://gerrit.cloudera.org:8080/5144 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6b41fc3c0e95508ab67f1d420b033b02ec75a5da Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] Bracketing Java logging output with log level checks part 2.
Marcel Kornacker has posted comments on this change. Change subject: Bracketing Java logging output with log level checks part 2. .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5297 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib94b3a20d439d854f579d9086755eb19699fcb68 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4303: Do not reset() qualifier of union operands.
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4303: Do not reset() qualifier of union operands. .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/4963/1/fe/src/main/java/org/apache/impala/analysis/UnionStmt.java File fe/src/main/java/org/apache/impala/analysis/UnionStmt.java: Line 56:* Contains a query statement and the all/distinct qualifier adapt or remove that last sentence, which is redundant anyway. Line 61: // distinct propagation and unnesting that are needed after rewriting Subqueries. i don't understand the semantics of UnionStmt.reset() anymore. if we're trying to preserve the changes made during distinct propagation/unnesting, why are we resetting distinct-/allOperands_? can you describe the semantics of reset() succinctly? -- To view, visit http://gerrit.cloudera.org:8080/4963 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I157bb0f08c4a94fd779487d7c23edd64a537a1f6 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] Bracketing Java logging output with log level checks part 2.
Alex Behm has uploaded a new change for review. http://gerrit.cloudera.org:8080/5297 Change subject: Bracketing Java logging output with log level checks part 2. .. Bracketing Java logging output with log level checks part 2. This reduces creation of intermediate objects and improves performance. Change-Id: Ib94b3a20d439d854f579d9086755eb19699fcb68 --- M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateUdfStmt.java M fe/src/main/java/org/apache/impala/analysis/DropFunctionStmt.java M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java M fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/PartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/AuthorizeableDb.java M fe/src/main/java/org/apache/impala/authorization/AuthorizeableUri.java M fe/src/main/java/org/apache/impala/authorization/ImpalaInternalAdminUser.java M fe/src/main/java/org/apache/impala/authorization/Privilege.java M fe/src/main/java/org/apache/impala/authorization/PrivilegeRequest.java M fe/src/main/java/org/apache/impala/authorization/User.java 30 files changed, 148 insertions(+), 135 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/5297/1 -- To view, visit http://gerrit.cloudera.org:8080/5297 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib94b3a20d439d854f579d9086755eb19699fcb68 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 11: Code-Review+1 (1 comment) Carry +1 http://gerrit.cloudera.org:8080/#/c/5250/11/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: Line 83: per_host_mem_usage_(NULL), later patchset depended on this being NULL but it was left uninitialized which can sometimes lead to a crash, so fixing this. -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Hello Henry Robinson, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5250 to look at the new patch set (#11). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h 4 files changed, 67 insertions(+), 121 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/11 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4363: Add Parquet timestamp validation
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4363: Add Parquet timestamp validation .. Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/4968/9/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: Line 657: Unnecessary blank line. http://gerrit.cloudera.org:8080/#/c/4968/7/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: Line 151: const static int64_t NUMBER_OF_NANOSECONDS_IN_24H = boost provides a direct way to construct the min and max date with boost::gregorian::date(boost::date_time::min_date_time) boost::gregorian::date(boost::date_time::max_date_time) PS7, Line 160: g buffer. The size of the buffer should be a > Yes, that's true, some days can have an extra second: https://en.wikipedia. It looks like boost date_time doesn't support leap seconds: https://groups.google.com/forum/#!topic/boost-list/i_0h_amkk-c "Leap second support never got added to the library -- most applications can ignore them. In principle, of course, you can use the library to help you do the calculations. Basically they are like leap years except that they are irregular -- so you have to use a collection to perform the adjustment instead of an algorithm." I think we should allow for one leap second per day just so we don't regress anything where the data is potentially valid. It would be good to add some testing of this (maybe as a follow) to make sure we don't crash, even if we do return bogus results. -- To view, visit http://gerrit.cloudera.org:8080/4968 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4498: crash in to utc timestamp/from utc timestamp
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4498: crash in to_utc_timestamp/from_utc_timestamp .. Patch Set 4: > What about the try/catch block in timestamp-functions-ir.cc. > Shouldn't that code be moved into the native code so that it > actually works? > > Also, maybe we should add try/catch in TimestampValue::DebugString() > to be safe? The question is where to stop. We should really audit all of our boost calls for uncaught exceptions. -- To view, visit http://gerrit.cloudera.org:8080/5251 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc427b06ac33ec874a05cb98d01c00e970d3dde6 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4557: Fix flakiness with FLAGS stress free pool alloc
Dan Hecht has posted comments on this change. Change subject: IMPALA-4557: Fix flakiness with FLAGS_stress_free_pool_alloc .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5281 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I373035d209a0f3477650336ee6da458fa289dbe0 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] Fix undefined calls to builtin ctz.
Dan Hecht has posted comments on this change. Change subject: Fix undefined calls to __builtin_ctz. .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8460bc3f7e510ce07b8673387c9440accc432abe Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Amos Bird Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4498: crash in to utc timestamp/from utc timestamp
Dan Hecht has posted comments on this change. Change subject: IMPALA-4498: crash in to_utc_timestamp/from_utc_timestamp .. Patch Set 4: What about the try/catch block in timestamp-functions-ir.cc. Shouldn't that code be moved into the native code so that it actually works? Also, maybe we should add try/catch in TimestampValue::DebugString() to be safe? -- To view, visit http://gerrit.cloudera.org:8080/5251 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc427b06ac33ec874a05cb98d01c00e970d3dde6 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4562: Fix for crash on kerberized clusters w/o Kudu support
Henry Robinson has posted comments on this change. Change subject: IMPALA-4562: Fix for crash on kerberized clusters w/o Kudu support .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5295 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib517d17ab12e215fe87f35bc5d03cdda736ff672 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4562: Fix for crash on kerberized clusters w/o Kudu support
Matthew Jacobs has uploaded a new change for review. http://gerrit.cloudera.org:8080/5295 Change subject: IMPALA-4562: Fix for crash on kerberized clusters w/o Kudu support .. IMPALA-4562: Fix for crash on kerberized clusters w/o Kudu support commit de88f0c4af3a07ae6bd6b8c94edcb8748468f522 for "IMPALA-4497: Fix Kudu client crash w/ SASL initialization" causes a crash on secure clusters where kudu is not supported. kudu::client::DisableSaslInitialization() from libkudu_client.so.0 impala::InitAuth(std::string const&) () impala::InitCommonRuntime() () ImpaladMain(int, char**) () main () This ensures Impala does not call the Kudu client to handle SASL init on systems where libkudu_client.so is a stub. Change-Id: Ib517d17ab12e215fe87f35bc5d03cdda736ff672 --- M be/src/rpc/authentication.cc 1 file changed, 4 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/5295/1 -- To view, visit http://gerrit.cloudera.org:8080/5295 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib517d17ab12e215fe87f35bc5d03cdda736ff672 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. Patch Set 16: Code-Review+2 Carrying +2. I'm running S3 core/hdfs core/hdfs exhaustive tests on this just to make sure we aren't breaking anything. I'll run the GVO once they pass. Thanks Dimitris and Alex. -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Hello Dimitris Tsirogiannis, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5148 to look at the new patch set (#16). Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. IMPALA-4172/IMPALA-3653: Improvements to block metadata loading This patch improves the block metadata loading (locations and disk storage IDs) for partitioned and un-partitioned tables in the Catalog server. Without this patch: -- We loop through each and every file in the table/partition directories and call getFileBlockLocations() on it to obtain the block metadata. This results in large number of RPC calls to the Namenode, especially for tables with large no. of files/partitions. With this patch: --- We move the block metadata querying to use listStatus() call which accepts a directory as input and fetches the 'BlockLocation' objects for every file recursively in that directory. This improves the metadata loading in the following ways. - For non-partitioned tables, we query all the BlockLocations in a single RPC call in the base table directory and load the corresponding disk IDs. - For partitioned tables, we query the BlockLocations for all the partitions residing under the base table directories in a single RPC and then load every partition with non-default partition directory separately. - REFRESH on a table reloads the block metadata from scratch for every data file every time. So it can be used as a replacement for invalidate in situations like HDFS block rebalancing which needs block metadata update. Also, this patch does away with VolumeIds returned by the HDFS NN and uses the new StorageIDs returned by the BlockLocation class. These StorageIDs are UUID strings and hence are mapped to a per-node 0-based index as expected by the backend. In the upcoming versions of Hadoop APIs, getFileBlockStorageLocations() is deprecated and instead the listStatus() returns BlockLocations with storage IDs embedded. This patch makes use of this improvement to reduce an additional RPC to the NN to fetch the storage locations. Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java 6 files changed, 365 insertions(+), 338 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/5148/16 -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. Patch Set 15: (2 comments) http://gerrit.cloudera.org:8080/#/c/5148/15/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: PS15, Line 794: for (Path location: locations) { : loadBlockMetadata(fs, location, partsByPath); : } > nit: single line? Done PS15, Line 1034: while reusing existing file descriptors to avoid loading metadata for files :* that haven't changed. > This is no longer valid. Remove? Done -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
Alex Behm has posted comments on this change. Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes. .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
Hello Marcel Kornacker, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5274 to look at the new patch set (#5). Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes. .. IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes. The bug was that HdfsScanNodeMt::Close() did not properly clean up all in-flight resources when called through the query cancellation path. The main change is to clean up all resources when passing a NULL batch into HdfsparquetScanner::Close() which also needed similar changes in the scanner context. Testing: Ran test_cancellation.py, test_scanners.py and test_nested_types.py with MT_DOP=3. Added a test query with a limit that was failing before. A regular private hdfs/core test run succeeded. Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/parquet-column-readers.h M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test 6 files changed, 52 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/5274/5 -- To view, visit http://gerrit.cloudera.org:8080/5274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
Alex Behm has posted comments on this change. Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes. .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/5274/4/be/src/exec/scanner-context.cc File be/src/exec/scanner-context.cc: PS4, Line 110: bufDesc > Wrong case for C++ code. Maybe just buffer? Oops, sorry. Done. http://gerrit.cloudera.org:8080/#/c/5274/4/testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test File testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test: Line 12: from tpch_nested_parquet.customer c, c.c_orders o > out of curiosity, why does this require a nested table to invoke the cancel I don't think it's required but I had a hard time finding another query on our existing test data that could reliably reproduce the issue. The cancellation tests repro the problem every time, but I wanted to add another "simpler" test here as well. -- To view, visit http://gerrit.cloudera.org:8080/5274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4163: Add sortby() query hint
Lars Volker has posted comments on this change. Change subject: IMPALA-4163: Add sortby() query hint .. Patch Set 3: I moved the parsing code into the lexer and parser. This changed some of the error behavior, where errors that used to be caught in analysis are now caught during parsing. I also rebased the change since it depended on the "clustered" hint which was recently merged. -- To view, visit http://gerrit.cloudera.org:8080/5051 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I37a3ffab99aaa5d5a4fd1ac674b3e8b394a3c4c0 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4498: crash in to utc timestamp/from utc timestamp
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4498: crash in to_utc_timestamp/from_utc_timestamp .. Patch Set 4: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/5251/4/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: Line 3616: "as timestamp), interval 13 months) as string)", odd indentation. could you try to fix the formatting rules? http://gerrit.cloudera.org:8080/#/c/5251/4/be/src/exprs/timestamp-functions.cc File be/src/exprs/timestamp-functions.cc: Line 54: void ThrowExceptionIfTimestampOutOfRange(const boost::gregorian::date& date) { > We need to catch exceptions anyway in the caller anyway, since other boost remove Exception, it's implied in Throw. remove Timestamp, you're not checking a timestamp. -- To view, visit http://gerrit.cloudera.org:8080/5251 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc427b06ac33ec874a05cb98d01c00e970d3dde6 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Alex Behm has posted comments on this change. Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. Patch Set 15: Code-Review+1 New changes lgtm, Nice catch with the useless map, Dimitris! -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] Bracketing Java logging output with log level checks.
Marcel Kornacker has posted comments on this change. Change subject: Bracketing Java logging output with log level checks. .. Patch Set 2: Code-Review+2 rebase -- To view, visit http://gerrit.cloudera.org:8080/5284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie0f5123dbf2caf3b03183c76820599920baa9785 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4557: Fix flakiness with FLAGS stress free pool alloc
Jim Apple has posted comments on this change. Change subject: IMPALA-4557: Fix flakiness with FLAGS_stress_free_pool_alloc .. Patch Set 3: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5281 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I373035d209a0f3477650336ee6da458fa289dbe0 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4557: Fix flakiness with FLAGS stress free pool alloc
Michael Ho has posted comments on this change. Change subject: IMPALA-4557: Fix flakiness with FLAGS_stress_free_pool_alloc .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5281/1/be/src/runtime/free-pool.h File be/src/runtime/free-pool.h: Line 214: static AtomicInt32 alloc_counts_; > I think it's clearer to explicitly initialize the value. Done -- To view, visit http://gerrit.cloudera.org:8080/5281 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I373035d209a0f3477650336ee6da458fa289dbe0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4557: Fix flakiness with FLAGS stress free pool alloc
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5281 to look at the new patch set (#3). Change subject: IMPALA-4557: Fix flakiness with FLAGS_stress_free_pool_alloc .. IMPALA-4557: Fix flakiness with FLAGS_stress_free_pool_alloc FLAGS_stress_free_pool_alloc is a gflag for simulating malloc failure in debug builds. If set, FreePool::Allocate() and FreePool::Reallocate() will return NULL on every FLAGS_stress_free_pool_alloc allocations. The problem is that the counter for tracking number of allocations is updated racily by multiple threads, leading to non-determinism and flaky tests. This change fixes the problem by making the counter an AtomicInt32. Change-Id: I373035d209a0f3477650336ee6da458fa289dbe0 --- M be/src/common/atomic.h M be/src/runtime/CMakeLists.txt A be/src/runtime/free-pool.cc M be/src/runtime/free-pool.h 4 files changed, 41 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/5281/3 -- To view, visit http://gerrit.cloudera.org:8080/5281 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I373035d209a0f3477650336ee6da458fa289dbe0 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Fix undefined calls to builtin ctz.
Jim Apple has posted comments on this change. Change subject: Fix undefined calls to __builtin_ctz. .. Patch Set 5: Code-Review+1 rebase carry tim's +1 -- To view, visit http://gerrit.cloudera.org:8080/5004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8460bc3f7e510ce07b8673387c9440accc432abe Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Amos Bird Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4527: Columns in Kudu tables created from Impala default to "NULL"
Hello Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5259 to look at the new patch set (#2). Change subject: IMPALA-4527: Columns in Kudu tables created from Impala default to "NULL" .. IMPALA-4527: Columns in Kudu tables created from Impala default to "NULL" This commit reverts the behavior introduced by IMPALA-3719 which used the Kudu default behavior for column nullability if none was specified in the CREATE TABLE statement. With this commit, non-key columns of Kudu tables that are created from Impala are by default nullable unless specified otherwise. Change-Id: I950d9a9c64e3851e11a641573617790b340ece94 --- M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test 4 files changed, 51 insertions(+), 44 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/5259/2 -- To view, visit http://gerrit.cloudera.org:8080/5259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I950d9a9c64e3851e11a641573617790b340ece94 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-4527: Columns in Kudu tables created from Impala default to "NULL"
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4527: Columns in Kudu tables created from Impala default to "NULL" .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5259/1/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java: Line 104: if (column.isSetIs_nullable()) { > This won't handle the ALTER TABLE ADD COLUMN case. Let's think about whethe There is no such place to handle both these cases. The fact that primary keys can be defined in both the column definitions and as a separate list of column names complicates things, i.e. we can't be setting the nullability while analyzing the ColumnDef. Changed the code to handle the addColumn as well. -- To view, visit http://gerrit.cloudera.org:8080/5259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I950d9a9c64e3851e11a641573617790b340ece94 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[Impala-ASF-CR] Bracketing Java logging output with log level checks.
Alex Behm has posted comments on this change. Change subject: Bracketing Java logging output with log level checks. .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie0f5123dbf2caf3b03183c76820599920baa9785 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. Patch Set 15: Code-Review+2 (2 comments) Nice! http://gerrit.cloudera.org:8080/#/c/5148/15/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: PS15, Line 794: for (Path location: locations) { : loadBlockMetadata(fs, location, partsByPath); : } nit: single line? PS15, Line 1034: while reusing existing file descriptors to avoid loading metadata for files :* that haven't changed. This is no longer valid. Remove? -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4557: Fix flakiness with FLAGS stress free pool alloc
Jim Apple has posted comments on this change. Change subject: IMPALA-4557: Fix flakiness with FLAGS_stress_free_pool_alloc .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5281/1/be/src/runtime/free-pool.h File be/src/runtime/free-pool.h: Line 214: static AtomicInt32 alloc_counts_; > https://github.com/apache/incubator-impala/blob/master/be/src/common/atomic I think it's clearer to explicitly initialize the value. While you're doing this, I think it's also best to turn the DCHECK in atomic.h into a static_assert to make it impossible for there to be a Static Initialization Order Fiasco bug with DCHECK's logging facilities: https://isocpp.org/wiki/faq/ctors#static-init-order -- To view, visit http://gerrit.cloudera.org:8080/5281 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I373035d209a0f3477650336ee6da458fa289dbe0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: PS9, Line 94: prepared_promise_.Set(status); > How about moving this above line 93, so anyone waiting on it won't have to Yeah, I thought the semantics of the promise were clearer to do it last (it's a barrier to the completion of Prepare()), but I don't feel too strongly about it so I've reversed it. http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.h File be/src/runtime/plan-fragment-executor.h: PS9, Line 109: > "Open() fails" Done -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Hello Henry Robinson, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5250 to look at the new patch set (#10). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h 4 files changed, 63 insertions(+), 121 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/10 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4557: Fix flakiness with FLAGS stress free pool alloc
Michael Ho has posted comments on this change. Change subject: IMPALA-4557: Fix flakiness with FLAGS_stress_free_pool_alloc .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5281/1/be/src/runtime/free-pool.h File be/src/runtime/free-pool.h: Line 214: static AtomicInt32 alloc_counts_; > I do not see that. https://github.com/apache/incubator-impala/blob/master/be/src/common/atomic.h#L73 -- To view, visit http://gerrit.cloudera.org:8080/5281 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I373035d209a0f3477650336ee6da458fa289dbe0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3788: Add flag for Kudu read-your-writes
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3788: Add flag for Kudu read-your-writes .. Patch Set 1: This is still being tested. David is going to test this with some Kudu changes going in to master soon, then will test this in our stress tests. -- To view, visit http://gerrit.cloudera.org:8080/5288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I003aba410548bc9158d1e11abbdcf710c31a82ff Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4014: Introduce query-wide execution state. .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/query-exec-mgr.cc File be/src/runtime/query-exec-mgr.cc: PS5, Line 72: qs->refcnt_.Add(1); > Another example: an ExecPlanFragment RPC can arrive late after some instanc what you're describing is a complete corner case. i don't see a need to optimize for that. Line 141: > With the current state of things, after all fragment instances finish execu i do not want to introduce another counter to keep track of fragment instances, the whole point of this is to keep the qs around until all references to it are gone (and not just those needed for instance execution - i'd prefer to have the final status report rpc be sent when the qs gets gc'ed). those late-arriving rpcs you're describing sound like another corner case, and i don't see how they would end up consuming a lot of resources. i would be worried about them if they increased the map lock contention by a sizable amount, say 30 or 50 or 100%, but i just don't see how that's possible. aside from that, it's very easy to reduce the lock contention by an arbitrary amount (by partitioning the key space and having a separate map plus lock per partition). i just don't think we need that quite yet. http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: PS5, Line 351: TUniqueId no_instance_id_; > Sorry I meant that it doesn't seem to be initialized. Where is it initializ the default c'tor initializes it. -- To view, visit http://gerrit.cloudera.org:8080/4418 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3788: Add flag for Kudu read-your-writes
Matthew Jacobs has uploaded a new change for review. http://gerrit.cloudera.org:8080/5288 Change subject: IMPALA-3788: Add flag for Kudu read-your-writes .. IMPALA-3788: Add flag for Kudu read-your-writes The previous attempt to support for Kudu 'read-your-writes' consistency successfully captured the latest observed ts from the Kudu client after a write, and to propagate it to future Kudu clients within the same session. That alone made writes within a session linearizable, but it did not fully address 'read-your-writes' semantics because the Kudu client in the KuduScanner needed further configuration. The Kudu client exposes an option to set the 'ReadMode', which can be either READ_LATEST or READ_AT_SNAPSHOT. The former is the default and allows the client to read the latest known value for every row, and there is no consistency among the version of the rows read within that scan. When READ_AT_SNAPSHOT is enabled, the client will pick a ts that is after the latest observed session ts (propagated and set with SetLatestObservedTimestamp() by the previous commit for IMPALA-3788) and perform a snapshot read at that time. This timestamp is still determined per-client, so that does not mean that the entire query performs a snapshot read at the same timestamp-- doing that requires further work in Kudu and will require another change in Impala as well. That said, this behavior is sufficient to satisfy 'read-your-writes' consistency in all cases _except_ when a DML statement is reading and writing the same table, e.g. INSERT INTO foo SELECT ... from foo This case may result in reading rows that were inserted by a different node of the same query. This case will be handled when a global snapshot timestamp is supported and configured by Impala. Because this is performing a snapshot read, some rows may be read from lagging replicas and thus those replicas will have to wait before returning rows. This has implications for the query execution behavior (e.g. queries may be more likely to time out, may affect number of queries that can be run), so the behavior is not yet enabled by default. It can be enabled with the flag --kudu_read_snapshot. The goal is to make this the default behavior after sufficient testing. Change-Id: I003aba410548bc9158d1e11abbdcf710c31a82ff --- M be/src/exec/kudu-scanner.cc 1 file changed, 6 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/5288/1 -- To view, visit http://gerrit.cloudera.org:8080/5288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I003aba410548bc9158d1e11abbdcf710c31a82ff Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs
[Impala-ASF-CR] IMPALA-4557: Fix flakiness with FLAGS stress free pool alloc
Jim Apple has posted comments on this change. Change subject: IMPALA-4557: Fix flakiness with FLAGS_stress_free_pool_alloc .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5281/1/be/src/runtime/free-pool.h File be/src/runtime/free-pool.h: Line 214: static AtomicInt32 alloc_counts_; > Also surround this with NDEBUG? And initialize it? -- To view, visit http://gerrit.cloudera.org:8080/5281 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I373035d209a0f3477650336ee6da458fa289dbe0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4557: Fix flakiness with FLAGS stress free pool alloc
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5281 to look at the new patch set (#2). Change subject: IMPALA-4557: Fix flakiness with FLAGS_stress_free_pool_alloc .. IMPALA-4557: Fix flakiness with FLAGS_stress_free_pool_alloc FLAGS_stress_free_pool_alloc is a gflag for simulating malloc failure in debug builds. If set, FreePool::Allocate() and FreePool::Reallocate() will return NULL on every FLAGS_stress_free_pool_alloc allocations. The problem is that the counter for tracking number of allocations is updated racily by multiple threads, leading to non-determinism and flaky tests. This change fixes the problem by making the counter an AtomicInt32. Change-Id: I373035d209a0f3477650336ee6da458fa289dbe0 --- M be/src/runtime/CMakeLists.txt A be/src/runtime/free-pool.cc M be/src/runtime/free-pool.h 3 files changed, 38 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/5281/2 -- To view, visit http://gerrit.cloudera.org:8080/5281 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I373035d209a0f3477650336ee6da458fa289dbe0 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Henry Robinson has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 9: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: PS9, Line 94: prepared_promise_.Set(status); How about moving this above line 93, so anyone waiting on it won't have to wait for the report RPC to complete? http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.h File be/src/runtime/plan-fragment-executor.h: PS9, Line 109: "Open() fails" -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Hello Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5148 to look at the new patch set (#15). Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. IMPALA-4172/IMPALA-3653: Improvements to block metadata loading This patch improves the block metadata loading (locations and disk storage IDs) for partitioned and un-partitioned tables in the Catalog server. Without this patch: -- We loop through each and every file in the table/partition directories and call getFileBlockLocations() on it to obtain the block metadata. This results in large number of RPC calls to the Namenode, especially for tables with large no. of files/partitions. With this patch: --- We move the block metadata querying to use listStatus() call which accepts a directory as input and fetches the 'BlockLocation' objects for every file recursively in that directory. This improves the metadata loading in the following ways. - For non-partitioned tables, we query all the BlockLocations in a single RPC call in the base table directory and load the corresponding disk IDs. - For partitioned tables, we query the BlockLocations for all the partitions residing under the base table directories in a single RPC and then load every partition with non-default partition directory separately. - REFRESH on a table reloads the block metadata from scratch for every data file every time. So it can be used as a replacement for invalidate in situations like HDFS block rebalancing which needs block metadata update. Also, this patch does away with VolumeIds returned by the HDFS NN and uses the new StorageIDs returned by the BlockLocation class. These StorageIDs are UUID strings and hence are mapped to a per-node 0-based index as expected by the backend. In the upcoming versions of Hadoop APIs, getFileBlockStorageLocations() is deprecated and instead the listStatus() returns BlockLocations with storage IDs embedded. This patch makes use of this improvement to reduce an additional RPC to the NN to fetch the storage locations. Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 --- M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java 6 files changed, 366 insertions(+), 336 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/5148/15 -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] Fix undefined calls to builtin ctz.
Tim Armstrong has posted comments on this change. Change subject: Fix undefined calls to __builtin_ctz. .. Patch Set 4: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8460bc3f7e510ce07b8673387c9440accc432abe Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Amos Bird Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4014: Introduce query-wide execution state. .. Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/query-exec-mgr.cc File be/src/runtime/query-exec-mgr.cc: PS5, Line 72: qs->refcnt_.Add(1); > i'm not sure what you're talking about. the qs is reused for subsequent fra Another example: an ExecPlanFragment RPC can arrive late after some instances for the same query have already arrived and finished executing and GC'd the QS. The example you mention also sounds scary, since it doesn't seem like the code anticipates that. That could mean FInstances that arrive late may start execution after the query has been cancelled, wasting resources. PS5, Line 125: int32_t cnt = qs->refcnt_.Add(1); > why? it's not just the execution threads that need qs references. For the same reason I mention in the comments in L141. Line 141: > i'm not sure what you mean by 'unnecessarily'. the qs is gc'd when the refc With the current state of things, after all fragment instances finish executing (either successfully or not), there is no use of servicing any of the async calls. Those are UpdateFilter, TransmitData, CancelPlanFragment, etc. Having the code this way could lead to a chain effect where these late RPCs that arrive after all FInstances have completed (successful or not) keep getting references to the QS only to find that the query is already complete, causing unnecessary contention on the qs_map_lock_ and having the QS longer than necessary. Late RPCs will be more evident when we move to a model where we queue RPCs rather than send them immediately (i.e. after we move to KuduRPC), so I feel we need to keep that in mind with future patches moving forward. So what I'm proposing is to know when all FInstances have completed execution (via another counter) and not give away new references to async RPCs after that. Of course this is all moot if you think we will be introducing RPCs in the future that will still need to be serviced after ALL FInstances have completed executing. http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: PS5, Line 338: local_query_ctx_ > the .h file points out that it's only used when there's no querystate, ie, Ok, makes sense. http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: PS5, Line 351: /// Use pointer to avoid i > it's initialized to an invalid id and used in l127 of this file. Sorry I meant that it doesn't seem to be initialized. Where is it initialized to an invalid ID? -- To view, visit http://gerrit.cloudera.org:8080/4418 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4543: Properly escape ignored tests subdirectories.
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-4543: Properly escape ignored tests subdirectories. .. IMPALA-4543: Properly escape ignored tests subdirectories. In the shell, double-quoted strings are not very close to "raw" strings; double quotes end the string, but parameter expansion is also performed forstrings like "${FOO}". To pass strings from Python to the shell, I have replaced double quotes with single quotes and escaped the single quote characters in the strings. While I am here, add better logging in TestExecutor.run_tests to make errors like this easier to diagnose. Change-Id: I006eb559ec5f5b5b0379997fab945116dfc7e8d7 Reviewed-on: http://gerrit.cloudera.org:8080/5242 Reviewed-by: Jim AppleTested-by: Impala Public Jenkins --- M tests/run-tests.py 1 file changed, 11 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Verified Jim Apple: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I006eb559ec5f5b5b0379997fab945116dfc7e8d7 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4543: Properly escape ignored tests subdirectories.
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4543: Properly escape ignored tests subdirectories. .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I006eb559ec5f5b5b0379997fab945116dfc7e8d7 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] Start a docs build system.
Jim Apple has posted comments on this change. Change subject: Start a docs build system. .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/5238/2/docs/Makefile File docs/Makefile: > I guess my feeling is that make is quirky enough that it's not simple to le As far as I know, it will remain simple. -- To view, visit http://gerrit.cloudera.org:8080/5238 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9551b75268cb7cb29a58367a3ef03b127dccbfca Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Anonymous Coward #250 Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Start a docs build system.
Tim Armstrong has posted comments on this change. Change subject: Start a docs build system. .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/5238/2/docs/Makefile File docs/Makefile: > I considered it, but I thought for a build this primitive it would be best I guess my feeling is that make is quirky enough that it's not simple to learn and tricky to debug once you have non-trivial makefiles. It seems like it we could probably write an equivalent cmake file with add_custom_command, etc and it wouldn't be any more complex. I think it depends how much this is likely to grow. If it's going to get a lot more complex I'm in favour of avoiding make. If it's going to stay like this, make seems fine. -- To view, visit http://gerrit.cloudera.org:8080/5238 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9551b75268cb7cb29a58367a3ef03b127dccbfca Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Anonymous Coward #250 Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix undefined calls to builtin ctz.
Jim Apple has posted comments on this change. Change subject: Fix undefined calls to __builtin_ctz. .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/5004/3/be/src/udf_samples/hyperloglog-uda.cc File be/src/udf_samples/hyperloglog-uda.cc: PS3, Line 78: uint > uint64_t? Seems clearer to specify the type and it's not too verbose. Done PS3, Line 80: (hash_top_bit > hash_top_bits != 0 would make the intent clearer (it's a matter of taste, b Done http://gerrit.cloudera.org:8080/#/c/5004/3/be/src/util/bit-util.h File be/src/util/bit-util.h: Line 32: #include > Is this used? Yes, but I was just moving it in the #include list, not using it. It's used elsewhere. Line 247: /// zero. > We don't use the optional argument, so it seems like we should just remove We do use the optional argument in be/src/exprs/aggregate-functions-ir.cc. Explained the undefinedness in a comment. -- To view, visit http://gerrit.cloudera.org:8080/5004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8460bc3f7e510ce07b8673387c9440accc432abe Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Amos Bird Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix undefined calls to builtin ctz.
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5004 to look at the new patch set (#4). Change subject: Fix undefined calls to __builtin_ctz. .. Fix undefined calls to __builtin_ctz. GCC's __builtin_ctz[l[l]] functions return undefined results when the argument is 0. This patch handles that 0 case, which could otherwise produce results that depend on the architecture. Change-Id: I8460bc3f7e510ce07b8673387c9440accc432abe --- M be/src/exprs/aggregate-functions-ir.cc M be/src/udf_samples/hyperloglog-uda.cc M be/src/util/bit-util.h 3 files changed, 45 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/5004/4 -- To view, visit http://gerrit.cloudera.org:8080/5004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8460bc3f7e510ce07b8673387c9440accc432abe Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Amos Bird Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Marcel Kornacker has uploaded a new patch set (#6). Change subject: IMPALA-4014: Introduce query-wide execution state. .. IMPALA-4014: Introduce query-wide execution state. This introduces a global structure to coordinate execution of fragment instances on a backend for a single query. New classes: - QueryExecMgr: subsumes FragmentMgr - QueryState - FragmentInstanceState: replaces FragmentExecState It also adds a static function Thread::Run() to run a function in a newly created thread. This is now used to execute a fragment instance, instead of managing and destroying Thread objects via shared_ptrs, etc. Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b --- M be/src/benchmarks/expr-benchmark.cc M be/src/exec/catalog-op-executor.cc M be/src/exec/data-source-scan-node.cc M be/src/exec/exchange-node.cc M be/src/exec/hash-table-test.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/union-node.cc M be/src/exprs/expr-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h R be/src/runtime/fragment-instance-state.cc A be/src/runtime/fragment-instance-state.h M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h A be/src/runtime/query-exec-mgr.cc A be/src/runtime/query-exec-mgr.h A be/src/runtime/query-state.cc A be/src/runtime/query-state.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/thread-resource-mgr.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/request-pool-service.cc M be/src/scheduling/simple-scheduler.cc M be/src/service/CMakeLists.txt M be/src/service/fe-support.cc D be/src/service/fragment-exec-state.h D be/src/service/fragment-mgr.cc D be/src/service/fragment-mgr.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-http-handler.cc A be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-exec-state.cc M be/src/service/query-exec-state.h M be/src/testutil/desc-tbl-builder.h M be/src/udf/udf.cc M be/src/util/container-util.h M be/src/util/thread.cc M be/src/util/thread.h M be/src/util/uid-util.h M common/thrift/ImpalaInternalService.thrift M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java M fe/src/test/java/org/apache/impala/testutil/TestUtils.java 67 files changed, 1,215 insertions(+), 778 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/6 -- To view, visit http://gerrit.cloudera.org:8080/4418 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes. .. Patch Set 4: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/5274/4/testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test File testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test: Line 12: from tpch_nested_parquet.customer c, c.c_orders o out of curiosity, why does this require a nested table to invoke the cancellation path? -- To view, visit http://gerrit.cloudera.org:8080/5274 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Fix undefined cals to builtin ctz.
Tim Armstrong has posted comments on this change. Change subject: Fix undefined cals to __builtin_ctz. .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/5004/3/be/src/udf_samples/hyperloglog-uda.cc File be/src/udf_samples/hyperloglog-uda.cc: PS3, Line 78: auto uint64_t? Seems clearer to specify the type and it's not too verbose. PS3, Line 80: hash_top_bits hash_top_bits != 0 would make the intent clearer (it's a matter of taste, but that's generally what we do in the codebase). http://gerrit.cloudera.org:8080/#/c/5004/3/be/src/util/bit-util.h File be/src/util/bit-util.h: Line 32: #include Is this used? Line 247: unsigned int v, int otherwise = sizeof(unsigned int) * CHAR_BIT) { We don't use the optional argument, so it seems like we should just remove that generality. Would be good to explain that __builtin_ctz(0) is undefined too. -- To view, visit http://gerrit.cloudera.org:8080/5004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8460bc3f7e510ce07b8673387c9440accc432abe Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Amos Bird Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4014: Introduce query-wide execution state. .. Patch Set 5: (11 comments) http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS5, Line 938: || > joining condition should go before the line break? no http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/query-exec-mgr.cc File be/src/runtime/query-exec-mgr.cc: PS5, Line 72: qs->refcnt_.Add(1); > If this is the logic we're going with, I would suggest that this patch and i'm not sure what you're talking about. the qs is reused for subsequent fragment instances, unless you have some corner case (e.g., first instance fails and gc's qs). PS5, Line 125: int32_t cnt = qs->refcnt_.Add(1); > After we have a per-query RPC, I think we should disallow getting the QS if why? it's not just the execution threads that need qs references. PS5, Line 135: qs->refcnt_.Load() > I think it will be better to put this LOG after getting the local 'cnt' val Done Line 141: > One problem I see with this is, every time we hit refcount=0, there is some i'm not sure what you mean by 'unnecessarily'. the qs is gc'd when the refcnt goes to 0. i don't see a problem with another async thread increment the refcnt again, given that that same thread will eventually also decrement it. why would that be a problem? http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/query-exec-mgr.h File be/src/runtime/query-exec-mgr.h: PS5, Line 66: /// lock order: : /// 1. qs_map_lock_ : /// 2. QueryState::refcnt_lock_ > Remove now that refcnt is an AtomicInt Done PS5, Line 69: boost::mutex qs_map_lock_ > This will soon need to be a RWlock, or it could introduce scaling issues. B i don't think it'll cause scaling issues, at least not at typical qps levels, but i'll let mostafa look into it. http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: PS5, Line 338: local_query_ctx_ > If we're going to have a local_query_ctx_ anyway, why bother with accessing the .h file points out that it's only used when there's no querystate, ie, for tests. we definitely do not want to make a gratuitous copy of the queryctx, which can be a *very* large structure. http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: PS5, Line 49: class ExecEnv; : class DataStreamMgr; : class HBaseTableFactory; : class TPlanFragmentCtx; : class TPlanFragmentInstanceCtx; : class QueryState; > Alphabetical order? that's immaterial to correctness or comprehensibility. PS5, Line 329: // needed? : // static const int DEFAULT_BATCH_SIZE = 1024; > Can remove? since its moved to query-state.h Done PS5, Line 351: TUniqueId no_instance_id_; > Who sets this? It doesn't seem to be used now. My guess is it should be set it's initialized to an invalid id and used in l127 of this file. -- To view, visit http://gerrit.cloudera.org:8080/4418 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4397 addendum: remove stray semicolon
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4397 addendum: remove stray semicolon .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5261 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie16e403ae0eb657f0614a1a90b0556f9f1d1056e Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4397 addendum: remove stray semicolon
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4397 addendum: remove stray semicolon .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5261 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie16e403ae0eb657f0614a1a90b0556f9f1d1056e Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Hello Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5148 to look at the new patch set (#14). Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. IMPALA-4172/IMPALA-3653: Improvements to block metadata loading This patch improves the block metadata loading (locations and disk storage IDs) for partitioned and un-partitioned tables in the Catalog server. Without this patch: -- We loop through each and every file in the table/partition directories and call getFileBlockLocations() on it to obtain the block metadata. This results in large number of RPC calls to the Namenode, especially for tables with large no. of files/partitions. With this patch: --- We move the block metadata querying to use listStatus() call which accepts a directory as input and fetches the 'BlockLocation' objects for every file recursively in that directory. This improves the metadata loading in the following ways. - For non-partitioned tables, we query all the BlockLocations in a single RPC call in the base table directory and load the corresponding disk IDs. - For partitioned tables, we query the BlockLocations for all the partitions residing under the base table directories in a single RPC and then load every partition with non-default partition directory separately. - REFRESH on a table reloads the block metadata from scratch for every data file every time. So it can be used as a replacement for invalidate in situations like HDFS block rebalancing which needs block metadata update. Also, this patch does away with VolumeIds returned by the HDFS NN and uses the new StorageIDs returned by the BlockLocation class. These StorageIDs are UUID strings and hence are mapped to a per-node 0-based index as expected by the backend. In the upcoming versions of Hadoop APIs, getFileBlockStorageLocations() is deprecated and instead the listStatus() returns BlockLocations with storage IDs embedded. This patch makes use of this improvement to reduce an additional RPC to the NN to fetch the storage locations. Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 --- A fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java 5 files changed, 359 insertions(+), 335 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/5148/14 -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. Patch Set 13: (13 comments) http://gerrit.cloudera.org:8080/#/c/5148/13/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java: PS13, Line 41: private DiskIdMapper() { : } > nit: single line Done PS13, Line 44: "storage ID UUID string" > nit: why quoted? Unquoted Line 55: * Returns a disk id (0-based) index for storageId on host 'host'. > This function doesn't only return a disk id (implying this is a getter), it Made the comment little detailed. PS13, Line 68: .intValue() > why do you need this? Won't this be unboxed automatically? Done PS13, Line 69: synchronized (storageIdGenerator) { > The common path now includes two calls to storageUuidToDiskId.get(), one sy Discussed in person. It was put outside the synchronized block so that callers with already mapped storage ids needn't enter the synchronized block. PS13, Line 74: stoprageUuid > typo: storageUuid Done http://gerrit.cloudera.org:8080/#/c/5148/13/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: PS13, Line 315: if (!FileSystemUtil.isChildPath(partDir, dirPath)) continue; > Is this check really needed? Haven't you done this check while populating p Currently we don't do it at the callers. So I added the check here. partsByPath includes all the partitions that need to be updated. However this method does only for descendants of dirPath. PS13, Line 316: perPartitionFileDescMap_ > If I recall correctly, I added this to speedup incremental loading of file Looking closely, yes this map doesn't make sense anymore. We are unnecessarily doing puts and gets into it. Removed.Nice catch btw. PS13, Line 431: > nit: extra space Done PS13, Line 441: // Synthesize the block metadata for the file descriptor. : long start = 0; : long remaining = fd.getFileLength(); : // Workaround HADOOP-11584 by using the filesystem default block size rather than : // the block size from the FileStatus. : // TODO: after HADOOP-11584 is resolved, get the block size from the FileStatus. : long blockSize = fs.getDefaultBlockSize(); : if (blockSize < MIN_SYNTHETIC_BLOCK_SIZE) blockSize = MIN_SYNTHETIC_BLOCK_SIZE; : Preconditions.checkState(partitions.size() > 0); : // For the purpose of synthesizing block metadata, we assume that all partitions : // with the same location have the same file format. : HdfsFileFormat fileFormat = partitions.get(0).getFileFormat(); : if (!fileFormat.isSplittable(HdfsCompression.fromFileName(fd.getFileName( { : blockSize = remaining; : } : while (remaining > 0) { : long len = Math.min(remaining, blockSize); : List replicas = Lists.newArrayList( : new BlockReplica(hostIndex_.getIndex(REMOTE_NETWORK_ADDRESS), false)); : fd.addFileBlock(new FileBlock(start, len, replicas)); : remaining -= len; : start += len; : } > I would put this in a separate function. Done PS13, Line 474: numHdfsFiles_++; : totalHdfsBytes_ += partition.getSize(); > Is this counting correct here? We increment them for every fd (and its corresponding partition). We seem to be implementing reload() as drop() + add(). dropPartition() is subtracting them when required. Also loadPartitionFileMetadata() subtracts when we reload a subset of partitions. Do you think anything could be done better here? I agree its confusing since its done in multiple places. http://gerrit.cloudera.org:8080/#/c/5148/13/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: PS13, Line 289: BlockLocation > Is this the name of the API call? If not, can you use the exact function na Rephrased it a little. Its no specific API call. Its just that only 'DistributedFileSystem's actually set storageIds in the BlockLocations and others don't. PS13, Line 418: child > child or descendant? Descendant. Updated the method name and callers. -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master
[Impala-ASF-CR] IMPALA-4498: crash in to utc timestamp/from utc timestamp
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4498: crash in to_utc_timestamp/from_utc_timestamp .. Patch Set 3: Rebased -- To view, visit http://gerrit.cloudera.org:8080/5251 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idc427b06ac33ec874a05cb98d01c00e970d3dde6 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4014: Introduce query-wide execution state. .. Patch Set 5: (11 comments) Did a quick first pass. http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS5, Line 938: || joining condition should go before the line break? http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/query-exec-mgr.cc File be/src/runtime/query-exec-mgr.cc: PS5, Line 72: qs->refcnt_.Add(1); If this is the logic we're going with, I would suggest that this patch and the per-query RPC patch make it in with a relatively short gap between them. Else we will have the penalty of tearing down and setting up a QS multiple times for the same query. PS5, Line 125: int32_t cnt = qs->refcnt_.Add(1); After we have a per-query RPC, I think we should disallow getting the QS if the refcount has already reached 0 (since that would mean all the fragments have completed). If you agree, you can add that as a TODO here. PS5, Line 135: qs->refcnt_.Load() I think it will be better to put this LOG after getting the local 'cnt' value and printing it as "refcnt=" << cnt + 1. So we can at least via the logs understand what decision was made by this function (attempt to gc vs return) based on the log line. Eg: If we see "refcnt=1", we will know that this would have attempted a GC. Also, its good to not lock the bus twice. Line 141: One problem I see with this is, every time we hit refcount=0, there is some leeway for some async call to get a reference. However, those async calls (late reports, late filters, etc.) will mostly be of no use since if the refcount hit 0, that means all the fragments are done executing. So why keep the QS around unnecessarily? http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/query-exec-mgr.h File be/src/runtime/query-exec-mgr.h: PS5, Line 66: /// lock order: : /// 1. qs_map_lock_ : /// 2. QueryState::refcnt_lock_ Remove now that refcnt is an AtomicInt PS5, Line 69: boost::mutex qs_map_lock_ This will soon need to be a RWlock, or it could introduce scaling issues. But that will have its own issues of reader or writer starvation, so won't be a trivial change. What do you think? http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: PS5, Line 338: local_query_ctx_ If we're going to have a local_query_ctx_ anyway, why bother with accessing the query_state_->query_ctx() ? http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: PS5, Line 49: class ExecEnv; : class DataStreamMgr; : class HBaseTableFactory; : class TPlanFragmentCtx; : class TPlanFragmentInstanceCtx; : class QueryState; Alphabetical order? PS5, Line 329: // needed? : // static const int DEFAULT_BATCH_SIZE = 1024; Can remove? since its moved to query-state.h PS5, Line 351: TUniqueId no_instance_id_; Who sets this? It doesn't seem to be used now. My guess is it should be set in the second constructor? -- To view, visit http://gerrit.cloudera.org:8080/4418 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4557: Fix flakiness with FLAGS stress free pool alloc
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4557: Fix flakiness with FLAGS_stress_free_pool_alloc .. Patch Set 1: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/5281/1/be/src/runtime/free-pool.h File be/src/runtime/free-pool.h: Line 214: static AtomicInt32 alloc_counts_; Also surround this with NDEBUG? -- To view, visit http://gerrit.cloudera.org:8080/5281 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I373035d209a0f3477650336ee6da458fa289dbe0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Bracketing Java logging output with log level checks.
Marcel Kornacker has uploaded a new change for review. http://gerrit.cloudera.org:8080/5284 Change subject: Bracketing Java logging output with log level checks. .. Bracketing Java logging output with log level checks. This reduces creation of intermediate objects and improves performance. Change-Id: Ie0f5123dbf2caf3b03183c76820599920baa9785 --- M fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/Column.java M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java M fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/TableLoader.java M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/main/java/org/apache/impala/extdatasource/ExternalDataSourceExecutor.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsPartitionFilter.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M fe/src/main/java/org/apache/impala/planner/SelectNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/MetadataOp.java M fe/src/main/java/org/apache/impala/util/FsPermissionChecker.java M fe/src/main/java/org/apache/impala/util/HdfsCachingUtil.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/main/java/org/apache/impala/util/RequestPoolService.java M fe/src/main/java/org/apache/impala/util/SentryPolicyService.java 39 files changed, 392 insertions(+), 193 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/5284/1 -- To view, visit http://gerrit.cloudera.org:8080/5284 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ie0f5123dbf2caf3b03183c76820599920baa9785 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel Kornacker
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Alex Behm has posted comments on this change. Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/5148/13/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java: PS13, Line 69: synchronized (storageIdGenerator) { > The common path now includes two calls to storageUuidToDiskId.get(), one sy The common case is that at some point this mapping will be complete, and most requests will be served by doing a get() on the concurrent hashmap. The CC hashmap does not lock on get(), i.e., in the common case we can get away with any locking. -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4557: Fix flakiness with FLAGS stress free pool alloc
Michael Ho has uploaded a new change for review. http://gerrit.cloudera.org:8080/5281 Change subject: IMPALA-4557: Fix flakiness with FLAGS_stress_free_pool_alloc .. IMPALA-4557: Fix flakiness with FLAGS_stress_free_pool_alloc FLAGS_stress_free_pool_alloc is a gflag for simulating malloc failure in debug builds. If set, FreePool::Allocate() and FreePool::Reallocate() will return NULL on every FLAGS_stress_free_pool_alloc allocations. The problem is that the counter for tracking number of allocations is updated racily by multiple threads, leading to non-determinism and flaky tests. This change fixes the problem by making the counter an AtomicInt32. Change-Id: I373035d209a0f3477650336ee6da458fa289dbe0 --- M be/src/runtime/CMakeLists.txt A be/src/runtime/free-pool.cc M be/src/runtime/free-pool.h 3 files changed, 32 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/5281/1 -- To view, visit http://gerrit.cloudera.org:8080/5281 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I373035d209a0f3477650336ee6da458fa289dbe0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho
[Impala-ASF-CR] IMPALA-4433: Fix undefined NDV calculations
Jim Apple has restored this change. Change subject: IMPALA-4433: Fix undefined NDV calculations .. Restored No longer needed for stats, but might as well eliminate undefined behavior -- To view, visit http://gerrit.cloudera.org:8080/5004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: restore Gerrit-Change-Id: I8460bc3f7e510ce07b8673387c9440accc432abe Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Amos Bird Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Fix undefined cals to builtin ctz.
Jim Apple has abandoned this change. Change subject: Fix undefined cals to __builtin_ctz. .. Abandoned https://gerrit.cloudera.org/#/c/5004/ -- To view, visit http://gerrit.cloudera.org:8080/5278 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: Id569c48824d1c4575d6bfde106df0eb80ab9623c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple
[Impala-ASF-CR] Fix undefined cals to builtin ctz.
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5004 to look at the new patch set (#3). Change subject: Fix undefined cals to __builtin_ctz. .. Fix undefined cals to __builtin_ctz. GCC's __builtin_ctz[l[l]] functions return undefined results when the argument is 0. This patch handles that 0 case, which could otherwise produce results that depend on the architecture. Change-Id: I8460bc3f7e510ce07b8673387c9440accc432abe --- M be/src/exprs/aggregate-functions-ir.cc M be/src/udf_samples/hyperloglog-uda.cc M be/src/util/bit-util.h 3 files changed, 43 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/5004/3 -- To view, visit http://gerrit.cloudera.org:8080/5004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8460bc3f7e510ce07b8673387c9440accc432abe Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Amos Bird Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Fix undefined cals to builtin ctz.
Jim Apple has uploaded a new change for review. http://gerrit.cloudera.org:8080/5278 Change subject: Fix undefined cals to __builtin_ctz. .. Fix undefined cals to __builtin_ctz. GCC's __builtin_ctz[l[l]] functions return undefined results when the argument is 0. This patch handles that 0 case, which could otherwise produce results that depend on the architecture. Change-Id: Id569c48824d1c4575d6bfde106df0eb80ab9623c --- M be/src/exprs/aggregate-functions-ir.cc M be/src/udf_samples/hyperloglog-uda.cc M be/src/util/bit-util.h 3 files changed, 43 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/5278/1 -- To view, visit http://gerrit.cloudera.org:8080/5278 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Id569c48824d1c4575d6bfde106df0eb80ab9623c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/5250/6/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: Line 93: if (!status.ok()) FragmentComplete(status); > Yeah, but that's also how the old code worked. It's benign since in case o Changed this up a bit so we still call FragmentComplete() here. http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: Line 488: DCHECK(!report_thread_active_); this patch was mostly a rebase, but also added these DCHECKs. -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Hello Henry Robinson, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5250 to look at the new patch set (#9). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h 4 files changed, 62 insertions(+), 120 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/9 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 8: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Hello Henry Robinson, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5250 to look at the new patch set (#8). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h 4 files changed, 59 insertions(+), 120 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/8 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/5250/7/be/src/service/fragment-exec-state.cc File be/src/service/fragment-exec-state.cc: PS7, Line 51: prepare_promise_ > Can't we just use the PFEs prepared_promise_ ? They both track the same val Yes, if we expose a timeout interface from PFE. Given this is preexisting and these classes will probably need major refactoring for MT soon, I'd rather not muck with it now. PS7, Line 62: DCHECK(status.ok() || done); // if !status.ok() => done > Redundant with SendReport() Yes, but it is a precondition to both, and since this is the dcheck that caught the bug in the first place, I'm inclined to leave it. (there's no reason there can't be other callers to this method, and SendReport() doesn't always call this-- well it does after the coordinator fragment change, but the code hasn't been fully cleaned up to reflect that yet). -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/5250/6/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: Line 93: if (!status.ok() && !report_status_cb_.empty()) { > There's a bug here. If PrepareInternal() fails after AcquireThreadToken(), Yeah, but that's also how the old code worked. It's benign since in case of failure the resource pool will go away shortly, but I agree it's best to address this (and it's the original reason why I made Exec() always call FragmentComplete(), whereas it used to not on failure). PS6, Line 94: // FragmentComplete() depends on Prepare() having executed successfully, so must : // call report_status_cb_ directly if Prepare() failed. > Is this only because of the usage of per_host_mem_usage_ in SendReport()? I Also the profile() is in an "undefined" state. Let me fix this up. -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/5250/6/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: Line 93: if (!status.ok() && !report_status_cb_.empty()) { There's a bug here. If PrepareInternal() fails after AcquireThreadToken(), the token will never be released. PS6, Line 94: // FragmentComplete() depends on Prepare() having executed successfully, so must : // call report_status_cb_ directly if Prepare() failed. Is this only because of the usage of per_host_mem_usage_ in SendReport()? If so, you can just check if that is 'nullptr' before using it there. Or we can call ReleaseThreadToken() here. Whatever seems better. http://gerrit.cloudera.org:8080/#/c/5250/7/be/src/service/fragment-exec-state.cc File be/src/service/fragment-exec-state.cc: PS7, Line 51: prepare_promise_ Can't we just use the PFEs prepared_promise_ ? They both track the same value. PS7, Line 62: DCHECK(status.ok() || done); // if !status.ok() => done Redundant with SendReport() -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 7: > Uploaded patch set 7. Fixed a comment about cancellation which, I don't think is true. And added comments to document the lifecycle requirements. -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Hello Henry Robinson, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5250 to look at the new patch set (#7). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h 4 files changed, 58 insertions(+), 118 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/7 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading .. Patch Set 13: (13 comments) http://gerrit.cloudera.org:8080/#/c/5148/13/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java: PS13, Line 41: private DiskIdMapper() { : } nit: single line PS13, Line 44: "storage ID UUID string" nit: why quoted? Line 55: * Returns a disk id (0-based) index for storageId on host 'host'. This function doesn't only return a disk id (implying this is a getter), it also generates and adds a disk id for a non-existentpair. That needs to be documented. PS13, Line 68: .intValue() why do you need this? Won't this be unboxed automatically? PS13, Line 69: synchronized (storageIdGenerator) { The common path now includes two calls to storageUuidToDiskId.get(), one synchronized block. Won't it be better if you make this function synchronized and remove the test and test and set logic? In this case you only need one call to storageUuidToDiskId.get(), this doesn't need to be a concurrent hash map and you still have 1 synchronized block with the same number of operations in it. PS13, Line 74: stoprageUuid typo: storageUuid http://gerrit.cloudera.org:8080/#/c/5148/13/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: PS13, Line 315: if (!FileSystemUtil.isChildPath(partDir, dirPath)) continue; Is this check really needed? Haven't you done this check while populating partsByPath? PS13, Line 316: perPartitionFileDescMap_ If I recall correctly, I added this to speedup incremental loading of file descriptors for files that already existed in the metadata and for which their blocks hadn't changed. My understanding is that this code is no longer here (i.e. we already reload the file/block metadata from scratch). So, is this map still needed? PS13, Line 431: nit: extra space PS13, Line 441: // Synthesize the block metadata for the file descriptor. : long start = 0; : long remaining = fd.getFileLength(); : // Workaround HADOOP-11584 by using the filesystem default block size rather than : // the block size from the FileStatus. : // TODO: after HADOOP-11584 is resolved, get the block size from the FileStatus. : long blockSize = fs.getDefaultBlockSize(); : if (blockSize < MIN_SYNTHETIC_BLOCK_SIZE) blockSize = MIN_SYNTHETIC_BLOCK_SIZE; : Preconditions.checkState(partitions.size() > 0); : // For the purpose of synthesizing block metadata, we assume that all partitions : // with the same location have the same file format. : HdfsFileFormat fileFormat = partitions.get(0).getFileFormat(); : if (!fileFormat.isSplittable(HdfsCompression.fromFileName(fd.getFileName( { : blockSize = remaining; : } : while (remaining > 0) { : long len = Math.min(remaining, blockSize); : List replicas = Lists.newArrayList( : new BlockReplica(hostIndex_.getIndex(REMOTE_NETWORK_ADDRESS), false)); : fd.addFileBlock(new FileBlock(start, len, replicas)); : remaining -= len; : start += len; : } I would put this in a separate function. PS13, Line 474: numHdfsFiles_++; : totalHdfsBytes_ += partition.getSize(); Is this counting correct here? http://gerrit.cloudera.org:8080/#/c/5148/13/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: PS13, Line 289: BlockLocation Is this the name of the API call? If not, can you use the exact function name? PS13, Line 418: child child or descendant? -- To view, visit http://gerrit.cloudera.org:8080/5148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 5: (5 comments) Henry, could you take a look at patchset 6? The changes to address your comments were non-trivial to please re-review. http://gerrit.cloudera.org:8080/#/c/5250/4/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: PS4, Line 294: VLOG_QUERY << "Open(): instance_id=" << runtime_state_->fragment_instance_id(); : S > nit: one line Done PS4, Line 298: status; > here it seems clearer to return status. Oops, yes, done. PS4, Line 331: { > I'm still of the opinion that this logic is better placed in FragmentExecSt I see what you mean now. Yeah, made that change, and also moved the callback invocation for when prepare fails into this class, so that at least it's consistent. I still think we should remove the callback indirection and simplify more, but agree this is a better intermediate state. PS4, Line 447: > the Done http://gerrit.cloudera.org:8080/#/c/5250/4/be/src/runtime/plan-fragment-executor.h File be/src/runtime/plan-fragment-executor.h: PS4, Line 249: > Nit: extra space. Done -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Hello Henry Robinson, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5250 to look at the new patch set (#6). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h 4 files changed, 50 insertions(+), 113 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/6 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Hello Henry Robinson, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5250 to look at the new patch set (#5). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h 4 files changed, 46 insertions(+), 113 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/5 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4543: Properly escape ignored tests subdirectories.
Jim Apple has posted comments on this change. Change subject: IMPALA-4543: Properly escape ignored tests subdirectories. .. Patch Set 3: Code-Review+2 (1 comment) carry +2 http://gerrit.cloudera.org:8080/#/c/5242/2/tests/run-tests.py File tests/run-tests.py: PS2, Line 144: , single quotes cannot > Would be nice to add a comment about the complication with single and doub Done -- To view, visit http://gerrit.cloudera.org:8080/5242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I006eb559ec5f5b5b0379997fab945116dfc7e8d7 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes