[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState
Henry Robinson has posted comments on this change. Change subject: IMPALA-4410: Safer tear-down of RuntimeState .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4893/1/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: PS1, Line 300: RuntimeState::Close() > The ownership is actually really simple - TestEnv owns the RuntimeStates an Oh, I see - no the problem is that TestEnv a) calls them query states and b) allocates and returns a naked pointer before finally wrapping it in a shared_ptr at the end of the method. I can make that change. -- To view, visit http://gerrit.cloudera.org:8080/4893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1286: Extract common conjuncts from disjunctions.
Alex Behm has posted comments on this change. Change subject: IMPALA-1286: Extract common conjuncts from disjunctions. .. Patch Set 1: (11 comments) Still working on a minor test fix, sending out to continue CR. http://gerrit.cloudera.org:8080/#/c/4877/1/fe/src/main/java/org/apache/impala/rewrite/ExtractCommonConjunctRule.java File fe/src/main/java/org/apache/impala/rewrite/ExtractCommonConjunctRule.java: Line 31: * Extracts common conjuncts from disjunctive CompoundPredicates. > Might be worth commenting on how this handles > 2 disjuncts. I.e. you need Expanded comments on ExprRewriter and modified comment here. (I don't think we should document how the framework works in every rule.) Line 37: public class ExtractCommonConjunctRule implements ExprRewriteRule { > where is this being applied? At the end of analysis, made those changes (had forgotten this after rebasing/cleaning) Line 42: if (!Expr.IS_OR_PREDICATE.apply(expr)) return expr; > Isn't this check a little weak in the sense that it only checks the root's Having separate rules for normalization certainly makes sense, but I see them as being independent of this change. I don't think we'd change the code/behavior of the current rule even if we had normalization. In your example, we would not return 'a' because the single conjunct of child0 would be (a OR b) which is not common to any of the child1 conjuncts "a" and "c". Maybe you can come up with another example that breaks? Line 49: if (child1Conjuncts.contains(conjunct)) { > This is n^2 so will get very slow if we have long lists of disjuncts, e.g. Agree that this is problematic. Unfortunately, also non-trivial to fix and make sure the hashCode() and equals() functions are correct. I added an arbitrary threshold on the maximum number of Expr.equals() comparisons to bail on these pathological cases. It's also not clear whether this rule would be much worse than the codegen in the BE. Not a good argument to avoid the improvement, but the improvement here alone might not make a material impact for the overall query. Line 72: if (!child0Conjuncts.isEmpty()) { > At this point both child0Conjuncts and child1Conjuncts are non-empty becaus Right, forgot to correct after some cleanup. Done. PS1, Line 84: newDisjuncts.size() > 1 > See above - this should always be true. You are right, thanks. Done. http://gerrit.cloudera.org:8080/#/c/4877/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java: Line 146: "(!(int_col=5 or tinyint_col > 9) and double_col = 7) or " + > Slightly tangential, but should we apply De Morgan's first to normalise the Completely agree about having additional rules for normalization. I view those as somewhat independent of the current rule. I don't think the code/approach of the current rule would change much even if we had normalization. Of course, normalizing would make the current rule useful in more cases. Let's continue this train of thought and come up with useful normalization rules in a separate change set. Definitely needed. Line 153: "(int_col < 10 and id between 5 and 6) or " + > Does common conjunct extraction work if you have a BETWEEN expression and t Good point, there was a bug. Fixed bug and added tests. http://gerrit.cloudera.org:8080/#/c/4877/1/testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test File testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test: Line 3384: p_brand = 'Brand#12' > i agree, i think we need normalization as a separate transformation step. t Agree. Line 3384: p_brand = 'Brand#12' > I'm not really sure how this patch handles a bushy predicate. Do we flatten Agree we need normalization as a separate set of rules. I'm not sure normalization is required for your specific example though. The interesting conjuncts are those that bubble up all the way to the top so that, e.g., we can assign them as scan predicates. I think the current implementation does that, although some opportunities for simplifying disjunctive subexpressions are missed. Line 3393: p_brand = 'Brand#23' > additional idea for transformation rule: derive additional, non-essential d Agree. Sounds like IMPALA-2108. -- To view, visit http://gerrit.cloudera.org:8080/4877 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3cf9b950afaa3fd753d1b09ba5e540b5258940ad Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1286: Extract common conjuncts from disjunctions.
Alex Behm has restored this change. Change subject: IMPALA-1286: Extract common conjuncts from disjunctions. .. Restored -- To view, visit http://gerrit.cloudera.org:8080/4877 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: restore Gerrit-Change-Id: I3cf9b950afaa3fd753d1b09ba5e540b5258940ad Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4415: Fix unassigned scan range of size 1
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4415: Fix unassigned scan range of size 1 .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4907 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id3af767ee9d121ca62ac383ef9e696a18dc903d6 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4365: Enabling end-to-end tests on a remote cluster
David Knupp has posted comments on this change. Change subject: IMPALA-4365: Enabling end-to-end tests on a remote cluster .. Patch Set 9: Ignore remote_load_data.py in patch set 8. I accidentally pushed an interim in-progress version of the file in that patch set. Patch set 9 has the right code. Sorry about the confusion. -- To view, visit http://gerrit.cloudera.org:8080/4769 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David KnuppGerrit-Reviewer: David Knupp Gerrit-Reviewer: Harrison Sheinblatt Gerrit-Reviewer: Martin Grund Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4352: test infra: store Impala/Kudu primary keys in object model
Taras Bobrovytsky has posted comments on this change. Change subject: IMPALA-4352: test infra: store Impala/Kudu primary keys in object model .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4873/1/tests/comparison/db_connection.py File tests/comparison/db_connection.py: Line 407: col.is_primary_key = True col.is_primary_key = col_name in primary_key_names -- To view, visit http://gerrit.cloudera.org:8080/4873 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib654b6cd0e8c2a172ffb7330497be4d4a751e6e5 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael BrownGerrit-Reviewer: David Knupp Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Harrison Sheinblatt Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3710: Kudu DML should ignore conflicts by default
Matthew Jacobs has uploaded a new change for review. http://gerrit.cloudera.org:8080/4911 Change subject: IMPALA-3710: Kudu DML should ignore conflicts by default .. IMPALA-3710: Kudu DML should ignore conflicts by default Removes the non-standard IGNORE syntax that was allowed for DML into Kudu tables to indicate that certain errors should be ignored, i.e. not fail the query and continue. However, because there is no way to 'roll back' mutations that occurred before an error occurs, tables are left in an inconsistent state and it's difficult to know what rows were successfully modified vs which rows were not. Instead, this change makes it so that we always 'ignore' these conflicts, i.e. a 'best effort'. In the future, when Kudu will provide the mechanisms Impala needs to provide a notion of isolation levels, then Impala will be able to provide options for more traditional semantics. After this change, the following errors are ignored: * INSERT where the PK already exists * UPDATE/DELETE where the PK doesn't exist Another follow-up patch will change other violations to be handled in this way as well, e.g. nulls inserted in non-nullable cols. Reporting: The number of rows inserted is reported to the coordinator, which makes the aggregate available to the shell and via the profile. TODO: Return rows modified for INSERT via HS2 (IMPALA-1789). TODO: Return rows modified for other CRUD (beeswax+hs2) (IMPALA-3713). TODO: Return error counts for specific warnings (IMPALA-4416). Testing: Updated tests. Ran all functional tests. More tests will be needed when other conflicts are handled in the same way. Change-Id: I83b5beaa982d006da4997a2af061ef7c22cad3f1 --- M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-table-sink.h M common/thrift/DataSinks.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java M fe/src/main/java/org/apache/impala/planner/TableSink.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeModifyStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-planner/queries/PlannerTest/kudu-delete.test M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test 15 files changed, 70 insertions(+), 146 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/4911/1 -- To view, visit http://gerrit.cloudera.org:8080/4911 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I83b5beaa982d006da4997a2af061ef7c22cad3f1 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs
[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4410: Safer tear-down of RuntimeState .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4893/1/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: PS1, Line 300: RuntimeState::Close() > Fair point. I'm going to leave TestEnv for now - looking into that code it' The ownership is actually really simple - TestEnv owns the RuntimeStates and destroys them in TestEnv::TearDownQueryStates(). So you could just close them in that function. Maybe the red herring is that TestEnv::query_states_ uses shared_ptr. It could actually be switched to unique_ptr - the ownership is actually exclusive, the code was just written before C++11 support. -- To view, visit http://gerrit.cloudera.org:8080/4893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4415: Fix unassigned scan range of size 1
Henry Robinson has posted comments on this change. Change subject: IMPALA-4415: Fix unassigned scan range of size 1 .. Patch Set 1: (2 comments) Started a GVO, but won't commit until I know that one of the Local / S3 builds are passing. http://gerrit.cloudera.org:8080/#/c/4907/1/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: Line 518: // reached, move onto the next instance. > "move on to" Done Line 530: // If this assignment pushes this instance past the threshold, move onto the next > on to Done -- To view, visit http://gerrit.cloudera.org:8080/4907 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id3af767ee9d121ca62ac383ef9e696a18dc903d6 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4415: Fix unassigned scan range of size 1
Hello Marcel Kornacker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4907 to look at the new patch set (#2). Change subject: IMPALA-4415: Fix unassigned scan range of size 1 .. IMPALA-4415: Fix unassigned scan range of size 1 ComputeScanRangeAssignment() computes the per-node scan-range load as the total number of bytes to be scanned / number of nodes, and casts that to a float. However, that computation suffers from a precision issue where the average * num_nodes may be 1 byte less than the total. The scan range assignment loop continues until at least average * num_nodes bytes have been assigned. If the last scan range has only 1 byte, it will not be assigned (if it has 2 or more bytes, it will be considered for assignment before the loop exit condition is met). The fix is to make sure that all instances are assigned in the last iteration of the assignment loop, even if the per-node threshold is already met. Testing: No local repro was found - only S3 and LocalFS builds have it. A unit test requires a lot of infrastructure from simple-scheduler-util that doesn't exist yet (e.g. ComputeScanRangeAssignment() is not called). S3 and LocalFS full test builds are in progress. Change-Id: Id3af767ee9d121ca62ac383ef9e696a18dc903d6 --- M be/src/scheduling/simple-scheduler.cc 1 file changed, 14 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/4907/2 -- To view, visit http://gerrit.cloudera.org:8080/4907 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id3af767ee9d121ca62ac383ef9e696a18dc903d6 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Marcel Kornacker
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4415: Fix unassigned scan range of size 1
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4415: Fix unassigned scan range of size 1 .. Patch Set 1: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/4907/1/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: Line 518: // reached, move onto the next instance. "move on to" Line 530: // If this assignment pushes this instance past the threshold, move onto the next on to -- To view, visit http://gerrit.cloudera.org:8080/4907 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id3af767ee9d121ca62ac383ef9e696a18dc903d6 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState
Henry Robinson has posted comments on this change. Change subject: IMPALA-4410: Safer tear-down of RuntimeState .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/4893/1/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: PS1, Line 300: RuntimeState::Close() > Agreed. TestEnv also creates RuntimeStates so we should call Close() there Fair point. I'm going to leave TestEnv for now - looking into that code it's a rat's nest, and we leak RuntimeStates everywhere as far as I can tell. Added it to fe-support - I added a utility to call Close() on scope exit, because there are so many paths out of that method... Line 304: ExecEnv::GetInstance()->thread_mgr()->UnregisterPool(resource_pool_); > Any reason for changing the order in which you are closing now? (Unregister No reason, don't think they're dependent. -- To view, visit http://gerrit.cloudera.org:8080/4893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/4910 Change subject: IMPALA-4410: Safer tear-down of RuntimeState .. IMPALA-4410: Safer tear-down of RuntimeState * Add RuntimeState::Close() which is guaranteed to release resources safely, rather than having PFE::Close() do the same piecemeal. * Fix for crash where PFE::Prepare() fails before descriptor table is created. Change-Id: I5f8244f41b247a82464f1e102a3b82e2001ccc1e Testing: Found by debug-actions, which has a reproducible test where PFE::Prepare() fails. Manually tested on master. --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/service/fe-support.cc A be/src/util/scope-exit-trigger.h 5 files changed, 61 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/4910/1 -- To view, visit http://gerrit.cloudera.org:8080/4910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I5f8244f41b247a82464f1e102a3b82e2001ccc1e Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 4: (1 comment) Augmented the comment. Also explained the relationship to the Coordinator locks. http://gerrit.cloudera.org:8080/#/c/4896/4/be/src/service/query-exec-state.h File be/src/service/query-exec-state.h: Line 215: /// holding this lock. > what about this? Updated. The condition was kind-of trivially true since it's only acquired in a couple of short functions. -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Hello Marcel Kornacker, Matthew Jacobs, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4896 to look at the new patch set (#5). Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. IMPALA-4409: respect lock order in QueryExecState::CancelInternal() The code previously violated the (partially documented) lock order in ImpalaServer. An example of a possible cycle in the dependency graph is: * SetQueryInFlight() holds SessionState::lock_ and waits for 'query_expiration_lock_' * ExpireQueries() holds 'query_expiration_lock_' and waits for 'query_exec_state_map_lock_' * GetQueryExecState() holds 'query_exec_state_map_lock_' and waits for QueryExecState::lock_ * QES::Cancel() holds QueryExecState::lock_ and waits for SessionState::lock It's not clear how likely the above scenario is, but it's hard to rule it out. We have not seen this hang in the wild but have seen similar ones. Testing: Ran local stress test on 3-node minicluster with TPC-H 20 and 50% of queries being cancelled. Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 --- M be/src/runtime/coordinator.h M be/src/service/impala-server.h M be/src/service/query-exec-state.cc M be/src/service/query-exec-state.h 4 files changed, 66 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/5 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4415: Fix unassigned scan range of size 1
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/4907 Change subject: IMPALA-4415: Fix unassigned scan range of size 1 .. IMPALA-4415: Fix unassigned scan range of size 1 ComputeScanRangeAssignment() computes the per-node scan-range load as the total number of bytes to be scanned / number of nodes, and casts that to a float. However, that computation suffers from a precision issue where the average * num_nodes may be 1 byte less than the total. The scan range assignment loop continues until at least average * num_nodes bytes have been assigned. If the last scan range has only 1 byte, it will not be assigned (if it has 2 or more bytes, it will be considered for assignment before the loop exit condition is met). The fix is to make sure that all instances are assigned in the last iteration of the assignment loop, even if the per-node threshold is already met. Testing: No local repro was found - only S3 and LocalFS builds have it. A unit test requires a lot of infrastructure from simple-scheduler-util that doesn't exist yet (e.g. ComputeScanRangeAssignment() is not called). S3 and LocalFS full test builds are in progress. Change-Id: Id3af767ee9d121ca62ac383ef9e696a18dc903d6 --- M be/src/scheduling/simple-scheduler.cc 1 file changed, 14 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/4907/1 -- To view, visit http://gerrit.cloudera.org:8080/4907 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Id3af767ee9d121ca62ac383ef9e696a18dc903d6 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool .. Patch Set 7: Fixed the bug found with IMPALA-4414 -- To view, visit http://gerrit.cloudera.org:8080/4631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool .. Patch Set 7: Code-Review+2 Carry -- To view, visit http://gerrit.cloudera.org:8080/4631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3202: DiskIoMgr improvements for new buffer pool
Hello Internal Jenkins, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4631 to look at the new patch set (#6). Change subject: IMPALA-3202: DiskIoMgr improvements for new buffer pool .. IMPALA-3202: DiskIoMgr improvements for new buffer pool The main goal of this patch is to add support to the DiskIoMgr to read into client-provided buffers, instead of the DiskIoMgr allocating intermediate buffers. This is important for the new buffer pool because we want it to entirely manage its own memory, rather than delegating part of its memory management to the DiskIoMgr. Also do some cleanup: * Consolidate some correlated ScanRange parameters into a parameter struct. * Remove the "untracked" buffers mem tracker, which is no longer necessary. * Change the buffer type in DiskIoMgr to use uint8_t* to be consistent with the rest of Impala. Testing: Added a unit test. We also get coverage from the BufferedBlockMgr unit tests, any spilling tests and the Parquet tests with large footers. Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec --- M be/src/exec/base-sequence-scanner.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/scanner-context.cc M be/src/runtime/buffered-block-mgr.cc M be/src/runtime/disk-io-mgr-internal.h M be/src/runtime/disk-io-mgr-scan-range.cc M be/src/runtime/disk-io-mgr-stress-test.cc M be/src/runtime/disk-io-mgr-stress.cc M be/src/runtime/disk-io-mgr-stress.h M be/src/runtime/disk-io-mgr-test.cc M be/src/runtime/disk-io-mgr.cc M be/src/runtime/disk-io-mgr.h 15 files changed, 404 insertions(+), 224 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/4631/6 -- To view, visit http://gerrit.cloudera.org:8080/4631 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I913fbb8ae6c242f1585992eb2a622509344dccec Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4365: Enabling end-to-end tests on a remote cluster
David Knupp has posted comments on this change. Change subject: IMPALA-4365: Enabling end-to-end tests on a remote cluster .. Patch Set 7: (18 comments) http://gerrit.cloudera.org:8080/#/c/4769/7/bin/remote_data_load.py File bin/remote_data_load.py: PS7, Line 17: This script is a setup script > How about simply "this is a setup script" Done PS7, Line 21: The script expects a setup Impala development environment. > How about: Done PS7, Line 22: appropriately > It's not clear what this means. Maybe expand on this a little? Done PS7, Line 47: import logging > import statements should be above from x import y Done PS7, Line 111: , > remove Done PS7, Line 116: that, > remove Done PS7, Line 127: not > having a not here makes it harder to reason about, how about: Done PS7, Line 130: password=options.cm_pass) > Is this indent correct according to the style guide? I thought it was suppo This passes PEP-8 linter inspection. PS7, Line 204: Only pick the first cluster configured > This comment does not really add anything, it's obvious what the code is do Done PS7, Line 209: for service in cluster.get_all_services(): > Just an observation: it looks like we don't check that all services we need I think that confirming the readiness of the cluster is a good idea. I'm adding a method to do this as part of the init procedure. Right now, all it does is check that the required services are installed and running. It can be extended later to do other checks as well. Line 281: # See https://issues.cloudera.org/browse/IMPALA-4365 > It looks like the issue is resolved, is this still necessary? Removed. PS7, Line 338: "database_name" > Why would it ever be called "database_name"? Where is this constant coming I figured out it's the header from the beeline output. (Much of this code, I've inherited, and I'll confess I didn't catch every line.) I'll add a comment. PS7, Line 432: wmay > may Done PS7, Line 451: # "--skip_local_tests", > why is this commented out? Bascially, this is another remainder from Martin was last working on the script. In fact, this entire method may go away (it's not currently working to use runtest to run remote cluster tests, but you can do it calilng impala-py.test directly.) http://gerrit.cloudera.org:8080/#/c/4769/7/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: Line 492: run-step "Loading nested data" load-nested.log \ > The command is very similar in both cases. The only difference is --cm-host Done http://gerrit.cloudera.org:8080/#/c/4769/7/testdata/bin/generate-schema-statements.py File testdata/bin/generate-schema-statements.py: Line 411: if os.getenv("REMOTE_LOAD"): > base_load_dir = os.getenv("REMOTE_LOAD") or os.getenv("IMPALA_HOME") Done http://gerrit.cloudera.org:8080/#/c/4769/7/testdata/bin/load_nested.py File testdata/bin/load_nested.py: Line 301: parser.add_argument("-s", "--source-db", default="tpch_parquet") > don't we also need cm-host parameter here? Yup, but we pick that up from L300: cli_options.add_cluster_options(parser) Good eye though. I'll add a comment. http://gerrit.cloudera.org:8080/#/c/4769/7/testdata/bin/setup-hdfs-env.sh File testdata/bin/setup-hdfs-env.sh: Line 20: set -e > we are already setting -euo pipefail below Done -- To view, visit http://gerrit.cloudera.org:8080/4769 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David KnuppGerrit-Reviewer: David Knupp Gerrit-Reviewer: Harrison Sheinblatt Gerrit-Reviewer: Martin Grund Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4365: Enabling end-to-end tests on a remote cluster
David Knupp has uploaded a new patch set (#8). Change subject: IMPALA-4365: Enabling end-to-end tests on a remote cluster .. IMPALA-4365: Enabling end-to-end tests on a remote cluster This patch lays the groundwork for loading data and running end-to-end tests on a remote CDH cluster. The requirements for the cluster to run the tests are: - Managed by Cloudera Manager (CM) - GPL Extras need to be installed - KMS and KeyTrustee installed and available as a service - SERDEPROPERTIES in the Hive DB modified to accept wide tables - Hive warehouse dir points to /test-warehouse The actual data loading is done via a new script, remote_data_load.py, which takes the CM host as an argument. It can be run from a client machine that is not a node of the cluster, but it needs to have the Impala repo checked out and Impala built. This insures that all of the necessary data load scripts are available, as well as setting up the environment properly (client binaries like beeline and the hbase shell are available, python libraries like cm_api are installed, necessary environment variables are defined, etc.) It should be noted that running remote_data_load.py will overwrite any local XML config files with the configurations downloaded from the remote cluster. Usage: remote_data_load.py [options] Options: -h, --helpshow this help message and exit --snapshot-file=SNAPSHOT_FILE Path to the test-warehouse archive --cm-user=CM_USER Cloudera Manager admin user --cm-pass=CM_PASS Cloudera Manager admin user password --gateway=GATEWAY Gateway host to upload the data from. If not set, uses the CM host as gateway. --ssh-user=SSH_USER System user on the remote machine with passwordless SSH configured. --no-load Do not try to load the snapshot --exploration-strategy=EXPLORATION_STRATEGY --testRun end-to-end tests against cluster Testing: This patch is being submitted with the understanding that there are still problems to work out with the remote data load script itself. However, since many of the existing build scripts also had to be modified, it is more important to make sure that no regressions were inadvertently introduced into the existing data load process. Loading data to a local mini-cluster was checked repeatedly while this patch was being developed, as well as running it against the Jenkins job that provides the test-warehouse snapshot used by the many other Impala CI builds that run daily. Remote data loading is working for the most part, although recent Kudu-related changes have introduced unforeseen problems: https://github.com/apache/incubator-impala/commit/041fa6d In the meantime, setting KUDU_IS_SUPPORTED to false provides a temporary workaround. Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9 --- M bin/load-data.py A bin/remote_data_load.py M testdata/bin/compute-table-stats.sh M testdata/bin/create-load-data.sh M testdata/bin/create-table-many-blocks.sh M testdata/bin/generate-schema-statements.py M testdata/bin/load-test-warehouse-snapshot.sh M testdata/bin/load_nested.py M testdata/bin/run-step.sh M testdata/bin/setup-hdfs-env.sh 10 files changed, 794 insertions(+), 64 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/4769/8 -- To view, visit http://gerrit.cloudera.org:8080/4769 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David KnuppGerrit-Reviewer: David Knupp Gerrit-Reviewer: Harrison Sheinblatt Gerrit-Reviewer: Martin Grund Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4229: Download aux test data if Impala-aux repo is available.
David Knupp has abandoned this change. Change subject: IMPALA-4229: Download aux test data if Impala-aux repo is available. .. Abandoned After a further discussion with Jim, it was decided that we need fewer (read: zero) references to Impala-aux in our ASF repo, so the fix may be larger than just this one line change. -- To view, visit http://gerrit.cloudera.org:8080/4901 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: abandon Gerrit-Change-Id: I089e61477a71f7f248aec74d01db6655fd203801 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David KnuppGerrit-Reviewer: Jim Apple
[Impala-ASF-CR] IMPALA-3853: More RAT cleaning.
Jim Apple has uploaded a new change for review. http://gerrit.cloudera.org:8080/4904 Change subject: IMPALA-3853: More RAT cleaning. .. IMPALA-3853: More RAT cleaning. Apache RAT is a tool to audit code repositories for the ASF copyright rules. Our wrapper script around it found a few more things; this patch fixes those things. Change-Id: I01367ea26feaf6a3e2cf4ac04f1c6a63f6e66195 --- M bin/check-rat-report.py M bin/rat_exclude_files.txt M testdata/datasets/tpcds/tpcds_kudu_template.sql M testdata/datasets/tpch/tpch_kudu_template.sql 4 files changed, 41 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/4904/1 -- To view, visit http://gerrit.cloudera.org:8080/4904 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I01367ea26feaf6a3e2cf4ac04f1c6a63f6e66195 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 4: (1 comment) if you could augment the comment to include the locks that aren't part of the hierarchy yet, that would be helpful. http://gerrit.cloudera.org:8080/#/c/4896/4/be/src/service/query-exec-state.h File be/src/service/query-exec-state.h: Line 215: /// holding this lock. what about this? -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 4: this needs a +2 from an owner. -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 4: Code-Review-1 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Hello Matthew Jacobs, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4896 to look at the new patch set (#4). Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. IMPALA-4409: respect lock order in QueryExecState::CancelInternal() The code previously violated the (partially documented) lock order in ImpalaServer. An example of a possible cycle in the dependency graph is: * SetQueryInFlight() holds SessionState::lock_ and waits for 'query_expiration_lock_' * ExpireQueries() holds 'query_expiration_lock_' and waits for 'query_exec_state_map_lock_' * GetQueryExecState() holds 'query_exec_state_map_lock_' and waits for QueryExecState::lock_ * QES::Cancel() holds QueryExecState::lock_ and waits for SessionState::lock It's not clear how likely the above scenario is, but it's hard to rule it out. We have not seen this hang in the wild but have seen similar ones. Testing: Ran local stress test on 3-node minicluster with TPC-H 20 and 50% of queries being cancelled. Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 --- M be/src/runtime/coordinator.h M be/src/service/impala-server.h M be/src/service/query-exec-state.cc M be/src/service/query-exec-state.h 4 files changed, 50 insertions(+), 27 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/4 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Revert "IMPALA-4314: Standardize on MT-related data structures"
Marcel Kornacker has posted comments on this change. Change subject: Revert "IMPALA-4314: Standardize on MT-related data structures" .. Patch Set 1: Code-Review-1 Henry is working on a fix. -- To view, visit http://gerrit.cloudera.org:8080/4903 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia484b70e5545fe696b8aadcbb9a0b36555a0918d Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 4: Code-Review+2 Thanks for adding the comment. Will be very helpful in the future. -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 4: Code-Review+1 Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/4896/3/be/src/service/impala-server.h File be/src/service/impala-server.h: PS3, Line 675: The lock must be acquired before 'query_exec_state_map_lock_' or : /// QueryExecState::lock_ if they will be held simulataneously. > Instead of just having this above this lock, it would be better to spell ou Done -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/4896/3/be/src/service/impala-server.h File be/src/service/impala-server.h: PS3, Line 675: The lock must be acquired before 'query_exec_state_map_lock_' or : /// QueryExecState::lock_ if they will be held simulataneously. Instead of just having this above this lock, it would be better to spell out the lock ordering of the main locks in question at the top of this header file, based on the DAG you posted in the JIRA. I have a feeling there might be other places where we mess up the lock ordering too that we don't know of yet. -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Revert "IMPALA-4314: Standardize on MT-related data structures"
Lars Volker has uploaded a new change for review. http://gerrit.cloudera.org:8080/4903 Change subject: Revert "IMPALA-4314: Standardize on MT-related data structures" .. Revert "IMPALA-4314: Standardize on MT-related data structures" This reverts commit 0d857237a8d5231caf9af4c30d4134b151cd1e0d. Change-Id: Ia484b70e5545fe696b8aadcbb9a0b36555a0918d --- M be/src/benchmarks/expr-benchmark.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/simple-scheduler-test-util.cc M be/src/scheduling/simple-scheduler-test-util.h M be/src/scheduling/simple-scheduler.cc M be/src/scheduling/simple-scheduler.h M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/query-exec-state.cc M common/thrift/Frontend.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test 23 files changed, 1,174 insertions(+), 637 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/4903/1 -- To view, visit http://gerrit.cloudera.org:8080/4903 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ia484b70e5545fe696b8aadcbb9a0b36555a0918d Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker
[Impala-ASF-CR] IMPALA-4411: Kudu inserts violate lock ordering and could deadlock
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4411: Kudu inserts violate lock ordering and could deadlock .. Patch Set 2: Code-Review+2 carrying +2 -- To view, visit http://gerrit.cloudera.org:8080/4895 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2aa36fce78525a80a6b880e1b668105006bc1425 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4411: Kudu inserts violate lock ordering and could deadlock
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4411: Kudu inserts violate lock ordering and could deadlock .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/4895/1//COMMIT_MSG Commit Message: PS1, Line 17: Running an exhaustive test job, and stress tests : manually. > Did you reproduce the hang? It should be easy to do if you add a sleep call I wasn't able to reproduce it easily. I tried inserting sleeps and running stress and cancellation tests, but no luck. I think it may have required a rare combination of events across a number of threads, e.g. like the one mentioned by Tim in his related patch for fixing another case of mis-ordering the same locks. http://gerrit.cloudera.org:8080/#/c/4895/1/be/src/service/query-exec-state.cc File be/src/service/query-exec-state.cc: PS1, Line 532: // This should be safe to access on coord_ after Wait() has been called. > This is a bit vague. Is it safe or not? Done PS1, Line 535: "Updating session latest observed Kudu timestamp: " << latest_kudu_ts > Add the session ID here, otherwise this message is not going to be useful ( Done -- To view, visit http://gerrit.cloudera.org:8080/4895 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2aa36fce78525a80a6b880e1b668105006bc1425 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] Add functional tests for compute stats with mt dop > 0.
Alex Behm has posted comments on this change. Change subject: Add functional tests for compute stats with mt_dop > 0. .. Patch Set 3: Code-Review-1 Hits DCHECK now, need to debug. -- To view, visit http://gerrit.cloudera.org:8080/4879 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icd4e7e44f9f23f66e59ad1fb298e13da76ad817a Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4229: Download auxiliary test data if Impala-auxiliary-tests is available.
David Knupp has uploaded a new change for review. http://gerrit.cloudera.org:8080/4900 Change subject: IMPALA-4229: Download auxiliary test data if Impala-auxiliary-tests is available. .. IMPALA-4229: Download auxiliary test data if Impala-auxiliary-tests is available. This was a slight annoyance for internal Cloudera devs setting up new environments. If the Impala-auxiliary-tests repo is checked out, our data load scripts assume that the auxiliary test data archives have been downloaded and unpacked. But this requires devs to run another script by hand. This patch just inserts the download step into the data load process. If the test data is in fact already present, nothing is downloaded. Change-Id: Iebe605bfb610c2f0529100d79ceecb07cc7c6165 --- M testdata/bin/create-load-data.sh 1 file changed, 1 insertion(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/4900/1 -- To view, visit http://gerrit.cloudera.org:8080/4900 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Iebe605bfb610c2f0529100d79ceecb07cc7c6165 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: David KnuppGerrit-Reviewer: Jim Apple
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 3: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3202: refactor scratch file management into TmpFileMgr
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/4898 Change subject: IMPALA-3202: refactor scratch file management into TmpFileMgr .. IMPALA-3202: refactor scratch file management into TmpFileMgr This is a pure refactoring patch that moves all of the logic for allocating scratch file ranges into TmpFileMgr in anticipation of this logic being used by the new BufferPool. There should be no behavioural changes. Also remove a bunch of TODOs that we're not going to fix. Testing: Covered by TmpFileMgr and BufferedBlockMgr unit tests. Code is exercised heavily by spilling system tests and the stress test. Change-Id: I0c56c195f3f28d520034f8c384494e566635fc62 --- M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-block-mgr.cc M be/src/runtime/buffered-block-mgr.h M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h 6 files changed, 193 insertions(+), 203 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/4898/1 -- To view, visit http://gerrit.cloudera.org:8080/4898 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I0c56c195f3f28d520034f8c384494e566635fc62 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-4410: Safer tear-down of RuntimeState
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4410: Safer tear-down of RuntimeState .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4893/1/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: PS1, Line 300: RuntimeState::Close() > Nothing very pressing but wanted to bring this up: Agreed. TestEnv also creates RuntimeStates so we should call Close() there too. -- To view, visit http://gerrit.cloudera.org:8080/4893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie416e4d57240142bf685385299b749c3a6792c45 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4381: Incorrect AVX version of BloomFilter::Or
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4381: Incorrect AVX version of BloomFilter::Or .. Patch Set 4: Code-Review+2 Sorry about that - had this patch on a branch for testing and accidentally pushed it. -- To view, visit http://gerrit.cloudera.org:8080/4866 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I18480c598fbf6b842398581acde6fc719c40ce27 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4896/1/be/src/service/query-exec-state.cc File be/src/service/query-exec-state.cc: PS1, Line 833: if (check_inflight) { : DCHECK(session_lock.owns_lock()); : if (session_->inflight_queries.find(query_id()) == : session_->inflight_queries.end()) { : return Status("Query not yet running"); : } > Are you aware of any reason we can't just do this check before taking lock_ It doesn't look like it would protect against anything - I did a pass through the code. Based on the history of the code, 'check_inflight' is meant to guard against the query being cancelled early when it is in a state with partially set-up data structures, so the race where it is removed from 'inflight_queries' during cancellation should be benign. -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Tim Armstrong has uploaded a new patch set (#2). Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. IMPALA-4409: respect lock order in QueryExecState::CancelInternal() The code previously violated the (partially documented) lock order in ImpalaServer. An example of a possible cycle in the dependency graph is: * SetQueryInFlight() holds SessionState::lock_ and waits for 'query_expiration_lock_' * ExpireQueries() holds 'query_expiration_lock_' and waits for 'query_exec_state_map_lock_' * GetQueryExecState() holds 'query_exec_state_map_lock_' and waits for QueryExecState::lock_ * QES::Cancel() holds QueryExecState::lock_ and waits for SessionState::lock It's not clear how likely the above scenario is, but it's hard to rule it out. We have not seen this hang in the wild but have seen similar ones. Testing: Ran local stress test on 3-node minicluster with TPC-H 20 and 50% of queries being cancelled. Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 --- M be/src/service/impala-server.h M be/src/service/query-exec-state.cc 2 files changed, 13 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/2 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Matthew Jacobs