[Impala-ASF-CR] IMPALA-2550 Introduce query-wide execution context.
Dan Hecht has posted comments on this change. Change subject: IMPALA-2550 Introduce query-wide execution context. .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/3817/7/be/src/service/query-exec-mgr.cc File be/src/service/query-exec-mgr.cc: Line 85: if (i->second->num_current_references_ == 0) return NULL; > No that's not how it works. The decrementing of ref cnt occurs in ReleaseQueryExecState, which will remove from the map when the cnt hits 0. So we never have a QES in the map with cnt == 0. -- To view, visit http://gerrit.cloudera.org:8080/3817 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2550 Introduce query-wide execution context.
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-2550 Introduce query-wide execution context. .. Patch Set 9: (3 comments) In this patch the following is addressed: - DescriptorTbl is not shared anymore. I've moved it back to RuntimeState as we still need to change how the serialization/deserialization is done before we can make it shared. It will be done in a separate patch. - QueryExecMgr now starts fragment threads so that refcount management stays in the scope of QueryExecMgr and is not passed to a lower level. - FragmentInstanceExecState refcount mechanism removed. Now these objects have the same lifetime as QueryExecState and are destroyed when QueryExecState is destroyed. - fragment_inst_exec_state_map_ is now indexed by fragment instance index and not fragment instance IDs. (Rebased on top of IMPALA-3988) http://gerrit.cloudera.org:8080/#/c/3817/7/be/src/service/query-exec-mgr.cc File be/src/service/query-exec-mgr.cc: Line 85: > how is this possible? it looks like an invariant that anything in the map w No that's not how it works. Everytime a new fragment comes in, FindOrCreateQueryExecState() is called (in RegisterFragmentInstanceWithQuery()) which increases the refcount by 1. Everytime a fragment completes (in QEM->ExecInstance()), ReleaseQES() is called which decrements the refcount by 1. This means that this refcount will ultimately reach 0. Alternatively CancelPlanFragment() and PublishFilter() use QESGuard() which calls a GetQES() at the beginning of its scope and a ReleaseQES() when it goes out of scope which will effectively balance the refcount. http://gerrit.cloudera.org:8080/#/c/3817/8/be/src/service/query-exec-mgr.cc File be/src/service/query-exec-mgr.cc: Line 54: QueryExecState* query_exec_state = FindOrCreateQueryExecState(exec_params.query_ctx); > comment seems redundant with the code (especially if you rename the method Done PS8, Line 100: > FindOrCreateQueryExecState() given that it creates (as opposed to inserting Done -- To view, visit http://gerrit.cloudera.org:8080/3817 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2550 Introduce query-wide execution context.
Sailesh Mukil has uploaded a new patch set (#9). Change subject: IMPALA-2550 Introduce query-wide execution context. .. IMPALA-2550 Introduce query-wide execution context. This patch is a header preview of a query wide execution state for a backend. Changes in design are as follows: FragmentMgr -> QueryExecMgr The QueryExecMgr now receives incoming fragments, creates a new QueryExecState if it is the first fragment to arrive for that query, and passes this received fragment along to the QueryExecState. This class is responsible for cleaning up the QueryExecState. QueryExecState (new): This is the query wide state for the backend. It is initialized by the first fragment to arrive to the QueryExecMgr. This class is now responsible for creating FragmentExecStates and executing them. Its life is protected by a ref counting mechanism and it is scheduled for destruction once the ref count reaches zero. Once scheduled for destruction, a thread in the QueryExecMgr will destroy the QueryExecState. Every user of the QueryExecState must access it within the scope of ScopedQueryExecStateRef which guarantees ref count incrementing and decrementing at entry/exit of the scope. QueryExecState (old) -> ClientRequestExecState: This is just a class name change. We still do not include shared state into the QueryExecState because there needs to be changes to DescriptorTbl, etc. before we can incorporate them into the QueryExecState. Will be done as separate patches. P.S: This only shows what the changes would look like and as of yet still does not compile. Excludes: - Some class renames in places not important to the core patch logic. - Most .cc files to make use of changes made. Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c --- M be/src/runtime/CMakeLists.txt M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h R be/src/runtime/fragment-instance-exec-state.cc R be/src/runtime/fragment-instance-exec-state.h R be/src/runtime/fragment-instance-executor.cc R be/src/runtime/fragment-instance-executor.h A be/src/runtime/query-exec-state.cc A be/src/runtime/query-exec-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/service/CMakeLists.txt R be/src/service/client-request-exec-state.cc R be/src/service/client-request-exec-state.h D be/src/service/fragment-mgr.cc D be/src/service/fragment-mgr.h M be/src/service/impala-internal-service.h A be/src/service/query-exec-mgr.cc A be/src/service/query-exec-mgr.h M common/thrift/ImpalaInternalService.thrift 20 files changed, 735 insertions(+), 360 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/3817/9 -- To view, visit http://gerrit.cloudera.org:8080/3817 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-2550 Introduce query-wide execution context.
Dan Hecht has posted comments on this change. Change subject: IMPALA-2550 Introduce query-wide execution context. .. Patch Set 8: (3 comments) Just to summarize some key points from the discussion today so we don't forget: - We agreed that ref counting is the way to go and explicit ref counting avoids "accidental" leakage of shared_ptr copies. - We agreed to restructure the fragment instance thread code so that the reference held by that thread would be done from the top-level thread method, rather than the reference being passed implicitly between classes like in the current patchset. - We agreed to try to have a hierarchical structure to the code to make ownership and lifetime as clear as possible. - We agreed that we only need reference counting at the QES level. Everything "below" would have the same lifetime as QES and does not need separate ref counting. http://gerrit.cloudera.org:8080/#/c/3817/8/be/src/service/query-exec-mgr.cc File be/src/service/query-exec-mgr.cc: Line 54: // Create or retrieve the QueryExecState for this query id. comment seems redundant with the code (especially if you rename the method as suggested below). Line 85: if (i->second->num_current_references_ == 0) return NULL; how is this possible? it looks like an invariant that anything in the map will have refcnt > 0. PS8, Line 100: Insert FindOrCreateQueryExecState() given that it creates (as opposed to inserting one that was created by the caller). -- To view, visit http://gerrit.cloudera.org:8080/3817 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2550 Introduce query-wide execution context.
Dan Hecht has posted comments on this change. Change subject: IMPALA-2550 Introduce query-wide execution context. .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/3817/7/be/src/service/query-exec-mgr.cc File be/src/service/query-exec-mgr.cc: Line 85: if (i->second->num_current_references_ == 0) return NULL; how is this possible? it looks like an invariant that anything in the map will have refcnt > 0. http://gerrit.cloudera.org:8080/#/c/3817/7/be/src/service/query-exec-mgr.h File be/src/service/query-exec-mgr.h: PS7, Line 127: /// The thread pool that is in charge of destroying a QueryExecState when it is no : /// longer in use. : ThreadPool destroy_thread_; > Currently I've made it that way because if a fragment is the last user of t The QES doesn't delete itself. The QEM does it, so there isn't a problem of doing it directly. -- To view, visit http://gerrit.cloudera.org:8080/3817 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2550 Introduce query-wide execution context.
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-2550 Introduce query-wide execution context. .. Patch Set 8: (9 comments) http://gerrit.cloudera.org:8080/#/c/3817/7//COMMIT_MSG Commit Message: PS7, Line 17: passes this received fragment along to the Query > which fragment? the received fragment. I've clarified the comment. PS7, Line 26: Once scheduled for destruction, a thread in the QueryExecMgr will : destroy the QueryExecState. > Why do we need a special thread for this? I've explained this in the CR comment in the query-exec-mgr.h file. PS7, Line 28: Every user of the QueryExecState must access it within the scope of : ScopedQueryExecStateRef which guarantees ref count incrementing and : decrementing at entry/exit of the scope. > It looks like we're implementing something that looks like a shared_ptr. I How I see it, reference counting is necessary because the lifetime of the QES is defined by how many fragments are around to use it. If needed, I can quickly go over the overall design with you in person tomorrow and also talk about the shared_ptr vs refcount decision, so that it'll be a lot easier to review this patch. http://gerrit.cloudera.org:8080/#/c/3817/7/be/src/service/query-exec-mgr.h File be/src/service/query-exec-mgr.h: PS7, Line 54: and that the fragment instance must either run to completion : /// (with or withou > or it can encounter an error? Yes the fragment is executed on another thread. I've clarified that the fragment instance can encounter an error too. PS7, Line 60: RegisterFragmentInstanceWithQuery > How about just 'RegisterFragment'? This is a fragment instance, so we would have to change it to RegisterFragmentInstance(). But that name is already used in QueryExecState::RegisterFragmentInstance(). So I added a 'WithQuery' here to curb confusion. If you still think it's better to change the name, I'll go ahead and do that. PS7, Line 68: QueryExecState* GetQueryExecState(const TUniqueId& query_id); : : /// Decrements query_exec_state->num_current_references and schedules it for destruction : /// if the reference count hits zero. : void ReleaseQueryExecState(QueryExecState* query_exec_state); > Who are the consumers of these APIs? I'm wondering if ref counting is reall The consumers of these 2 functions are: - The fragment itself in QueryExecState::ExecInstance(). - CancelPlanFragment() when a Cancel RPC is received. - PublishFilter() when a publish filter RPC is received. PS7, Line 74: e que > query's Done PS7, Line 74: > a Done PS7, Line 127: /// The thread pool that is in charge of destroying a QueryExecState when it is no : /// longer in use. : ThreadPool destroy_thread_; > why does the QES destruction need to happen on a separate thread? Currently I've made it that way because if a fragment is the last user of the QES, that fragment will decrease the refcount to 0, which means it should be deleted. This happens in QueryExecState::ExecInstance(). If we delete it in the context of the same thread, it will be as if the QueryExecState is deleting itself which could cause memory corruption, etc. Which is why the destruction of the QES is handed off to the QueryExecMgr on a separate thread. -- To view, visit http://gerrit.cloudera.org:8080/3817 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2550 Introduce query-wide execution context.
Sailesh Mukil has uploaded a new patch set (#8). Change subject: IMPALA-2550 Introduce query-wide execution context. .. IMPALA-2550 Introduce query-wide execution context. This patch is a header preview of a query wide execution state for a backend. Changes in design are as follows: FragmentMgr -> QueryExecMgr The QueryExecMgr now receives incoming fragments, creates a new QueryExecState if it is the first fragment to arrive for that query, and passes this received fragment along to the QueryExecState. This class is responsible for cleaning up the QueryExecState. QueryExecState (new): This is the query wide state for the backend. It is initialized by the first fragment to arrive to the QueryExecMgr. This class is now responsible for creating FragmentExecStates and executing them. Its life is protected by a ref counting mechanism and it is scheduled for destruction once the ref count reaches zero. Once scheduled for destruction, a thread in the QueryExecMgr will destroy the QueryExecState. Every user of the QueryExecState must access it within the scope of ScopedQueryExecStateRef which guarantees ref count incrementing and decrementing at entry/exit of the scope. QueryExecState (old) -> ClientRequestExecState: This is just a class name change. P.S: This only shows what the changes would look like and as of yet still does not compile. Excludes: - Some class renames in places not important to the core patch logic. - Most .cc files to make use of changes made. Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c --- M be/src/runtime/CMakeLists.txt M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h R be/src/runtime/fragment-instance-exec-state.cc R be/src/runtime/fragment-instance-exec-state.h R be/src/runtime/fragment-instance-executor.cc R be/src/runtime/fragment-instance-executor.h A be/src/runtime/query-exec-state.cc A be/src/runtime/query-exec-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/service/CMakeLists.txt R be/src/service/client-request-exec-state.cc R be/src/service/client-request-exec-state.h D be/src/service/fragment-mgr.cc D be/src/service/fragment-mgr.h M be/src/service/impala-internal-service.h A be/src/service/query-exec-mgr.cc A be/src/service/query-exec-mgr.h M common/thrift/ImpalaInternalService.thrift 20 files changed, 743 insertions(+), 374 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/3817/8 -- To view, visit http://gerrit.cloudera.org:8080/3817 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-2550 Introduce query-wide execution context.
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-2550 Introduce query-wide execution context. .. Patch Set 7: (9 comments) http://gerrit.cloudera.org:8080/#/c/3817/7//COMMIT_MSG Commit Message: PS7, Line 17: passes this fragment along to the QueryExecState which fragment? PS7, Line 26: Once scheduled for destruction, a thread in the QueryExecMgr will : destroy the QueryExecState. Why do we need a special thread for this? PS7, Line 28: Every user of the QueryExecState must access it within the scope of : ScopedQueryExecStateRef which guarantees ref count incrementing and : decrementing at entry/exit of the scope. It looks like we're implementing something that looks like a shared_ptr. I know we don't like shared_ptrs, but what we're doing looks a bit error prone. Also, I'm wondering if we can't avoid reference counting to begin with, if the ownership and lifecycle of the QES is more explicitly defined from the beginning. I don't see enough context to reason about that though. http://gerrit.cloudera.org:8080/#/c/3817/7/be/src/service/query-exec-mgr.h File be/src/service/query-exec-mgr.h: PS7, Line 54: and that the fragment instance must either run to completion : /// or be cancelled or it can encounter an error? I assume this is executed on another thread? PS7, Line 60: RegisterFragmentInstanceWithQuery How about just 'RegisterFragment'? PS7, Line 68: QueryExecState* GetQueryExecState(const TUniqueId& query_id); : : /// Decrements query_exec_state->num_current_references and schedules it for destruction : /// if the reference count hits zero. : void ReleaseQueryExecState(QueryExecState* query_exec_state); Who are the consumers of these APIs? I'm wondering if ref counting is really necessary. Maybe I'll be able to answer this as I read more through the code. PS7, Line 74: right query's PS7, Line 74: a PS7, Line 127: /// The thread pool that is in charge of destroying a QueryExecState when it is no : /// longer in use. : ThreadPool destroy_thread_; why does the QES destruction need to happen on a separate thread? -- To view, visit http://gerrit.cloudera.org:8080/3817 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2550 Introduce query-wide execution context.
Sailesh Mukil has uploaded a new patch set (#7). Change subject: IMPALA-2550 Introduce query-wide execution context. .. IMPALA-2550 Introduce query-wide execution context. This patch is a header preview of a query wide execution state for a backend. Changes in design are as follows: FragmentMgr -> QueryExecMgr The QueryExecMgr now receives incoming fragments, creates a new QueryExecState if it is the first fragment to arrive for that query, and passes this fragment along to the QueryExecState. This class is responsible for cleaning up the QueryExecState. QueryExecState (new): This is the query wide state for the backend. It is initialized by the first fragment to arrive to the QueryExecMgr. This class is now responsible for creating FragmentExecStates and executing them. Its life is protected by a ref counting mechanism and it is scheduled for destruction once the ref count reaches zero. Once scheduled for destruction, a thread in the QueryExecMgr will destroy the QueryExecState. Every user of the QueryExecState must access it within the scope of ScopedQueryExecStateRef which guarantees ref count incrementing and decrementing at entry/exit of the scope. QueryExecState (old) -> ClientRequestExecState: This is just a class name change. P.S: This only shows what the changes would look like and as of yet still does not compile. Excludes: - Some class renames in places not important to the core patch logic. - Most .cc files to make use of changes made. Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c --- M be/src/runtime/CMakeLists.txt M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h R be/src/runtime/fragment-instance-exec-state.cc R be/src/runtime/fragment-instance-exec-state.h R be/src/runtime/fragment-instance-executor.cc R be/src/runtime/fragment-instance-executor.h A be/src/runtime/query-exec-state.cc A be/src/runtime/query-exec-state.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/service/CMakeLists.txt R be/src/service/client-request-exec-state.cc R be/src/service/client-request-exec-state.h D be/src/service/fragment-mgr.cc D be/src/service/fragment-mgr.h M be/src/service/impala-internal-service.h A be/src/service/query-exec-mgr.cc A be/src/service/query-exec-mgr.h M common/thrift/ImpalaInternalService.thrift 20 files changed, 740 insertions(+), 374 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/3817/7 -- To view, visit http://gerrit.cloudera.org:8080/3817 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Sailesh Mukil