Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 11:

(43 comments)

this needs a follow-on discussion regarding the existing class hierarchy (and 
what it should turn into).

we should disentangle that before adding the per-query rpc functionality.

http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/runtime/fragment-instance-exec-state.h
File be/src/runtime/fragment-instance-exec-state.h:

Line 39:       executor_(query_exec_state, exec_env, lambda(
please use a regular c++ lambda


Line 54:   Status Prepare();
yet more indirection.

the division of labor between this class and FragmentInstanceExecutor is 
unclear (who does the execution). i find the control flow very hard to follow.

here you have to go through the state to get to the executor. and the state in 
here is basically all static (exec params) aside from things like status, which 
are really execution-related.


Line 77:   /// Contains the context for the current fragment instance.
in general: separate w/ blank line if you have member variables with comments

this particular member is redundant, because it's already contained in 
exec_params_


Line 84:   boost::scoped_ptr<Thread> exec_thread_;
unused?

if not: who is doing the execution then (who has the execution logic), this 
class or InstanceExecutor? this seems convoluted.


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/runtime/fragment-instance-executor.cc
File be/src/runtime/fragment-instance-executor.cc:

Line 113:     cgroup = 
ExecEnv::GetInstance()->cgroups_mgr()->UniqueIdToCgroup(PrintId(query_id_, 
"_"));
long lines (here and elsewhere)


Line 336:             &FragmentInstanceExecutor::ReportProfile, this));
indent 2 spaces for the 2nd level


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/runtime/fragment-instance-executor.h
File be/src/runtime/fragment-instance-executor.h:

Line 75:   /// TODO: Made reports per-query vs per fragment-instance.
make


Line 76:   typedef boost::function<
std::


Line 82:   FragmentInstanceExecutor(QueryState* query_exec_state, ExecEnv* 
exec_env,
might as well also call the param query_state.

drop exec_env, you can call GetInstance()


Line 145:   QueryState* query_exec_state() { return query_exec_state_; }
now that we're doing a global rename, let's drop 'state' uniformly


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/runtime/query-exec-state.cc
File be/src/runtime/query-exec-state.cc:

Line 64:       CreateFragmentInstanceState(exec_params);
does this get called from anywhere else? if not, inline here and get rid of 
function. there is too much indirection here, and no clear separation of 
responsibilities (in the current code Create does the registration, not 
Register).


Line 82:     fragment_inst_exec_state_map_.insert(make_pair(GetInstanceIdx(
change line breaks to improve readability (don't break up a function call if 
you don't have to)


Line 106: Status QueryState::CancelPlanFragment(const TUniqueId& 
fragment_instance_id) {
this and the next function are simply wrapper around FIS member functions, it 
looks like you only need GetFIS and the caller can do the rest. this will make 
the code easier to follow because it removes a level of indirection


Line 109:   if (fragment_inst_exec_state.get() == NULL) {
why not move this into GetInstanceState()?


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/runtime/query-exec-state.h
File be/src/runtime/query-exec-state.h:

Line 57:   // TODO: Remove once we refcator code in coordinator.
spelling


Line 75:   void CleanFragmentInstanceStates();
clear?

also, shouldn't this be a general TearDown()?


Line 82:   FragmentInstanceState* CreateFragmentInstanceState(
does this need to be public, in addition to the preceding function? who calls 
this?


Line 114:   /// decrement this to 0 and clean up the query exec state.
move this comment to the actor class (qem).

rephrase (it's the last one to dereference this that will remove the state) if 
still useful.


Line 115:   int32_t num_current_references_;
not atomic? (or does it even make a difference?)


Line 118:   SpinLock fragment_inst_exec_state_map_lock_;
i started using finstance instead of frag_inst or fragment_inst or frag_instance


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

Line 71:       QueryState* query_state = NULL);
pass in FIS (which contains a pointer to the query state as well as the exec 
params)


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/service/client-request-exec-state.h
File be/src/service/client-request-exec-state.h:

Line 47: /// Execution state of a query and also captures all state required 
for servicing
this is limited to client-related state. differentiate more clearly against 
QueryState.


Line 57: class ImpalaServer::ClientRequestExecState {
in the interest of removing extraneous 'Exec' qualifiers, drop this here as well


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/service/query-exec-mgr.cc
File be/src/service/query-exec-mgr.cc:

Line 39:         bind<void>(mem_fn(&QueryExecMgr::DestroyQueryState), this, _1, 
_2)) {
lambda is clearer because it contains a signature


Line 42: Status QueryExecMgr::RegisterFragmentInstanceWithQuery(
this does more than just registration (it starts a thread that executes the 
instance).

i find the control flow hard to follow because there are so many indirections 
between the classes, and there doesn't seem to be a clear separation of 
responsibilities.

maybe the qem should only deal with creation/registration and destruction of 
qes, and the qes does everything else (including starting the execution 
thread). or we also add a QueryExecutor which does the execution, but that may 
be very little functionality, and qes already does other activities. in that 
case, we can also call this class QueryStateRegistry or something like that, 
not anything with 'mgr' in it, which implies active involvement.

have this Register only do the registration and return a QES*, and then have 
the caller do an explicit qes->ExecInstance().

let's talk about the details tomorrow.


Line 48:              << exec_params.fragment_instance_ctx.instance_state_idx;
doesn't exist anymore


Line 59:   ImpaladMetrics::IMPALA_SERVER_NUM_FRAGMENTS_IN_FLIGHT->Increment(1L);
why not move this into ExecInstance as well, so it's symmetric?


Line 62:   FragmentInstanceState* fragment_inst_exec_state =
fix var name


Line 70:   // TODO: Move this responsibility to the QueryExecutor when such an 
abstraction is
collect those todo's in the .h class comment


Line 74:       Substitute("exec-plan-fragment-$0", PrintId(fragment_id)),
fix label


Line 80: void QueryExecMgr::ExecInstance(QueryState* query_state,
the instance state already has the query state, drop this param


Line 139:   TUniqueId query_id = query_ctx.query_id;
const ref


Line 145:   ImpaladMetrics::IMPALA_SERVER_NUM_RUNNING_QUERIES->Increment(1L);
indentation


Line 155:   if (scoped_qes.query_state_ == nullptr) {
might as well use the getter


Line 158:         PrintId(fragment_instance_id))));
indentation


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/service/query-exec-mgr.h
File be/src/service/query-exec-mgr.h:

Line 31: /// Manages execution of queries by creating a QueryState object which 
owns
you're missing one of the most important aspects of this class: it's a 
singleton across the process


Line 47: /// aware of FragmentInstanceStates. The API should be more like 
RegisterQuery().
we also decided to add another class at that point, QueryExecutor, that runs in 
a separate thread (created by QEM) and creates global shared state, creates the 
instance threads, etc.

augment your todo here to contain a full list of all items that will get done 
once the per-query rpc arrives (new classes, which functions migrate, etc.)


Line 62:   Status RegisterFragmentInstanceWithQuery(const 
TExecPlanFragmentParams& params);
remove 'with query' (is there one without?)


Line 78:   Status CancelPlanFragment(const TUniqueId& query_id,
remove query_id in situations where you already have a fragment instance id, 
you can now derive the former from the latter.

also, rename everything 'fragment' or 'plan fragment' to 'fragment instance' if 
applicable.

also, i started using FInstance as an abbreviation for FragmentInstance (which 
is very long)


Line 83:   Status PublishFilter(const TUniqueId& query_id, const TUniqueId& 
dst_instance_id,
query_id is redundant


Line 93:   class QESGuard {
move into QES itself and call it Guard.


Line 136:   /// must previously have been registered in 
fragment_inst_exec_state_map_. Runs in the
this map doesn't exist here.

migrate into QueryState


Line 140:       FragmentInstanceState* fragment_inst_exec_state);
have param name match its type


-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sail...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: David Knupp <dkn...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sail...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to