Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10158 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
......................................................................


Patch Set 16:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h@322
PS15, Line 322: cu
> double space snuck back in
Done


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h@341
PS15, Line 341: ith 'instance_ho
> nit: update to "failed_finstance"
Done


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.h@414
PS15, Line 414:
> nit: extra space
Done


http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@a115
PS14, Line 115:
              :
> put that back
Done


http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@121
PS14, Line 121: may call any
> should we call this a non-OK status, because for cancellation through Cance
Done


http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@131
PS14, Line 131:
> do you mean cancels?
I think I meant to delete that.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@147
PS15, Line 147: Status prepare_status = query_state_->WaitForPrepare();
              :       DCHECK(!prepare_status.ok());
              :       return UpdateExecState
> should we call UpdateExecStateIfError() here?
yes, done.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@486
PS15, Line 486: !status.ok() && exec_status_.IsCancelled()
> can we have a case with <state, status> = <ERROR, Cancelled> ?
Ideally no. but I think the backend can sometimes do it due to complications in 
the error propagation path. I would like to get to a point where this can't be 
cancelled, but I don't think we're there yet.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@522
PS15, Line 522: // TODO: is this needed? This will also happen on the "backend" 
side of cancel rpc.
              :   // And in the case of EOS, sink already knows this.
              :   if (coord_sink_ != nullptr) coord_sink_->CloseConsumer();
> I dont think we need this, I looked at why we originally needed this and it
Yeah, I agree, but would prefer to do it in an independent commit in case there 
is some unforeseen issue.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@530
PS15, Line 530:
              :   } else {
> in case of cancelBackends(), should we also wait for resources to be freed?
I think the argument can be made in both directions, but we eventually settled 
on the current behavior. The reasoning is documented in the header comment for 
ReleaseAdmissionControlResources(). I'll update this todo as to not add 
confusion.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@605
PS15, Line 605:
> maybe change the name to something like UpdateExecState, since this is also
The "if error" was meant to say that the exec state is only updated if the 
status is an not-ok and only to move into an error state (as opposed to, say, 
cancell state). But I can see the confusion so i'll change it since you feel 
it'd be clearer. I guess the header comment should clarify that this shouldn't 
be used for cancellation.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@639
PS15, Line 639: DCHECK(exec_rpcs_complete_barrier_ != nullptr &&
              :       exec_rpcs_complete_barrier_->pending() <= 0) << "Exec() 
must be called first";
              :   discard_result(SetNonErrorTerminalSt
> does this mean we can call cancel before Exec() completes? what if cancel i
Yeah, this is bogus. Fixed. The comments in the header already say Exec() needs 
to precede this.


http://gerrit.cloudera.org:8080/#/c/10158/15/be/src/runtime/coordinator.cc@943
PS15, Line 943: e : backend_states_) {
> Do we expect these methods to be called before Exec() completes? If not, we
In principle, I agree that's how we'd want it to work. But given the way 
ClientRequestState currently works and that things like the webserver just 
reach in and get the coord() object, I think they can currently call these 
webserver support rourtines before Exec() completes. How about I revisit this 
after your patch is merged?



--
To view, visit http://gerrit.cloudera.org:8080/10158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e
Gerrit-Change-Number: 10158
Gerrit-PatchSet: 16
Gerrit-Owner: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sail...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Thu, 10 May 2018 00:22:42 +0000
Gerrit-HasComments: Yes

Reply via email to