Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16323 )
Change subject: IMPALA-9225: Add query option for retryable queries to spool all results before returning any to the client ...................................................................... Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/16323/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16323/3//COMMIT_MSG@25 PS3, Line 25: The coordinator fragment instance will only update its : opened_promise_ when the sender is blocked by a full queue or all : results are spooled, or when any errors happen. The DataSink::Send() : interface is extended to pass in a reference of the promise. So the sink : can update it to signal the fetch() request which is waiting for the : opened_promise_ to continue and actually fetch results. conceptually I think this approach makes sense, but I think it would make more sense to move some of the code around. see my other comments, but I think we probably don't want to change what opened_promise_ means. I think this change will effectively change the meaning of opened_promise_. instead of being set when Open() completes, it will be set when all results are spooled, which would probably be after Exec() completes. which is a bit confusing. conceptually, I think you the approach is correct. you basically want Coordinator::Wait() to block until all results have been spooled, which make sense. I just think we can move the logic directly in Coordinator::Wait() because it will be cleaner. http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/exec/data-sink.h File be/src/exec/data-sink.h: http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/exec/data-sink.h@136 PS3, Line 136: virtual Status Send(RuntimeState* state, RowBatch* batch, : Promise<Status>* sender_blocked=nullptr) = 0; See my comment in fragment-instance-state.cc, but I think this can be moved into the BufferedPlanRootSink class. You might have to do a cast in from the PlanRootSink to BufferedPlanRootSink, but that should be fine. BufferedPlanRootSink is always used when result spooling is enabled. It doesn't look like this used anywhere else but BufferedPlanRootSink anyway. You probably can't add it to the Send method in BufferedPlanRootSink since it inherits from DataSink::Send, so you might just have to add a new public method, something like IsBufferFull(). http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/runtime/fragment-instance-state.cc@430 PS3, Line 430: Promise<Status>* sender_blocked_promise = !opened_promise_.IsSet() ? : &opened_promise_ : nullptr; : RETURN_IF_ERROR(sink_->Send(runtime_state_, row_batch_.get(), : sender_blocked_promise)); I think it might make more sense to move all this logic into coordinator.cc There are a few benefits: (1) Most of this logic is really Coordinator specific. The opened_promised_ is actually only used by the Coordinator in Coordinator::Wait() (2) The Coordinator has a pointer to the PlanRootSink (see PlanRootSink* coord_sink_). So you can potentially avoid making the change to the Send() method in DataSink, which thus agains all the re-factoring to unrelated classes like KuduTableSink http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/runtime/query-driver.cc File be/src/runtime/query-driver.cc: http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/runtime/query-driver.cc@312 PS3, Line 312: if (query_ctx.client_request.query_options.safely_retry_queries) { : // Reset this flag in the retry query since we won't retry again, so results can be : // returned immediately. : query_ctx.client_request.query_options.__set_safely_retry_queries(false); : VLOG_QUERY << "Unset safely_retry_queries when retrying query " : << PrintId(client_request_state_->query_id()); yeah this makes sense, nice catch http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/runtime/spillable-row-batch-queue.h File be/src/runtime/spillable-row-batch-queue.h: http://gerrit.cloudera.org:8080/#/c/16323/3/be/src/runtime/spillable-row-batch-queue.h@133 PS3, Line 133: MAX_SPILLED_RESULT_SPOOLING_MEM thanks for catching this http://gerrit.cloudera.org:8080/#/c/16323/3/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/16323/3/common/thrift/ImpalaService.thrift@561 PS3, Line 561: SAFELY_RETRY_QUERIES I do like the idea of making this configurable. some users might find the overhead of spooling all results too high - yet may still want queries to be retried. I think we might want to change the name though. If I'm a user, I would probably always want this to be set to true, because why wouldn't I want my queries to be retried safely. perhaps a better name would be SPOOL_ALL_RESULTS_FOR_RETRIES. -- To view, visit http://gerrit.cloudera.org:8080/16323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I462dbfef9ddab9060b30a6937fca9122484a24a5 Gerrit-Change-Number: 16323 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com> Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-Comment-Date: Wed, 12 Aug 2020 20:23:57 +0000 Gerrit-HasComments: Yes