hvanhovell commented on code in PR #46570:
URL: https://github.com/apache/spark/pull/46570#discussion_r1605509972


##########
connector/connect/common/src/main/protobuf/spark/connect/base.proto:
##########
@@ -199,6 +200,17 @@ message AnalyzePlanRequest {
     // (Required) The logical plan to get the storage level.
     Relation relation = 1;
   }
+
+  message Checkpoint {

Review Comment:
   Caching is lazy, so it cannot trigger execution of the main query. 
Checkpoint can actually do this, which is weird for an analyze request. 
Moreover analyse does not work well with long running operations (i.e. an eager 
checkpoint) for the following reasons::
   - Analyze needs a thread in the server's thread pool which in extreme cases 
can lead to exhaustion, or in a less extreme scenario can lead to reduced 
availability.
   - Long running request can be killed by intermediate proxies. Analyze does 
not support reattach which is an issue when the checkpoint takes too long. In 
that case you are basically dead in the water.
   
   The argument that ExecutePlanResponse does not support this is IMO a bit 
flimsy. Look for example at the `SqlCommandResult` this can be returned as part 
of the `ExecutePlanResponse` and it returns a relation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to