Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/19954
  
    > It's nice that there is some documentation somewhere, but that 
documentation doesn't really seem to address my comments. For one example, it 
only explicitly talks about the driver - which sort of makes sense because the 
document is about submission. But why aren't orchestrators used when starting 
executors too? It seems there's similar code baked into another class instead.
    
    > What I'm asking is for this to be documented properly so that someone who 
didn't write the code has enough information to know that it's working as it 
should. Right now I don't see what some of these abstractions are for at all - 
for example, as far as I can see, the orchestrator can be replaced by a method 
call instead of being a completely separate type; it's not really abstracting 
anything. Look at where it's used:
    
    @vanzin Thanks for this feedback. I might suggest that documentation isn't 
the main issue, but rather that it's possible the abstractions themselves are 
not self-documenting. We're open to ideas for alternative representations of 
the submission logic that are easier to parse for the outside reader, and this 
dialogue is very productive.
    
    > But why aren't orchestrators used when starting executors too? It seems 
there's similar code baked into another class instead.
    
    As with any abstraction, we can inline the orchestrator into the submission 
client, and then consider if that makes the `Client` do too many things, and 
also, what would the unit tests look like as a result? I like that we can test 
the orchestrator and its selection of which steps to apply independently from 
testing that the `Client` takes the API objects and creates them accordingly, 
particularly with the hooked up owner references. A goal we had with this 
design is that every class has as few responsibilities as possible; ideally 
every module is responsible for exactly one thing. Though the long constructor 
argument list for the orchestrator would suggest that the orchestrator is tied 
to the Client pretty tightly.
    
    > But why aren't orchestrators used when starting executors too? It seems 
there's similar code baked into another class instead.
    
    A common situation one will see is that the submission client needs to do 
strictly more work than the driver does, as there is more work required to set 
up a Spark application and create the objects for it, than it is to bootstrap 
the executor pods with the objects that already exist. For example, the 
submission client has to create the config map that sets up the init-container, 
as well as mount the config map volume into the driver; but the driver does not 
create the config map but instead uses the pre-existing one, but still has to 
do the same mounting into the executor pod as the submission client does. These 
semantics are shown where we first have a submodule that attaches the config 
map volumes to an arbitrary pod (the “bootstrap”), then, the driver need 
only use that submodule in isolation, but the submission client wraps that 
submodule in a step that both uses that submodule and also creates the API 
objects in the API server.
    
    It would be neat to explore the idea of using a step-based system that is 
shared for creating the executors and creating the driver, but we have to think 
carefully about a proper abstraction. For example, the `KubernetesDriverSpec` 
has the notion of `additionalKubernetesResources` and the `driverSparkConf`, 
which are constructs that do not apply when creating the executor pods. This 
illustrates what we observe above - the construction of the application is 
strictly more work than creating the executors. So then, does the submission 
client have two orchestrators - one for configuring the driver pod itself, and 
one for configuring everything else in the application, e.g. creating config 
maps? Then the orchestration and step selection for the pod-configuration-steps 
could be shared between the submission client and the driver, but the 
orchestrator for strictly creating Kubernetes API objects and the driver spark 
conf would only be used by the submission client. There are some direct
 ions we can explore here, and we are open to ideas.
    
    > So aside from not really being able to infer the structure of how these 
things work, the current abstraction seems to be creating a lot of constructors 
and methods with long lists of arguments, which is another thing that hurts the 
readability of the code.
    
    I think regardless of the abstractions we choose, we're eventually going to 
end up with many classes with unique constructors simply due to the complexity 
of the submission client itself. Creating a Spark application in Kubernetes has 
many steps, and we want to put each step into a submodule - regardless if the 
submodules have a shared trait like we have now, or if the submodules are just 
independent classes with independent APIs. Each submodule will have an argument 
list that is befitting of the properties that submodule will require to 
configure the driver. See 
https://github.com/apache-spark-on-k8s/spark/tree/branch-2.2-kubernetes/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/submitsteps
 for all the different submodules the submission client will eventually need.
    
    I do think that we should remove the `InitContainer` level of steps and 
orchestration, as having two layers of steps and orchestration is pretty 
confusing. There’s only one code branch in the init container steps 
orchestrator which we only apply when we are also submitting local files - and 
that isn’t even done in this PR - so it shouldn’t be too big of a burden to 
inline that logic all into the top level step when the time comes.
    
    Again, the feedback is very much appreciated. What are some ways we can 
improve and move forward here?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to