Hey all

 

For those following the K8S backend you are probably aware of SPARK-24434 [1] 
(and PR 22416 [2]) which proposes a mechanism to allow for advanced pod 
customisation via pod templates.  This is motivated by the fact that 
introducing additional Spark configuration properties for each aspect of pod 
specification a user might wish to customise was becoming unwieldy.

 

However I am concerned that the current implementation doesn’t go far enough 
and actually limits the utility of the proposed new feature.  The problem stems 
from the fact that the implementation simply uses the pod template as a base 
and then Spark attempts to build a pod spec on top of that.  As the code that 
does this doesn’t do any kind of validation or inspection of the incoming 
template it is possible to provide a template that causes Spark to generate an 
invalid pod spec ultimately causing the job to be rejected by Kubernetes.

 

Now clearly Spark code cannot attempt to account for every possible 
customisation that a user may attempt to make via pod templates nor should it 
be responsible for ensuring that the user doesn’t start from an invalid 
template in the first place.  However it seems like we could be more 
intelligent in how we build our pod specs to avoid generating invalid specs in 
cases where we have a clear use case for advanced customisation.  For example 
the current implementation does not allow users to customise the volumes used 
to back SPARK_LOCAL_DIRS to better suit the compute environment the K8S cluster 
is running on and trying to do so with a pod template will result in an invalid 
spec due to duplicate volumes.

 

I think there are a few ways the community could address this:

 
Status quo – provide the pod template feature as-is and simply tell users that 
certain customisations are never supported and may result in invalid pod specs
Provide the ability for advanced users to explicitly skip pod spec building 
steps they know interfere with their pod templates via configuration properties
Modify the pod spec building code to be aware of known desirable user 
customisation points and avoid generating  invalid specs in those cases
 

Currently committers seem to be going for Option 1.  Personally I would like to 
see the community adopt option 3 but have already received considerable 
pushback when I proposed that in one of my PRs hence the suggestion of the 
compromise option 2.  Yes this still has the possibility of ending up with 
invalid specs if users are over-zealous in the spec building steps they disable 
but since this is a power user feature I think this would be a risk power users 
would be willing to assume.  If we are going to provide features for power 
users we should avoid unnecessarily limiting the utility of those features.

 

What do other K8S folks think about this issue?

 

Thanks,

 

Rob

 

[1] https://issues.apache.org/jira/browse/SPARK-24434

[2] https://github.com/apache/spark/pull/22146

 

Reply via email to