Thanks for bring this up. My opinion on this is this feature is really
targeting advanced use cases that need more customization than what the
basic k8s-related Spark config properties offer. So I think it's fair to
assume that users who would like to use this feature know the risks and are
responsible for making sure the supplied pod template is valid. Clear
documentation on which fields of the pod spec are customizable and which
are not is critical. With that being said, however, I agree that it's not a
good experience for end users if the pod spec generated from a template
plus overlay of customization from other Spark config properties and the
k8s backend itself is invalid. Ideally, I think option #3 is favorable and
the k8s backend code or the code for building the pod spec specifically
should perform validation on the generated pod spec, e.g., json schema
validation at bare minimum. I'm not sure how well this is supported by the
client (farbric8 java client) we use though. Regarding option #2, I
personally think the risks and complexity it brings far outweigh the
benefits and it is much more prone to errors. For example, the k8s backend
code sets some environment variables in the container that get used by the
entrypoint script. If the backend skips the building of the pod spec, the
end users are responsible for properly populating them and maybe other
stuffs. So my opinion is I think initially #1 plus good documentation is
the way to go, and we can improve on this in subsequent releases to move it
towards #3.

Yinan

On Wed, Sep 19, 2018 at 9:12 AM Rob Vesse <rve...@dotnetrdf.org> wrote:

> Hey all
>
>
>
> For those following the K8S backend you are probably aware of SPARK-24434
> [1] (and PR 22416 [2]) which proposes a mechanism to allow for advanced pod
> customisation via pod templates.  This is motivated by the fact that
> introducing additional Spark configuration properties for each aspect of
> pod specification a user might wish to customise was becoming unwieldy.
>
>
>
> However I am concerned that the current implementation doesn’t go far
> enough and actually limits the utility of the proposed new feature.  The
> problem stems from the fact that the implementation simply uses the pod
> template as a base and then Spark attempts to build a pod spec on top of
> that.  As the code that does this doesn’t do any kind of validation or
> inspection of the incoming template it is possible to provide a template
> that causes Spark to generate an invalid pod spec ultimately causing the
> job to be rejected by Kubernetes.
>
>
>
> Now clearly Spark code cannot attempt to account for every possible
> customisation that a user may attempt to make via pod templates nor should
> it be responsible for ensuring that the user doesn’t start from an invalid
> template in the first place.  However it seems like we could be more
> intelligent in how we build our pod specs to avoid generating invalid specs
> in cases where we have a clear use case for advanced customisation.  For
> example the current implementation does not allow users to customise the
> volumes used to back SPARK_LOCAL_DIRS to better suit the compute
> environment the K8S cluster is running on and trying to do so with a pod
> template will result in an invalid spec due to duplicate volumes.
>
>
>
> I think there are a few ways the community could address this:
>
>
>
>    1. Status quo – provide the pod template feature as-is and simply tell
>    users that certain customisations are never supported and may result in
>    invalid pod specs
>    2. Provide the ability for advanced users to explicitly skip pod spec
>    building steps they know interfere with their pod templates via
>    configuration properties
>    3. Modify the pod spec building code to be aware of known desirable
>    user customisation points and avoid generating  invalid specs in those 
> cases
>
>
>
> Currently committers seem to be going for Option 1.  Personally I would
> like to see the community adopt option 3 but have already received
> considerable pushback when I proposed that in one of my PRs hence the
> suggestion of the compromise option 2.  Yes this still has the possibility
> of ending up with invalid specs if users are over-zealous in the spec
> building steps they disable but since this is a power user feature I think
> this would be a risk power users would be willing to assume.  If we are
> going to provide features for power users we should avoid unnecessarily
> limiting the utility of those features.
>
>
>
> What do other K8S folks think about this issue?
>
>
>
> Thanks,
>
>
>
> Rob
>
>
>
> [1] https://issues.apache.org/jira/browse/SPARK-24434
>
> [2] https://github.com/apache/spark/pull/22146
>
>
>

Reply via email to