Github user echarles commented on the issue:
https://github.com/apache/zeppelin/pull/2637
@matyix I have tested your last commits and was able to make it work in my
env (with both zeppelin `in-` and `out` k8s cluster).
Your implement a new (specific for spark-k8s) launch and remote executor.
In another local branch, I have tried to stick as much as possible to the
current zeppelin paradigm (thrift servers both sides of the interpreters
processes with CallbackInfo) and 2 parameters (host, port) for interpreter.sh -
I still have issue with the callback, so I finally think the approach you
propose is good and does the job.
My feedbacks:
+ The branch as-such need basic updates: I had to fix compilation issue
with the new classes (`SparkK8sInterpreterLauncher` and
`SparkK8sRemoteIntepreterManagedProcess`) and had to add
`${ZEPPELIN_SPARK_CONF}` in the `interpreter.sh` script.
+ To find the running driver pod, you actually poll on regular basis. The
ideal would be to be notified when the pod is ready (not sure if the k8s client
support this. We would closely map the current mechanism of the thrift
notification via the CallbackInfo, but here with a pure k8s mechanism. This
could be also extended to other interpreters we would want to see in k8s.
+ We need to set `spark.app.name` with must a value starting with `zri-` -
If you don't set this in the interpreter settings, the k8s client will not find
the driver pod - I wonder if we can make this more configurable, let's say
using metada or simply using the InterpreterContext with contains a
`properties` attributes with all the given props - the launcher could retrieve
this and search for a pod starting with a dynamic prefix rather than with this
hardcoded one.
+ The current vanilla zeppelin supports out-of-the-box the spark-k8s
`client-mode` (assuming you are using
https://github.com/apache/zeppelin/pull/2637). The condition to use the
`SparkK8sInterpreterLauncher` needs to check for `spark.submit.deployMode`
being `cluster` and continue to use the normal ManagedProcess for `client`.
+ On documentation level, certainly mention that the app name must start
with `zri-`. Also, relying on the kubespark docker image would be better to
ensure nothing special is added in the docker image.
WDYT?
Do you prefer me to submit a PR on your PR and will you make another push?
---