I would like to run a spark master from my Jupyter notebook. Currently I have Jupyter Hub running in Kubernetes successfully, and Kubespawner creates notebooks and it is awesome.
The problem is, in order to run spark master (driver) from my notebook to a Yarn cluster (not spark-submit, where the driver would be running in the cluster) the yarn executors cannot reach back to my notebook because the ports aren't exposed as a service. Looking at Kubespawner code, I see there is a function it looks like to make a service with each pod, but I'm not sure how it would be invoked. https://github.com/jupyterhub/kubespawner/blob/master/kubespawner/objects.py#L916: def make_service( name, port, servername, owner_references, labels=None, annotations=None, ): """ Make a k8s service specification for using dns to communicate with the notebook. Parameters ---------- name: Name of the service. Must be unique within the namespace the object is going to be created in. env: Dictionary of environment variables. labels: Labels to add to the service. annotations: Annotations to add to the service. """ And then there's a call to make an ingress as well: https://github.com/jupyterhub/kubespawner/blob/master/kubespawner/objects.py#L720 What are my missing? I can't see anything in the Kubespawner documentation about creating service ports (ideally using a LoadBalancer IP). Side question: should I be using Enterprise Gateway? I'm slightly confused at what the mainstream way to deploy Jupyter notebooks in a multi-user environment should be. I hadn't heard of the EG until I started researching this issue, and now I'm wondering if I'm deploying the right thing at all. Thanks in Advance, |> Greg -- You received this message because you are subscribed to the Google Groups "Project Jupyter" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/jupyter/6b0d8628-293e-4118-9749-feae094b5d66n%40googlegroups.com.
