An Update on Spark on Kubernetes [Jun 23]

2017-06-23 Thread Anirudh Ramanathan
*Project Description: *Kubernetes cluster manager integration that enables native support for submitting Spark applications to a kubernetes cluster. The submitted applications can make use of Kubernetes native constructs. *JIRA*: 18278 *Upstream

Re: Thoughts on release cadence?

2017-07-31 Thread Anirudh Ramanathan
Is now the best time for submitting a SPIP for a feature targeting the 2.3 release? On Mon, Jul 31, 2017 at 12:22 PM, Sean Owen wrote: > Done at https://spark.apache.org/versioning-policy.html > > On Mon, Jul 31, 2017 at 6:22 PM Reynold Xin

Re: SPIP: Spark on Kubernetes

2017-08-21 Thread Anirudh Ramanathan
t seem necessary for kubernetes support, or specific to it. > If its nice to have, and you want to add it to kubernetes first before > other cluster managers, fine, but seems separate from this proposal. > > > > On Tue, Aug 15, 2017 at 10:32 AM, Anirudh Ramanathan < > fo

Re: SPIP: Spark on Kubernetes

2017-08-31 Thread Anirudh Ramanathan
-19700>. This vote has passed. So far, there have been 4 binding +1 votes, ~25 non-binding votes, and no -1 votes. Thanks all! +1 votes (binding): Reynold Xin Matei Zahari Marcelo Vanzin Mark Hamstra +1 votes (non-binding): Anirudh Ramanathan Erik Erlandson Ilan Filonenko Sean Suchter Kimoon

Re: Timeline for Spark 2.3

2017-11-09 Thread Anirudh Ramanathan
This would help the community on the Kubernetes effort quite a bit - giving us additional time for reviews and testing for the 2.3 release. On Thu, Nov 9, 2017 at 3:56 PM, Justin Miller wrote: > That sounds fine to me. I’m hoping that this ticket can make it into Spark > 2.3: https://issues.apac

Publishing official docker images for KubernetesSchedulerBackend

2017-11-29 Thread Anirudh Ramanathan
thoughts of the community regarding this. -- Thanks, Anirudh Ramanathan

Re: Publishing official docker images for KubernetesSchedulerBackend

2017-11-29 Thread Anirudh Ramanathan
ink: pyspark on PyPI as well) that makes sense, but is that the >> situation? if it's more of an extension or alternate presentation of Spark >> components, that typically wouldn't be part of a Spark release. The ones >> the PMC takes responsibility for maintaining ought to be the

Kubernetes Scheduler Backend for Spark: Pre Holiday Update

2017-12-22 Thread Anirudh Ramanathan
Hi all, We've all been hard at work on the Spark + Kubernetes effort over the past few weeks. Here's an update on what we've been up to. TL; DR - we're *nearly* there! Spark 2.3 promises to be an exciting release for the Apache Spark and Kubernetes communities. *What's done?* (please see SPARK-18

Re: Kubernetes backend and docker images

2018-01-08 Thread Anirudh Ramanathan
+1 We discussed some alternatives early on - including using a single dockerfile and different spec.container.command and spec.container.args from the Kubernetes driver/executor specification (which override entrypoint in docker). No reason that won't work also - except that it reduced the transpa

Re: Kubernetes backend and docker images

2018-01-08 Thread Anirudh Ramanathan
Is there a reason why that approach would not work? You could still >>> create separate images for driver and executor if wanted, but there's >>> no reason I can see why we should need 3 images for the simple case. >>> >>> Note that the code there can be cleaned up still, and I don't love the >>> idea of using env variables to propagate arguments to the container, >>> but that works for now. >>> >>> -- >>> Marcelo >>> >>> - >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>> >> -- Anirudh Ramanathan

Integration testing and Scheduler Backends

2018-01-08 Thread Anirudh Ramanathan
This is with regard to the Kubernetes Scheduler Backend and scaling the process to accept contributions. Given we're moving past upstreaming changes from our fork, and into getting *new* patches, I wanted to start this discussion sooner than later. This is more of a post-2.3 question - not somethin

Re: Integration testing and Scheduler Backends

2018-01-08 Thread Anirudh Ramanathan
10:16 PM Anirudh Ramanathan > wrote: > >> This is with regard to the Kubernetes Scheduler Backend and scaling the >> process to accept contributions. Given we're moving past upstreaming >> changes from our fork, and into getting *new* patches, I wanted to start >> thi

Re: Kubernetes: why use init containers?

2018-01-09 Thread Anirudh Ramanathan
We were running a change in our fork which was similar to this at one point early on. My biggest concerns off the top of my head with this change would be localization performance with large numbers of executors, and what we lose in terms of separation of concerns. Init containers are a standard co

Re: Kubernetes: why use init containers?

2018-01-09 Thread Anirudh Ramanathan
explicit > when using an init-container, even though the code does end up being more > complex. > > > > *From: *Yinan Li > *Date: *Tuesday, January 9, 2018 at 7:16 PM > *To: *Nicholas Chammas > *Cc: *Anirudh Ramanathan , Marcelo Vanzin > , Matt Cheah , Kimoon Kim <

Re: Kubernetes: why use init containers?

2018-01-09 Thread Anirudh Ramanathan
Marcelo, I can see that we might be misunderstanding what this change implies for performance and some of the deeper implementation details here. We have a community meeting tomorrow (at 10am PT), and we'll be sure to explore this idea in detail, and understand the implications and then get back to

Re: Kubernetes: why use init containers?

2018-01-10 Thread Anirudh Ramanathan
contract is that the Spark driver pod does not have an init > container, and the driver handles its own dependencies, then by > definition that situation cannot exist. > > -- > Marcelo > -- Anirudh Ramanathan

Re: Kubernetes: why use init containers?

2018-01-11 Thread Anirudh Ramanathan
time to make the change, test and release with confidence. On Wed, Jan 10, 2018 at 3:45 PM, Marcelo Vanzin wrote: > On Wed, Jan 10, 2018 at 3:00 PM, Anirudh Ramanathan > wrote: > > We can start by getting a PR going perhaps, and start augmenting the > > integration testing to e

Re: Kubernetes: why use init containers?

2018-01-12 Thread Anirudh Ramanathan
n, and mainly testing, and we're pretty > far into the 2.3 cycle for all of those to be sorted out. > > On Thu, Jan 11, 2018 at 8:19 AM, Anirudh Ramanathan > wrote: > > If we can separate concerns those out, that might make sense in the short > > term IMO. > >

Re: Kubernetes: why use init containers?

2018-01-12 Thread Anirudh Ramanathan
e. Thanks, Anirudh On Jan 12, 2018 2:00 PM, "Marcelo Vanzin" wrote: > On Fri, Jan 12, 2018 at 1:53 PM, Anirudh Ramanathan > wrote: > > As I understand, the bigger change discussed here are like the init > > containers, which will be more on the implementation side than

Re: [VOTE] Spark 2.3.0 (RC1)

2018-01-12 Thread Anirudh Ramanathan
ctively worked on: > > 1. SPARK-23051 that tracks a regression in the Spark UI > 2. SPARK-23020 and SPARK-23000 that track a couple of flaky tests that are > responsible for build failures. Additionally, https://github.com/apache/ > spark/pull/20242 fixes a few Java linter errors in RC1. > > Given that these blockers are fairly isolated, in the sprit of starting a > thorough QA early, this RC1 aims to serve as a good approximation of the > functionality of final release. > > Regards, > Sameer > -- Anirudh Ramanathan

Re: [Kubernetes] structured-streaming driver restarts / roadmap

2018-03-28 Thread Anirudh Ramanathan
We discussed this early on in our fork and I think we should have this in a JIRA and discuss it further. It's something we want to address in the future. One proposed method is using a StatefulSet of size 1 for the driver. This ensures recovery but at the same time takes away from the completion s

Re: Build issues with apache-spark-on-k8s.

2018-03-28 Thread Anirudh Ramanathan
As Lucas said, those directories are generated and copied when you run a full maven build with the -Pkubernetes flag specified (or use instructions in https://spark.apache.org/docs/latest/building-spark.html#building-a-runnable-distribution ). Also, using the Kubernetes integration in the main Ap

Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

2018-04-02 Thread Anirudh Ramanathan
amp;c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=BFXcJr3WTIvmlY-gtaiCO5QK4bLix2sgwDDpPfrZKoE&m=TrCA4oIVKyN3M_ExqpHr7bbhi14uvoEaspPwclIJI4M&s=hA8h-KIeJ_6Khjx1JzFZF55ZH3GnSrB4HEkHc1I-yBc&e=> >>> with more details and made a PR [github.com] >>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_20943&d=DwMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=BFXcJr3WTIvmlY-gtaiCO5QK4bLix2sgwDDpPfrZKoE&m=TrCA4oIVKyN3M_ExqpHr7bbhi14uvoEaspPwclIJI4M&s=qZFhxef7FgsA9UfijbVtKAIDuchcTf9wQxYIKL87SsU&e=> >>> . >>> >>> >>> >>> *For CPU:* >>> >>> As it turns out, there can be performance problems if we only have >>> `executor.cores` available (which means we have one core per task). This >>> was raised here [github.com] >>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache-2Dspark-2Don-2Dk8s_spark_issues_352&d=DwMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=BFXcJr3WTIvmlY-gtaiCO5QK4bLix2sgwDDpPfrZKoE&m=TrCA4oIVKyN3M_ExqpHr7bbhi14uvoEaspPwclIJI4M&s=uTMrl29jkJRlc_N1S_6lvwCjkovzrsan8zIczzxDZGM&e=> >>> and is the reason that the cpu limit was set to unlimited. >>> >>> This issue stems from the fact that in general there will be more than >>> one thread per task, resulting in performance impacts if there is only one >>> core available. >>> >>> However, I am not sure that just setting the limit to unlimited is the >>> best solution because it means that even if the Kubernetes cluster can >>> perfectly satisfy the resource requests, performance might be very bad. >>> >>> >>> >>> I think we should guarantee that an executor is able to do its work well >>> (without performance issues or getting killed - as could happen in the >>> memory case) with the resources it gets guaranteed from Kubernetes. >>> >>> >>> >>> One way to solve this could be to request more than 1 core from >>> Kubernetes per task. The exact amount we should request is unclear to me >>> (it largely depends on how many threads actually get spawned for a task). >>> >>> We would need to find a way to determine this somehow automatically or >>> at least come up with a better default value than 1 core per task. >>> >>> >>> >>> Does somebody have ideas or thoughts on how to solve this best? >>> >>> >>> >>> Best, >>> >>> David >>> >>> >>> >>> >>> >>> >>> >> >> -- Anirudh Ramanathan