Re: Spark on Kubernetes Builder Pattern Design Document

2018-02-05 Thread Mark Hamstra
Sure. Obviously, there is going to be some overlap as the project
transitions to being part of mainline Spark development. As long as you are
consciously working toward moving discussions into this dev list, then all
is good.

On Mon, Feb 5, 2018 at 1:56 PM, Matt Cheah  wrote:

> I think in this case, the original design that was proposed before the
> document was implemented on the Spark on K8s fork, that we took some time
> to build separately before proposing that the fork be merged into the main
> line.
>
>
>
> Specifically, the timeline of events was:
>
>
>
>1. We started building Spark on Kubernetes on a fork and was prepared
>to merge our work directly into master,
>2. Discussion on https://issues.apache.org/jira/browse/SPARK-18278 led
>us to move down the path of working on a fork first. We would harden the
>fork, have the fork become used more widely to prove its value and
>robustness in practice. See https://github.com/apache-
>spark-on-k8s/spark
>3. On said fork, we made the original design decisions to use a
>step-based builder pattern for the driver but not the same design for the
>executors. This original discussion was made among the collaborators of the
>fork, as much of the work on the fork in general was not done on the
>mailing list.
>4. We eventually decided to merge the fork into the main line, and got
>the feedback in the corresponding PRs.
>
>
>
> Therefore the question may less so be with this specific design, but
> whether or not the overarching approach we took - building Spark on K8s on
> a fork first before merging into mainline – was the correct one in the
> first place. There’s also the issue that the work done on the fork was
> isolated from the dev mailing list. Moving forward as we push our work into
> mainline Spark, we aim to be transparent with the Spark community via the
> Spark mailing list and Spark JIRA tickets. We’re specifically aiming to
> deprecate the fork and migrate all the work done on the fork into the main
> line.
>
>
>
> -Matt Cheah
>
>
>
> *From: *Mark Hamstra 
> *Date: *Monday, February 5, 2018 at 1:44 PM
> *To: *Matt Cheah 
> *Cc: *"dev@spark.apache.org" , "
> ramanath...@google.com" , Ilan Filonenko <
> i...@cornell.edu>, Erik , Marcelo Vanzin <
> van...@cloudera.com>
> *Subject: *Re: Spark on Kubernetes Builder Pattern Design Document
>
>
>
> That's good, but you should probably stop and consider whether the
> discussions that led up to this document's creation could have taken place
> on this dev list -- because if they could have, then they probably should
> have as part of the whole spark-on-k8s project becoming part of mainline
> spark development, not a separate fork.
>
>
>
> On Mon, Feb 5, 2018 at 1:17 PM, Matt Cheah  wrote:
>
> Hi everyone,
>
>
>
> While we were building the Spark on Kubernetes integration, we realized
> that some of the abstractions we introduced for building the driver
> application in spark-submit, and building executor pods in the scheduler
> backend, could be improved for better readability and clarity. We received
> feedback in this pull request[github.com]
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_spark_pull_19954&d=DwMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=erOFmxRHVvo6PvT99RtjVlMv__RdcgyOiWW5leJYHqw&s=FhN8lkONIMpEX-CfF7YaC91JJWA695X8DNbM3p9bB3c&e=>
> in particular. In response to this feedback, we’ve put together a design
> document that proposes a possible refactor to address the given feedback.
>
>
>
> You may comment on the proposed design at this link:
> https://docs.google.com/document/d/1XPLh3E2JJ7yeJSDLZWXh_
> lUcjZ1P0dy9QeUEyxIlfak/edit#[docs.google.com]
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1XPLh3E2JJ7yeJSDLZWXh-5FlUcjZ1P0dy9QeUEyxIlfak_edit-23&d=DwMFaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=hzwIMNQ9E99EMYGuqHI0kXhVbvX3nU3OSDadUnJxjAs&m=erOFmxRHVvo6PvT99RtjVlMv__RdcgyOiWW5leJYHqw&s=tiPOvGDtyhow_VDk3X4hjCs7l3fVeCyRlQDgXLzhD_Q&e=>
>
>
>
> I hope that we can have a productive discussion and continue improving the
> Kubernetes integration further.
>
>
>
> Thanks,
>
>
>
> -Matt Cheah
>
>
>


Re: Spark on Kubernetes Builder Pattern Design Document

2018-02-05 Thread Matt Cheah
I think in this case, the original design that was proposed before the document 
was implemented on the Spark on K8s fork, that we took some time to build 
separately before proposing that the fork be merged into the main line.

 

Specifically, the timeline of events was:

 
We started building Spark on Kubernetes on a fork and was prepared to merge our 
work directly into master,
Discussion on https://issues.apache.org/jira/browse/SPARK-18278 led us to move 
down the path of working on a fork first. We would harden the fork, have the 
fork become used more widely to prove its value and robustness in practice. See 
https://github.com/apache-spark-on-k8s/spark
On said fork, we made the original design decisions to use a step-based builder 
pattern for the driver but not the same design for the executors. This original 
discussion was made among the collaborators of the fork, as much of the work on 
the fork in general was not done on the mailing list.
We eventually decided to merge the fork into the main line, and got the 
feedback in the corresponding PRs.
 

Therefore the question may less so be with this specific design, but whether or 
not the overarching approach we took - building Spark on K8s on a fork first 
before merging into mainline – was the correct one in the first place. There’s 
also the issue that the work done on the fork was isolated from the dev mailing 
list. Moving forward as we push our work into mainline Spark, we aim to be 
transparent with the Spark community via the Spark mailing list and Spark JIRA 
tickets. We’re specifically aiming to deprecate the fork and migrate all the 
work done on the fork into the main line.

 

-Matt Cheah

 

From: Mark Hamstra 
Date: Monday, February 5, 2018 at 1:44 PM
To: Matt Cheah 
Cc: "dev@spark.apache.org" , "ramanath...@google.com" 
, Ilan Filonenko , Erik 
, Marcelo Vanzin 
Subject: Re: Spark on Kubernetes Builder Pattern Design Document

 

That's good, but you should probably stop and consider whether the discussions 
that led up to this document's creation could have taken place on this dev list 
-- because if they could have, then they probably should have as part of the 
whole spark-on-k8s project becoming part of mainline spark development, not a 
separate fork. 

 

On Mon, Feb 5, 2018 at 1:17 PM, Matt Cheah  wrote:

Hi everyone,

 

While we were building the Spark on Kubernetes integration, we realized that 
some of the abstractions we introduced for building the driver application in 
spark-submit, and building executor pods in the scheduler backend, could be 
improved for better readability and clarity. We received feedback in this pull 
request[github.com] in particular. In response to this feedback, we’ve put 
together a design document that proposes a possible refactor to address the 
given feedback.

 

You may comment on the proposed design at this link: 
https://docs.google.com/document/d/1XPLh3E2JJ7yeJSDLZWXh_lUcjZ1P0dy9QeUEyxIlfak/edit#[docs.google.com]

 

I hope that we can have a productive discussion and continue improving the 
Kubernetes integration further.

 

Thanks,

 

-Matt Cheah

 



smime.p7s
Description: S/MIME cryptographic signature


Re: Spark on Kubernetes Builder Pattern Design Document

2018-02-05 Thread Mark Hamstra
That's good, but you should probably stop and consider whether the
discussions that led up to this document's creation could have taken place
on this dev list -- because if they could have, then they probably should
have as part of the whole spark-on-k8s project becoming part of mainline
spark development, not a separate fork.

On Mon, Feb 5, 2018 at 1:17 PM, Matt Cheah  wrote:

> Hi everyone,
>
>
>
> While we were building the Spark on Kubernetes integration, we realized
> that some of the abstractions we introduced for building the driver
> application in spark-submit, and building executor pods in the scheduler
> backend, could be improved for better readability and clarity. We received
> feedback in this pull request 
> in particular. In response to this feedback, we’ve put together a design
> document that proposes a possible refactor to address the given feedback.
>
>
>
> You may comment on the proposed design at this link:
> https://docs.google.com/document/d/1XPLh3E2JJ7yeJSDLZWXh_
> lUcjZ1P0dy9QeUEyxIlfak/edit#
>
>
>
> I hope that we can have a productive discussion and continue improving the
> Kubernetes integration further.
>
>
>
> Thanks,
>
>
>
> -Matt Cheah
>


Spark on Kubernetes Builder Pattern Design Document

2018-02-05 Thread Matt Cheah
Hi everyone,

 

While we were building the Spark on Kubernetes integration, we realized that 
some of the abstractions we introduced for building the driver application in 
spark-submit, and building executor pods in the scheduler backend, could be 
improved for better readability and clarity. We received feedback in this pull 
request in particular. In response to this feedback, we’ve put together a 
design document that proposes a possible refactor to address the given feedback.

 

You may comment on the proposed design at this link: 
https://docs.google.com/document/d/1XPLh3E2JJ7yeJSDLZWXh_lUcjZ1P0dy9QeUEyxIlfak/edit#

 

I hope that we can have a productive discussion and continue improving the 
Kubernetes integration further.

 

Thanks,

 

-Matt Cheah



smime.p7s
Description: S/MIME cryptographic signature