If the goal is a reproducible test environment then I think that is what
Jenkins is. Granted you can only ask it for a test. But presumably you get
the same result if you start from the same VM image as Jenkins and run the
same steps.

I bet it is not hard to set up and maintain. I bet it is easier than a VM.
But unless Jenkins is using it aren't we just making another different
standard build env in an effort to standardize? If it is not the same then
it loses value as being exactly the same as the reference build env. Has a
problem come up that this solves?

If the goal is just easing developer set up then what does a Docker image
do - what does it set up for me? I don't know of stuff I need set up on OS
X for me beyond the IDE.
On Jan 21, 2015 7:30 AM, "Patrick Wendell" <pwend...@gmail.com> wrote:

> To respond to the original suggestion by Nick. I always thought it
> would be useful to have a Docker image on which we run the tests and
> build releases, so that we could have a consistent environment that
> other packagers or people trying to exhaustively run Spark tests could
> replicate (or at least look at) to understand exactly how we recommend
> building Spark. Sean - do you think that is too high of overhead?
>
> In terms of providing images that we encourage as standard deployment
> images of Spark and want to make portable across environments, that's
> a much larger project and one with higher associated maintenance
> overhead. So I'd be interested in seeing that evolve as its own
> project (spark-deploy) or something associated with bigtop, etc.
>
> - Patrick
>
> On Tue, Jan 20, 2015 at 10:30 PM, Paolo Platter
> <paolo.plat...@agilelab.it> wrote:
> > Hi all,
> > I also tried the docker way and it works well.
> > I suggest to look at sequenceiq/spark dockers, they are very active on
> that field.
> >
> > Paolo
> >
> > Inviata dal mio Windows Phone
> > ________________________________
> > Da: jay vyas<mailto:jayunit100.apa...@gmail.com>
> > Inviato: 21/01/2015 04:45
> > A: Nicholas Chammas<mailto:nicholas.cham...@gmail.com>
> > Cc: Will Benton<mailto:wi...@redhat.com>; Spark dev list<mailto:
> dev@spark.apache.org>
> > Oggetto: Re: Standardized Spark dev environment
> >
> > I can comment on both...  hi will and nate :)
> >
> > 1) Will's Dockerfile solution is  the most  simple direct solution to the
> > dev environment question : its a  efficient way to build and develop
> spark
> > environments for dev/test..  It would be cool to put that Dockerfile
> > (and/or maybe a shell script which uses it) in the top level of spark as
> > the build entry point.  For total platform portability, u could wrap in a
> > vagrantfile to launch a lightweight vm, so that windows worked equally
> > well.
> >
> > 2) However, since nate mentioned  vagrant and bigtop, i have to chime in
> :)
> > the vagrant recipes in bigtop are a nice reference deployment of how to
> > deploy spark in a heterogenous hadoop style environment, and tighter
> > integration testing w/ bigtop for spark releases would be lovely !  The
> > vagrant stuff use puppet to deploy an n node VM or docker based cluster,
> in
> > which users can easily select components (including
> > spark,yarn,hbase,hadoop,etc...) by simnply editing a YAML file :
> >
> https://github.com/apache/bigtop/blob/master/bigtop-deploy/vm/vagrant-puppet/vagrantconfig.yaml..
> ..
> > As nate said, it would be alot of fun to get more cross collaboration
> > between the spark and bigtop communities.   Input on how we can better
> > integrate spark (wether its spork, hbase integration, smoke tests aroudn
> > the mllib stuff, or whatever, is always welcome )
> >
> >
> >
> >
> >
> >
> > On Tue, Jan 20, 2015 at 10:21 PM, Nicholas Chammas <
> > nicholas.cham...@gmail.com> wrote:
> >
> >> How many profiles (hadoop / hive /scala) would this development
> environment
> >> support ?
> >>
> >> As many as we want. We probably want to cover a good chunk of the build
> >> matrix <https://issues.apache.org/jira/browse/SPARK-2004> that Spark
> >> officially supports.
> >>
> >> What does this provide, concretely?
> >>
> >> It provides a reliable way to create a "good" Spark development
> >> environment. Roughly speaking, this probably should mean an environment
> >> that matches Jenkins, since that's where we run "official" testing and
> >> builds.
> >>
> >> For example, Spark has to run on Java 6 and Python 2.6. When devs build
> and
> >> run Spark locally, we can make sure they're doing it on these versions
> of
> >> the languages with a simple vagrant up.
> >>
> >> Nate, could you comment on how something like this would relate to the
> >> Bigtop effort?
> >>
> >> http://chapeau.freevariable.com/2014/08/jvm-test-docker.html
> >>
> >> Will, that's pretty sweet. I tried something similar a few months ago
> as an
> >> experiment to try building/testing Spark within a container. Here's the
> >> shell script I used <
> https://gist.github.com/nchammas/60b04141f3b9f053faaa
> >> >
> >> against the base CentOS Docker image to setup an environment ready to
> build
> >> and test Spark.
> >>
> >> We want to run Spark unit tests within containers on Jenkins, so it
> might
> >> make sense to develop a single Docker image that can be used as both a
> "dev
> >> environment" as well as execution container on Jenkins.
> >>
> >> Perhaps that's the approach to take instead of looking into Vagrant.
> >>
> >> Nick
> >>
> >> On Tue Jan 20 2015 at 8:22:41 PM Will Benton <wi...@redhat.com> wrote:
> >>
> >> Hey Nick,
> >> >
> >> > I did something similar with a Docker image last summer; I haven't
> >> updated
> >> > the images to cache the dependencies for the current Spark master,
> but it
> >> > would be trivial to do so:
> >> >
> >> > http://chapeau.freevariable.com/2014/08/jvm-test-docker.html
> >> >
> >> >
> >> > best,
> >> > wb
> >> >
> >> >
> >> > ----- Original Message -----
> >> > > From: "Nicholas Chammas" <nicholas.cham...@gmail.com>
> >> > > To: "Spark dev list" <dev@spark.apache.org>
> >> > > Sent: Tuesday, January 20, 2015 6:13:31 PM
> >> > > Subject: Standardized Spark dev environment
> >> > >
> >> > > What do y'all think of creating a standardized Spark development
> >> > > environment, perhaps encoded as a Vagrantfile, and publishing it
> under
> >> > > `dev/`?
> >> > >
> >> > > The goal would be to make it easier for new developers to get
> started
> >> > with
> >> > > all the right configs and tools pre-installed.
> >> > >
> >> > > If we use something like Vagrant, we may even be able to make it so
> >> that
> >> > a
> >> > > single Vagrantfile creates equivalent development environments
> across
> >> OS
> >> > X,
> >> > > Linux, and Windows, without having to do much (or any) OS-specific
> >> work.
> >> > >
> >> > > I imagine for committers and regular contributors, this exercise may
> >> seem
> >> > > pointless, since y'all are probably already very comfortable with
> your
> >> > > workflow.
> >> > >
> >> > > I wonder, though, if any of you think this would be worthwhile as a
> >> > > improvement to the "new Spark developer" experience.
> >> > >
> >> > > Nick
> >> > >
> >> >
> >>
> >>
> >
> >
> >
> > --
> > jay vyas
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>

Reply via email to