> If the goal is a reproducible test environment then I think that is what
> Jenkins is. Granted you can only ask it for a test. But presumably you get
> the same result if you start from the same VM image as Jenkins and run the
> same steps.

But the issue is when users can't reproduce Jenkins failures. We don't
publish anywhere what the exact set of packages and versions is that
is installed on Jenkins. And it can change since it's a shared
infrastructure with other projects. So why not publish this manifest
as a docker file and then have it run on jenkins using that image? My
point is that this "VM image + steps" is not public anywhere.

> I bet it is not hard to set up and maintain. I bet it is easier than a VM.
> But unless Jenkins is using it aren't we just making another different
> standard build env in an effort to standardize? If it is not the same then
> it loses value as being exactly the same as the reference build env. Has a
> problem come up that this solves?

Right now the reference build env is an AMI I created and keep adding
stuff to when Spark gets new dependencies (e.g. the version of ruby we
need to create the docs, new python stats libraries, etc). So if we
had a docker image, then I would use that for making the RC's as well
and it could serve as a definitive reference for people who want to
understand exactly what set of things they need to build Spark.

>
> If the goal is just easing developer set up then what does a Docker image do
> - what does it set up for me? I don't know of stuff I need set up on OS X
> for me beyond the IDE.

There are actually a good number of packages you need to do a full
build of Spark including a compliant python version, Java version,
certain python packages, ruby and jekyll stuff for the docs, etc
(mentioned a bit earlier).

- Patrick

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to