Yeah, this is the same idea behind having Travis cache the ivy2 folder to
speed up builds. In Amplab Jenkins each individual build workspace has its
own individual Ivy cache which is preserved across build runs but which is
only used by one active run at a time in order to avoid SBT ivy lock
conten
Interesting. As long as Spark's dependencies don't change that often, the
same caches could save "from scratch" build time over many months of Spark
development. Is that right?
On Tue, Dec 8, 2015 at 12:33 PM Josh Rosen wrote:
> @Nick, on a fresh EC2 instance a significant chunk of the initial b
I will echo Steve L's comment about having zinc running (with --nailed).
That provides at least a 2X speedup - sometimes without it spark simply
does not build for me.
2015-12-08 9:33 GMT-08:00 Josh Rosen :
> @Nick, on a fresh EC2 instance a significant chunk of the initial build
> time might be
@Nick, on a fresh EC2 instance a significant chunk of the initial build
time might be due to artifact resolution + downloading. Putting
pre-populated Ivy and Maven caches onto your EC2 machine could shave a
decent chunk of time off that first build.
On Tue, Dec 8, 2015 at 9:16 AM, Nicholas Chammas
Thanks for the tips, Jakob and Steve.
It looks like my original approach is the best for me since I'm installing
Spark on newly launched EC2 instances and can't take advantage of
incremental compilation.
Nick
On Tue, Dec 8, 2015 at 7:01 AM Steve Loughran
wrote:
> On 7 Dec 2015, at 19:07, Jakob
On 7 Dec 2015, at 19:07, Jakob Odersky
mailto:joder...@gmail.com>> wrote:
make-distribution and the second code snippet both create a distribution from a
clean state. They therefore require that every source file be compiled and that
takes time (you can maybe tweak some settings or use a newer
make-distribution and the second code snippet both create a distribution
from a clean state. They therefore require that every source file be
compiled and that takes time (you can maybe tweak some settings or use a
newer compiler to gain some speed).
I'm inferring from your question that for your
Say I want to build a complete Spark distribution against Hadoop 2.6+ as
fast as possible from scratch.
This is what I’m doing at the moment:
./make-distribution.sh -T 1C -Phadoop-2.6
-T 1C instructs Maven to spin up 1 thread per available core. This takes
around 20 minutes on an m3.large instan