Re: Fastest way to build Spark from scratch

2015-12-09 Thread Josh Rosen
Yeah, this is the same idea behind having Travis cache the ivy2 folder to speed up builds. In Amplab Jenkins each individual build workspace has its own individual Ivy cache which is preserved across build runs but which is only used by one active run at a time in order to avoid SBT ivy lock conten

Re: Fastest way to build Spark from scratch

2015-12-08 Thread Nicholas Chammas
Interesting. As long as Spark's dependencies don't change that often, the same caches could save "from scratch" build time over many months of Spark development. Is that right? On Tue, Dec 8, 2015 at 12:33 PM Josh Rosen wrote: > @Nick, on a fresh EC2 instance a significant chunk of the initial b

Re: Fastest way to build Spark from scratch

2015-12-08 Thread Stephen Boesch
I will echo Steve L's comment about having zinc running (with --nailed). That provides at least a 2X speedup - sometimes without it spark simply does not build for me. 2015-12-08 9:33 GMT-08:00 Josh Rosen : > @Nick, on a fresh EC2 instance a significant chunk of the initial build > time might be

Re: Fastest way to build Spark from scratch

2015-12-08 Thread Josh Rosen
@Nick, on a fresh EC2 instance a significant chunk of the initial build time might be due to artifact resolution + downloading. Putting pre-populated Ivy and Maven caches onto your EC2 machine could shave a decent chunk of time off that first build. On Tue, Dec 8, 2015 at 9:16 AM, Nicholas Chammas

Re: Fastest way to build Spark from scratch

2015-12-08 Thread Nicholas Chammas
Thanks for the tips, Jakob and Steve. It looks like my original approach is the best for me since I'm installing Spark on newly launched EC2 instances and can't take advantage of incremental compilation. Nick On Tue, Dec 8, 2015 at 7:01 AM Steve Loughran wrote: > On 7 Dec 2015, at 19:07, Jakob

Re: Fastest way to build Spark from scratch

2015-12-08 Thread Steve Loughran
On 7 Dec 2015, at 19:07, Jakob Odersky mailto:joder...@gmail.com>> wrote: make-distribution and the second code snippet both create a distribution from a clean state. They therefore require that every source file be compiled and that takes time (you can maybe tweak some settings or use a newer

Re: Fastest way to build Spark from scratch

2015-12-07 Thread Jakob Odersky
make-distribution and the second code snippet both create a distribution from a clean state. They therefore require that every source file be compiled and that takes time (you can maybe tweak some settings or use a newer compiler to gain some speed). I'm inferring from your question that for your

Fastest way to build Spark from scratch

2015-11-23 Thread Nicholas Chammas
Say I want to build a complete Spark distribution against Hadoop 2.6+ as fast as possible from scratch. This is what I’m doing at the moment: ./make-distribution.sh -T 1C -Phadoop-2.6 -T 1C instructs Maven to spin up 1 thread per available core. This takes around 20 minutes on an m3.large instan