AFAIK the resolver does pick up things form your local ~/.m2 -- Note that as ~/.m2 is on NFS that adds to the amount of filesystem traffic.
Shivaram On Fri, Apr 25, 2014 at 2:57 PM, Williams, Ken <ken.willi...@windlogics.com>wrote: > I am indeed, but it's a pretty fast NFS. I don't have any SSD I can > use, but I could try to use local disk to see what happens. > > > > For me, a large portion of the time seems to be spent on lines like > "Resolving org.fusesource.jansi#jansi;1.4 ..." or similar . Is this going > out to find Maven resources? Any way to tell it to just use my local ~/.m2 > repository instead when the resource already exists there? Sometimes I > even get sporadic errors like this: > > > > [info] Resolving org.apache.hadoop#hadoop-yarn;2.2.0 ... > > [error] SERVER ERROR: Bad Gateway url= > http://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-yarn-server/2.2.0/hadoop-yarn-server-2.2.0.jar > > > > > > -Ken > > > > *From:* Shivaram Venkataraman [mailto:shiva...@eecs.berkeley.edu] > *Sent:* Friday, April 25, 2014 4:31 PM > > *To:* user@spark.apache.org > *Subject:* Re: Build times for Spark > > > > Are you by any chance building this on NFS ? As far as I know the build is > severely bottlenecked by filesystem calls during assembly (each class file > in each dependency gets a fstat call or something like that). That is > partly why building from say a local ext4 filesystem or a SSD is much > faster irrespective of memory / CPU. > > > > Thanks > > Shivaram > > > > On Fri, Apr 25, 2014 at 2:09 PM, Akhil Das <ak...@sigmoidanalytics.com> > wrote: > > You can always increase the sbt memory by setting > > export JAVA_OPTS="-Xmx10g" > > > > > Thanks > > Best Regards > > > > On Sat, Apr 26, 2014 at 2:17 AM, Williams, Ken < > ken.willi...@windlogics.com> wrote: > > No, I haven't done any config for SBT. Is there somewhere you might be > able to point me toward for how to do that? > > > > -Ken > > > > *From:* Josh Rosen [mailto:rosenvi...@gmail.com] > *Sent:* Friday, April 25, 2014 3:27 PM > *To:* user@spark.apache.org > *Subject:* Re: Build times for Spark > > > > Did you configure SBT to use the extra memory? > > > > On Fri, Apr 25, 2014 at 12:53 PM, Williams, Ken < > ken.willi...@windlogics.com> wrote: > > I've cloned the github repo and I'm building Spark on a pretty beefy > machine (24 CPUs, 78GB of RAM) and it takes a pretty long time. > > > > For instance, today I did a 'git pull' for the first time in a week or > two, and then doing 'sbt/sbt assembly' took 43 minutes of wallclock time > (88 minutes of CPU time). After that, I did 'SPARK_HADOOP_VERSION=2.2.0 > SPARK_YARN=true sbt/sbt assembly' and that took 25 minutes wallclock, 73 > minutes CPU. > > > > Is that typical? Or does that indicate some setup problem in my > environment? > > > > -- > > Ken Williams, Senior Research Scientist > > *WindLogics* > > http://windlogics.com > > > > > ------------------------------ > > > CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the > intended recipient(s) and may contain confidential and privileged > information. Any unauthorized review, use, disclosure or distribution of > any kind is strictly prohibited. If you are not the intended recipient, > please contact the sender via reply e-mail and destroy all copies of the > original message. Thank you. > > > > > > >