+1 (non-binding)
-- Original --
From: Patrick Wendell;pwend...@gmail.com;
Date: Sat, Nov 29, 2014 01:16 PM
To: dev@spark.apache.orgdev@spark.apache.org;
Subject: [VOTE] Release Apache Spark 1.2.0 (RC1)
Please vote on releasing the following candidate
In the course of trying to make contributions to Spark, I have had a lot of
trouble running Spark's tests successfully. The main pain points I've
experienced are:
1) frequent, spurious test failures
2) high latency of running tests
3) difficulty running specific tests in an iterative
+1, you aren¹t alone in this. I certainly would like some clarity in these
things well, but, as its been said on this listserv a few times (and you
noted), most developers use `sbt` for their day-to-day compilations to
greatly speed up the iterative testing process. I personally use `sbt` for
all
Hi Ryan,
As a tip (and maybe this isn't documented well), I normally use SBT for
development to avoid the slow build process, and use its interactive console to
run only specific tests. The nice advantage is that SBT can keep the Scala
compiler loaded and JITed across builds, making it faster
thanks for the info, Matei and Brennon. I will try to switch my workflow to
using sbt. Other potential action items:
- currently the docs only contain information about building with maven,
and even then don't cover many important cases, as I described in my
previous email. If SBT is as much
- currently the docs only contain information about building with maven,
and even then don’t cover many important cases
All other points aside, I just want to point out that the docs document
both how to use Maven and SBT and clearly state
- Start the SBT interactive console with sbt/sbt
- Build your assembly by running the assembly target in the assembly
project: assembly/assembly
- Run all the tests in one module: core/test
- Run a specific suite: core/test-only org.apache.spark.rdd.RDDSuite (this
also supports tab
Hey Ryan,
The existing JIRA also covers publishing nightly docs:
https://issues.apache.org/jira/browse/SPARK-1517
- Patrick
On Sun, Nov 30, 2014 at 5:53 PM, Ryan Williams
ryan.blake.willi...@gmail.com wrote:
Thanks Nicholas, glad to hear that some of this info will be pushed to the
main site
Btw - the documnetation on github represents the source code of our
docs, which is versioned with each release. Unfortunately github will
always try to render .md files so it could look to a passerby like
this is supposed to represent published docs. This is a feature
limitation of github, AFAIK
Thanks Mark, most of those commands are things I've been using and used in
my original post except for Start zinc. I now see the section about it on
the unpublished building-spark
https://github.com/apache/spark/blob/master/docs/building-spark.md#speeding-up-compilation-with-zinc
page and will try
An update on this: After adding the initial maintainer list, we got feedback to
add more maintainers for some components, so we added four others (Josh Rosen
for core API, Mark Hamstra for scheduler, Shivaram Venkataraman for MLlib and
Xiangrui Meng for Python). We also decided to lower the
Thanks Patrick, great to hear that docs-snapshots-via-jenkins is already
JIRA'd; you can interpret some of this thread as a gigantic +1 from me on
prioritizing that, which it looks like you are doing :)
I do understand the limitations of the github vs. official site status
quo; I was mostly
Hi, Patrick - with regards to testing on Jenkins, is the process for this
to submit a pull request for the branch or is there another interface we
can use to submit a build to Jenkins for testing?
On 11/30/14, 6:49 PM, Patrick Wendell pwend...@gmail.com wrote:
Hey Ryan,
A few more things here.
Hi Joseph,
Thank you for your nice work and telling us the draft!
During the next development cycle, new algorithms should be contributed to
spark.mllib. Optionally, wrappers for new (and old) algorithms can be
contributed to spark.ml.
I understand that we should contribute new
14 matches
Mail list logo