Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-11-30 Thread GuoQiang Li
+1 (non-binding‍) -- Original -- From: Patrick Wendell;pwend...@gmail.com; Date: Sat, Nov 29, 2014 01:16 PM To: dev@spark.apache.orgdev@spark.apache.org; Subject: [VOTE] Release Apache Spark 1.2.0 (RC1) Please vote on releasing the following candidate

Spurious test failures, testing best practices

2014-11-30 Thread Ryan Williams
In the course of trying to make contributions to Spark, I have had a lot of trouble running Spark's tests successfully. The main pain points I've experienced are: 1) frequent, spurious test failures 2) high latency of running tests 3) difficulty running specific tests in an iterative

Re: Spurious test failures, testing best practices

2014-11-30 Thread York, Brennon
+1, you aren¹t alone in this. I certainly would like some clarity in these things well, but, as its been said on this listserv a few times (and you noted), most developers use `sbt` for their day-to-day compilations to greatly speed up the iterative testing process. I personally use `sbt` for all

Re: Spurious test failures, testing best practices

2014-11-30 Thread Matei Zaharia
Hi Ryan, As a tip (and maybe this isn't documented well), I normally use SBT for development to avoid the slow build process, and use its interactive console to run only specific tests. The nice advantage is that SBT can keep the Scala compiler loaded and JITed across builds, making it faster

Re: Spurious test failures, testing best practices

2014-11-30 Thread Ryan Williams
thanks for the info, Matei and Brennon. I will try to switch my workflow to using sbt. Other potential action items: - currently the docs only contain information about building with maven, and even then don't cover many important cases, as I described in my previous email. If SBT is as much

Re: Spurious test failures, testing best practices

2014-11-30 Thread Nicholas Chammas
- currently the docs only contain information about building with maven, and even then don’t cover many important cases All other points aside, I just want to point out that the docs document both how to use Maven and SBT and clearly state

Re: Spurious test failures, testing best practices

2014-11-30 Thread Mark Hamstra
- Start the SBT interactive console with sbt/sbt - Build your assembly by running the assembly target in the assembly project: assembly/assembly - Run all the tests in one module: core/test - Run a specific suite: core/test-only org.apache.spark.rdd.RDDSuite (this also supports tab

Re: Spurious test failures, testing best practices

2014-11-30 Thread Patrick Wendell
Hey Ryan, The existing JIRA also covers publishing nightly docs: https://issues.apache.org/jira/browse/SPARK-1517 - Patrick On Sun, Nov 30, 2014 at 5:53 PM, Ryan Williams ryan.blake.willi...@gmail.com wrote: Thanks Nicholas, glad to hear that some of this info will be pushed to the main site

Re: Spurious test failures, testing best practices

2014-11-30 Thread Patrick Wendell
Btw - the documnetation on github represents the source code of our docs, which is versioned with each release. Unfortunately github will always try to render .md files so it could look to a passerby like this is supposed to represent published docs. This is a feature limitation of github, AFAIK

Re: Spurious test failures, testing best practices

2014-11-30 Thread Ryan Williams
Thanks Mark, most of those commands are things I've been using and used in my original post except for Start zinc. I now see the section about it on the unpublished building-spark https://github.com/apache/spark/blob/master/docs/building-spark.md#speeding-up-compilation-with-zinc page and will try

Re: [RESULT] [VOTE] Designating maintainers for some Spark components

2014-11-30 Thread Matei Zaharia
An update on this: After adding the initial maintainer list, we got feedback to add more maintainers for some components, so we added four others (Josh Rosen for core API, Mark Hamstra for scheduler, Shivaram Venkataraman for MLlib and Xiangrui Meng for Python). We also decided to lower the

Re: Spurious test failures, testing best practices

2014-11-30 Thread Ryan Williams
Thanks Patrick, great to hear that docs-snapshots-via-jenkins is already JIRA'd; you can interpret some of this thread as a gigantic +1 from me on prioritizing that, which it looks like you are doing :) I do understand the limitations of the github vs. official site status quo; I was mostly

Re: Spurious test failures, testing best practices

2014-11-30 Thread Ganelin, Ilya
Hi, Patrick - with regards to testing on Jenkins, is the process for this to submit a pull request for the branch or is there another interface we can use to submit a build to Jenkins for testing? On 11/30/14, 6:49 PM, Patrick Wendell pwend...@gmail.com wrote: Hey Ryan, A few more things here.

Re: [mllib] Which is the correct package to add a new algorithm?

2014-11-30 Thread Yu Ishikawa
Hi Joseph, Thank you for your nice work and telling us the draft! During the next development cycle, new algorithms should be contributed to spark.mllib. Optionally, wrappers for new (and old) algorithms can be contributed to spark.ml. I understand that we should contribute new