Re: Announcing the official Spark Job Server repo

2014-03-24 Thread Evan Chan
Andy, doesn't Marathon handle fault tolerance amongst its apps? ie if you say that N instances of an app are running, and one shuts off, then it spins up another one no? The tricky thing was that I was planning to use Akka Cluster to coordinate, but Mesos itself can be used to coordinate as

Re: Spark 0.9.1 release

2014-03-24 Thread Evan Chan
I also have a really minor fix for SPARK-1057 (upgrading fastutil), could that also make it in? -Evan On Sun, Mar 23, 2014 at 11:01 PM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: Sorry this request is coming in a bit late, but would it be possible to backport SPARK-979[1] to

Re: new Catalyst/SQL component merged into master

2014-03-24 Thread Evan Chan
Hi Michael, Congrats, this is really neat! What thoughts do you have regarding adding indexing support and predicate pushdown to this SQL framework?Right now we have custom bitmap indexing to speed up queries, so we're really curious as far as the architectural direction. -Evan On Fri,

Re: Spark 0.9.1 release

2014-03-24 Thread Evan Chan
@Tathagata, the PR is here: https://github.com/apache/spark/pull/215 On Mon, Mar 24, 2014 at 12:02 AM, Tathagata Das tathagata.das1...@gmail.com wrote: @Shivaram, That is a useful patch but I am bit afraid merge it in. Randomizing the executor has performance implications, especially for Spark

Re: spark jobserver

2014-03-24 Thread Evan Chan
Suhas, here is the update, which I posted to SPARK-818: An update: we have put up the final job server here: https://github.com/ooyala/spark-jobserver The plan is to have a spark-contrib repo/github account and this would be one of the first projects. See SPARK-1283 for the ticket to track

Re: Spark 0.9.1 release

2014-03-24 Thread Patrick Wendell
Hey Evan and TD, Spark's dependency graph in a maintenance release seems potentially harmful, especially upgrading a minor version (not just a patch version) like this. This could affect other downstream users. For instance, now without knowing their fastutil dependency gets bumped and they hit

Re: Spark 0.9.1 release

2014-03-24 Thread Patrick Wendell
Spark's dependency graph in a maintenance *Modifying* Spark's dependency graph...

Re: Spark 0.9.1 release

2014-03-24 Thread Tathagata Das
Patrick, that is a good point. On Mon, Mar 24, 2014 at 12:14 AM, Patrick Wendell pwend...@gmail.comwrote: Spark's dependency graph in a maintenance *Modifying* Spark's dependency graph...

Re: Announcing the official Spark Job Server repo

2014-03-24 Thread andy petrella
Thx for answering! see inline for my thoughts (or misunderstanding ? ^^) Andy, doesn't Marathon handle fault tolerance amongst its apps? ie if you say that N instances of an app are running, and one shuts off, then it spins up another one no? Yes indeed, but my wonder is about how to know how

Re: spark jobserver

2014-03-24 Thread Suhas Satish
Thanks a lot for this update Evan , really appreciate the effort. On Monday, March 24, 2014, Evan Chan e...@ooyala.com wrote: Suhas, here is the update, which I posted to SPARK-818: An update: we have put up the final job server here: https://github.com/ooyala/spark-jobserver The plan is

Re: Spark 0.9.1 release

2014-03-24 Thread Evan Chan
Patrick, yes, that is indeed a risk. On Mon, Mar 24, 2014 at 12:30 AM, Tathagata Das tathagata.das1...@gmail.com wrote: Patrick, that is a good point. On Mon, Mar 24, 2014 at 12:14 AM, Patrick Wendell pwend...@gmail.comwrote: Spark's dependency graph in a maintenance *Modifying* Spark's

Re: spark jobserver

2014-03-24 Thread Evan Chan
Suhas, You're welcome. We are planning to speak about the job server at the Spark Summit by the way. -Evan On Mon, Mar 24, 2014 at 9:38 AM, Suhas Satish suhas.sat...@gmail.com wrote: Thanks a lot for this update Evan , really appreciate the effort. On Monday, March 24, 2014, Evan Chan

Re: new Catalyst/SQL component merged into master

2014-03-24 Thread Usman Ghani
How does it compare against Shark, and what is the future of Shark with this new module in place? On Sun, Mar 23, 2014 at 11:49 PM, Evan Chan e...@ooyala.com wrote: Hi Michael, Congrats, this is really neat! What thoughts do you have regarding adding indexing support and predicate

Re: new Catalyst/SQL component merged into master

2014-03-24 Thread Michael Armbrust
Hi Evan, Index support is definitely something we would like to add, and it is possible that adding support for your custom indexing solution would not be too difficult. We already push predicates into hive table scan operators when the predicates are over partition keys. You can see an example

Re: Spark 0.9.1 release

2014-03-24 Thread Kevin Markey
1051 is essential! I'm not sure about the others, but anything that adds stability to Spark/Yarn would be helpful. Kevin Markey On 03/20/2014 01:12 PM, Tom Graves wrote: I'll pull [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on YARN - JIRA and [SPARK-1051] On Yarn,

Re: Spark 0.9.1 release

2014-03-24 Thread Tathagata Das
1051 has been pulled in! search 1051 in https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog;h=refs/heads/branch-0.9 TD On Mon, Mar 24, 2014 at 4:26 PM, Kevin Markey kevin.mar...@oracle.com wrote: 1051 is essential! I'm not sure about the others, but anything that adds stability to

Re: Spark 0.9.1 release

2014-03-24 Thread Kevin Markey
Is there any way that [SPARK-782] (Shade ASM) can be included? I see that it is not currently backported to 0.9. But there is no single issue that has caused us more grief as we integrate spark-core with other project dependencies. There are way too many libraries out there in addition to

Re: Spark 0.9.1 release

2014-03-24 Thread Tathagata Das
Hello Kevin, A fix for SPARK-782 would definitely simplify building against Spark. However, its possible that a fix for this issue in 0.9.1 will break the builds (that reference spark) of existing 0.9 users, either due to a change in the ASM version, or for being incompatible with their current