Re: [VOTE] Release Apache Spark 0.8.1-incubating (rc3)

2013-12-09 Thread Henry Saputra
Thanks for catching the problems, Patrick and Raymond. - Henry On Mon, Dec 9, 2013 at 10:40 PM, Patrick Wendell wrote: > I'm going to -1 this now because we had two issues reported today. > They were reported off the list so I'm summarizing here: > > (1) Raymond Liu found an issue with the Maven

Re: [VOTE] Release Apache Spark 0.8.1-incubating (rc3)

2013-12-09 Thread Patrick Wendell
I'm going to -1 this now because we had two issues reported today. They were reported off the list so I'm summarizing here: (1) Raymond Liu found an issue with the Maven build for YARN 2.2+. Previously we had only tested the sbt build since this is what we refer to in the docs, but we'd like to su

Re: [VOTE] Release Apache Spark 0.8.1-incubating (rc3)

2013-12-09 Thread Patrick Wendell
I'll go ahead and kick this off with a +1. On Sun, Dec 8, 2013 at 10:30 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark > (incubating) version 0.8.1. > > The tag to be voted on is v0.8.1-incubating (commit c88a9916): > https://git-wip-us.apache.org/re

Re: Kafka not shutting down cleanly; Actor serializtion?

2013-12-09 Thread Michael Malak
I haven't seen a response to my September question and was wondering if anyone had any insights into the problem I was having cleanly shutting down Kafka. Note: For the purposes of the Apache 2.0 license, regarding the contents of this and my earlier message: THIS IS NOT A CONTRIBUTION. __

Re: Spark API - support for asynchronous calls - Reactive style [I]

2013-12-09 Thread Mark Hamstra
Spark has already supported async jobs for awhile now -- https://github.com/apache/incubator-spark/pull/29, and they even work correctly after https://github.com/apache/incubator-spark/pull/232 There are now implicit conversions from RDD to AsyncRDDActions

Spark API - support for asynchronous calls - Reactive style [I]

2013-12-09 Thread Deenar Toraskar
Classification: For internal use only Hi developers Are there any plans to have Spark (and Shark) APIs that are asynchronous and non blocking? APIs that return Futures and Iteratee/Enumerators would be very useful to users building scalable apps using Spark, specially when combined with a fully

Re: spark.task.maxFailures

2013-12-09 Thread Grega Kešpret
I see that it is the DAGScheduler that orchestrates task resubmission. This code is responsible for calling submitStage for any failed stages. How does spark.task.maxFa

Re: spark.task.maxFailures

2013-12-09 Thread Grega Kešpret
Hi! I tried this (by setting spark.task.maxFailures to 1) and it still does not fail-fast. I started a job and after some time, I killed all JVMs running on one of the two workers. I was expecting Spark job to fail, however it re-fetched tasks to one of the two workers that was still alive and the

Re: spark.task.maxFailures

2013-12-09 Thread Grega Kešpret
Hi Reynold, I submitted a pull request here - https://github.com/apache/incubator-spark/pull/245 Do I need to do anything else (perhaps add a ticket in JIRA)? Best, Grega -- [image: Inline image 1] *Grega Kešpret* Analytics engineer Celtra — Rich Media Mobile Advertising celtra.com

Re: Spark streaming quantile?

2013-12-09 Thread Sandy Ryza
Thanks all for the suggestions. Exactly what I was looking for. -Sandy On Thu, Dec 5, 2013 at 5:00 AM, Sam Bessalah wrote: > Just as stated before Algebird has many data structure to compute those > like QTree, or Ted's tvdigest . Or you can look at stream-lib q digest > https://github.com/ad