date:20150408

Which method do you think is better for making MIN_REMEMBER_DURATION configurable?

2015-04-08 Thread Emre Sevinc

Hello, This is about SPARK-3276 and I want to make MIN_REMEMBER_DURATION (that is now a constant) a variable (configurable, with a default value). Before spending effort on developing something and creating a pull request, I wanted to consult with the core developers to see which approach makes

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

2015-04-08 Thread Sean Owen

Still a +1 from me; same result (except that now of course the UISeleniumSuite test does not fail) On Wed, Apr 8, 2015 at 1:46 AM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.3.1! The tag to be voted on is v1.3.1-rc2

finding free ports for tests

2015-04-08 Thread Steve Loughran

I'm writing some functional tests for the SPARK-1537 JIRA, Yarn timeline service integration, for which I need to allocate some free ports. I don't want to hard code them in as that can lead to unreliable tests, especially on Jenkins. Before I implement the logic myself -Is there a utility

Re: finding free ports for tests

2015-04-08 Thread Sean Owen

Utils.startServiceOnPort? On Wed, Apr 8, 2015 at 6:16 AM, Steve Loughran ste...@hortonworks.com wrote: I'm writing some functional tests for the SPARK-1537 JIRA, Yarn timeline service integration, for which I need to allocate some free ports. I don't want to hard code them in as that can

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

2015-04-08 Thread Denny Lee

The RC2 bits are lacking Hadoop 2.4 and Hadoop 2.6 - was that intended (they were included in RC1)? On Wed, Apr 8, 2015 at 9:01 AM Tom Graves tgraves...@yahoo.com.invalid wrote: +1. Tested spark on yarn against hadoop 2.6. Tom On Wednesday, April 8, 2015 6:15 AM, Sean Owen

Re: Which method do you think is better for making MIN_REMEMBER_DURATION configurable?

2015-04-08 Thread Tathagata Das

Approach 2 is definitely better :) Can you tell us more about the use case why you want to do this? TD On Wed, Apr 8, 2015 at 1:44 AM, Emre Sevinc emre.sev...@gmail.com wrote: Hello, This is about SPARK-3276 and I want to make MIN_REMEMBER_DURATION (that is now a constant) a variable

RDD firstParent

2015-04-08 Thread Zoltán Zvara

Is does not seem to be safe to call RDD.firstParent from anywhere, as it might throw a java.util.NoSuchElementException: head of empty list. This seems to be a bug for a consumer of the RDD API. Zvara Zoltán mail, hangout, skype: zoltan.zv...@gmail.com mobile, viber: +36203129543 bank:

PR 5140

2015-04-08 Thread Nathan Kronenfeld

Could I get someone to look at PR 5140 please? It's been languishing more than two weeks.

Re: RDD firstParent

2015-04-08 Thread Reynold Xin

Why is this a bug? Each RDD implementation should know whether they have a parent or not. For example, if you are a MapPartitionedRDD, there is always a parent since it is a unary operator. On Wed, Apr 8, 2015 at 6:19 AM, Zoltán Zvara zoltan.zv...@gmail.com wrote: Is does not seem to be safe

Re: Which method do you think is better for making MIN_REMEMBER_DURATION configurable?

2015-04-08 Thread Emre Sevinc

Tathagata, Thanks for stating your preference for Approach 2. My use case and motivation are similar to the concerns raised by others in SPARK-3276. In previous versions of Spark, e.g. 1.1.x we had the ability for Spark Streaming applications to process the files in an input directory that

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

2015-04-08 Thread Matei Zaharia

+1. Tested on Mac OS X and verified that some of the bugs were fixed. Matei On Apr 8, 2015, at 7:13 AM, Sean Owen so...@cloudera.com wrote: Still a +1 from me; same result (except that now of course the UISeleniumSuite test does not fail) On Wed, Apr 8, 2015 at 1:46 AM, Patrick Wendell

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

2015-04-08 Thread Tom Graves

+1. Tested spark on yarn against hadoop 2.6. Tom On Wednesday, April 8, 2015 6:15 AM, Sean Owen so...@cloudera.com wrote: Still a +1 from me; same result (except that now of course the UISeleniumSuite test does not fail) On Wed, Apr 8, 2015 at 1:46 AM, Patrick Wendell

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

2015-04-08 Thread Timothy Chen

+1 Tested on 4 nodes Mesos cluster with fine-grain and coarse-grain mode. Tim On Wed, Apr 8, 2015 at 9:32 AM, Denny Lee denny.g@gmail.com wrote: The RC2 bits are lacking Hadoop 2.4 and Hadoop 2.6 - was that intended (they were included in RC1)? On Wed, Apr 8, 2015 at 9:01 AM Tom Graves

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

2015-04-08 Thread Patrick Wendell

Hey Denny, I beleive the 2.4 bits are there. The 2.6 bits I had done specially (we haven't merge that into our upstream build script). I'll do it again now for RC2. - Patrick On Wed, Apr 8, 2015 at 1:53 PM, Timothy Chen tnac...@gmail.com wrote: +1 Tested on 4 nodes Mesos cluster with

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

2015-04-08 Thread Denny Lee

Oh, it appears the 2.4 bits without hive are there but not the 2.4 bits with hive. Cool stuff on the 2.6. On Wed, Apr 8, 2015 at 12:30 Patrick Wendell pwend...@gmail.com wrote: Hey Denny, I beleive the 2.4 bits are there. The 2.6 bits I had done specially (we haven't merge that into our

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

2015-04-08 Thread Patrick Wendell

Oh I see - ah okay I'm guessing it was a transient build error and I'll get it posted ASAP. On Wed, Apr 8, 2015 at 3:41 PM, Denny Lee denny.g@gmail.com wrote: Oh, it appears the 2.4 bits without hive are there but not the 2.4 bits with hive. Cool stuff on the 2.6. On Wed, Apr 8, 2015 at

Re: PR 5140

2015-04-08 Thread Andrew Or

Hey Nathan, thanks for bringing this up I will look at this within the next day or two. 2015-04-08 8:03 GMT-07:00 Nathan Kronenfeld nkronenfeld@uncharted.software : Could I get someone to look at PR 5140 please? It's been languishing more than two weeks.

Re: [mllib] Deprecate static train and use builder instead for Scala/Java

2015-04-08 Thread Joseph Bradley

I'll add a note that this is just for ML, not other parts of Spark. (We can discuss more on the JIRA.) Thanks! Joseph On Mon, Apr 6, 2015 at 9:46 PM, Yu Ishikawa yuu.ishikawa+sp...@gmail.com wrote: Hi all, Joseph proposed an idea about using just builder methods, instead of static train()

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

2015-04-08 Thread Denny Lee

+1 (non-binding) Tested Scala, SparkSQL, and MLLib on OSX against Hadoop 2.6 On Wed, Apr 8, 2015 at 5:35 PM Joseph Bradley jos...@databricks.com wrote: +1 tested ML-related items on Mac OS X On Wed, Apr 8, 2015 at 7:59 PM, Krishna Sankar ksanka...@gmail.com wrote: +1 (non-binding, of

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

2015-04-08 Thread Sandy Ryza

+1 Built against Hadoop 2.6 and ran some jobs against a pseudo-distributed YARN cluster. -Sandy On Wed, Apr 8, 2015 at 12:49 PM, Patrick Wendell pwend...@gmail.com wrote: Oh I see - ah okay I'm guessing it was a transient build error and I'll get it posted ASAP. On Wed, Apr 8, 2015 at 3:41

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

2015-04-08 Thread Joseph Bradley

+1 tested ML-related items on Mac OS X On Wed, Apr 8, 2015 at 7:59 PM, Krishna Sankar ksanka...@gmail.com wrote: +1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 14:16 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

2015-04-08 Thread Krishna Sankar

+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 14:16 min mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -DskipTests -Dscala-2.11 2. Tested pyspark, mlib - running as well as compare results with 1.3.0 pyspark works

Re: Which method do you think is better for making MIN_REMEMBER_DURATION configurable?

2015-04-08 Thread Jeremy Freeman

+1 for this feature In our use case, we probably wouldn’t use this feature in production, but it can be useful during prototyping and algorithm development to repeatedly perform the same streaming operation on a fixed, already existing set of files. - jeremyfreeman.net

Which method do you think is better for making MIN_REMEMBER_DURATION configurable?

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

finding free ports for tests

Re: finding free ports for tests

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

Re: Which method do you think is better for making MIN_REMEMBER_DURATION configurable?

RDD firstParent

PR 5140

Re: RDD firstParent

Re: Which method do you think is better for making MIN_REMEMBER_DURATION configurable?

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

Re: PR 5140

Re: [mllib] Deprecate static train and use builder instead for Scala/Java

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

Re: [VOTE] Release Apache Spark 1.3.1 (RC2)

Re: Which method do you think is better for making MIN_REMEMBER_DURATION configurable?

23 matches

Site Navigation

Mail list logo

Footer information