date:20141123

Notes on writing complex spark applications

2014-11-23 Thread Evan R. Sparks

Hi all, Shivaram Venkataraman, Joseph Gonzalez, Tomer Kaftan, and I have been working on a short document about writing high performance Spark applications based on our experience developing MLlib, GraphX, ml-matrix, pipelines, etc. It may be a useful document both for users and new Spark

Re: Notes on writing complex spark applications

2014-11-23 Thread andy petrella

Cool! On Sun Nov 23 2014 at 5:58:03 PM Evan R. Sparks evan.spa...@gmail.com wrote: Hi all, Shivaram Venkataraman, Joseph Gonzalez, Tomer Kaftan, and I have been working on a short document about writing high performance Spark applications based on our experience developing MLlib, GraphX,

Re: Notes on writing complex spark applications

2014-11-23 Thread Sam Bessalah

Thanks Evan, this is great. On Nov 23, 2014 5:58 PM, Evan R. Sparks evan.spa...@gmail.com wrote: Hi all, Shivaram Venkataraman, Joseph Gonzalez, Tomer Kaftan, and I have been working on a short document about writing high performance Spark applications based on our experience developing

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Debasish Das

-1 from me...same FetchFailed issue as what Hector saw... I am running Netflix dataset and dumping out recommendation for all users. It shuffles around 100 GB data on disk to run a reduceByKey per user on utils.BoundedPriorityQueue...The code runs fine with MovieLens1m dataset... I gave Spark 10

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Patrick Wendell

+1 (binding). Don't see any evidence of regressions at this point. The issue reported by Hector was not related to this rlease. On Sun, Nov 23, 2014 at 9:50 AM, Debasish Das debasish.da...@gmail.com wrote: -1 from me...same FetchFailed issue as what Hector saw... I am running Netflix dataset

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Stephen Haberman

Hi, I wanted to try 1.1.1-rc2 because we're running into SPARK-3633, but therc releases not being tagged with -rcX means the pre-built artifacts are basically useless to me. (Pedantically, to test a release, I have to upload it into our internal repo, to compile jobs, start clusters, etc.

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Matei Zaharia

Interesting, perhaps we could publish each one with two IDs, of which the rc one is unofficial. The problem is indeed that you have to vote on a hash for a potentially final artifact. Matei On Nov 23, 2014, at 7:54 PM, Stephen Haberman stephen.haber...@gmail.com wrote: Hi, I wanted to

Re: Notes on writing complex spark applications

2014-11-23 Thread Inkyu Lee

Very helpful!! thank you very much! 2014-11-24 2:17 GMT+09:00 Sam Bessalah samkiller@gmail.com: Thanks Evan, this is great. On Nov 23, 2014 5:58 PM, Evan R. Sparks evan.spa...@gmail.com wrote: Hi all, Shivaram Venkataraman, Joseph Gonzalez, Tomer Kaftan, and I have been working

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Patrick Wendell

Hey Stephen, Thanks for bringing this up. Technically when we call a release vote it needs to be on the exact commit that will be the final release. However, one thing I've thought of doing for a while would be to publish the maven artifacts using a version tag with $VERSION-rcX even if the

Re: Notes on writing complex spark applications

2014-11-23 Thread Patrick Wendell

Hey Evan, It might be nice to merge this into existing documentation. In particular, a lot of this could serve to update the current tuning section and programming guides. It could also work to paste this wholesale as a reference for Spark users, but in that case it's less likely to get updated

2 spark streaming questions

2014-11-23 Thread tian zhang

Hi, Dear Spark Streaming Developers and Users, We are prototyping using spark streaming and hit the following 2 issues thatI would like to seek your expertise. 1) We have a spark streaming application in scala, that reads data from Kafka intoa DStream, does some processing and output a

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Stephen Haberman

Awesome, sounds great, guys; thanks for understanding. Depending on how badly I need 1.1.1-rc2 (I'll check my jobs tomorrow) I'll just build a local version for now. Should be easy, it's just been awhile. :-) Thanks, Stephen On Sun Nov 23 2014 at 11:01:09 PM Patrick Wendell pwend...@gmail.com

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Stephen Haberman

http://maven.apache.org/plugins/maven-install-plugin/ examples/specific-local-repo.html Hm, I didn't know about that plugin--assuming it does all of the jar/pom/sources/etc., then, yes, that could work... At first glance, I'm not sure it'll bring over the pom with all of the transitive

Notes on writing complex spark applications

Re: Notes on writing complex spark applications

Re: Notes on writing complex spark applications

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

Re: Notes on writing complex spark applications

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

Re: Notes on writing complex spark applications

2 spark streaming questions

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

13 matches

Site Navigation

Mail list logo

Footer information