Re: Building and Running Spark on OS X

2014-10-20 Thread Jeremy Freeman
I also prefer sbt on Mac. You might want to add checking for / getting Python 2.6+ (though most modern Macs should have it), and maybe numpy as an optional dependency. I often just point people to Anaconda. — Jeremy - jeremyfreeman.net @thefreemanlab On Oct 20, 2014, a

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread shane knapp
thanks, patrick! :) On Mon, Oct 20, 2014 at 5:35 PM, Patrick Wendell wrote: > I created an issue to fix this: > > https://issues.apache.org/jira/browse/SPARK-4021 > > On Mon, Oct 20, 2014 at 5:32 PM, Patrick Wendell > wrote: > > Thanks Shane - we should fix the source code issues in the Kinesi

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread Patrick Wendell
I created an issue to fix this: https://issues.apache.org/jira/browse/SPARK-4021 On Mon, Oct 20, 2014 at 5:32 PM, Patrick Wendell wrote: > Thanks Shane - we should fix the source code issues in the Kinesis > code that made stricter Java compilers reject it. > > - Patrick > > On Mon, Oct 20, 2014

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread Patrick Wendell
Thanks Shane - we should fix the source code issues in the Kinesis code that made stricter Java compilers reject it. - Patrick On Mon, Oct 20, 2014 at 5:28 PM, shane knapp wrote: > ok, so earlier today i installed a 2nd JDK within jenkins (7u71), which > fixed the SparkR build but apparently mad

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread shane knapp
ok, so earlier today i installed a 2nd JDK within jenkins (7u71), which fixed the SparkR build but apparently made Spark itself quite unhappy. i removed that JDK, triggered a build ( https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21943/console), and it compiled kinesis w/o dyin

Re: Building and Running Spark on OS X

2014-10-20 Thread Nicholas Chammas
So back to my original question... :) If we wanted to post this guide to the user list or to a gist for easy reference, would we rather have Maven or SBT listed? And is there anything else about the steps that should be modified? Nick On Mon, Oct 20, 2014 at 8:25 PM, Sean Owen wrote: > Oh righ

Re: Building and Running Spark on OS X

2014-10-20 Thread Sean Owen
Oh right, we're talking about the bundled sbt of course. And I didn't know Maven wasn't installed anymore! On Mon, Oct 20, 2014 at 8:20 PM, Hari Shreedharan wrote: > The sbt executable that is in the spark repo can be used to build sbt > without any other set up (it will download the sbt jars etc

Re: Building and Running Spark on OS X

2014-10-20 Thread Hari Shreedharan
The sbt executable that is in the spark repo can be used to build sbt without any other set up (it will download the sbt jars etc). Thanks, Hari On Mon, Oct 20, 2014 at 5:16 PM, Sean Owen wrote: > Maven is at least built in to OS X (well, with dev tools). You don't > even have to brew install

Re: Building and Running Spark on OS X

2014-10-20 Thread Nicholas Chammas
I think starting in Mavericks, Maven is no longer included by default . On Mon, Oct 20, 2014 at 8:15 PM, Sean Owen wrote: > Maven is at least built in to OS X (well, with dev tools). You don't > even have to brew

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread Patrick Wendell
The failure is in the Kinesis compoent, can you reproduce this if you build with -Pkinesis-asl? - Patrick On Mon, Oct 20, 2014 at 5:08 PM, shane knapp wrote: > hmm, strange. i'll take a look. > > On Mon, Oct 20, 2014 at 5:11 PM, Nan Zhu wrote: > >> yes, I can compile locally, too >> >> but it

Re: Building and Running Spark on OS X

2014-10-20 Thread Sean Owen
Maven is at least built in to OS X (well, with dev tools). You don't even have to brew install it. Surely SBT isn't in the dev tools even? I recall I had to install it. I'd be surprised to hear it required zero setup. On Mon, Oct 20, 2014 at 8:04 PM, Nicholas Chammas wrote: > Yeah, I would use sb

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread shane knapp
hmm, strange. i'll take a look. On Mon, Oct 20, 2014 at 5:11 PM, Nan Zhu wrote: > yes, I can compile locally, too > > but it seems that Jenkins is not happy now... > https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/ > > All failed to compile > > Best, > > -- > Nan Zhu > > > On

Re: Building and Running Spark on OS X

2014-10-20 Thread Denny Lee
+1 huge fan of sbt with OSX > On Oct 20, 2014, at 17:00, Reynold Xin wrote: > > I usually use SBT on Mac and that one doesn't require any setup ... > > > On Mon, Oct 20, 2014 at 4:43 PM, Nicholas Chammas < > nicholas.cham...@gmail.com> wrote: > >> If one were to put together a short but com

Re: Building and Running Spark on OS X

2014-10-20 Thread Nicholas Chammas
Yeah, I would use sbt too, but I thought if I wanted to publish a little reference page for OS X users then I probably should use the “official “ build instructions. Nick ​ On Mon, Oct 20, 2014 at 8:00 PM, Reynold Xin wrote: > I usually use SBT on

Re: Building and Running Spark on OS X

2014-10-20 Thread Reynold Xin
I usually use SBT on Mac and that one doesn't require any setup ... On Mon, Oct 20, 2014 at 4:43 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > If one were to put together a short but comprehensive guide to setting up > Spark to run locally on OS X, would it look like this? > > # In

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread Nan Zhu
yes, I can compile locally, too but it seems that Jenkins is not happy now...https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/ All failed to compile Best, -- Nan Zhu On Monday, October 20, 2014 at 7:56 PM, Ted Yu wrote: > I performed build on latest master branch but d

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread Ted Yu
I performed build on latest master branch but didn't get compilation error. FYI On Mon, Oct 20, 2014 at 3:51 PM, Nan Zhu wrote: > Hi, > > I just submitted a patch https://github.com/apache/spark/pull/2864/files > with one line change > > but the Jenkins told me it's failed to compile on the unr

Building and Running Spark on OS X

2014-10-20 Thread Nicholas Chammas
If one were to put together a short but comprehensive guide to setting up Spark to run locally on OS X, would it look like this? # Install Maven. On OS X, we suggest using Homebrew. brew install maven # Set some important Java and Maven environment variables.export JAVA_HOME=$(/usr/libexec/java_ho

something wrong with Jenkins or something untested merged?

2014-10-20 Thread Nan Zhu
Hi, I just submitted a patch https://github.com/apache/spark/pull/2864/files with one line change but the Jenkins told me it's failed to compile on the unrelated files? https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21935/console Best, Nan

Re: Get attempt number in a closure

2014-10-20 Thread Yin Huai
Yes, it is for (2). I was confused because the doc of TaskContext.attemptId (release 1.1) is "the number of attempts to execute this task". Seems the per-task attempt id used to populate "attempt" field in the UI

Re: Get attempt number in a closure

2014-10-20 Thread Reynold Xin
Yes, as I understand it this is for (2). Imagine a use case in which I want to save some output. In order to make this atomic, the program uses part_[index]_[attempt].dat, and once it finishes writing, it renames this to part_[index].dat. Right now [attempt] is just the TID, which could show up l

Re: Get attempt number in a closure

2014-10-20 Thread Kay Ousterhout
Sorry to clarify, there are two issues here: (1) attemptId has different meanings in the codebase (2) we currently don't propagate the 0-based per-task attempt identifier to the executors. (1) should definitely be fixed. It sounds like Yin's original email was requesting that we add (2). On Mon

Re: Get attempt number in a closure

2014-10-20 Thread Kay Ousterhout
Are you guys sure this is a bug? In the task scheduler, we keep two identifiers for each task: the "index", which uniquely identifiers the computation+partition, and the "taskId" which is unique across all tasks for that Spark context (See https://github.com/apache/spark/blob/master/core/src/main/

Re: Get attempt number in a closure

2014-10-20 Thread Patrick Wendell
There is a deeper issue here which is AFAIK we don't even store a notion of attempt inside of Spark, we just use a new taskId with the same index. On Mon, Oct 20, 2014 at 12:38 PM, Yin Huai wrote: > Yeah, seems we need to pass the attempt id to executors through > TaskDescription. I have created

Re: Get attempt number in a closure

2014-10-20 Thread Yin Huai
Yeah, seems we need to pass the attempt id to executors through TaskDescription. I have created https://issues.apache.org/jira/browse/SPARK-4014. On Mon, Oct 20, 2014 at 1:57 PM, Reynold Xin wrote: > I also ran into this earlier. It is a bug. Do you want to file a jira? > > I think part of the p

Re: Get attempt number in a closure

2014-10-20 Thread Reynold Xin
I also ran into this earlier. It is a bug. Do you want to file a jira? I think part of the problem is that we don't actually have the attempt id on the executors. If we do, that's great. If not, we'd need to propagate that over. On Mon, Oct 20, 2014 at 7:17 AM, Yin Huai wrote: > Hello, > > Is t

Get attempt number in a closure

2014-10-20 Thread Yin Huai
Hello, Is there any way to get the attempt number in a closure? Seems TaskContext.attemptId actually returns the taskId of a task (see this and this