Stages with non-arithmetic numbering & Timing metrics in event logs

2015-06-07 Thread Mike Hynes
Hi Patrick and Akhil, Thank you both for your responses. This is a bit of an extended email, but I'd like to: 1. Answer your (Patrick) note about the "missing" stages since the IDs do (briefly) appear in the event logs 2. Ask for advice/experience with extracting information from the event logs in

Re: Ivy support in Spark vs. sbt

2015-06-07 Thread Igor Costa
Marcelo I've run this problem once, when I was starting with Spark, like you mentioned. I found out that ivy get messy with diff sbt version. My solution was using a previous compatible version with sbt to not cross with version. Best Igor On Thu, Jun 4, 2015 at 8:08 PM, Eron Wright wrote: >

Re: createDataframe from s3 results in error

2015-06-07 Thread Igor Costa
Hey there Ignacio Like Reynold said, It's related to your build of Spark, try to not compile with Thrift. Also, try to use this command to see what's the error and link to here. sc.wholeTextFile("s3://my-directory/2015*/ignacio/*") Ps( Are you using boto to connect? Which version?) Igor On

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-07 Thread Ajay Singal
+1 On Sun, Jun 7, 2015 at 6:02 PM, Tathagata Das wrote: > +1 > > On Sun, Jun 7, 2015 at 3:01 PM, Joseph Bradley > wrote: > >> +1 >> >> On Sat, Jun 6, 2015 at 7:55 PM, Guoqiang Li wrote: >> >>> +1 (non-binding) >>> >>> >>> -- Original -- >>> *From: * "Reynold Xin

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-07 Thread Tathagata Das
+1 On Sun, Jun 7, 2015 at 3:01 PM, Joseph Bradley wrote: > +1 > > On Sat, Jun 6, 2015 at 7:55 PM, Guoqiang Li wrote: > >> +1 (non-binding) >> >> >> -- Original -- >> *From: * "Reynold Xin";; >> *Date: * Fri, Jun 5, 2015 03:18 PM >> *To: * "Krishna Sankar"; >> *Cc

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-07 Thread Joseph Bradley
+1 On Sat, Jun 6, 2015 at 7:55 PM, Guoqiang Li wrote: > +1 (non-binding) > > > -- Original -- > *From: * "Reynold Xin";; > *Date: * Fri, Jun 5, 2015 03:18 PM > *To: * "Krishna Sankar"; > *Cc: * "Patrick Wendell"; "dev@spark.apache.org"< > dev@spark.apache.org>; >

Re: Scheduler question: stages with non-arithmetic numbering

2015-06-07 Thread Patrick Wendell
Hey Mike, Stage ID's are not guaranteed to be sequential because of the way the DAG scheduler works (only increasing). In some cases stage ID numbers are skipped when stages are generated. Any stage/ID that appears in the Spark UI is an actual stage, so if you see ID's in there, but they are not

Re: Scheduler question: stages with non-arithmetic numbering

2015-06-07 Thread Akhil Das
Are you seeing the same behavior on the driver UI? (that running on port 4040), If you click on the stage id header you can sort the stages based on IDs. Thanks Best Regards On Fri, Jun 5, 2015 at 10:21 PM, Mike Hynes <91m...@gmail.com> wrote: > Hi folks, > > When I look at the output logs for a