Dataframes filter by count fails with python API

2015-06-28 Thread Andrew Vykhodtsev
Dear developers, I found the following behaviour that I think is a minor bug. If I apply groupBy and count in python API, the resulting data frame has grouped columns and the field named "count". Filtering by that field does not work because it thinks it is a key word: x = sc.parallelize(zip(xr

Re: Question about Spark process and thread

2015-06-28 Thread Reynold Xin
Most of those threads are not for task execution. They are for RPC, scheduling, ... On Sun, Jun 28, 2015 at 8:32 AM, Dogtail Ray wrote: > Hi, > > I was looking at Spark source code, and I found that when launching a > Executor, actually Spark is launching a threadpool; each time the scheduler >

Re: Spark 1.5.0-SNAPSHOT broken with Scala 2.11

2015-06-28 Thread Shixiong Zhu
Could you update your maven to 3.3.3? I'm not sure if this is a known issue but the exception message looks same. See https://github.com/apache/spark/pull/6770 Best Regards, Shixiong Zhu 2015-06-29 9:02 GMT+08:00 Alessandro Baretta : > I am building the current master branch with Scala 2.11 foll

Re: UnusedStubClass in 1.3.0-rc1

2015-06-28 Thread dobashim
Hi, all I found the same situation in Spark Streaming + Kafka of 1.4.0. Are there any progress about this discussion? (I cannot find JIRA issue about this.) By the way, I can avoid this kind of error by using "exclude" in SBT configuration of my application like the following. :: libraryDepend

Gossip protocol in Master selection

2015-06-28 Thread Debasish Das
Hi, Akka cluster uses gossip protocol for Master election. The approach in Spark right now is to use Zookeeper for high availability. Interestingly Cassandra and Redis clusters are both using Gossip protocol. I am not sure what is the default behavior right now. If the master dies and zookeeper

Re: Spark 1.5.0-SNAPSHOT broken with Scala 2.11

2015-06-28 Thread Josh Rosen
The 2.11 compile build is going to be green because this is an issue with tests, not compilation. On Sun, Jun 28, 2015 at 6:30 PM, Ted Yu wrote: > Spark-Master-Scala211-Compile build is green. > > However it is not clear what the actual command is: > > [EnvInject] - Variables injected successful

Re: Spark 1.5.0-SNAPSHOT broken with Scala 2.11

2015-06-28 Thread Ted Yu
Spark-Master-Scala211-Compile build is green. However it is not clear what the actual command is: [EnvInject] - Variables injected successfully. [Spark-Master-Scala211-Compile] $ /bin/bash /tmp/hudson8945334776362889961.sh FYI On Sun, Jun 28, 2015 at 6:02 PM, Alessandro Baretta wrote: > I a

Spark 1.5.0-SNAPSHOT broken with Scala 2.11

2015-06-28 Thread Alessandro Baretta
I am building the current master branch with Scala 2.11 following these instructions: Building for Scala 2.11 To produce a Spark package compiled with Scala 2.11, use the -Dscala-2.11 property: dev/change-version-to-2.11.sh mvn -Pyarn -Phadoop-2.4 -Dscala-2.11 -DskipTests clean package Here's

Re: [VOTE] Release Apache Spark 1.4.1

2015-06-28 Thread Patrick Wendell
Hey Krishna - this is still the current release candidate. - Patrick On Sun, Jun 28, 2015 at 12:14 PM, Krishna Sankar wrote: > Patrick, >Haven't seen any replies on test results. I will byte ;o) - Should I test > this version or is another one in the wings ? > Cheers > > > On Tue, Jun 23, 2

Re: [VOTE] Release Apache Spark 1.4.1

2015-06-28 Thread Krishna Sankar
Patrick, Haven't seen any replies on test results. I will byte ;o) - Should I test this version or is another one in the wings ? Cheers On Tue, Jun 23, 2015 at 10:37 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.4.1! > > This releas

Question about Spark process and thread

2015-06-28 Thread Dogtail Ray
Hi, I was looking at Spark source code, and I found that when launching a Executor, actually Spark is launching a threadpool; each time the scheduler launches a task, the executor will launch a thread within the threadpool. However, I also found that the Spark process always has approximately 40

Unable to add to roles in JIRA

2015-06-28 Thread Sean Owen
In case you've tried and failed to add a person to a role in JIRA... https://issues.apache.org/jira/browse/INFRA-9891 - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apach