Re: Foundation policy on releases and Spark nightly builds

2015-07-11 Thread Sean Owen
From a developer perspective, I also find it surprising to hear that nightly builds should be hidden from non-developer end users. In an age of Github, what on earth is the problem with distributing the content of master? However I do understand why this exists. To the extent the ASF provides any

Re: Questions about Fault tolerance of Spark

2015-07-11 Thread 牛兆捷
I am in the beginning, would you like to share something on this area? 2015-07-10 22:05 GMT+08:00 MIKE HYNES 91m...@gmail.com: Gentle bump on this topic; how to test the fault tolerance and previous benchmark results are both things we are interested in as well. Mike Original

Re: Should spark-ec2 get its own repo?

2015-07-11 Thread Matt Goodman
I wanted to revive the conversation about the spark-ec2 tools, as it seems to have been lost in the 1.4.1 release voting spree. I think that splitting it into its own repository is a really good move, and I would also be happy to help with this transition, as well as help maintain the resulting

Spark application examples

2015-07-11 Thread Vasili I. Galchin
Hello, Reading slides entitled DATABRICKS written by Holden Karau, et. al. I am also reading Spark application examples under ../spark/examples/src/main/*. Let's assume examples Driver data-manipulation.R and dataframe.R. Question: where in these Drivers are the worker bees

Re: [PySpark DataFrame] When a Row is not a Row

2015-07-11 Thread Jerry Lam
Hi guys, I just hit the same problem. It is very confusing when Row is not the same Row type at runtime. The worst thing is that when I use Spark in local mode, the Row is the same Row type! so it passes the test cases but it fails when I deploy the application. Can someone suggest a