jdbc/save DataFrameWriter implementation change

2016-04-12 Thread Justin.Pihony
Hi, I have a ticket open on how save should delegate to the jdbc method, however I went to implement this and it just didn't seem clean. Please take a look at my comment on https://issues.apache.org/jira/browse/SPARK-14525 and let me know if you agree with the second approach or not. Thanks,

Accessing Secure Hadoop from Mesos cluster

2016-04-12 Thread Tony Kinsley
I have been working towards getting some spark streaming jobs to run in Mesos cluster mode (using docker containers) and write data periodically to a secure HDFS cluster. Unfortunately this does not seem to be well supported currently in spark ( https://issues.apache.org/jira/browse/SPARK-12909).

Re: Different maxBins value for categorical and continuous features in RandomForest implementation.

2016-04-12 Thread Joseph Bradley
That sounds useful. Would you mind creating a JIRA for it? Thanks! Joseph On Mon, Apr 11, 2016 at 2:06 AM, Rahul Tanwani wrote: > Hi, > > Currently the RandomForest algo takes a single maxBins value to decide the > number of splits to take. This sometimes causes

Re: Spark 1.6.1 packages on S3 corrupt?

2016-04-12 Thread Nicholas Chammas
Yes, this is a known issue. The core devs are already aware of it. [CC dev] FWIW, I believe the Spark 1.6.1 / Hadoop 2.6 package on S3 is not corrupt. It may be the only 1.6.1 package that is not corrupt, though. :/ Nick On Tue, Apr 12, 2016 at 9:00 PM Augustus Hong

Spark on Mesos 0.28 issue

2016-04-12 Thread Yang Lei
I have been able to run spark submission in docker container (HOST network) through Marathon on mesos and target to Mesos cluster (zk address) for at least Spark 1.6, 1.5.2 over Mesos 0.26, 0.27. I do need to define SPARK_PUBLIC_DNS and SPARK_LOCAL_IP so that the spark driver can announce the

Possible deadlock in registering applications in the recovery mode

2016-04-12 Thread Niranda Perera
Hi all, I have encountered a small issue in the standalone recovery mode. Let's say there was an application A running in the cluster. Due to some issue, the entire cluster, together with the application A goes down. Then later on, cluster comes back online, and the master then goes into the