Re: Python to Java object conversion of numpy array

2015-01-12 Thread Meethu Mathew
Hi, This is the function defined in PythonMLLibAPI.scala def findPredict( data: JavaRDD[Vector], wt: Object, mu: Array[Object], si: Array[Object]): RDD[Array[Double]] = { } So the parameter mu should be converted to Array[object]. mu = (Vectors.dense([0.8786, -0.7855])

Re: Re-use scaling means and variances from StandardScalerModel

2015-01-12 Thread Octavian Geagla
Thanks for the suggestions. I've opened this JIRA ticket: https://issues.apache.org/jira/browse/SPARK-5207 Feel free to modify it, assign it to me, kick off a discussion, etc. I'd be more than happy to own this feature and PR. Thanks, -Octavian -- View this message in context: http://apa

Re: Python to Java object conversion of numpy array

2015-01-12 Thread Davies Liu
On Sun, Jan 11, 2015 at 10:21 PM, Meethu Mathew wrote: > Hi, > > This is the code I am running. > > mu = (Vectors.dense([0.8786, -0.7855]),Vectors.dense([-0.1863, 0.7799])) > > membershipMatrix = callMLlibFunc("findPredict", rdd.map(_convert_to_vector), > mu) What's the Java API looks like? all t

Re: Discussion | SparkContext 's setJobGroup and clearJobGroup should return a new instance of SparkContext

2015-01-12 Thread Erik Erlandson
setJobGroup needs fixing: https://issues.apache.org/jira/browse/SPARK-4514 I'm interested in any community input on what the semantics or design "ought" to be changed to. - Original Message - > Hi spark committers > > I would like to discuss the possibility of changing the signature >

Re: Apache Spark client high availability

2015-01-12 Thread Akhil Das
We usually run Spark in HA with the following stack: -> Apache Mesos -> Marathon - init/control system for starting, stopping, and maintaining always-on applications.(Mainly SparkStreaming) -> Chronos - general-purpose scheduler for Mesos, supports job dependency graphs. -> Spark Job Server - prim

Apache Spark client high availability

2015-01-12 Thread preeze
Dear community, I've been searching the internet for quite a while to find out what is the best architecture to support HA for a spark client. We run an application that connects to a standalone Spark cluster and caches a big chuck of data for subsequent intensive calculations. To achieve HA we'l

Re: YARN | SPARK-5164 | Submitting jobs from windows to linux YARN

2015-01-12 Thread Aniket Bhatnagar
Ohh right. It is. I will mark my defect as duplicate and cross check my notes with the fixes in the pull request. Thanks for pointing out Zsolt :) On Mon, Jan 12, 2015, 7:42 PM Zsolt Tóth wrote: > Hi Aniket, > > I think this is a duplicate of SPARK-1825, isn't it? > > Zsolt > > 2015-01-12 14:38

Re: YARN | SPARK-5164 | Submitting jobs from windows to linux YARN

2015-01-12 Thread Zsolt Tóth
Hi Aniket, I think this is a duplicate of SPARK-1825, isn't it? Zsolt 2015-01-12 14:38 GMT+01:00 Aniket Bhatnagar : > Hi Spark YARN maintainers > > Can anyone please look and comment on SPARK-5164? Basically, this stops > users from submitting jobs (or using spark shell) from a windows machine

YARN | SPARK-5164 | Submitting jobs from windows to linux YARN

2015-01-12 Thread Aniket Bhatnagar
Hi Spark YARN maintainers Can anyone please look and comment on SPARK-5164? Basically, this stops users from submitting jobs (or using spark shell) from a windows machine to a a YARN cluster running on linux. I should be able to submit a pull request for this provided the community agrees. This wo

Discussion | SparkContext 's setJobGroup and clearJobGroup should return a new instance of SparkContext

2015-01-12 Thread Aniket Bhatnagar
Hi spark committers I would like to discuss the possibility of changing the signature of SparkContext 's setJobGroup and clearJobGroup functions to return a replica of SparkContext with the job group set/unset instead of mutating the original context. I am building a spark job server and I am assi