Re: Re-use scaling means and variances from StandardScalerModel

2015-01-09 Thread Xiangrui Meng
Feel free to create a JIRA for this issue. We might need to discuss what to put in the public constructors. In the meanwhile, you can use Java serialization to save/load the model: sc.parallelize(Seq(model), 1).saveAsObjectFile(/tmp/model) val model =

Re: missing document of several messages in actor-based receiver?

2015-01-09 Thread Tathagata Das
It was not really mean to be hidden. So its essentially the case of the documentation being insufficient. This code has not gotten much attention for a while, so it could have a bugs. If you find any and submit a fix for them, I am happy to take a look! TD On Thu, Jan 8, 2015 at 6:33 PM, Nan Zhu

RE:Results of tests

2015-01-09 Thread Tony Reix
Hi Ted Thanks for the info. However, I'm still unable to understand how the page: https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-1.2-Maven-with-YARN/lastSuccessfulBuild/HADOOP_PROFILE=hadoop-2.4,label=centos/testReport/ has been built. This page contains details I do not find in

Re: Python to Java object conversion of numpy array

2015-01-09 Thread Davies Liu
Hey Meethu, The Java API accepts only Vector, so you should convert the numpy array into pyspark.mllib.linalg.DenseVector. BTW, which class are you using? the KMeansModel.predict() accept numpy.array, it will do the conversion for you. Davies On Fri, Jan 9, 2015 at 4:45 AM, Meethu Mathew

Re: Results of tests

2015-01-09 Thread Ted Yu
For a build which uses JUnit, we would see a summary such as the following ( https://builds.apache.org/job/HBase-TRUNK/6007/console): Tests run: 2199, Failures: 0, Errors: 0, Skipped: 25 In

Present/Future of monitoring spark jobs, MetricsSystem vs. Web UI, etc.

2015-01-09 Thread Ryan Williams
I've long wished the web UI gave me a better sense of how the metrics it reports are changing over time, so I was intrigued to stumble across the MetricsSystem

Re: Results of tests

2015-01-09 Thread Ted Yu
I noticed that org.apache.spark.sql.hive.execution has a lot of tests skipped. Is there plan to enable these tests on Jenkins (so that there is no regression across releases) ? Cheers On Fri, Jan 9, 2015 at 11:46 AM, Josh Rosen rosenvi...@gmail.com wrote: The Test Result pages for Jenkins

Re: Results of tests

2015-01-09 Thread Josh Rosen
The Test Result pages for Jenkins builds shows some nice statistics for the test run, including individual test times: https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-1.2-Maven-with-YARN/lastSuccessfulBuild/HADOOP_PROFILE=hadoop-2.4,label=centos/testReport/ Currently this only covers

Re: Results of tests

2015-01-09 Thread Nicholas Chammas
Just created: Integrate Python unit tests into Jenkins https://issues.apache.org/jira/browse/SPARK-5178 Nick On Fri Jan 09 2015 at 2:48:48 PM Josh Rosen rosenvi...@gmail.com wrote: The Test Result pages for Jenkins builds shows some nice statistics for the test run, including individual

Re-use scaling means and variances from StandardScalerModel

2015-01-09 Thread ogeagla
Hello, I would like to re-use the means and variances computed by the fit function in the StandardScaler, as I persist them and my use case requires consisted scaling of data based on some initial data set. The StandardScalerModel's constructor takes means and variances, but is private[mllib].

Re: missing document of several messages in actor-based receiver?

2015-01-09 Thread Nan Zhu
Hi, I have created the PR for these two issues Best, -- Nan Zhu http://codingcat.me On Friday, January 9, 2015 at 7:38 AM, Nan Zhu wrote: Thanks, TD, I just created 2 JIRAs to track these, https://issues.apache.org/jira/browse/SPARK-5174