[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72604505 Thanks @mengxr for your help. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not h

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4059 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72604421 LGTM. Merged into master. Thanks for adding GMM Python API! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72602454 [Test build #26607 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26607/consoleFull) for PR 4059 at commit [`c973ab3`](https://gith

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72602459 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72595692 [Test build #26607 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26607/consoleFull) for PR 4059 at commit [`c973ab3`](https://githu

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23982975 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -285,6 +286,59 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23982793 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +89,98 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-means||"

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23981473 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +89,98 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-means||"

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23943651 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +89,98 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-means||"

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23943538 --- Diff: python/pyspark/mllib/stat/distribution.py --- @@ -0,0 +1,25 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +#

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23943359 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +89,98 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-means||"

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23943114 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -285,6 +286,59 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23943111 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -285,6 +286,59 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23943123 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -285,6 +286,59 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23943118 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -285,6 +286,59 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23943115 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -285,6 +286,59 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72458892 Please review and merge.. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72458186 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72458177 [Test build #26515 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26515/consoleFull) for PR 4059 at commit [`fa0a142`](https://gith

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72449146 [Test build #26515 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26515/consoleFull) for PR 4059 at commit [`fa0a142`](https://githu

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72445559 Is it possible to start a test build in Jenkins without updating the PR? --- If your project is set up for it, you can reply to this email and have your reply appear o

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72442051 [Test build #26509 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26509/consoleFull) for PR 4059 at commit [`d5b36ab`](https://gith

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72442062 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72434076 [Test build #26509 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26509/consoleFull) for PR 4059 at commit [`d5b36ab`](https://githu

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72431354 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72431346 [Test build #26502 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26502/consoleFull) for PR 4059 at commit [`ac134f1`](https://gith

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72423418 [Test build #26502 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26502/consoleFull) for PR 4059 at commit [`ac134f1`](https://githu

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72420713 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72420711 [Test build #26499 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26499/consoleFull) for PR 4059 at commit [`2e9f12a`](https://gith

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-02 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72420623 [Test build #26499 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26499/consoleFull) for PR 4059 at commit [`2e9f12a`](https://githu

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72408935 Yes, Array should work. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-01 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72406403 So I will go with the current approach. I tried to change Array to ArrayBuffer but is ending up in exceptions. So can I go with array itself ? --- If your project is s

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-01 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72405813 They are not attributes but public methods. Did you try `mu()` and `sigma()`? I think the current approach looks good except minor issues commented. We can try other appro

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-02-01 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72404786 Instead of passing mu & sigma as arrays, I tried to directly pass "gaussians "(Array[MultivariateGaussian]) from PythonMLLibAPI. But I was not able to access the attrib

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-30 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72289335 @FlytxtRnD #4290 is merged. So please fetch and merge master, and rename `GaussianMixtureEM` to `GaussianMixture` in your PR. Thanks! --- If your project is set up for it

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-30 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23828921 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -284,6 +285,59 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72083636 Btw, I wanted to make sure you knew about [https://issues.apache.org/jira/browse/SPARK-5400], which I plan to do soon. --- If your project is set up for it, you can re

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-72075492 @FlytxtRnD I've merged the PR that refactors `mllib.stat`. It should be straightforward to add `distribution.py` under `mllib/stat/` now. --- If your project is set up fo

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23754619 --- Diff: python/pyspark/mllib/tests.py --- @@ -167,6 +167,32 @@ def test_kmeans_deterministic(self): # TODO: Allow small numeric difference

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23754373 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -284,6 +285,59 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23754387 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +88,84 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-means||"

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23754398 --- Diff: python/pyspark/mllib/tests.py --- @@ -167,6 +167,32 @@ def test_kmeans_deterministic(self): # TODO: Allow small numeric difference.

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23754380 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -284,6 +285,59 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23754369 --- Diff: examples/src/main/python/mllib/gaussian_mixture_model.py --- @@ -0,0 +1,67 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23754367 --- Diff: examples/src/main/python/mllib/gaussian_mixture_model.py --- @@ -0,0 +1,67 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23754375 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -284,6 +285,59 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23754381 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -284,6 +285,59 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23754368 --- Diff: examples/src/main/python/mllib/gaussian_mixture_model.py --- @@ -0,0 +1,67 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23754372 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -284,6 +285,59 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23754389 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +88,84 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-means||"

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23754395 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +88,84 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-means||"

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23753388 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +88,84 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-means||"

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread FlytxtRnD
Github user FlytxtRnD commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23752384 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +88,84 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-means||"

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23751738 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +88,84 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-means||"

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-71984393 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-29 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-71984377 [Test build #26303 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26303/consoleFull) for PR 4059 at commit [`2e14d82`](https://gith

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-71978609 [Test build #26303 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26303/consoleFull) for PR 4059 at commit [`2e14d82`](https://githu

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-28 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-71978435 The PR is updated according to https://github.com/apache/spark/pull/4088 which modifies GaussianMixtureModel to expose instances of MutlivariateGaussian rather than se

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-71146302 @mengxr Thank you for the review and comments. I am changing the code according to #3923 (tgaloppo). --- If your project is set up for it, you can reply to this emai

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399968 --- Diff: examples/src/main/python/mllib/gaussian_mixture_model.py --- @@ -0,0 +1,65 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399808 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +86,68 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-means||"

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399900 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +86,68 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-means||"

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399765 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -280,6 +280,48 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399804 --- Diff: python/pyspark/mllib/clustering.py --- @@ -86,6 +86,68 @@ def train(cls, rdd, k, maxIterations=100, runs=1, initializationMode="k-means||"

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399756 --- Diff: examples/src/main/python/mllib/gaussian_mixture_model.py --- @@ -0,0 +1,65 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399719 --- Diff: examples/src/main/python/mllib/gaussian_mixture_model.py --- @@ -0,0 +1,65 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399766 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -280,6 +280,48 @@ class PythonMLLibAPI extends Serializable {

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399715 --- Diff: examples/src/main/python/mllib/gaussian_mixture_model.py --- @@ -0,0 +1,65 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/4059#discussion_r23399730 --- Diff: examples/src/main/python/mllib/gaussian_mixture_model.py --- @@ -0,0 +1,65 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70230909 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70230901 [Test build #25648 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25648/consoleFull) for PR 4059 at commit [`c1d4c71`](https://gith

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70223642 [Test build #25648 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25648/consoleFull) for PR 4059 at commit [`c1d4c71`](https://githu

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70208901 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70208896 [Test build #25634 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25634/consoleFull) for PR 4059 at commit [`f82750b`](https://gith

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70205823 [Test build #25634 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25634/consoleFull) for PR 4059 at commit [`f82750b`](https://githu

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-15 Thread FlytxtRnD
Github user FlytxtRnD commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70205738 @jkbradley py4j serialization issue has been solved by the commit https://github.com/apache/spark/commit/8ead999fd627b12837fb2f082a0e76e9d121d269 --- If your project

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70142339 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70142336 [Test build #25609 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25609/consoleFull) for PR 4059 at commit [`5c83825`](https://gith

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70142185 [Test build #25609 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25609/consoleFull) for PR 4059 at commit [`5c83825`](https://githu

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-15 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70141761 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70077652 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...

2015-01-15 Thread FlytxtRnD
GitHub user FlytxtRnD opened a pull request: https://github.com/apache/spark/pull/4059 [SPARK-5012][MLLib][PySpark]Python API for Gaussian Mixture Model Python API for the Gaussian Mixture Model clustering algorithm in MLLib. You can merge this pull request into a Git repository by