Github user MechCoder commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73831296
Thanks @tgaloppo and @mengxr . Any idea what to touch in GaussianMixture
next? The parallelized Gaussian initialization.
---
If your project is set up for it, you can
Github user MechCoder commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73667895
@mengxr Just to make it easier for you, a small description.
GaussianMixture used to support sparse input, by converting it to DenseVectors,
which is non-optimal, in
Github user tgaloppo commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73666468
LGTM
cc: @jkbradley @mengxr
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user tgaloppo commented on a diff in the pull request:
https://github.com/apache/spark/pull/4459#discussion_r24397429
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/clustering/GaussianMixtureSuite.scala
---
@@ -80,4 +81,60 @@ class GaussianMixtureSuite extends
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/4459#discussion_r24436973
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala ---
@@ -235,12 +235,23 @@ private[spark] object BLAS extends Serializable with
Logging
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/4459#discussion_r24436979
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala ---
@@ -255,6 +266,15 @@ private[spark] object BLAS extends Serializable with
Logging
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/4459#discussion_r24436968
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala ---
@@ -235,12 +235,23 @@ private[spark] object BLAS extends Serializable with
Logging
Github user MechCoder commented on a diff in the pull request:
https://github.com/apache/spark/pull/4459#discussion_r24439928
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala ---
@@ -235,12 +235,23 @@ private[spark] object BLAS extends Serializable with
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73790723
[Test build #27230 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27230/consoleFull)
for PR 4459 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73790731
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user MechCoder commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73777170
@mengxr Fixed up your comments. Let me know if there is anything else.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73777944
[Test build #27230 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27230/consoleFull)
for PR 4459 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73779641
[Test build #27231 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27231/consoleFull)
for PR 4459 at commit
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73794267
LGTM. Merged into master and branch-1.3. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73793005
[Test build #27231 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27231/consoleFull)
for PR 4459 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73793016
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/4459
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73642221
[Test build #27171 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27171/consoleFull)
for PR 4459 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73647333
[Test build #27171 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27171/consoleFull)
for PR 4459 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73647337
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user MechCoder commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73649664
@tgaloppo Alright, thanks for the explanation. What makes you think that
the covariance matrix is wrong. I calculated it manually and it seems to be
right. I added the
Github user tgaloppo commented on a diff in the pull request:
https://github.com/apache/spark/pull/4459#discussion_r24355867
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/clustering/GaussianMixtureSuite.scala
---
@@ -80,4 +81,60 @@ class GaussianMixtureSuite extends
Github user tgaloppo commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73568637
@MechCoder Do you mean the negative values in the covariance (sigma)
matrices? Negative covariance indicates, roughly speaking, that variables move
in opposite
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73655675
[Test build #27179 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27179/consoleFull)
for PR 4459 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73655680
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73551857
[Test build #27108 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27108/consoleFull)
for PR 4459 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73564536
[Test build #27107 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27107/consoleFull)
for PR 4459 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73550904
[Test build #27107 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27107/consoleFull)
for PR 4459 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73565333
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user MechCoder commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73554216
Also a noob question, but what is the significance of the negative variance
in the tests?
---
If your project is set up for it, you can reply to this email and have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73564547
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user MechCoder commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73551176
@tgaloppo I fixed it up. Can you have a look?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73565319
[Test build #27108 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27108/consoleFull)
for PR 4459 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73649958
[Test build #27179 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27179/consoleFull)
for PR 4459 at commit
Github user tgaloppo commented on a diff in the pull request:
https://github.com/apache/spark/pull/4459#discussion_r24326410
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixture.scala ---
@@ -215,20 +217,29 @@ private object ExpectationSum {
def
Github user MechCoder commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73508202
@tgaloppo Thanks for your valuable feedback. Do you have anything more to
add as of now?
---
If your project is set up for it, you can reply to this email and have
Github user tgaloppo commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73510029
@MechCoder Nothing else stands out to me... I will give it another look
after your next commit.
---
If your project is set up for it, you can reply to this email and
Github user tgaloppo commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73571213
@MechCoder Getting close; just need to finish up the sparse single cluster
test.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user tgaloppo commented on a diff in the pull request:
https://github.com/apache/spark/pull/4459#discussion_r24326951
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala ---
@@ -255,6 +255,20 @@ private[spark] object BLAS extends Serializable with
Github user tgaloppo commented on a diff in the pull request:
https://github.com/apache/spark/pull/4459#discussion_r24326687
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/clustering/GaussianMixtureSuite.scala
---
@@ -40,10 +41,15 @@ class GaussianMixtureSuite extends
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73469069
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4459#issuecomment-73469062
[Test build #27087 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27087/consoleFull)
for PR 4459 at commit
42 matches
Mail list logo