Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-207567445
Yes, I'll try to take a look, thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user zhengruifeng commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-207190370
@jkbradley Thanks. BTW, I have three minor PRs for DOC, and there is a
whiile since I open them. Do you mind if I cc you at those PRs and you give a
glimpse in
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/11419
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-206509721
LGTM
Merging with master
I'll create follow-up JIRAs for Python and for docs
Thanks very much!
---
If your project is set up for it, you can reply to
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-206116388
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-206116392
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-206116242
**[Test build #55085 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55085/consoleFull)**
for PR 11419 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-206110770
**[Test build #55085 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55085/consoleFull)**
for PR 11419 at commit
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-205916377
Haha OK thanks. I just sent a PR to update this PR:
[https://github.com/zhengruifeng/spark/pull/1]
---
If your project is set up for it, you can reply to this
Github user zhengruifeng commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-205586852
@jkbradley You dont need to apologize for patience and carefulness, and
your comments really help me a lot.
You are welcomed to update the last items!
---
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-205407379
The update looks good. One more thing: I realized we should add unit test
coverage for the model summary. Apologies for the many iterations of code
review; let me
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-205182059
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-205181939
**[Test build #54837 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54837/consoleFull)**
for PR 11419 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-205182058
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user zhengruifeng commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r58336208
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,301 @@
+/*
+ * Licensed to the Apache
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-205170209
**[Test build #54837 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54837/consoleFull)**
for PR 11419 at commit
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r58330615
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,301 @@
+/*
+ * Licensed to the Apache Software
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-204903301
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-204902834
**[Test build #54794 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54794/consoleFull)**
for PR 11419 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-204903297
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-204896121
**[Test build #54794 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54794/consoleFull)**
for PR 11419 at commit
Github user zhengruifeng commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-204888043
@jkbradley I fix those 2 issues. And I change the output type of
clusterSizes from Map[Int, Long] to Array[Long]
---
If your project is set up for it, you can
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-204763129
There is also 1 comment above (inline response) about load which needs to
be addressed. Those 2 remaining items should be it. Thanks!
---
If your project is set
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r58296701
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,301 @@
+/*
+ * Licensed to the Apache Software
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-204638410
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-204638409
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-204638394
**[Test build #54753 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54753/consoleFull)**
for PR 11419 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-204634821
**[Test build #54753 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54753/consoleFull)**
for PR 11419 at commit
Github user zhengruifeng commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-204623567
@jkbradley Sorry, there was somthing wrong in my push operation and
involved other peoples' commit into this PR. So I recreated the commit.
---
If your project
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-204071362
I only found a couple more items. After those, this should be ready.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r58105376
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,301 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r58103804
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,291 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r58103812
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,301 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-204063751
Thanks for updating this. By the way, it looks like you're squashing your
commits, which makes it difficult for reviewers to tell what your latest
changes are.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-203877187
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-203877191
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-203877075
**[Test build #54616 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54616/consoleFull)**
for PR 11419 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-203875174
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-203875170
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-203875134
**[Test build #54615 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54615/consoleFull)**
for PR 11419 at commit
Github user zhengruifeng commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-203867515
@jkbradley
I have fix those issue according your comments, and I am willing to
following this PR.
The is something wrong in my force push operation... I
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-203866803
**[Test build #54616 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54616/consoleFull)**
for PR 11419 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-203864758
**[Test build #54615 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54615/consoleFull)**
for PR 11419 at commit
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r57983007
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,291 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r57983020
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/clustering/GaussianMixtureSuite.scala
---
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-203689419
Nice PR! I only had minor comments. By the way, I know it has been a
while, so please say if you don't have time to work on this currently.
---
If your project is
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r57982988
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,291 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r57982984
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,291 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r57982997
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,291 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r57983002
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,291 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r57983018
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/clustering/GaussianMixtureSuite.scala
---
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r57982986
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,291 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r57982992
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,291 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r57983012
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,291 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/11419#discussion_r57983006
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -0,0 +1,291 @@
+/*
+ * Licensed to the Apache Software
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-203688707
**[Test build #2715 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2715/consoleFull)**
for PR 11419 at commit
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-203677408
I'll take a look at this
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-203677181
**[Test build #2715 has
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2715/consoleFull)**
for PR 11419 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-190585130
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-190585131
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-190585033
**[Test build #52218 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52218/consoleFull)**
for PR 11419 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-190567555
**[Test build #52218 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52218/consoleFull)**
for PR 11419 at commit
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-190566838
ok to test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-190566812
add to whitelist
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user zhengruifeng commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-190507332
Jenkins test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user zhengruifeng commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-190486901
@sethah GaussianMixture is a kind of clustering algorithm, not
classification.
And ProbabilisticClassifier extend Predictor, so it has a "setLabelCol"
Github user sethah commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-190298309
Is there any reason this estimator should not extend
`ProbabilisticClassifier`?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user zhengruifeng commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-190159614
Jenkins, test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11419#issuecomment-189861644
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
GitHub user zhengruifeng opened a pull request:
https://github.com/apache/spark/pull/11419
[SPARK-13538][ML] Add GaussianMixture to ML
JIRA: https://issues.apache.org/jira/browse/SPARK-13538
## What changes were proposed in this pull request?
Add GaussianMixture
70 matches
Mail list logo