Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r93973973
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r93972115
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r93972699
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r93981193
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r93974759
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r93974324
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r93974634
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r93981206
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r93973211
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r93982097
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r93995903
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16355
ok to test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r94062847
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15212#discussion_r94073643
--- Diff: python/pyspark/ml/feature.py ---
@@ -2629,8 +2629,28 @@ class ChiSqSelector(JavaEstimator, HasFeaturesCol,
HasOutputCol, HasLabelCol, Ja
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15212#discussion_r94073623
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/ChiSqSelectorSuite.scala ---
@@ -79,6 +79,12 @@ class ChiSqSelectorSuite extends SparkFunSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15212#discussion_r94073566
--- Diff: docs/ml-features.md ---
@@ -1423,12 +1423,12 @@ for more details on the API.
`ChiSqSelector` stands for Chi-Squared feature selection. It
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15413
I'll take a look, thanks for pinging!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this fe
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r94078123
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -356,13 +427,243 @@ class GaussianMixture @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r94085951
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/clustering/GaussianMixtureSuite.scala
---
@@ -126,6 +141,106 @@ class GaussianMixtureSuite extends
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r94086048
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/clustering/GaussianMixtureSuite.scala
---
@@ -18,22 +18,37 @@
package
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r94084918
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -356,13 +427,243 @@ class GaussianMixture @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r94085211
--- Diff: python/pyspark/ml/clustering.py ---
@@ -95,15 +95,10 @@ class GaussianMixture(JavaEstimator, HasFeaturesCol,
HasPredictionCol, HasMaxIte
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r94084884
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -356,13 +427,243 @@ class GaussianMixture @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r94083559
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -356,13 +427,243 @@ class GaussianMixture @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r94086090
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/clustering/GaussianMixtureSuite.scala
---
@@ -126,6 +141,106 @@ class GaussianMixtureSuite extends
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r94082859
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -356,13 +427,243 @@ class GaussianMixture @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r94083864
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -356,13 +427,243 @@ class GaussianMixture @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r94084799
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -323,27 +326,95 @@ class GaussianMixture @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r94088566
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15211
Thanks! The updates look good. I'll check out the unit tests now.
Thanks for looking into the default intercept. Also, let me know if you
find literature about convergence analys
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r94091205
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala ---
@@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r94090777
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala ---
@@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r94090877
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala ---
@@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r94091453
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala ---
@@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r94091151
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala ---
@@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r94091025
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala ---
@@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r94090785
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,554 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15211#discussion_r94093958
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LinearSVC.scala ---
@@ -0,0 +1,558 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16415
predErrorCheckpointer should already be persisting and unpersisting
predError. This PR's changes will mean:
* persist will use MEMORY_AND_DISK instead of MEMORY_ONLY
* 1 (instead
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15212
Will do!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16434#discussion_r94165525
--- Diff: python/pyspark/ml/feature.py ---
@@ -2629,6 +2629,8 @@ class ChiSqSelector(JavaEstimator, HasFeaturesCol,
HasOutputCol, HasLabelCol, Ja
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16436#discussion_r94174586
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala ---
@@ -219,6 +219,16 @@ class StringIndexerSuite
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16434
Also, could you please change the PR description to be self-contained
(rather than just referencing another PR)? The description becomes the commit
message.
---
If your project is set up for
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16436
LGTM
Merging with master
Thanks @imatiach-msft !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16415
Thanks for checking!
Does changing the storageLevel in predErrorCheckpointer fix the problem?
"other use cases": Well, I remember thinking about this a lot when a
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16355
@yu-iskw Pinging on this since you wrote bisecting k-means originally. Do
you have time to take a look? Thanks!
---
If your project is set up for it, you can reply to this email and have your
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16441
Thanks for the PR; I do want to get this fixed. However, I don't think
this is the right way to make predictions of probabilities for GBTs. I believe
it should depend on the loss used.
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/12823
@BenFradet I'm sorry for dropping the ball on this one. Did you close this
due to inactivity? If you're willing, it would be nice to do this cleanup.
To answer your
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16415
@zdh2292390 Thanks for the update. Given that this will change behavior
for existing workloads, I'll ask that we specify it via a Param. Also, I'm
going to create a new JIRA for thi
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16457
+1 for @sethah 's comment: Algorithms should validate input data. Some
already do:
https://github.com/apache/spark/blob/b67b35f76b684c5176dc683e7491fd01b43f4467/mllib/src/main/scala/org/a
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16453
LGTM except for the style nit
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16453#discussion_r94496997
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala ---
@@ -127,13 +127,11 @@ class NaiveBayes @Since("
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15768
Pinging on this: What's a reasonable ETA for updating the PR? Thanks
@yanboliang !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/14394
@supremekai Thanks for the PR! I'm sorry about the inactivity on this.
However, now that it has been added to the DataFrame-based API (in pyspark.ml),
we will not be adding it to the RDD-
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/12491
@hujy Sorry for the delay!
ok to test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/11520
Sorry about the inaction on this! As you said on the JIRA, let's redo this
for the DataFrame-based API. In the meantime, could you please close this
issue? Thanks a lot.
---
If your pr
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/12135#discussion_r94640047
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1364,18 +1364,41 @@ def approxQuantile(self, col, probabilities,
relativeError):
Space
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/12135#discussion_r94640052
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1364,18 +1364,41 @@ def approxQuantile(self, col, probabilities,
relativeError):
Space
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15671
Taking a look now!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15671#discussion_r94640747
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala ---
@@ -225,7 +230,7 @@ class LinearRegression @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15671#discussion_r94641380
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala
---
@@ -227,6 +227,11 @@ class AFTSurvivalRegression @Since
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15671#discussion_r94641388
--- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala ---
@@ -905,7 +911,10 @@ class LDA @Since("1.6.0") (
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15671#discussion_r94641394
--- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala ---
@@ -888,6 +888,12 @@ class LDA @Since("1.6.0") (
@Si
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15314
Sorry for the delay, will look now
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15314#discussion_r94645280
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/util/MLTestingUtils.scala ---
@@ -47,18 +47,47 @@ object MLTestingUtils extends SparkFunSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15314#discussion_r94645495
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/util/MLTestingUtils.scala ---
@@ -137,10 +169,11 @@ object MLTestingUtils extends SparkFunSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15314#discussion_r94645270
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/util/MLTestingUtils.scala ---
@@ -118,12 +148,14 @@ object MLTestingUtils extends SparkFunSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16480#discussion_r94861192
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -344,6 +344,10 @@ final class OneVsRest @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16480#discussion_r94862191
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/util/Instrumentation.scala ---
@@ -85,9 +86,27 @@ private[spark] class Instrumentation[E <: Estima
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16480#discussion_r94861627
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala ---
@@ -116,13 +116,17 @@ class TrainValidationSplit @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16480#discussion_r94860559
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -344,6 +344,10 @@ final class OneVsRest @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16480#discussion_r94861932
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/util/Instrumentation.scala ---
@@ -85,9 +86,27 @@ private[spark] class Instrumentation[E <: Estima
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16480#discussion_r94860863
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -344,6 +344,10 @@ final class OneVsRest @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16480#discussion_r94860281
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -344,6 +344,10 @@ final class OneVsRest @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16480#discussion_r94861991
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/util/Instrumentation.scala ---
@@ -85,9 +86,27 @@ private[spark] class Instrumentation[E <: Estima
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16434#discussion_r94867603
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/ChiSqSelectorSuite.scala ---
@@ -35,22 +35,63 @@ class ChiSqSelectorSuite extends SparkFunSuite
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16434
Thanks @mpjlu ! The changes look good, except that I'd like to have a code
snippet for verifying with R.
---
If your project is set up for it, you can reply to this email and have your
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16480#discussion_r94871938
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tuning/ValidatorParams.scala ---
@@ -76,6 +77,18 @@ private[ml] trait ValidatorParams extends HasSeed
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16480#discussion_r94872466
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -339,11 +344,13 @@ final class OneVsRest @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16480#discussion_r94872199
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tuning/ValidatorParams.scala ---
@@ -76,6 +77,18 @@ private[ml] trait ValidatorParams extends HasSeed
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16480
ok to test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16480
add to whitelist
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16480
LGTM pending Jenkins tests
Thanks @sueann !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16355#discussion_r95001542
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/clustering/KMeansSuite.scala ---
@@ -160,6 +162,17 @@ object KMeansSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16355#discussion_r95001382
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/clustering/BisectingKMeansSuite.scala
---
@@ -29,9 +29,12 @@ class BisectingKMeansSuite
final
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16355#discussion_r95001534
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/clustering/BisectingKMeansSuite.scala
---
@@ -51,6 +54,23 @@ class BisectingKMeansSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16355#discussion_r95001517
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/clustering/BisectingKMeansSuite.scala
---
@@ -51,6 +54,23 @@ class BisectingKMeansSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r95031363
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/clustering/GaussianMixtureSuite.scala
---
@@ -126,9 +143,104 @@ class GaussianMixtureSuite extends
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r95031388
--- Diff: python/pyspark/ml/clustering.py ---
@@ -95,15 +95,10 @@ class GaussianMixture(JavaEstimator, HasFeaturesCol,
HasPredictionCol, HasMaxIte
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r95031349
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -356,13 +427,243 @@ class GaussianMixture @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15413#discussion_r95031946
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/clustering/GaussianMixtureSuite.scala
---
@@ -126,9 +143,104 @@ class GaussianMixtureSuite extends
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15413
This LGTM
@sethah Any further comments before we merge it?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16480
Merging with master.
Not backporting unless people request it since this memory leak is very
minor.
Thanks @sueann !
---
If your project is set up for it, you can reply to this email and
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15413
OK, I'll just wait so @sethah can make a final pass and so @yanboliang can
merge the 2 tests.
---
If your project is set up for it, you can reply to this email and have your
reply appe
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16491
Yikes, thanks for fixing this!
LGTM
Merging with master
I'll also try to merge it with branch-2.1, branch-2.0, branch-1.6 but will
say if I run into issues.
---
If your proje
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15018#discussion_r95094550
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala
---
@@ -328,74 +336,80 @@ class IsotonicRegression private
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/15018#discussion_r95094551
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala
---
@@ -328,74 +336,80 @@ class IsotonicRegression private
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16495
The key point @vlad17 made was that an operation which should be O(N) is
taking O(N^2) in the current implementation. Let's fix that, regardless of
whether or not we add a stress
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16494
Thanks for the patch. This sounds like it may be the same bug being
addressed in https://issues.apache.org/jira/browse/SPARK-14804 so I'll CC @tdas
If so, then I believe the b
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/12064#discussion_r95858775
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala ---
@@ -130,6 +130,25 @@ class GaussianMixtureModel private[ml
501 - 600 of 7760 matches
Mail list logo