Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107801822
--- Diff: python/pyspark/ml/tests.py ---
@@ -1243,6 +1245,43 @@ def test_tweedie_distribution(self):
self.assertTrue(np.isclose(model2
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17218
@hhbyyh OK that seems reasonable; I could see us adding support for
multiple items in the future as well. Thanks for confirming!
---
If your project is set up for it, you can reply
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17130
Noting here: Please check out the "Issue this PR brought up" here:
https://github.com/apache/spark/pull/17218
It may affect this PR. Thanks!
---
If your project is set up f
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17108
LGTM will merge after tests
Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107773640
--- Diff: python/pyspark/ml/fpm.py ---
@@ -0,0 +1,211 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107774888
--- Diff: python/pyspark/ml/tests.py ---
@@ -1243,6 +1244,45 @@ def test_tweedie_distribution(self):
self.assertTrue(np.isclose(model2
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107773828
--- Diff: python/pyspark/ml/tests.py ---
@@ -60,6 +60,7 @@
from pyspark.ml.regression import LinearRegression, DecisionTreeRegressor
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107770880
--- Diff: python/pyspark/ml/fpm.py ---
@@ -0,0 +1,211 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107774188
--- Diff: python/pyspark/ml/tests.py ---
@@ -1243,6 +1244,45 @@ def test_tweedie_distribution(self):
self.assertTrue(np.isclose(model2
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17218
Issue this PR brought up:
* Background: AssociationRules currently return a 1-element array for the
consequent (predicted item). This makes sense b/c, even though multiple
consequents could
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107757886
--- Diff: python/pyspark/ml/fpm.py ---
@@ -0,0 +1,211 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107757685
--- Diff: python/pyspark/ml/fpm.py ---
@@ -0,0 +1,211 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107761241
--- Diff: python/pyspark/ml/fpm.py ---
@@ -0,0 +1,211 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107759052
--- Diff: python/pyspark/ml/fpm.py ---
@@ -0,0 +1,211 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107758576
--- Diff: python/pyspark/ml/fpm.py ---
@@ -0,0 +1,211 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107758565
--- Diff: python/pyspark/ml/fpm.py ---
@@ -0,0 +1,211 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107758054
--- Diff: python/pyspark/ml/fpm.py ---
@@ -0,0 +1,211 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107757520
--- Diff: python/pyspark/ml/fpm.py ---
@@ -0,0 +1,211 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107757555
--- Diff: python/pyspark/ml/fpm.py ---
@@ -0,0 +1,211 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107757391
--- Diff: python/pyspark/ml/fpm.py ---
@@ -0,0 +1,211 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17218#discussion_r107757033
--- Diff: python/pyspark/ml/fpm.py ---
@@ -0,0 +1,211 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17218
Sure, I can take a look. Let me ping @mlnick too since he marked himself
as shepherd
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17130
I'll be happy to help get this merged now that the column renaming is done
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16722
Hi all, I can try to track this work now.
> This patch maintains the meaning of minInstancesPerNode, in that the
parameter still corresponds to raw, unweighted counts. It also adds a
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17108
LGTM except for the one doc nit.
When you update this, could you also please make and link JIRAs for the
Python wrapper and doc update?
---
If your project is set up for it, you can reply
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17108#discussion_r107718109
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Correlation.scala
---
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17108#discussion_r107717895
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Correlations.scala
---
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17108#discussion_r107717637
--- Diff:
mllib-local/src/test/scala/org/apache/spark/ml/util/TestingUtils.scala ---
@@ -32,6 +32,10 @@ object TestingUtils {
* the relative
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17108#discussion_r107284141
--- Diff: mllib/src/test/scala/org/apache/spark/ml/util/LinalgUtils.scala
---
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17108#discussion_r107283647
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Correlations.scala
---
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17108#discussion_r107074556
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Correlations.scala
---
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17108#discussion_r107074472
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Correlations.scala
---
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17108#discussion_r107075473
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Correlations.scala
---
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17108#discussion_r107283840
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Correlations.scala
---
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software
est, distribution, or what. This PR renames it to
ChiSquareTest to clarify this.
## How was this patch tested?
Existing unit tests
Author: Joseph K. Bradley <jos...@databricks.com>
Closes #17368 from jkbradley/SPARK-20039.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: h
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17368
Yep, thanks for confirming that @srowen and checking it out @imatiach-msft
and @MLnick !
Merging with master
---
If your project is set up for it, you can reply to this email and have
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17368
CC @thunterdb @imatiach-msft What do you think?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
GitHub user jkbradley opened a pull request:
https://github.com/apache/spark/pull/17368
[SPARK-20039][ML] rename ChiSquare to ChiSquareTest
## What changes were proposed in this pull request?
I realized that since ChiSquare is in the package stat, it's pretty unclear
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17108
Taking a look now
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Repository: spark
Updated Branches:
refs/heads/master fc7554599 -> bec6b16c1
[SPARK-19899][ML] Replace featuresCol with itemsCol in ml.fpm.FPGrowth
## What changes were proposed in this pull request?
Replaces `featuresCol` `Param` with `itemsCol`. See
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17321
LGTM
Thanks for the PR!
Merging with master
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17321#discussion_r106970923
--- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala ---
@@ -37,7 +37,20 @@ import org.apache.spark.sql.types._
/**
* Common
sed API
Author: Joseph K. Bradley <jos...@databricks.com>
Closes #17110 from jkbradley/df-hypotests.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4c320054
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4c32
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17110
OK merging with master
Thanks @imatiach-msft and @thunterdb !
@imatiach-msft I agree about sparse testing. This has all of the MLlib
tests, but we should add more in the future
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17237
Thanks for the PR! I just merged the fix for
https://issues.apache.org/jira/browse/SPARK-11569 which will affect this PR.
Would you mind updating this PR to include SPARK-11569's handling
Repository: spark
Updated Branches:
refs/heads/master d4a637cd4 -> 85941ecf2
[SPARK-11569][ML] Fix StringIndexer to handle null value properly
## What changes were proposed in this pull request?
This PR is to enhance StringIndexer with NULL values handling.
Before the PR, StringIndexer will
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17233
LGTM
Merging with master
Thanks for the improvement!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Repository: spark
Updated Branches:
refs/heads/master 5e96a57b2 -> d4a637cd4
[SPARK-19940][ML][MINOR] FPGrowthModel.transform should skip duplicated items
## What changes were proposed in this pull request?
This commit moved `distinct` in its intended place to avoid duplicated
predictions
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17283
Thanks for fixing this issue! LGTM
Merging with master
Stating the JIRA number for a bug fix is reasonable, though it's most
useful if the bug appears in an actual release
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17110
Ping @imatiach-msft any more comments after the update?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
hor: Joseph K. Bradley <jos...@databricks.com>
Closes #17215 from jkbradley/write-err-msg.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/72c66dbb
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/72c66dbb
Diff: h
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17215
Thanks! Merging with master
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17218
True, if minSupport can be shared, then that's OK. confidence won't be
shared though.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17233#discussion_r10571
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala ---
@@ -122,6 +122,86 @@ class StringIndexerSuite
assert
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17233#discussion_r105706598
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -39,20 +39,21 @@ import
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17233#discussion_r105719023
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala ---
@@ -122,6 +122,86 @@ class StringIndexerSuite
assert
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17233#discussion_r105719645
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala ---
@@ -122,6 +122,86 @@ class StringIndexerSuite
assert
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17233#discussion_r105721463
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -188,35 +189,45 @@ class StringIndexerModel
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17233
I'll take a look
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17130
The updated transform looks good; thanks for pinging!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17130#discussion_r105702725
--- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala ---
@@ -56,8 +56,8 @@ private[fpm] trait FPGrowthParams extends Params
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17218
Thanks for the PR! I'll wait until this isn't "WIP" to review it
thoroughly, but I'll make two comments now:
* The params should not be added to shared.py since they are not shar
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16002
@yanboliang Sorry for missing earlier discussion. I'm OK with declaring
defeat here, though I still disagree about using exceptions. I agree that
passing an obscure error code up is not ideal
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17110
I just reversed my opinion about a shared "Statistics" object. See
https://github.com/apache/spark/pull/17108#issuecomment-285200613 for details.
I pushed an update per y
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17108
Given further thought, I'd prefer we stick to the API specified in the
design doc, with a Correlations object instead of a generic Statistics object.
In the future, we may want optional Params
GitHub user jkbradley opened a pull request:
https://github.com/apache/spark/pull/17215
[MINOR][ML] Improve MLWriter overwrite error message
## What changes were proposed in this pull request?
Give proper syntax for Java and Python in addition to Scala.
## How
Repository: spark
Updated Branches:
refs/heads/branch-2.0 0cc992c89 -> e69902806
[SPARK-19348][PYTHON] PySpark keyword_only decorator is not thread-safe
## What changes were proposed in this pull request?
The `keyword_only` decorator in PySpark is not thread-safe. It writes kwargs
to a
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17195
LGTM
Merging with branch-2.0
Thank you again!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Repository: spark
Updated Branches:
refs/heads/branch-2.1 3b648a626 -> 0ba9ecbea
[SPARK-19348][PYTHON] PySpark keyword_only decorator is not thread-safe
## What changes were proposed in this pull request?
The `keyword_only` decorator in PySpark is not thread-safe. It writes kwargs
to a
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17193
LGTM
Merging with branch-2.1
Thank you!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16811
Thanks! I made a follow-up JIRA for updating the Python API:
https://issues.apache.org/jira/browse/SPARK-19866
---
If your project is set up for it, you can reply to this email and have your
Repository: spark
Updated Branches:
refs/heads/master d8830c503 -> 56e1bd337
[SPARK-17629][ML] methods to return synonyms directly
## What changes were proposed in this pull request?
provide methods to return synonyms directly, without wrapping them in a
dataframe
In performance sensitive
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16811
LGTM
Merging with master
Thank you!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16883
Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16739
I've commented elsewhere, but wanted to here just to make more people
aware: Let's refrain from backporting new APIs into patch versions unless they
are really critical. We do not do
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16512
It would definitely be considered a new API, though I agree with you that
it's probably safe. That said, I'm not a fan of such changes in patch versions
unless they really are necessary
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16883
Btw, are you interested in updating the Python API too?
https://issues.apache.org/jira/browse/SPARK-19852
---
If your project is set up for it, you can reply to this email and have your
reply
Repository: spark
Updated Branches:
refs/heads/master c05baabf1 -> 4a9034b17
[SPARK-17498][ML] StringIndexer enhancement for handling unseen labels
## What changes were proposed in this pull request?
This PR is an enhancement to ML StringIndexer.
Before this PR, String Indexer only supports
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16883
LGTM
Merging with master
Thanks a lot!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17090
Thanks @MLnick for the explanation. This is what I'd understood from your
similar description on the JIRA, but definitely more in-depth. (It might be
good to copy to JIRA, or even a design doc
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17090
@MLnick OK I think I misunderstood some of your comments above then. I
see the proposal in SPARK-14409 differs from this PR, so I agree it'd be nice
to resolve it. We can make changes
Repository: spark
Updated Branches:
refs/heads/master 9991c2dad -> 926543664
[SPARK-19382][ML] Test sparse vectors in LinearSVCSuite
## What changes were proposed in this pull request?
Add unit tests for testing SparseVector.
We can't add mixed DenseVector and SparseVector test case, as
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16784
LGTM
Thanks @wangmiao1981 !
Merging with master
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104489270
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -71,9 +92,8 @@ class StringIndexer @Since("1.4.0") (
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104489264
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -34,8 +36,27 @@ import org.apache.spark.util.collection.OpenHashMap
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104489244
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -34,8 +36,27 @@ import org.apache.spark.util.collection.OpenHashMap
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104489280
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -142,18 +167,17 @@ class StringIndexerModel
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16784
Thanks for the updates. This LGTM pending the conflict resolution. Sorry
for the delay!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15435
+1 for moving implementation to traits, as long as the public methods are
still Java-friendly. (Methods which are implemented in traits often can't be
called from Java.)
---
If your project
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16623
Thanks for the comments. I definitely agree with many of your combined
statements:
* R has not been declared stable. (Though where in the docs is this even
stated? I was unable to find
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16715#discussion_r104331119
--- Diff: python/pyspark/ml/feature.py ---
@@ -120,6 +122,196 @@ def getThreshold(self):
return self.getOrDefault(self.threshold
Repository: spark
Updated Branches:
refs/heads/master 369a148e5 -> 70f9d7f71
[SPARK-19535][ML] RecommendForAllUsers RecommendForAllItems for ALS on Dataframe
## What changes were proposed in this pull request?
This is a simple implementation of RecommendForAllUsers & RecommendForAllItems
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17090
I'll merge this with master now
Thanks @sueann and @MLnick for feedback. I'll prioritize helping with
your work on transform, metrics, and tuning for ALS next.
---
If your project is set
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16811#discussion_r104330610
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/Word2VecSuite.scala ---
@@ -134,13 +134,20 @@ class Word2VecSuite extends SparkFunSuite
Github user jkbradley closed the pull request at:
https://github.com/apache/spark/pull/17165
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104296156
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -142,18 +166,18 @@ class StringIndexerModel
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104296099
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -105,7 +125,11 @@ class StringIndexer @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104296562
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -163,25 +187,28 @@ class StringIndexerModel
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104296367
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -163,25 +190,28 @@ class StringIndexerModel
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104296546
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -71,18 +92,17 @@ class StringIndexer @Since("1.4.0") (
1101 - 1200 of 8390 matches
Mail list logo