Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17336
Thanks a lot for the second update! This LGTM
Merging with master
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17532#discussion_r109975956
--- Diff: python/pyspark/ml/linalg/__init__.py ---
@@ -72,7 +72,9 @@ def _convert_to_vector(l):
return DenseVector(l)
elif
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17532
Are you able to write a unit test which passes data through
_convert_to_vector and fails before this fix?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17532
Btw, I'd really like to get this into 2.2, which will be cut soon. Let me
know if you'd like me to take it over. Thanks!
---
If your project is set up for it, you can reply to this
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17532#discussion_r110057924
--- Diff: python/pyspark/mllib/tests.py ---
@@ -853,6 +853,17 @@ def serialize(l):
self.assertEqual(sv, serialize(lil.tocsr
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17532
LGTM
Merging with master and branch-2.1, branch-2.0
Thanks a lot!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17494
LGTM if others are ok too
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16782
I'm OK with the current solution, though if it's easy to check using
```inspection``` then that seems nice to do.
If there are cases in which the wrapper is still not thread-
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17048
Can you please close this manually? Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15770
Yep, that's correct. Everyone, please let me know if you disagree.
Also, if we do go with Option 2 above, then the input schema could be a few
possible things:
* list of (neighb
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16883
I'll take a look now, thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
en
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r103332885
--- Diff: docs/ml-features.md ---
@@ -576,7 +578,22 @@ will be generated:
2 | c| 1.0
-Notice that the row containing &q
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r103332764
--- Diff: docs/ml-features.md ---
@@ -502,7 +502,7 @@ for more details on the API.
## StringIndexer
`StringIndexer` encodes a string
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r103331444
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -163,25 +190,28 @@ class StringIndexerModel
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r103331212
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -163,25 +190,28 @@ class StringIndexerModel
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r103330268
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -71,18 +90,22 @@ class StringIndexer @Since("1.4.0") (
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r103330303
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -71,18 +90,22 @@ class StringIndexer @Since("1.4.0") (
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r103330242
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -71,18 +90,22 @@ class StringIndexer @Since("1.4.0") (
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r103330093
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -34,8 +36,25 @@ import org.apache.spark.util.collection.OpenHashMap
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r103332623
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala ---
@@ -75,22 +75,32 @@ class StringIndexerSuite
intercept
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r103332929
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -34,8 +36,25 @@ import org.apache.spark.util.collection.OpenHashMap
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r103325211
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -17,14 +17,16 @@
package org.apache.spark.ml.feature
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r103325403
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -34,8 +36,25 @@ import org.apache.spark.util.collection.OpenHashMap
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16811#discussion_r103338146
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/Word2VecSuite.scala ---
@@ -144,6 +144,31 @@ class Word2VecSuite extends SparkFunSuite with
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16811#discussion_r103338261
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/Word2VecSuite.scala ---
@@ -144,6 +144,31 @@ class Word2VecSuite extends SparkFunSuite with
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/14273
Sorry about the delay here. Do you still have time to work on this?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16965
Github isn't handling the merge well, so you might try rebasing
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your pr
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/14273
OK apologies @MechCoder for the delay. I guess we can close this issue,
and someone else can open up a PR based on yours.
---
If your project is set up for it, you can reply to this email and
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/12762#discussion_r103352114
--- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
---
@@ -242,16 +263,19 @@ class ALSModel private[ml
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17090#discussion_r103354132
--- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
---
@@ -285,6 +286,55 @@ class ALSModel private[ml] (
@Since
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17090#discussion_r103353184
--- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
---
@@ -285,6 +285,43 @@ class ALSModel private[ml] (
@Since
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17090#discussion_r103353799
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/recommendation/TopByKeyAggregator.scala
---
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17090#discussion_r103352432
--- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
---
@@ -285,6 +285,43 @@ class ALSModel private[ml] (
@Since
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17090#discussion_r103350750
--- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
---
@@ -285,6 +285,43 @@ class ALSModel private[ml] (
@Since
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17090#discussion_r103351299
--- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
---
@@ -248,18 +248,18 @@ class ALSModel private[ml] (
@Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16715#discussion_r103357342
--- Diff: python/pyspark/ml/feature.py ---
@@ -120,6 +122,196 @@ def getThreshold(self):
return self.getOrDefault(self.threshold
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17090#discussion_r103396443
--- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
---
@@ -248,18 +248,18 @@ class ALSModel private[ml] (
@Since("
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17090
@hhbyyh This is different from https://github.com/apache/spark/pull/12574
since it sidesteps the ongoing design discussions about input and output
schema. Eventually, I'd like us to pr
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17090#discussion_r103397271
--- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
---
@@ -285,6 +288,57 @@ class ALSModel private[ml] (
@Since
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17090
I'd been following the long discussions about a transform-based solution,
but those had not seemed to have converged to a clear design. If you feel they
have in your PR, then I'll
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17090
@MLnick Thanks for showing those comparison numbers. If your
implementation is faster, then I'm happy going with it. I do wonder if we
might hit scalability issues with RDDs which we woul
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17090
@MLnick Thanks *a lot* for the detailed tests! I really appreciate it. In
this case, are you OK with the approach in the current PR (pending reviews)?
One thing we should confirm is
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17103
LGTM
Thanks!
Merging with master
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16782
> it leaves in place the static class variable for all other ML classes
that use the wrapper, and those classes continue to use the static class
variable.
I think this was discus
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15415
test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15415
I'm going to go ahead and merge this after tests to make sure it's in 2.2,
but can you please send a follow-up for my last 2 comments? Thanks!
---
If your project is set up for i
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15415
No problem, thanks! Could you please create a subtask for docs?
Merging with master
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16784#discussion_r103582526
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala ---
@@ -68,13 +77,21 @@ class LinearSVCSuite extends SparkFunSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16784#discussion_r103582543
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala ---
@@ -51,6 +54,12 @@ class LinearSVCSuite extends SparkFunSuite
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16784#discussion_r103582514
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/LinearSVCSuite.scala ---
@@ -68,13 +77,21 @@ class LinearSVCSuite extends SparkFunSuite
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16883
Btw, we're near the time when the 2.2 branch will be cut, and I'd like to
get this into 2.2. Let me know if you're busy, and I'd be happy to help
finalize the PR. Thanks!
GitHub user jkbradley opened a pull request:
https://github.com/apache/spark/pull/17110
[SPARK-19635][ML] DataFrame-based API for chi square test
## What changes were proposed in this pull request?
Wrapper taking and return a DataFrame
## How was this patch tested
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17090
It's a good point about making an implicit decision. We could deprecate
these methods in favor of transform-based ones in the future---we have done
this in the past---but it does push the
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17110#discussion_r104220081
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/ChiSquare.scala ---
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17110#discussion_r104220074
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/ChiSquare.scala ---
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17110#discussion_r104220095
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/stat/ChiSquareSuite.scala ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17110
Actually, synced with @thunterdb and will update design doc to put
everything under a "Statistics" object. I'll wait until
https://github.com/apache/spark/pull/17108 gets merged
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17090
LGTM
Any other comments before we merge?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16782
You're right about the test. I'll take a look now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project doe
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16782
Clever unit test : )
LGTM
Merging with master
I'll try to backport it to branch-2.1 and branch-2.0 as well.
---
If your project is set up for it, you can reply to this
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16782
Well, it merged with master, but it will need some manual backports.
@BryanCutler Would you mind sending one for branch-2.1? I'm ambivalent about
2.0; your call (or anyone who's hit t
GitHub user jkbradley opened a pull request:
https://github.com/apache/spark/pull/17165
[DO NOT MERGE][TESTING] Vince shieh spark 17498
Temp PR to reproduce Jenkins compilation error
You can merge this pull request into a Git repository by running:
$ git pull https
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104296526
--- Diff: docs/ml-features.md ---
@@ -542,12 +543,13 @@ column, we should get the following:
"a" gets index `0` because it is the mos
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104296075
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -105,7 +125,11 @@ class StringIndexer @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104296045
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -34,8 +36,27 @@ import org.apache.spark.util.collection.OpenHashMap
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104296396
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -163,25 +190,28 @@ class StringIndexerModel
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104296546
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -71,18 +92,17 @@ class StringIndexer @Since("1.4.0") (
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104296156
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -142,18 +166,18 @@ class StringIndexerModel
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104296099
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -105,7 +125,11 @@ class StringIndexer @Since("
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104296562
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -163,25 +187,28 @@ class StringIndexerModel
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104296367
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -163,25 +190,28 @@ class StringIndexerModel
Github user jkbradley closed the pull request at:
https://github.com/apache/spark/pull/17165
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16811#discussion_r104330610
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/Word2VecSuite.scala ---
@@ -134,13 +134,20 @@ class Word2VecSuite extends SparkFunSuite with
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17090
I'll merge this with master now
Thanks @sueann and @MLnick for feedback. I'll prioritize helping with
your work on transform, metrics, and tuning for ALS next.
---
If your proj
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16715#discussion_r104331119
--- Diff: python/pyspark/ml/feature.py ---
@@ -120,6 +122,196 @@ def getThreshold(self):
return self.getOrDefault(self.threshold
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16623
Thanks for the comments. I definitely agree with many of your combined
statements:
* R has not been declared stable. (Though where in the docs is this even
stated? I was unable to find
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15435
+1 for moving implementation to traits, as long as the public methods are
still Java-friendly. (Methods which are implemented in traits often can't be
called from Java.)
---
If your proje
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16784
Thanks for the updates. This LGTM pending the conflict resolution. Sorry
for the delay!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104489280
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -142,18 +167,17 @@ class StringIndexerModel
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104489264
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -34,8 +36,27 @@ import org.apache.spark.util.collection.OpenHashMap
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104489270
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -71,9 +92,8 @@ class StringIndexer @Since("1.4.0") (
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r104489244
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -34,8 +36,27 @@ import org.apache.spark.util.collection.OpenHashMap
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16784
LGTM
Thanks @wangmiao1981 !
Merging with master
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17090
@MLnick OK I think I misunderstood some of your comments above then. I
see the proposal in SPARK-14409 differs from this PR, so I agree it'd be nice
to resolve it. We can make changes to
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17090
Thanks @MLnick for the explanation. This is what I'd understood from your
similar description on the JIRA, but definitely more in-depth. (It might be
good to copy to JIRA, or even a desig
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16883
LGTM
Merging with master
Thanks a lot!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16883
Btw, are you interested in updating the Python API too?
https://issues.apache.org/jira/browse/SPARK-19852
---
If your project is set up for it, you can reply to this email and have your
reply
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16512
It would definitely be considered a new API, though I agree with you that
it's probably safe. That said, I'm not a fan of such changes in patch versions
unless they really are
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16739
I've commented elsewhere, but wanted to here just to make more people
aware: Let's refrain from backporting new APIs into patch versions unless they
are really critical. We do n
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16883
Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16811
LGTM
Merging with master
Thank you!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16811
Thanks! I made a follow-up JIRA for updating the Python API:
https://issues.apache.org/jira/browse/SPARK-19866
---
If your project is set up for it, you can reply to this email and have your
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17193
LGTM
Merging with branch-2.1
Thank you!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17195
LGTM
Merging with branch-2.0
Thank you again!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
GitHub user jkbradley opened a pull request:
https://github.com/apache/spark/pull/17215
[MINOR][ML] Improve MLWriter overwrite error message
## What changes were proposed in this pull request?
Give proper syntax for Java and Python in addition to Scala.
## How was
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17108
Given further thought, I'd prefer we stick to the API specified in the
design doc, with a Correlations object instead of a generic Statistics object.
In the future, we may want optional P
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17110
I just reversed my opinion about a shared "Statistics" object. See
https://github.com/apache/spark/pull/17108#issuecomment-285200613 for details.
I pushed an update per y
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16002
@yanboliang Sorry for missing earlier discussion. I'm OK with declaring
defeat here, though I still disagree about using exceptions. I agree that
passing an obscure error code up is not
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17218
Thanks for the PR! I'll wait until this isn't "WIP" to review it
thoroughly, but I'll make two comments now:
* The params should not be added to shared.py since th
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/17130#discussion_r105702725
--- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala ---
@@ -56,8 +56,8 @@ private[fpm] trait FPGrowthParams extends Params with
801 - 900 of 7760 matches
Mail list logo