Github user WeichenXu123 closed the pull request at:
https://github.com/apache/spark/pull/16574
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/16574
I need to make a survey for better Cartesian implementation, especially in
shuffle way. Close this PR for now and when the new solution is done I will
reopen it.
---
If your project is set
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/16576
@mridulm code updated. thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15730
@brkyvz I update code and attach a running result screenshot, waiting for
your review, thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
@sethah Oh, thanks very much! I will update code immediately. As I commit
this PR I will work on it in the first priority. Happy new year!
---
If your project is set up for it, you can reply
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15730#discussion_r96131333
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/BlockMatrix.scala
---
@@ -459,14 +464,155 @@ class BlockMatrix @Since("
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
Jenkins, test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/17373
[SPARK-12664] Expose probability in mlp model
## What changes were proposed in this pull request?
Modify MLP model to inherit `ProbabilisticClassificationModel` and so that
it can
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
@sethah Don't worry, I will update code ASAP and @yanboliang will also help
review it. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15435#discussion_r107326057
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala
---
@@ -1786,51 +1793,98 @@ class
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
@sethah Code updated.
cc @yanboliang Pls help boost this PR thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15435#discussion_r107325971
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala
---
@@ -194,15 +207,9 @@ class
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
@sethah Thanks! I have merged your updates and fix mima file conflicts.
@yanboliang has just come back from trip and will help review and merge it
into 2.2 so don't worry about
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17706
Jenkins test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17373
@nicodri Hi, I am modifying this PR and will commit this week! Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/17706
fix MLOR coeffs centering when reg == 0
## What changes were proposed in this pull request?
When reg == 0, MLOR has multiple solutions and we need to centralize the
coeffs to get
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17373
cc @yanboliang thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
Done. cc @sethah @jkbradley thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
jenkins, test please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
@sethah OK no problem! I can move the method implementations into trait.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
@sethah I think `truePositiveRate` is equivalent to `recall` so I directly
implement another one in the trait, but if you don't like this way I can modify
it.
Is there anything else need
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/18797
@srowen Yeah, the third case is another problem (I think we can simply
change the iter num 7 to 6 in testcase)
I am curious about the first two cases, why trigger the require fail
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/18313
@jkbradley
I think the thing is simple.
When persist model list param is `false`, just keep the code logic the same
and **it won't increase the memory cost** (This is the default case
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17849
Thanks your work on this but I am curious what is the benefit of doing
this? In pyspark there is no param in Model itself currently, what is the
problem or bugs it can resolve after adding
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18798#discussion_r130746993
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18742#discussion_r130745275
--- Diff: python/pyspark/ml/util.py ---
@@ -283,3 +289,124 @@ def numFeatures(self):
Returns the number of features the model was trained
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18798#discussion_r130746893
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/18798
@thunterdb
1) The dataframe deserialize from binary data will add overhead, (maybe
there is compaction or not, it depends on the datatype, cc @liancheng ) about
1x performance in my test
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/18798
performance data attached. cc @thunterdb @jkbradley
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18798#discussion_r130747756
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/18797
Thanks! Waiting AFT testcode author to figure out how to modify the
testcase.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/18896
@MLnick That's because, this bug will be triggered only when we standardize
feature first then do training...
---
If your project is set up for it, you can reply to this email and have your
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17373
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/18896
@MLnick Yes it is always trained in scaled space. But the testcase you
mentioned do not take the "scale" step, so do not trigger the bug...
---
If your project is set up for it, you
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/18798
@yanboliang I will update ASAP, thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18798#discussion_r133121659
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -0,0 +1,593 @@
+/*
+ * Licensed to the Apache Software
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18798#discussion_r133119397
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -0,0 +1,593 @@
+/*
+ * Licensed to the Apache Software
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18872#discussion_r133081995
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/source/libsvm/LibSVMRelationSuite.scala
---
@@ -109,14 +112,15 @@ class LibSVMRelationSuite
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18872#discussion_r133081682
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/source/libsvm/LibSVMRelationSuite.scala
---
@@ -109,14 +112,15 @@ class LibSVMRelationSuite
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18872#discussion_r133082255
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/source/libsvm/LibSVMRelationSuite.scala
---
@@ -109,14 +112,15 @@ class LibSVMRelationSuite
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18736#discussion_r133080543
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala
---
@@ -80,20 +82,31 @@ class HashingTF @Since("1.4.0") (@Si
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18736#discussion_r133080201
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/HashingTFSuite.scala ---
@@ -69,6 +69,20 @@ class HashingTFSuite extends SparkFunSuite
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/18798
@viirya Sure! comment updated.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17373
cc @jkbradley Code updated, thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18798#discussion_r133191237
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -0,0 +1,593 @@
+/*
+ * Licensed to the Apache Software
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15435#discussion_r133636062
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -882,21 +882,28 @@ class LogisticRegression @Since
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/18281
@ajaysaini725 @jkbradley Can we avoid python-side to re-implement the
logic of OneVsRest? It can simply python-side code I think. Just let the
wrapper inherit `JavaEstimator`, and when we
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17373
@LeoIV sorry for delay! I will update code soon!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17373
The probability should always between 0 and 1
Send me your test code and test data to help me find out where is wrong.
In my own test the result is ok.
Sent from my iPhone
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17373
RawPrediction is not probability
It's range is from -inf to inf
Softmax(raw predictions) get probabilities
It's range is from 0 to 1
Thanks!
Sent from my iPhone
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/16571
This PR is very similar to my early PR. Is that right? @jkbradley #14950
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17373
They are, I think some values are something like 4.7532244532E-10 the
display truncate them.
Thanks
Sent from my iPhone
On 15 Jul 2017, at 12:35 AM, Leonard Hövelmann
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17419
As the dataframe version is much slower than RDD version (currently test
against vector of size 1)
I also guess there is some performance issue in
`ObjectAggregationIterator.processInput
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17419#discussion_r128429389
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/stat/SummarizerSuite.scala ---
@@ -0,0 +1,406 @@
+/*
+ * Licensed to the Apache Software
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17419#discussion_r128428434
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -0,0 +1,746 @@
+/*
+ * Licensed to the Apache Software
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17419#discussion_r128429254
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -0,0 +1,799 @@
+/*
+ * Licensed to the Apache Software
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17419#discussion_r128428604
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -0,0 +1,746 @@
+/*
+ * Licensed to the Apache Software
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r128648171
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -271,7 +273,7 @@ object OneVsRestModel extends
MLReadable
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
cc @jkbradley I think it's OK now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18313#discussion_r128684617
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala ---
@@ -113,15 +122,28 @@ class CrossValidator @Since("1.2.0"
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r128645326
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -337,8 +353,13 @@ final class OneVsRest @Since("
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r128815421
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -271,7 +273,7 @@ object OneVsRestModel extends
MLReadable
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18313#discussion_r12283
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala ---
@@ -113,15 +122,28 @@ class CrossValidator @Since("1.2.0"
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18610#discussion_r129413407
--- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala ---
@@ -309,6 +313,23 @@ private[ml] object DefaultParamsWriter {
val
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18313#discussion_r129122995
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala ---
@@ -113,15 +122,28 @@ class CrossValidator @Since("1.2.0"
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17419#discussion_r129173570
--- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
@@ -0,0 +1,799 @@
+/*
+ * Licensed to the Apache Software
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18610#discussion_r130015435
--- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala ---
@@ -309,6 +313,23 @@ private[ml] object DefaultParamsWriter {
val
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18610#discussion_r129652515
--- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala ---
@@ -309,6 +313,23 @@ private[ml] object DefaultParamsWriter {
val
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/9183
@yanboliang I will take over this feature and create a new PR soon.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17373#discussion_r129698281
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -527,9 +544,21 @@ private[ml] class FeedForwardModel private
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17373#discussion_r129697890
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -527,9 +544,21 @@ private[ml] class FeedForwardModel private
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17373#discussion_r129697649
--- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala ---
@@ -463,7 +479,7 @@ private[ml] class FeedForwardModel private(
private
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17373
cc @yanboliang @jkbradley
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17373
cc @jkbradley @yanboliang
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15435#discussion_r114043519
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -1231,6 +1295,109 @@ class
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
`update v7` fix previous `LogisticRegressionSuite` conflicts and `fix nits`
commit for some nits update.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/15435#discussion_r113867765
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -1231,6 +1295,109 @@ class
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/18797
@srowen Great! thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17583#discussion_r132052106
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/FuncTransformer.scala ---
@@ -0,0 +1,141 @@
+/*
+ * Licensed to the Apache Software
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18736#discussion_r132058692
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala
---
@@ -90,10 +92,22 @@ class HashingTF @Since("1.4.0") (@Si
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17583#discussion_r132058898
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/FuncTransformer.scala ---
@@ -0,0 +1,141 @@
+/*
+ * Licensed to the Apache Software
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/15435
cc @jkbradley @MrBago thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17373#discussion_r131824713
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifierSuite.scala
---
@@ -82,6 +83,23 @@ class
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17894#discussion_r132069046
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -1722,25 +1723,22 @@ private class
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/17894
I am also interested in implementation by level-3 BLAS. Can you post a
design doc first?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/17894#discussion_r132068663
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -1722,25 +1723,22 @@ private class
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1#discussion_r132072527
--- Diff: python/pyspark/ml/pipeline.py ---
@@ -242,3 +327,65 @@ def _to_java(self):
JavaParams._new_java_obj
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1#discussion_r132070100
--- Diff: python/pyspark/ml/pipeline.py ---
@@ -204,13 +282,20 @@ def copy(self, extra=None):
@since("2.0.0")
def
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/16774
@BryanCutler You are right. Once `Future` complete the model can be cleaned
by GC. So the memory cost of the code has been optimized already. I didn't look
at the code carefully a few days ago
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/18797
[SPARK-21523] update breeze to 0.13.1 for an emergency bugfix in strong
wolfe line search
## What changes were proposed in this pull request?
Update breeze to 0.13.1 for an emergency
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18746#discussion_r130518773
--- Diff: python/pyspark/ml/base.py ---
@@ -116,3 +121,44 @@ class Model(Transformer):
"""
__metacl
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18746#discussion_r130518147
--- Diff: python/pyspark/ml/base.py ---
@@ -116,3 +121,44 @@ class Model(Transformer):
"""
__metacl
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18742#discussion_r130520335
--- Diff: python/pyspark/ml/util.py ---
@@ -283,3 +289,124 @@ def numFeatures(self):
Returns the number of features the model was trained
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18742#discussion_r130522066
--- Diff: python/pyspark/ml/util.py ---
@@ -283,3 +289,124 @@ def numFeatures(self):
Returns the number of features the model was trained
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18742#discussion_r130521964
--- Diff: python/pyspark/ml/util.py ---
@@ -283,3 +289,124 @@ def numFeatures(self):
Returns the number of features the model was trained
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18742#discussion_r130523538
--- Diff: python/pyspark/ml/util.py ---
@@ -283,3 +289,124 @@ def numFeatures(self):
Returns the number of features the model was trained
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18742#discussion_r130519716
--- Diff: python/pyspark/ml/param/__init__.py ---
@@ -375,6 +375,18 @@ def copy(self, extra=None):
that._defaultParamMap
Github user WeichenXu123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18742#discussion_r130521214
--- Diff: python/pyspark/ml/util.py ---
@@ -283,3 +289,124 @@ def numFeatures(self):
Returns the number of features the model was trained
GitHub user WeichenXu123 opened a pull request:
https://github.com/apache/spark/pull/18798
[SPARK-19634][ML] Multivariate summarizer - dataframes API
## What changes were proposed in this pull request?
This patch adds the DataFrames API to the multivariate summarizer (mean
301 - 400 of 1170 matches
Mail list logo