Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18998
Close it since quite a long time without any activity. Thanks all the same
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user facaiy closed the pull request at:
https://github.com/apache/spark/pull/18998
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18998#discussion_r179903481
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala
---
@@ -93,11 +97,21 @@ class HashingTF @Since("1.4.0") (@Si
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18736
Closed as #18998 takes too long to wait.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user facaiy closed the pull request at:
https://github.com/apache/spark/pull/18736
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17503
Colsed since its duplicate PR #20632 has been merged.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user facaiy closed the pull request at:
https://github.com/apache/spark/pull/17503
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18998#discussion_r172015693
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala
---
@@ -93,11 +97,21 @@ class HashingTF @Since("1.4.0") (@Si
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18998#discussion_r172015644
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala
---
@@ -93,11 +97,21 @@ class HashingTF @Since("1.4.0") (@Si
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18998#discussion_r171412547
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala
---
@@ -93,11 +97,21 @@ class HashingTF @Since("1.4.0") (@Si
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/19666
Thank you, @WeichenXu123 . You can also use the condition "include the
first bin" to filter left splits. Perhaps it
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/19666
In fact, I'm not sure whether the idea is right, so no hesitate to correct
me. I assume the algorithm requires O(N^2) complexity
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/19666
Hi, I write a demo with python. I'll be happy if it could be useful.
For N bins, say `[x_1, x_2, ..., x_N]`, since all its splits contain either
`x_1` or not, so we can choose the half
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/19666
I believe that unordered features will benefit a lot from the idea, however
I have two questions:
1. I'm a little confused by 964L in `traverseUnorderedSplits`. Is it a
backtracking algorithm
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/19666#discussion_r149313427
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala ---
@@ -631,6 +614,42 @@ class RandomForestSuite extends SparkFunSuite
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18998
ping @yanboliang
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user facaiy closed the pull request at:
https://github.com/apache/spark/pull/17383
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17383
Hi, since the work has been done for a long time, I take a review by
myself.
After careful review, as SparseVector is compressed sparse row format, so
the only benefit of the PR would
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17503
HI, @WeichenXu123.
As said by @srowen , the benefit of this would be for speed at predict time
or for model storage. Hence I'm not sure whether benchmark is really need for
the PR
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17383
Sure, @WeichenXu123 , perhaps one or two weeks later, is it OK?
By the way, I think using sparse representation can only reduce memory
usage, and it is in the cost of compute performance
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17383
Thank you for comment.
Very good question, at least for me, the answer to both questions is no. In
most case, we feed dense raw data into tree model. However, if large dimensions
required
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18998
Hi, @yanboliang and @srowen . Thanks for your comments. For HashingTF, I
agree that it is necessary to migrate its implementation so that new method
could be added easily.
Thanks, any
Github user facaiy closed the pull request at:
https://github.com/apache/spark/pull/18120
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17503
Hi, @yanboliang . Do you have time to take a look at first? Thanks very
much.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18998
The PR is dependent by #18736 . To keep consistency of `setxxx` methods
between scala and python , as @yanboliang suggested, it is better to migrate
the HashingTF implementation from mllib to ml
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18998
Hi, @srowen . Could you take a look at the PR? Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18998
cc @yanboliang @WeichenXu123 who I believe are interested in this PR. Could
you take a look please?
---
If your project is set up for it, you can reply to this email and have your
reply appear
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/18998
[SPARK-21748][ML] Migrate the implementation of HashingTF from MLlib to ML
## What changes were proposed in this pull request?
Migrate the implementation of HashingTF from MLlib to ML
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18736
Sure, @yanboliang . Thanks for your suggestion. I'll work on it later,
perhaps next week. Is it OK?
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18736
@yanboliang Hi, yangbo. Could you help review the PR? Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18736#discussion_r132618802
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala
---
@@ -80,20 +82,31 @@ class HashingTF @Since("1.4.0") (@Si
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18736#discussion_r132131171
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala
---
@@ -90,10 +92,22 @@ class HashingTF @Since("1.4.0") (@Si
Github user facaiy closed the pull request at:
https://github.com/apache/spark/pull/18763
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18763
Thanks, all.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18764
Sure, thanks, @yanboliang !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user facaiy closed the pull request at:
https://github.com/apache/spark/pull/18764
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18764
Thanks, @yanboliang @gatorsmile
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18764
@SparkQA Take a test, please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18763#discussion_r131529768
--- Diff: python/pyspark/ml/classification.py ---
@@ -1423,7 +1425,18 @@ def _fit(self, dataset):
numClasses = int(dataset.agg({labelCol
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18764#discussion_r131529693
--- Diff: python/pyspark/ml/classification.py ---
@@ -1344,7 +1346,19 @@ def _fit(self, dataset):
numClasses = int(dataset.agg({labelCol
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18764
@yanboliang Thanks, yanbo. I am not familar with python 2.6, which is too
outdated.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18764
Test failures in pyspark.ml.tests with python2.6, but I don't have the
environment.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18764
Test failures in pyspark.ml.tests with python2.6, but I don't have the
environment.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18764
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18764
Thanks, @yanboliang . Could you give a hand, @srowen ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18763#discussion_r130213337
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -158,7 +158,7 @@ class OneVsRestSuite extends SparkFunSuite
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18763#discussion_r130202540
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -158,7 +158,7 @@ class OneVsRestSuite extends SparkFunSuite
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18763#discussion_r130200461
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -157,6 +157,16 @@ class OneVsRestSuite extends SparkFunSuite
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18764#discussion_r130200379
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -143,6 +144,16 @@ class OneVsRestSuite extends SparkFunSuite
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18764#discussion_r130200288
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -33,6 +33,7 @@ import
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/18764
[SPARK-21306][ML] For branch 2.0, OneVsRest should support setWeightCol
The PR is related to #18554, and is modified for branch 2.0.
## What changes were proposed in this pull request
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/18763
[SPARK-21306][ML] OneVsRest should support setWeightCol for branch-2.1
The PR is related to #18554, and is modified for branch 2.1.
## What changes were proposed in this pull request
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18554#discussion_r129562237
--- Diff: python/pyspark/ml/tests.py ---
@@ -1255,6 +1255,24 @@ def test_output_columns(self):
output = model.transform(df
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18554#discussion_r129562189
--- Diff: python/pyspark/ml/classification.py ---
@@ -1517,20 +1517,22 @@ class OneVsRest(Estimator, OneVsRestParams,
MLReadable, MLWritable
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18554
ping @holdenk @yanboliang
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/18736
[SPARK-21481][ML] Add indexOf method for ml.feature.HashingTF
## What changes were proposed in this pull request?
Add indexOf method for ml.feature.HashingTF.
The PR is a hotfix
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18554#discussion_r128158473
--- Diff: python/pyspark/ml/tests.py ---
@@ -1255,6 +1255,17 @@ def test_output_columns(self):
output = model.transform(df
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18305#discussion_r127972263
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -598,8 +598,23 @@ class LogisticRegression @Since("
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18305#discussion_r127874833
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/optim/loss/DifferentiableRegularization.scala
---
@@ -32,40 +34,45 @@ private[ml] trait
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18305#discussion_r127873828
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
---
@@ -598,8 +598,23 @@ class LogisticRegression @Since("
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18554#discussion_r126863072
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -317,7 +318,12 @@ final class OneVsRest @Since("
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18582#discussion_r126646511
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -36,7 +36,8 @@ import org.apache.spark.util.collection.OpenHashMap
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18582#discussion_r126645714
--- Diff: python/pyspark/ml/feature.py ---
@@ -3058,26 +3035,37 @@ class RFormula(JavaEstimator, HasFeaturesCol,
HasLabelCol, JavaMLReadable, JavaM
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18582#discussion_r126643882
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala ---
@@ -460,16 +460,16 @@ object LinearRegression extends
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18582#discussion_r126642928
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala
---
@@ -36,7 +36,8 @@ import org.apache.spark.sql.types.{DoubleType
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18554
@srowen @yanboliang Could you help review the PR? Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18554
I'm not familiar with R, and use grep to search "OneVsRest" and get
nothing. Hence it seems that nothing is needed to do with R part.
---
If your project is set up for it, you
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18523
@SparkQA test again, please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18556#discussion_r126050849
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala ---
@@ -89,18 +93,17 @@ private[libsvm] class LibSVMFileFormat extends
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18554
@lins05 thanks, reasonable suggestion, I will fix it later.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18556#discussion_r126026388
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala ---
@@ -89,18 +93,17 @@ private[libsvm] class LibSVMFileFormat extends
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18556#discussion_r126023986
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala ---
@@ -89,18 +93,17 @@ private[libsvm] class LibSVMFileFormat extends
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18523#discussion_r125860650
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala ---
@@ -113,12 +113,15 @@ class VectorAssembler @Since("1.4.0"
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/18554
[SPARK-21306][ML] OneVsRest should cache weightCol if necessary
## What changes were proposed in this pull request?
cache weightCol if classifier inherits HasWeightCol trait
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18523
@SparkQA Jenkins, run tests again, please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18523#discussion_r125763918
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala ---
@@ -113,12 +113,15 @@ class VectorAssembler @Since("1.4.0"
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18523
I don't know how to write an unit test for the pr? Is it necessary?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18523
Good idea!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18523#discussion_r125584572
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala ---
@@ -113,12 +113,12 @@ class VectorAssembler @Since("1.4.0"
GitHub user facaiy reopened a pull request:
https://github.com/apache/spark/pull/17383
[SPARK-3165][MLlib][WIP] DecisionTree does not use sparsity in data
## What changes were proposed in this pull request?
DecisionTree should take advantage of sparse feature vectors
Github user facaiy closed the pull request at:
https://github.com/apache/spark/pull/17383
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18523#discussion_r125539040
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala ---
@@ -113,12 +113,12 @@ class VectorAssembler @Since("1.4.0"
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/17503
@jkbradley May you have time reviewing the pr? I believe that it will be a
little improvement for predict. Thanks.
---
If your project is set up for it, you can reply to this email and have your
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18523#discussion_r125398010
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala ---
@@ -113,12 +113,12 @@ class VectorAssembler @Since("1.4.0"
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/18523
[SPARK-21285][ML] VectorAssembler reports the column name of unsupported
data type
## What changes were proposed in this pull request?
add the column name in the exception which is raised
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18288
Yes.
an example code:
```scala
val df = spark.read.format("libsvm")
.option("numFeatures", "780")
.load("data/mllib/sample_libsvm_data.
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18288
You might be mistaken. The aim of code here is to encourage user to specify
`numFeatures` in any case, rather than encourage user to use only one file.
---
If your project is set up for it, you can
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18288#discussion_r123474003
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala ---
@@ -91,12 +91,10 @@ private[libsvm] class LibSVMFileFormat extends
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18288
In my opinion, `numFeatures` is vital for sparse data.
Say our feature is 100-dim indeed, while in a small train data their
maximum size is 990. It is dangerous (or wrong) to train a 990
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18288#discussion_r122909919
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala ---
@@ -91,12 +91,10 @@ private[libsvm] class LibSVMFileFormat extends
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18288#discussion_r122908140
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/source/libsvm/LibSVMRelation.scala ---
@@ -91,12 +91,10 @@ private[libsvm] class LibSVMFileFormat extends
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18139#discussion_r119346663
--- Diff: python/pyspark/sql/types.py ---
@@ -187,8 +187,11 @@ def needConversion(self):
def toInternal(self, dt):
if dt
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18139#discussion_r119263608
--- Diff: python/pyspark/sql/types.py ---
@@ -187,8 +187,11 @@ def needConversion(self):
def toInternal(self, dt):
if dt
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18120
Thanks, @BryanCutler.
It seems that #17849 copys `Params` from `Estimator` to `Model`
automatically, which is pretty useful. However, `getter` method is still
missing and need to be added
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18120
Hi, @keypointt . It's the feature of Python. The doctest is both document
and unit test.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18120
@keypointt Hi, could you help check the pr is consistent with your #17207 ?
Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/18120
[SPARK-20498][PYSPARK][ML] Expose getMaxDepth for ensemble tree model in
PySpark
## What changes were proposed in this pull request?
add `getMaxDepth` method for ensemble tree models
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18058
Resolved.
By the way,
Which one is preferable, rebase or merge?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user facaiy commented on the issue:
https://github.com/apache/spark/pull/18058
Hi, I'm not familiar with pyspark. I just wonder whether is it needed to
create a unit test for verification. If yes, how to check it? Thanks.
---
If your project is set up for it, you can reply
Github user facaiy commented on a diff in the pull request:
https://github.com/apache/spark/pull/18058#discussion_r118416434
--- Diff: python/pyspark/ml/fpm.py ---
@@ -49,6 +49,32 @@ def getMinSupport(self):
return self.getOrDefault(self.minSupport
1 - 100 of 157 matches
Mail list logo