Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/21465#discussion_r194143768
--- Diff: python/pyspark/ml/classification.py ---
@@ -1251,26 +1256,33 @@ class GBTClassifier(JavaEstimator, HasFeaturesCol,
HasLabelCol, HasPredictionCol
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20629
Right - so while itâs perhaps a lower quality metric it is different. So I
wonder if deprecation is the right approach (vs say putting the within
cluster sum squares into ClusteringEvaluator
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20629
Sorry I mean putting the metric in evaluator and then also deprecating
computCost
On Sun, 18 Feb 2018 at 20:41, Nick Pentreath
wrote:
> Right - so while itâs perhaps a lo
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20629
Just want to check - does `computeCost` do the same thing as the silhouette
metric?
---
-
To unsubscribe, e-mail: reviews
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20446#discussion_r165575680
--- Diff:
examples/src/main/scala/org/apache/spark/examples/ml/SummarizerExample.scala ---
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20446#discussion_r165568368
--- Diff: docs/ml-statistics.md ---
@@ -89,4 +89,26 @@ Refer to the [`ChiSquareTest` Python
docs](api/python/index.html#pyspark.ml.stat
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20446#discussion_r165568014
--- Diff:
examples/src/main/java/org/apache/spark/examples/ml/JavaSummarizerExample.java
---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20446#discussion_r165567614
--- Diff:
examples/src/main/scala/org/apache/spark/examples/ml/SummarizerExample.scala ---
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20446#discussion_r165362568
--- Diff:
examples/src/main/scala/org/apache/spark/examples/ml/SummarizerExample.scala ---
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20446#discussion_r165362364
--- Diff:
examples/src/main/java/org/apache/spark/examples/ml/JavaSummarizerExample.java
---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20446#discussion_r165360692
--- Diff: docs/ml-statistics.md ---
@@ -89,4 +89,26 @@ Refer to the [`ChiSquareTest` Python
docs](api/python/index.html#pyspark.ml.stat
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20446#discussion_r165362533
--- Diff:
examples/src/main/scala/org/apache/spark/examples/ml/SummarizerExample.scala ---
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20446#discussion_r165362148
--- Diff: docs/ml-statistics.md ---
@@ -89,4 +89,26 @@ Refer to the [`ChiSquareTest` Python
docs](api/python/index.html#pyspark.ml.stat
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20446#discussion_r165362440
--- Diff:
examples/src/main/java/org/apache/spark/examples/ml/JavaSummarizerExample.java
---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20459
Merged to master / branch-2.3. Thanks @yanboliang !
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20421
Didn't this go into 2.2.1?
On Wed, 31 Jan 2018 at 20:37 WeichenXu wrote:
> @MLnick <https://github.com/mlnick>
> Forget one fix: #18797 <https://github.c
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20421
Merged to master / branch-2.3. Thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20421
@felixcheung just added a few more behavior changes I found. Should be
final now.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20332
Merged to master / branch-2.3. Thanks @sethah, and @WeichenXu123 for review.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20332#discussion_r164654897
--- Diff: docs/ml-classification-regression.md ---
@@ -111,10 +110,9 @@ Continuing the earlier example:
[`LogisticRegressionTrainingSummary`](api/java
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20332#discussion_r164479596
--- Diff: docs/ml-classification-regression.md ---
@@ -125,7 +123,8 @@ Continuing the earlier example:
[`LogisticRegressionTrainingSummary`](api/python
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20332#discussion_r164387272
--- Diff: docs/ml-classification-regression.md ---
@@ -125,7 +123,8 @@ Continuing the earlier example:
[`LogisticRegressionTrainingSummary`](api/python
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20332#discussion_r164384660
--- Diff: docs/ml-classification-regression.md ---
@@ -111,10 +110,9 @@ Continuing the earlier example:
[`LogisticRegressionTrainingSummary`](api/java
GitHub user MLnick opened a pull request:
https://github.com/apache/spark/pull/20421
[SPARK-23112][DOC] Update ML migration guide with breaking changes.
Add breaking change note to ML migration guide.
## How was this patch tested?
Doc only
You can merge this pull
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19892
I reverted this (see #20410 for details) - we can re-open it once that
issue is solved.
---
-
To unsubscribe, e-mail: reviews
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20410
I reverted #19892 in master (f5911d4894700eb48f794133cbd363bf3b7c8c8e) /
branch-2.3 (a8a3e9b7cf7b9346c43cfbbf7b26fd2fd28dd521), so that other test runs
can be unblocked
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20410
I think this is somewhat related to #15113
cc @BryanCutler
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20410
We should just revert SPARK-22797 for now to unblock others.
SPARK-22799 itself is not the cause per se (it passed tests) but after it
was merged SPARK-22797 causes the failure
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19892
Merged to master / branch-2.3. Thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19993
Merged to master / branch-2.3
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19993
Thanks @mgaido91 and @jkbradley for working on this and others for review
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20363
Merged to master/branch-2.3
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20363
`SPARK-20047` was in 2.2 (and mentioned in the previous highlights). We
could also mention `SPARK-20619` but I've tried to limit the list.
I checked through the issues for 2.3 and cou
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20363
Itâs just listing any breaking changes, if I missed them. Will do a pass
to
check and then remove WIP. If any folks know of breaking changes ping the
JIRA issue
On Wed, 24 Jan 2018 at
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19993#discussion_r163562784
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/BucketizerSuite.scala ---
@@ -401,15 +390,24 @@ class BucketizerSuite extends SparkFunSuite with
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19993#discussion_r163561075
--- Diff: mllib/src/test/scala/org/apache/spark/ml/param/ParamsSuite.scala
---
@@ -20,8 +20,11 @@ package org.apache.spark.ml.param
import java.io
GitHub user MLnick opened a pull request:
https://github.com/apache/spark/pull/20363
[SPARK-23112][WIP][DOC] Add highlights and migration guide for 2.3
Update ML user guide with highlights and migration guide for `2.3`.
## How was this patch tested?
Doc only.
You
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19892
RC2 has been cut - @jkbradley do you see #19993 as a blocker? I think it
should be merged for `2.3`. And also there are QA JIRAs (sub-tasks of
[SPARK-23105](https://issues.apache.org/jira/browse
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19993
Well yes it would - but the method checks inputCols/inputCol first so will
always fail for that reason here, ie we arenât actually testing the full
code path
On Mon, 22 Jan 2018 at 16:43
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19993
Overall looks good with @jkbradley's changes. I just left a comment on the
param test cases as I think they're not quit
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19993#discussion_r162940665
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/BucketizerSuite.scala ---
@@ -401,15 +390,14 @@ class BucketizerSuite extends SparkFunSuite with
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19892
@holdenk everything except my comment in
https://github.com/apache/spark/pull/19892#discussion_r162900053
---
-
To unsubscribe
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19892#discussion_r162900053
--- Diff: python/pyspark/ml/feature.py ---
@@ -315,13 +315,19 @@ class BucketedRandomProjectionLSHModel(LSHModel,
JavaMLReadable, JavaMLWritable
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20332#discussion_r162873193
--- Diff:
examples/src/main/python/ml/multiclass_logistic_regression_with_elastic_net.py
---
@@ -43,6 +43,43 @@
# Print the coefficients and
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20332#discussion_r162873036
--- Diff: docs/ml-classification-regression.md ---
@@ -97,10 +97,6 @@ only available on the driver.
[`LogisticRegressionTrainingSummary`](api/scala
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20332#discussion_r162872261
--- Diff: docs/ml-classification-regression.md ---
@@ -125,7 +117,6 @@ Continuing the earlier example:
[`LogisticRegressionTrainingSummary`](api/python
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20332#discussion_r162873388
--- Diff:
examples/src/main/scala/org/apache/spark/examples/ml/MulticlassLogisticRegressionWithElasticNetExample.scala
---
@@ -49,6 +49,48 @@ object
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19892
If it is going to get merged to `branch-2.3` the `since` tags need to be
`2.3.0` again
---
-
To unsubscribe, e-mail: reviews
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20257
Merged to master / branch-2.3, thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20293
Merged to master / branch-2.3
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19892
Iâm generally ok with these small python api wrapper additions getting
merged as long as the risk of breaking anything is low - and here it is
since itâs just api parity
On Fri, 19 Jan
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20275#discussion_r162292944
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/linalg/VectorsSuite.scala ---
@@ -113,6 +113,13 @@ class VectorsSuite extends SparkFunSuite with
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20275#discussion_r162292520
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/linalg/VectorsSuite.scala ---
@@ -113,6 +113,13 @@ class VectorsSuite extends SparkFunSuite with
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19993#discussion_r162043704
--- Diff: mllib/src/main/scala/org/apache/spark/ml/param/params.scala ---
@@ -249,6 +250,27 @@ object ParamValidators {
def arrayLengthGt[T
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19993#discussion_r162042318
--- Diff: mllib/src/main/scala/org/apache/spark/ml/param/params.scala ---
@@ -249,6 +250,27 @@ object ParamValidators {
def arrayLengthGt[T
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20257
A couple minor comments, otherwise looks fine.
I see we are changing the example names, so effectively removing the old
examples. I'm ok with this, unless others have an obje
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r162040939
--- Diff: docs/ml-features.md ---
@@ -783,11 +783,11 @@ Because this existing `OneHotEncoder` is a stateless
transformer, it is not usab
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r162038403
--- Diff: docs/ml-features.md ---
@@ -783,11 +783,11 @@ Because this existing `OneHotEncoder` is a stateless
transformer, it is not usab
GitHub user MLnick opened a pull request:
https://github.com/apache/spark/pull/20293
[SPARK-23127][DOC] Update FeatureHasher guide for categoricalCols parameter
Update user guide entry for `FeatureHasher` to match the Scala / Python
doc, to describe the `categoricalCols` parameter
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20257
Added a few more small comments
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161989945
--- Diff: docs/ml-features.md ---
@@ -777,17 +777,17 @@ for more details on the API.
## OneHotEncoder (Deprecated since 2.3.0)
-Because
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161989107
--- Diff: docs/ml-features.md ---
@@ -777,17 +777,17 @@ for more details on the API.
## OneHotEncoder (Deprecated since 2.3.0)
-Because
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161988396
--- Diff: docs/ml-features.md ---
@@ -777,17 +777,17 @@ for more details on the API.
## OneHotEncoder (Deprecated since 2.3.0)
-Because
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19876
Do we want to think about an `options` / `option` interface too? I'm not
that familiar with whether there could be important options for PMML export,
but custom user formats may need it (I
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19876
So to be clear this doesn't handle the `read` path at all? Would there be a
plan to implement a similar read API?
Overall I like the idea of an open API for plugging in model serializ
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161722235
--- Diff: docs/ml-features.md ---
@@ -775,35 +775,43 @@ for more details on the API.
-## OneHotEncoder
+## OneHotEncoder
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161740882
--- Diff: examples/src/main/python/ml/onehot_encoder_estimator_example.py
---
@@ -18,32 +18,31 @@
from __future__ import print_function
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161739866
--- Diff: docs/ml-features.md ---
@@ -775,35 +775,43 @@ for more details on the API.
-## OneHotEncoder
+## OneHotEncoder
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161741274
--- Diff: docs/ml-features.md ---
@@ -775,35 +775,43 @@ for more details on the API.
-## OneHotEncoder
+## OneHotEncoder
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161740612
--- Diff:
examples/src/main/java/org/apache/spark/examples/ml/JavaOneHotEncoderEstimatorExample.java
---
@@ -35,41 +34,37 @@
import
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161722104
--- Diff: docs/ml-features.md ---
@@ -775,35 +775,43 @@ for more details on the API.
-## OneHotEncoder
+## OneHotEncoder
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161739788
--- Diff: docs/ml-features.md ---
@@ -775,35 +775,43 @@ for more details on the API.
-## OneHotEncoder
+## OneHotEncoder
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161740927
--- Diff:
examples/src/main/scala/org/apache/spark/examples/ml/OneHotEncoderEstimatorExample.scala
---
@@ -19,38 +19,34 @@
package
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19892#discussion_r161719111
--- Diff: python/pyspark/ml/feature.py ---
@@ -317,26 +317,34 @@ class BucketedRandomProjectionLSHModel(LSHModel,
JavaMLReadable, JavaMLWritable
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19892#discussion_r161683821
--- Diff: python/pyspark/ml/feature.py ---
@@ -317,13 +317,19 @@ class BucketedRandomProjectionLSHModel(LSHModel,
JavaMLReadable, JavaMLWritable
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19892#discussion_r161683714
--- Diff: python/pyspark/ml/feature.py ---
@@ -347,6 +353,28 @@ class Bucketizer(JavaTransformer, HasInputCol,
HasOutputCol, HasHandleInvalid
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19892#discussion_r161684641
--- Diff: python/pyspark/ml/param/__init__.py ---
@@ -134,6 +134,16 @@ def toListFloat(value):
return [float(v) for v in value
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19993#discussion_r161681586
--- Diff: mllib/src/main/scala/org/apache/spark/ml/param/params.scala ---
@@ -249,6 +250,27 @@ object ParamValidators {
def arrayLengthGt[T
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19993#discussion_r161682506
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/BucketizerSuite.scala ---
@@ -401,15 +390,9 @@ class BucketizerSuite extends SparkFunSuite with
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/17280#discussion_r161679593
--- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala ---
@@ -319,9 +323,11 @@ object FPGrowthModel extends MLReadable[FPGrowthModel
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20164#discussion_r161535696
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ---
@@ -170,21 +170,24 @@ final class OneVsRestModel private[ml
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/18904
@mpjlu could you post the actual results of test runs (timing numbers and
shuffle data)?
---
-
To unsubscribe, e-mail: reviews
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/18904
@mpjlu could you post the actual results of test runs (timing numbers and
shuffle data)?
---
-
To unsubscribe, e-mail: reviews
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161472191
--- Diff: docs/ml-features.md ---
@@ -775,7 +775,9 @@ for more details on the API.
-## OneHotEncoder
+## OneHotEncoder (Deprecated
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161477464
--- Diff:
examples/src/main/scala/org/apache/spark/examples/ml/OneHotEncoderEstimatorExample.scala
---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161475879
--- Diff: docs/ml-features.md ---
@@ -775,7 +775,9 @@ for more details on the API.
-## OneHotEncoder
+## OneHotEncoder (Deprecated
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161473460
--- Diff: docs/ml-features.md ---
@@ -807,6 +809,36 @@ for more details on the API.
+## OneHotEncoderEstimator
+
+[One-hot
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161472954
--- Diff: docs/ml-features.md ---
@@ -775,7 +775,9 @@ for more details on the API.
-## OneHotEncoder
+## OneHotEncoder (Deprecated
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/20241
LGTM thanks. Merged to master / branch-2.3
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19991
Merged to master.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19715
Merged to master. If there are any further small comments / clean ups we
can do that during QA for 2.3
Thanks @huaxingao and all others for review
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19993#discussion_r159100390
--- Diff: mllib/src/test/scala/org/apache/spark/ml/param/ParamsSuite.scala
---
@@ -430,4 +433,45 @@ object ParamsSuite extends SparkFunSuite
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19993#discussion_r159100191
--- Diff: mllib/src/main/scala/org/apache/spark/ml/param/params.scala ---
@@ -249,6 +250,27 @@ object ParamValidators {
def arrayLengthGt[T
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19993#discussion_r159099688
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala
---
@@ -137,18 +137,10 @@ final class Bucketizer @Since("1.4.0")
(@Si
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19993#discussion_r159100299
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/BucketizerSuite.scala ---
@@ -401,15 +401,9 @@ class BucketizerSuite extends SparkFunSuite with
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19715
Jenkins retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19715
Thanks for the changes @huaxingao. This LGTM now - any further comments
from others?
---
-
To unsubscribe, e-mail: reviews
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19527
Agree on keeping the new OneHotEncoderEstimator as an alias for 3.0
On Fri, 1 Dec 2017 at 23:29, jkbradley wrote:
> *@jkbradley* commented on this pull requ
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19993
Yes featurehasher doesnât need inputCol and inputCols - since it is a new
multi column transformer. We may see this more in future - as I think new
transformers should be able to work on
Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19715#discussion_r158239692
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/QuantileDiscretizerSuite.scala
---
@@ -386,19 +382,16 @@ class QuantileDiscretizerSuite
1 - 100 of 1955 matches
Mail list logo