Github user crackcell commented on a diff in the pull request:
https://github.com/apache/spark/pull/17123#discussion_r162935486
--- Diff: docs/ml-guide.md ---
@@ -122,6 +122,8 @@ There are no deprecations.
* [SPARK-21027](https://issues.apache.org/jira/browse/SPARK-21027
Github user crackcell commented on the issue:
https://github.com/apache/spark/pull/17123
@WeichenXu123 I have finished my work, plz review it. Any suggestion is
welcome. :-)
---
-
To unsubscribe, e-mail: reviews
Github user crackcell commented on a diff in the pull request:
https://github.com/apache/spark/pull/17123#discussion_r162885968
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala
---
@@ -53,7 +53,8 @@ final class Bucketizer @Since("1.4.0") (@Si
Github user crackcell commented on the issue:
https://github.com/apache/spark/pull/17123
@WeichenXu123 sorry to miss the message for two days, I'm working on it.
---
-
To unsubscribe, e-mail: reviews-uns
Github user crackcell commented on a diff in the pull request:
https://github.com/apache/spark/pull/17123#discussion_r106363022
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala
---
@@ -105,20 +106,21 @@ final class Bucketizer @Since("1.4.0"
Github user crackcell commented on the issue:
https://github.com/apache/spark/pull/17233
@jkbradley Hi, I have made some updates according to your comments, please
review it again. :-)
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user crackcell commented on a diff in the pull request:
https://github.com/apache/spark/pull/17233#discussion_r105820314
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -188,35 +189,45 @@ class StringIndexerModel
Github user crackcell commented on a diff in the pull request:
https://github.com/apache/spark/pull/17233#discussion_r105820279
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala ---
@@ -122,6 +122,86 @@ class StringIndexerSuite
assert
Github user crackcell commented on a diff in the pull request:
https://github.com/apache/spark/pull/17233#discussion_r105820283
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala ---
@@ -122,6 +122,86 @@ class StringIndexerSuite
assert
Github user crackcell commented on the issue:
https://github.com/apache/spark/pull/17233
cc @srowen @cloud-fan @MLnick
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user crackcell commented on the issue:
https://github.com/apache/spark/pull/17123
@cloud-fan Would you please review my code again? I'm now using `Option` to
handle NULLs. :-)
---
If your project is set up for it, you can reply to this email and have your
reply appe
GitHub user crackcell opened a pull request:
https://github.com/apache/spark/pull/17233
[SPARK-11569][ML] Fix StringIndexer to handle null value properly
## What changes were proposed in this pull request?
This PR is to enhance StringIndexer with NULL values handling
Github user crackcell commented on the issue:
https://github.com/apache/spark/pull/16883
Nice work! I'm just planning to improve `StringIndexer` exactly the same
way as yours. Now I can have a rest. :-)
---
If your project is set up for it, you can reply to this email and have
Github user crackcell commented on a diff in the pull request:
https://github.com/apache/spark/pull/17123#discussion_r104572696
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala
---
@@ -105,20 +106,21 @@ final class Bucketizer @Since("1.4.0"
Github user crackcell commented on a diff in the pull request:
https://github.com/apache/spark/pull/17123#discussion_r104434899
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala
---
@@ -105,20 +106,21 @@ final class Bucketizer @Since("1.4.0"
Github user crackcell commented on the issue:
https://github.com/apache/spark/pull/17123
@imatiach-msft @cloud-fan I updated the code, replaced java.lang.Double
with isNullAt() and getDouble().
---
If your project is set up for it, you can reply to this email and have your
reply
Github user crackcell commented on the issue:
https://github.com/apache/spark/pull/17123
@srowen @cloud-fan Please review my code. Thanks. :-)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user crackcell commented on the issue:
https://github.com/apache/spark/pull/17123
@imatiach-msft Hi, Ilya. I have added two tests based on the original tests
for NaN data. Please review my code again. Thanks for your time. :-)
---
If your project is set up for it, you can
Github user crackcell commented on a diff in the pull request:
https://github.com/apache/spark/pull/17123#discussion_r103955065
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala
---
@@ -171,23 +176,23 @@ object Bucketizer extends
DefaultParamsReadable
Github user crackcell commented on a diff in the pull request:
https://github.com/apache/spark/pull/17123#discussion_r103954857
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala
---
@@ -105,20 +106,24 @@ final class Bucketizer @Since("1.4.0"
Github user crackcell commented on the issue:
https://github.com/apache/spark/pull/17123
Fixed style errors during the unit tests.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
GitHub user crackcell opened a pull request:
https://github.com/apache/spark/pull/17123
[SPARK-19781][ML] Handle NULLs as well as NaNs in Bucketizer when
handleInvalid is on
## What changes were proposed in this pull request?
The original Bucketizer can put NaNs into a
22 matches
Mail list logo