[GitHub] spark pull request: [SPARK-13600] [MLlib] [WIP] Incorrect number o...

2016-03-07 Thread oliverpierson
Github user oliverpierson commented on the pull request: https://github.com/apache/spark/pull/11553#issuecomment-193584221 Putting this up for review now. Tests are passing on my machine. Using `approxQuantile` in DataFrame stats reduces amount of code required by a good bit.

[GitHub] spark pull request: [SPARK-13600] [MLlib] [WIP] Incorrect number o...

2016-03-06 Thread oliverpierson
Github user oliverpierson commented on the pull request: https://github.com/apache/spark/pull/11553#issuecomment-193073200 This is still a work in progress, just wanted to get the PR up so it's on the radar. Still need to: - [ ] add an external Parameter (with default value)

[GitHub] spark pull request: [SPARK-13600] [MLlib] [WIP] Incorrect number o...

2016-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11553#issuecomment-193072808 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-13600] [MLlib] [WIP] Incorrect number o...

2016-03-06 Thread oliverpierson
GitHub user oliverpierson opened a pull request: https://github.com/apache/spark/pull/11553 [SPARK-13600] [MLlib] [WIP] Incorrect number of buckets in QuantileDiscretizer ## What changes were proposed in this pull request? QuantileDiscretizer can return an unexpected number of