[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2018-04-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @leliang65 The PySpark support is not added yet. Please refer to #19892. --- - To unsubscribe, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2018-04-19 Thread leliang65
Github user leliang65 commented on the issue: https://github.com/apache/spark/pull/17819 Is there any python example for this api? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-11-09 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17819 Merged to master. Thanks @viirya and all the reviewers! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-11-08 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 About vector bucketizer, seems it might work similarly as multi-col bucketizer. But some behaviors such as `Bucketizer.SKIP_INVALID` need to address. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-11-08 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 Thanks @MLnick --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-11-08 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17819 @MarcKaminski by the way you mentioned a vector bucketizer. I think in principal that might be useful. I'm not sure if it would make sense to add vector type support to the existing `Bucketizer` or

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-11-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-11-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83519/ Test PASSed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-11-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #83519 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83519/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-11-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #83519 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83519/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-11-06 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @MLnick Conflicts resolved. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-11-06 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17819 @viirya could you resolve conflicts? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-30 Thread AFractalThought
Github user AFractalThought commented on the issue: https://github.com/apache/spark/pull/17819 Thanks @huaxingao @MLnick @viirya this will be super helpful --- - To unsubscribe, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-30 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 Thanks @MLnick --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-30 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17819 I've created https://issues.apache.org/jira/browse/SPARK-22397 to track the changes in `QuantileDiscretizer`. The PR can be submitted once we finalize this one. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-29 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @MLnick Is this ready to go? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-27 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/17819 @AFractalThought @viirya I have made changes for QuantileDiscretizer based on this PR. Once this PR is merged, I will open a jira to submit the PR for QuantileDiscretizer. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-27 Thread AFractalThought
Github user AFractalThought commented on the issue: https://github.com/apache/spark/pull/17819 Does this extension exist for QuantileDiscretizer as well? --- - To unsubscribe, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @MLnick Any more comments or thoughts on this I need to address? --- - To unsubscribe, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82632/ Test PASSed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #82632 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82632/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-11 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @MLnick Thanks for leaving the comments. I think I've addressed all of them. Please take a look if you are free. Thanks. --- -

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #82632 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82632/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82583/ Test PASSed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #82583 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82583/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #82583 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82583/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-10 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17819 Yes, fair enough On Tue, 10 Oct 2017 at 14:09 Liang-Chi Hsieh wrote: > *@viirya* commented on this pull request. > --

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @gatorsmile The SQL change looks good to you? Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82403/ Test PASSed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #82403 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82403/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #82403 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82403/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82402/ Test FAILed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #82402 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82402/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @gatorsmile The related test is added. Please take a look again. Thanks. --- - To unsubscribe, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #82402 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82402/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82388/ Test PASSed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #82388 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82388/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #82388 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82388/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @MLnick I've updated this. Please take a look. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-02 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17819 @viirya can you resolve the conflicts now that #19229 was merged? --- - To unsubscribe, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82016/ Test PASSed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #82016 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82016/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #82016 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82016/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-20 Thread MarcKaminski
Github user MarcKaminski commented on the issue: https://github.com/apache/spark/pull/17819 If I may throw in that e.g. ChiSqSelector issues a warning rather than throwing an exception in cases that unnecessary parameters are set, see:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-20 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 I think we can only check `inputCol` and `inputCols`. If both are set, throw an exception. Namely depends on which one is set, we go single column or multiple columns path. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81963/ Test PASSed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81963 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81963/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 @MLnick Yea, you're right, only move `setXXX` to concrete class also work fine. The root cause is the `setXXX` return type. But I think the multi / single logic can be merged, because single

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81963 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81963/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81956/ Test PASSed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81956 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81956/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81956 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81956/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @MLnick I have no strong option but @WeichenXu123 seems more preferring merging the new API into current `Bucketizer`. --- - To

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81935/ Test FAILed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81935 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81935/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17819 The issue is the in the trait `setXXX` returns `this.type` which in Java in the concrete class doesn't work, so the `setXXX` methods need to be implemented in the concrete subclass. See the decision

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81935 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81935/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81917/ Test PASSed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81917 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81917/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81917 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81917/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @WeichenXu123 Yeah, I'm merging it. I just want to clarify adding trait to a class doesn't necessarily makes java incompatible. :) Thanks. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 Yes you can only move `setInputCols` into the outer class to resolve this issue. But I prefer merge it together. I think we can unify the `transform` method. (First we check param `inputCol`

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 Btw, the reason that this change isn't java compatible, is not mainly because adding a trait to `Bucketizer`. Looks like It is because the params setter methods such as `setInputCols`. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @WeichenXu123 I see. That's correct this change is not java compatible. Thanks for pointing out. I'm merging the changes into `Bucketizer`. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 @viirya Oh, I am not saying the compatibility against old version scala application. What I say is about new version `Bucketizer`, when spark user use java language(not scala language), call

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 Sorry I have to reply on a phone, so I may not write codes smoothly. What I mean it doesn't break binary compatibility, is the existing users codes using Bucketizer don't need to

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 @viirya Scala `with trait` is a complex mechanism and `trait` isn't equivalent to java's `interface`. Scala compiler will precompile and generate many other codes, so java-side code cannot

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @WeichenXu123 According to https://docs.oracle.com/javase/specs/jls/se7/html/jls-13.html#jls-13.4.4 and https://wiki.eclipse.org/Evolving_Java-based_APIs_2#Evolving_API_Classes, I think adding an

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 @viirya It is possible I think. A similar example is, `HasRegParam` trait, do not put `setRegParam` in trait but moved into concrete estimator/transformer class, should be the same reason.

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @WeichenXu123 I'm ok for that but I think adding an interface doesn't break binary compatibility? --- - To unsubscribe, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 @viirya Yes. But if there is some better design I will be happy to listen. --- - To unsubscribe, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @WeichenXu123 Do you mean we keep both inputCol and inputCols in `Bucketizer`? --- - To unsubscribe, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 ok to test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81869/ Test FAILed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81869 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81869/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @original-brownbear Thanks for letting me know. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread original-brownbear
Github user original-brownbear commented on the issue: https://github.com/apache/spark/pull/17819 @viirya re `HiveExternalCatalogVersionsSuite`, jup it is https://github.com/apache/spark/commit/dbb824125d4d31166d9a47c330f8d51f5d159515#commitcomment-24354358 ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81869 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81869/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 `HiveExternalCatalogVersionsSuite` seems flaky? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81867/ Test FAILed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81867 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81867/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81867 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81867/testReport)** for PR 17819 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-17 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81864/ Test FAILed. ---

  1   2   >