Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/19527
Thank you for all the work in this PR! Here's the follow-up:
https://github.com/apache/spark/pull/20132
---
-
To
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19527
@jkbradley Thanks for reviewing and merging this. Thanks for all helping
this too.
---
-
To unsubscribe, e-mail:
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/19527
Thanks for the updates! I still think there's some confusion, but since I
think this code is correct & it doesn't affect APIs, I'll go ahead and merge
this. I'll ping you on a follow-up PR to
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85551/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #85551 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85551/testReport)**
for PR 19527 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #85551 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85551/testReport)**
for PR 19527 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19527
@jkbradley Ok. I will update this today.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/19527
@viirya Will you have time to update this soon? I'd like to get it in 2.3,
which will mean merging it by Jan. 1. If you're busy, I can merge it as is and
do a follow-up myself too. Thanks!
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85391/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #85391 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85391/testReport)**
for PR 19527 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19527
Unit tests are reformatted too.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85389/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #85389 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85389/testReport)**
for PR 19527 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #85391 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85391/testReport)**
for PR 19527 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #85389 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85389/testReport)**
for PR 19527 at commit
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19527
Agree on keeping the new OneHotEncoderEstimator as an alias for 3.0
On Fri, 1 Dec 2017 at 23:29, jkbradley wrote:
> *@jkbradley* commented on this pull
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19527
@jkbradley Can you review this again? Thanks.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84690/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #84690 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84690/testReport)**
for PR 19527 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #84690 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84690/testReport)**
for PR 19527 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19527
Ok, I understood. In other words, the extra category is added as the last
category and `dropLast` option works as before. It makes sense to me.
---
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/19527
> For example with 5 categories, we don't know [0.0, 0.0, 0.0, 0.0, 0.0]
means last category or invalid value.
For the semantics I described ("OPTION 1"), I think it is clear:
* Last
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19527
The behavior I thought is:
* keep=true, dropLast=true ==> error option
* keep=true, dropLast=false ==> vector size n (all-0 only for invalid value)
For the cases of `dropLast =
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/19527
Question about this PR description comment:
> Note that keep can't be used at the same time with dropLast as true.
Because they will conflict in encoded vector by producing a vector of zeros.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83738/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #83738 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83738/testReport)**
for PR 19527 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #83738 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83738/testReport)**
for PR 19527 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83258/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #83258 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83258/testReport)**
for PR 19527 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #83258 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83258/testReport)**
for PR 19527 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19527
Thanks for reviewing @zhengruifeng. I will update it later.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83070/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #83070 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83070/testReport)**
for PR 19527 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #83070 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83070/testReport)**
for PR 19527 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19527
@huaxingao Good catch! Thanks.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83000/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #83000 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83000/testReport)**
for PR 19527 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #83000 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83000/testReport)**
for PR 19527 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19527
@BryanCutler @MLnick @WeichenXu123 Thanks for reviewing. Your comments
should be all addressed now. Please take a look again when you have more time.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82934/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #82934 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82934/testReport)**
for PR 19527 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82933/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #82933 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82933/testReport)**
for PR 19527 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82932/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #82932 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82932/testReport)**
for PR 19527 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #82934 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82934/testReport)**
for PR 19527 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #82933 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82933/testReport)**
for PR 19527 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #82932 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82932/testReport)**
for PR 19527 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82930/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #82930 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82930/testReport)**
for PR 19527 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #82930 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82930/testReport)**
for PR 19527 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82917/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #82917 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82917/testReport)**
for PR 19527 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #82917 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82917/testReport)**
for PR 19527 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19527
Benchmark against existing one hot encoder.
Because existing encoder only needs to run `transform`, there is no fitting
time.
Transforming:
numColums | Existing one
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19527
Benchmark against multi-column one hot encoder.
Multi-Col, Multiple run: The first commit. Run multiple `treeAggregate` on
columns.
Multi-Col, Single Run: Run one `treeAggregate` on
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19527
cc @MLnick @WeichenXu123 @jkbradley This adds a new class
`OneHotEncoderEstimator` which extends `Estimator`. Please review this when you
can. Thanks.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19527
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82879/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #82879 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82879/testReport)**
for PR 19527 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19527
**[Test build #82879 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82879/testReport)**
for PR 19527 at commit
74 matches
Mail list logo