Github user yinxusen closed the pull request at:
https://github.com/apache/spark/pull/5731
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is en
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-128139349
@yinxusen In the interests of time, I created a new PR based on this one:
[https://github.com/apache/spark/pull/7972] You will still be the primary
author of it. If
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-127806589
@yinxusen Would you mind if I sent a PR to you (which will update this PR)?
We'd like to squeeze this into 1.5.
---
If your project is set up for it, you can reply t
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-127091229
@yinxusen I think switching to disallowing any overlap in indices and names
will simplify both the API and the implementation.
---
If your project is set up for it, y
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r36050372
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala ---
@@ -0,0 +1,153 @@
+/*
+ * Licensed to the Apache Software Foundati
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r36050377
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala ---
@@ -0,0 +1,153 @@
+/*
+ * Licensed to the Apache Software Foundati
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r36050382
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala ---
@@ -0,0 +1,153 @@
+/*
+ * Licensed to the Apache Software Foundati
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r36050378
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala ---
@@ -0,0 +1,153 @@
+/*
+ * Licensed to the Apache Software Foundati
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r36050373
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala ---
@@ -0,0 +1,153 @@
+/*
+ * Licensed to the Apache Software Foundati
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r36050374
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala ---
@@ -0,0 +1,153 @@
+/*
+ * Licensed to the Apache Software Foundati
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r36050379
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala ---
@@ -0,0 +1,153 @@
+/*
+ * Licensed to the Apache Software Foundati
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r36050375
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala ---
@@ -0,0 +1,153 @@
+/*
+ * Licensed to the Apache Software Foundati
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r36050383
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala ---
@@ -0,0 +1,153 @@
+/*
+ * Licensed to the Apache Software Foundati
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r36050381
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala ---
@@ -0,0 +1,153 @@
+/*
+ * Licensed to the Apache Software Foundati
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-127059025
@yinxusen Yeah, good point, what I said last is too complex. I'll take a
look now.
---
If your project is set up for it, you can reply to this email and have your
re
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126998721
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126998698
[Test build #39412 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39412/console)
for PR 5731 at commit
[`98c6939`](https://github.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126997005
[Test build #39412 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39412/consoleFull)
for PR 5731 at commit
[`98c6939`](https://gith
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126996943
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126996946
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126987226
[Test build #39401 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39401/console)
for PR 5731 at commit
[`ecbf2d3`](https://github.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126987227
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126987192
[Test build #39401 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39401/consoleFull)
for PR 5731 at commit
[`ecbf2d3`](https://gith
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126987125
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126987127
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user yinxusen commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126987117
@jkbradley How about we sticking to the prior discussion? I think users do
not want to repeat features.
---
If your project is set up for it, you can reply to this ema
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126948155
@yinxusen By the way, it would be great to squeeze this into this release.
Will you be able to send an update soon? Thanks!
---
If your project is set up for it, you
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126947540
How about this:
* We use the ordering specified by the user, where we put features
specified by index before features specified by name.
* This will be a well
Github user yinxusen commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126877346
@jkbradley There is already an IntArrayParam in the sharedParam.
Besides, there are some issues to talk:
- Should we consider the scenario that some att
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126747089
OK thanks! Note: "IntArrayParam" may not exist yet in params.scala, but
please add it based on DoubleArrayParam as needed.
---
If your project is set up for it, you
Github user yinxusen commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126688991
@jkbradley Agreed, blending these two selected indices is easy to use. I'll
fix it soon.
---
If your project is set up for it, you can reply to this email and have you
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126534353
Here are some initial thoughts: We should definitely permit users to
specify features with indices and names. Supporting both within the same type
makes the API prett
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r35938626
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala ---
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundati
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r35938540
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala ---
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundati
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r35938539
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala ---
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundati
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126533119
[Test build #39130 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39130/console)
for PR 5731 at commit
[`fd154d7`](https://github.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126533121
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126533047
[Test build #172 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SlowSparkPullRequestBuilder/172/console)
for PR 5731 at commit
[`fd154d7`](https://github.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126533050
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126532937
[Test build #39130 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39130/consoleFull)
for PR 5731 at commit
[`fd154d7`](https://gith
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126532847
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126532877
[Test build #172 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SlowSparkPullRequestBuilder/172/consoleFull)
for PR 5731 at commit
[`fd154d7`](https://gith
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126532835
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126532520
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126532524
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126532438
@yinxusen Apologies for the long wait, but I'm hoping to get this in for
1.5. I'll make a pass now. But if you are too busy, I'd be happy to help
update the PR as ne
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-126532361
Jenkins test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-96906037
[Test build #31104 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31104/consoleFull)
for PR 5731 at commit
[`fd154d7`](https://gith
Github user yinxusen commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r29211169
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala ---
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundatio
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/5731#issuecomment-96893198
[Test build #31104 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31104/consoleFull)
for PR 5731 at commit
[`fd154d7`](https://githu
Github user yinxusen commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r29211027
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/VectorSlicerSuite.scala ---
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Found
Github user yinxusen commented on a diff in the pull request:
https://github.com/apache/spark/pull/5731#discussion_r29211014
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/VectorSlicerSuite.scala ---
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Found
GitHub user yinxusen opened a pull request:
https://github.com/apache/spark/pull/5731
[SPARK-5895][ML] add vector slicer
JIRA issue [here](https://issues.apache.org/jira/browse/SPARK-5895).
I have some thoughts of `AttributeGroup`:
1. End-user is hard to add `Attrib
53 matches
Mail list logo