Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/9092
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enab
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-155248366
merging with master, branch-1.6
Thank you for the PR!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as wel
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-155243776
**[Test build #45439 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45439/consoleFull)**
for PR 9092 at commit
[`2663cbf`](https://git
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-155244180
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-155239682
LGTM pending tests
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this f
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-155231560
**[Test build #45439 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45439/consoleFull)**
for PR 9092 at commit
[`2663cbf`](https://gith
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-155230865
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-155230847
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/9092#discussion_r44311675
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Tokenizer.scala
---
@@ -100,10 +100,25 @@ class RegexTokenizer(override val uid: String)
/
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-155150009
Looks good except that one outdated doc line
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your pr
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-154907032
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-154906966
**[Test build #45328 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45328/consoleFull)**
for PR 9092 at commit
[`43fd8e9`](https://git
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-154903263
**[Test build #45328 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45328/consoleFull)**
for PR 9092 at commit
[`43fd8e9`](https://gith
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-154902998
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-154902986
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-154861500
**[Test build #45314 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45314/consoleFull)**
for PR 9092 at commit
[`0c07366`](https://git
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-154861510
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-154859228
**[Test build #45314 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45314/consoleFull)**
for PR 9092 at commit
[`0c07366`](https://gith
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-154858828
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-154858819
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user hhbyyh commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-154832573
Yes, I agree.
1. Tokenizer and RegexTokenizer should have consistent behavior.
2. Whether to set toLower to true is a matter of preference. I assume for
ML applica
Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-154743585
I'm wondering now if we should set it to convert to lowercase by default.
I know it breaks behavior, but otherwise, we'll introduce an inconsistency in
the API (betwe
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/9092#discussion_r44216481
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/TokenizerSuite.scala ---
@@ -69,6 +69,18 @@ class RegexTokenizerSuite extends SparkFunSuite with
Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/9092#discussion_r44216479
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Tokenizer.scala
---
@@ -100,10 +100,25 @@ class RegexTokenizer(override val uid: String)
/
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-147640440
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-147640443
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-147639962
[Test build #43630 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43630/console)
for PR 9092 at commit
[`ce09ef5`](https://github.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-147628224
[Test build #43630 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43630/consoleFull)
for PR 9092 at commit
[`ce09ef5`](https://gith
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-147627765
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9092#issuecomment-147627734
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
GitHub user hhbyyh opened a pull request:
https://github.com/apache/spark/pull/9092
[SPARK-11069] [ML] Add RegexTokenizer option to convert to lowercase
jira: https://issues.apache.org/jira/browse/SPARK-11069
quotes from jira:
Tokenizer converts strings to lowercase automat
31 matches
Mail list logo