[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20464 merged to master, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88039/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20464 **[Test build #88039 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88039/testReport)** for PR 20464 at commit [`8c1a8ec`](https://github.com/apache/spark/commit/8c1a8ec46ea28ce17fcaae42aa7b9955cb34bfc8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1347/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20464 **[Test build #88039 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88039/testReport)** for PR 20464 at commit [`8c1a8ec`](https://github.com/apache/spark/commit/8c1a8ec46ea28ce17fcaae42aa7b9955cb34bfc8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20464 appveyor tests failed, could you close and reopen this PR to trigger it. strange, I haven't seen anything like this on appveyor a long time. ``` 1. Error: create DataFrame with complex types (@test_sparkSQL.R#535) --- 8712org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 114.0 failed 1 times, most recent failure: Lost task 0.0 in stage 114.0 (TID 116, localhost, executor driver): java.net.SocketTimeoutException: Accept timed out ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20464 **[Test build #87993 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87993/testReport)** for PR 20464 at commit [`0ebdf74`](https://github.com/apache/spark/commit/0ebdf74942e0894bfaf6cbede4c03fd3f5d26411). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87993/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1305/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20464 **[Test build #87993 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87993/testReport)** for PR 20464 at commit [`0ebdf74`](https://github.com/apache/spark/commit/0ebdf74942e0894bfaf6cbede4c03fd3f5d26411). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20464 Because 2.3 is released, ping @felixcheung again --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20464 @felixcheung Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20464 Sorry, I'm a bit occupied with testing 2.3 RC, will get back to this after. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20464 ping @felixcheung --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/20464 I think @felixcheung has the most context here, so I'd suggest we wait for his comments. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86985/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20464 **[Test build #86985 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86985/testReport)** for PR 20464 at commit [`d994d76`](https://github.com/apache/spark/commit/d994d76d45e474b3e4a31fff8250c30efef6a757). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20464 **[Test build #86985 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86985/testReport)** for PR 20464 at commit [`d994d76`](https://github.com/apache/spark/commit/d994d76d45e474b3e4a31fff8250c30efef6a757). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/525/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86924/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20464 **[Test build #86924 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86924/testReport)** for PR 20464 at commit [`95c8a4e`](https://github.com/apache/spark/commit/95c8a4e48e8f760bb9ca0df844136d19452521d7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20464 **[Test build #86924 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86924/testReport)** for PR 20464 at commit [`95c8a4e`](https://github.com/apache/spark/commit/95c8a4e48e8f760bb9ca0df844136d19452521d7). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/473/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20464 > One followup question is though, would it be difficult to match the behaviour with substr in R when the index is 0 or minus? If i understood #20464 (comment) correctly, it sounds better to match it to substr's behaviour in R. Took a quick look/test and seems we can just set start to 1 for both cases. If we both consider the indices at starting and ending, setting them to 1 seems not enough. E.g., ```R > substr("abcdef", -2, -3) [1] "" > substr("abcdef", 1, 1) [1] "a" ``` For the cases when only ending is zero/negative, no matter what starting is, the result is empty string. For the cases when only starting is zero/negative, we can set it to 1. For the cases they are both zero/negative, the result is empty string. We can address this in another PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20464 Just in case, I am testing with: ```R df <- createDataFrame(list(list(a="abcdef"))) collect(select(df, substr(df$a, 4, 5))) substr("abcdef", 4, 5) ``` just in case it helps to check and reproduce. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20464 I was just manually double checking both substr in R and this. It seems correct; however, I think we should add a note in the doc and release note ... One followup question is though, would it be difficult to match the behaviour with substr in R when the index is 0 or minus? If i understood https://github.com/apache/spark/pull/20464#issuecomment-362150632 correctly, it sounds better to match it to substr's behaviour in R. Took a quick look/test and seems we can just set `start` to 1 for both cases. If this followup question is something we are not sure yet, I think we might be okay as is. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20464 @shivaram Thanks for pointing out it. I made change to the description. Hopefully it is clearer now. Basically I just want to clarify why R's substr tests are correct previously. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86908/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20464 **[Test build #86908 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86908/testReport)** for PR 20464 at commit [`a2ffdc1`](https://github.com/apache/spark/commit/a2ffdc14ebfa67656e3598f0a0a0131f18f98aa5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/20464 Thanks for clarifying @viirya. Is the PR description accurate ? I read it as `..SQL's substr also accepts zero-based starting position` while R uses a 1-based starting position. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20464 @shivaram This fix is to make it correctly 1-based. Previously SparkR substr API substracts starting position by 1, so it becomes zero-based. This fix matches R's substr in above link as I test: ```R > substr("Michael", 4, 6) [1] "hae" ``` Before this fix, SparkR's substr returns "cha". --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/20464 One thing to keep in mind is what the user's perception of the API is. If R users are going to use 1-based indexing then this might not be the right fix ? http://stat.ethz.ch/R-manual/R-devel/library/base/html/substr.html is the base R function FWIW --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/20464 Also @shivaram --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/459/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20464 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20464 cc @felixcheung @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20464 One more thing to notice is that the two parameters (starting and ending positions) of R's substr API is also unaligned with Scala's substr which takes starting position and substring length. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20464 **[Test build #86908 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86908/testReport)** for PR 20464 at commit [`a2ffdc1`](https://github.com/apache/spark/commit/a2ffdc14ebfa67656e3598f0a0a0131f18f98aa5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org