[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76444282 --- Diff: R/pkg/NAMESPACE --- @@ -363,4 +363,9 @@ S3method(structField, jobj) S3method(structType, jobj) S3method(structType, structField) +export("newJObject") +export("callJMethod") +export("callJStatic") +export("cleanup.jobj") --- End diff -- Thanks @sun-rui and @aloknsingh - I'll update the PR today --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76522943 --- Diff: R/pkg/R/backend.R --- @@ -25,9 +25,23 @@ isInstanceOf <- function(jobj, className) { callJMethod(cls, "isInstance", jobj) } -# Call a Java method named methodName on the object -# specified by objId. objId should be a "jobj" returned -# from the SparkRBackend. +#' Call Java Methods +#' +#' Call a Java method in the JVM running the Spark driver. +#' +#' @param objId object to invoke the method on. Should be a "jobj" created by newJObject. +#' @param methodName method name to call. +#' @param ... parameters to pass to the Java method. +#' @export +#' @seealso callJStatic, newJObject --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76522940 --- Diff: R/pkg/R/backend.R --- @@ -25,9 +25,23 @@ isInstanceOf <- function(jobj, className) { callJMethod(cls, "isInstance", jobj) } -# Call a Java method named methodName on the object -# specified by objId. objId should be a "jobj" returned -# from the SparkRBackend. +#' Call Java Methods +#' +#' Call a Java method in the JVM running the Spark driver. +#' +#' @param objId object to invoke the method on. Should be a "jobj" created by newJObject. --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76522957 --- Diff: R/pkg/R/backend.R --- @@ -37,12 +51,42 @@ callJMethod <- function(objId, methodName, ...) { invokeJava(isStatic = FALSE, objId$id, methodName, ...) } -# Call a static method on a specified className +#' Call Static Java Methods +#' +#' Call a static method in the JVM running the Spark driver. +#' +#' @param className class containing the static method to invoke. +#' @param methodName name of static method to invoke. +#' @param ... parameters to pass to the Java method. +#' @export +#' @seealso callJMethod, newJObject --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76522952 --- Diff: R/pkg/R/backend.R --- @@ -25,9 +25,23 @@ isInstanceOf <- function(jobj, className) { callJMethod(cls, "isInstance", jobj) } -# Call a Java method named methodName on the object -# specified by objId. objId should be a "jobj" returned -# from the SparkRBackend. +#' Call Java Methods +#' +#' Call a Java method in the JVM running the Spark driver. +#' +#' @param objId object to invoke the method on. Should be a "jobj" created by newJObject. +#' @param methodName method name to call. +#' @param ... parameters to pass to the Java method. +#' @export +#' @seealso callJStatic, newJObject +#' @examples --- End diff -- add the `@note` and `@return` to all the methods. I am not sure we need the `@aliases` or `@rdname` because these are S3 methods ? The rdnames come out as `sparkR.callJMethod.rd` etc. and look fine to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76522953 --- Diff: R/pkg/R/backend.R --- @@ -37,12 +51,42 @@ callJMethod <- function(objId, methodName, ...) { invokeJava(isStatic = FALSE, objId$id, methodName, ...) } -# Call a static method on a specified className +#' Call Static Java Methods +#' +#' Call a static method in the JVM running the Spark driver. +#' +#' @param className class containing the static method to invoke. --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76522977 --- Diff: R/pkg/R/backend.R --- @@ -37,12 +51,42 @@ callJMethod <- function(objId, methodName, ...) { invokeJava(isStatic = FALSE, objId$id, methodName, ...) } -# Call a static method on a specified className +#' Call Static Java Methods +#' +#' Call a static method in the JVM running the Spark driver. +#' +#' @param className class containing the static method to invoke. +#' @param methodName name of static method to invoke. +#' @param ... parameters to pass to the Java method. +#' @export +#' @seealso callJMethod, newJObject +#' @examples +#' \dontrun{ +#' sparkR.session() # Need to have a Spark JVM running before calling callJStatic +#' callJStatic("java.lang.System", "currentTimeMillis") +#' callJStatic("java.lang.System", "getProperty", "java.home") +#' } callJStatic <- function(className, methodName, ...) { invokeJava(isStatic = TRUE, className, methodName, ...) } -# Create a new object of the specified class name +#' Create Java Objects +#' +#' Create a new Java object in the JVM running the Spark driver. +#' +#' @param className name of the class to create --- End diff -- I tried to qualify this in the comment on what gets returned. The trouble is that we dont have an externally visible documentation of what types will get converted vs. what will not. Also I think this is bound to change with versions (like the SQL decimal change for example). However this is a very low level API that is only for advanced developers. So I wonder if we should just leave a pointer to the source file ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76522986 --- Diff: R/pkg/R/backend.R --- @@ -37,12 +51,42 @@ callJMethod <- function(objId, methodName, ...) { invokeJava(isStatic = FALSE, objId$id, methodName, ...) } -# Call a static method on a specified className +#' Call Static Java Methods +#' +#' Call a static method in the JVM running the Spark driver. +#' +#' @param className class containing the static method to invoke. +#' @param methodName name of static method to invoke. +#' @param ... parameters to pass to the Java method. +#' @export +#' @seealso callJMethod, newJObject +#' @examples +#' \dontrun{ +#' sparkR.session() # Need to have a Spark JVM running before calling callJStatic +#' callJStatic("java.lang.System", "currentTimeMillis") +#' callJStatic("java.lang.System", "getProperty", "java.home") +#' } callJStatic <- function(className, methodName, ...) { invokeJava(isStatic = TRUE, className, methodName, ...) } -# Create a new object of the specified class name +#' Create Java Objects +#' +#' Create a new Java object in the JVM running the Spark driver. +#' +#' @param className name of the class to create +#' @param ... arguments to be passed to the constructor +#' @export +#' @seealso callJMethod, callJStatic --- End diff -- Done the `seealso` and `return` - Lets discuss the `alias`, `rdname` in the other thread --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76522987 --- Diff: R/pkg/R/jobj.R --- @@ -82,7 +82,20 @@ getClassName.jobj <- function(x) { callJMethod(cls, "getName") } -cleanup.jobj <- function(jobj) { +#' Garbage collect Java Objects +#' +#' Garbage collect an object allocated on Spark driver JVM heap. +#' +#' cleanup.jobj is a low level method that lets developers manually garbage collect objects +#' allocated using newJObject. This is only to be used for advanced use cases as objects allocated +#' on the JVM heap are automatically garbage collected when the corresponding R reference goes out +#' of scope. +#' +#' @param x the Java object that should be garbage collected. +#' @note cleanup.jobj since 2.0.1 --- End diff -- This is removed now --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76522995 --- Diff: R/pkg/R/jobj.R --- @@ -82,7 +82,20 @@ getClassName.jobj <- function(x) { callJMethod(cls, "getName") } -cleanup.jobj <- function(jobj) { +#' Garbage collect Java Objects +#' +#' Garbage collect an object allocated on Spark driver JVM heap. +#' +#' cleanup.jobj is a low level method that lets developers manually garbage collect objects +#' allocated using newJObject. This is only to be used for advanced use cases as objects allocated +#' on the JVM heap are automatically garbage collected when the corresponding R reference goes out +#' of scope. +#' +#' @param x the Java object that should be garbage collected. +#' @note cleanup.jobj since 2.0.1 +#' @export +cleanup.jobj <- function(x) { --- End diff -- Yeah thats a good point - but I've removed this as we don't need to export this for now --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76522998 --- Diff: R/pkg/inst/tests/testthat/test_jvm_api.R --- @@ -0,0 +1,41 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +context("JVM API") + +sparkSession <- sparkR.session(enableHiveSupport = FALSE) + +test_that("Create and call methods on object", { + jarr <- newJObject("java.util.ArrayList") + # Add an element to the array + callJMethod(jarr, "add", 1L) + # Check if get returns the same element + expect_equal(callJMethod(jarr, "get", 0L), 1L) +}) + +test_that("Call static methods", { + # Convert a boolean to a string + strTrue <- callJStatic("java.lang.String", "valueOf", TRUE) + expect_equal(strTrue, "true") +}) + +test_that("Manually garbage collect objects", { + jarr <- newJObject("java.util.ArrayList") + cleanup.jobj(jarr) + # Using a jobj after GC should throw an error + expect_error(print(jarr), "Error in invokeJava.*") +}) --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76523010 --- Diff: R/pkg/R/jobj.R --- @@ -82,7 +82,20 @@ getClassName.jobj <- function(x) { callJMethod(cls, "getName") } -cleanup.jobj <- function(jobj) { +#' Garbage collect Java Objects +#' +#' Garbage collect an object allocated on Spark driver JVM heap. +#' +#' cleanup.jobj is a low level method that lets developers manually garbage collect objects +#' allocated using newJObject. This is only to be used for advanced use cases as objects allocated +#' on the JVM heap are automatically garbage collected when the corresponding R reference goes out +#' of scope. +#' +#' @param x the Java object that should be garbage collected. +#' @note cleanup.jobj since 2.0.1 +#' @export +cleanup.jobj <- function(x) { + jobj <- x if (isValidJobj(jobj)) { --- End diff -- I guess since this is back to an internal method we dont need to worry about it. But I guess I can add the check if we still want it ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14775: [SPARK-16581][SPARKR] Make JVM backend calling functions...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14775 @felixcheung Good point about having a wrapper -- That will make it easier to update the methods going forward. I added a new file `jvm.R` with the wrapper functions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76526127 --- Diff: R/pkg/DESCRIPTION --- @@ -11,7 +11,7 @@ Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), email = "felixche...@apache.org"), person(family = "The Apache Software Foundation", role = c("aut", "cph"))) URL: http://www.apache.org/ http://spark.apache.org/ -BugReports: https://issues.apache.org/jira/secure/CreateIssueDetails!init.jspa?pid=12315420&components=12325400&issuetype=4 +BugReports: http://issues.apache.org/jira/browse/SPARK --- End diff -- Other than being long that link also doesn't seem to work if you are not logged in (would be good if you can also check this). The other thing we could do is to just link to the wiki page at https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-ContributingBugReports -- Do you think that is better ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14859: [SPARK-17200][PROJECT INFRA][BUILD][SparkR] Automate bui...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14859 Thanks @HyukjinKwon - I for one would at least like the SparkR client to work on Windows as I think there are quite a few R users who are on Windows. @HyukjinKwon Could you add some documentation on how this build works, where we can check its status and what we can do to restart a build etc ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76645788 --- Diff: R/pkg/DESCRIPTION --- @@ -11,7 +11,7 @@ Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"), email = "felixche...@apache.org"), person(family = "The Apache Software Foundation", role = c("aut", "cph"))) URL: http://www.apache.org/ http://spark.apache.org/ -BugReports: https://issues.apache.org/jira/secure/CreateIssueDetails!init.jspa?pid=12315420&components=12325400&issuetype=4 +BugReports: http://issues.apache.org/jira/browse/SPARK --- End diff -- I updated the wiki - Let me know if it looks better or if you have other suggestions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76654243 --- Diff: R/pkg/R/backend.R --- @@ -37,12 +51,42 @@ callJMethod <- function(objId, methodName, ...) { invokeJava(isStatic = FALSE, objId$id, methodName, ...) } -# Call a static method on a specified className +#' Call Static Java Methods +#' +#' Call a static method in the JVM running the Spark driver. +#' +#' @param className class containing the static method to invoke. +#' @param methodName name of static method to invoke. +#' @param ... parameters to pass to the Java method. +#' @export +#' @seealso callJMethod, newJObject +#' @examples +#' \dontrun{ +#' sparkR.session() # Need to have a Spark JVM running before calling callJStatic +#' callJStatic("java.lang.System", "currentTimeMillis") +#' callJStatic("java.lang.System", "getProperty", "java.home") +#' } callJStatic <- function(className, methodName, ...) { invokeJava(isStatic = TRUE, className, methodName, ...) } -# Create a new object of the specified class name +#' Create Java Objects +#' +#' Create a new Java object in the JVM running the Spark driver. +#' +#' @param className name of the class to create --- End diff -- I added some comments in a `Details` section. Let me know if it reads ok. I agree that having a md file might be useful in the long run. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14775: [SPARK-16581][SPARKR] Make JVM backend calling fu...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14775#discussion_r76654300 --- Diff: R/pkg/R/backend.R --- @@ -25,9 +25,23 @@ isInstanceOf <- function(jobj, className) { callJMethod(cls, "isInstance", jobj) } -# Call a Java method named methodName on the object -# specified by objId. objId should be a "jobj" returned -# from the SparkRBackend. +#' Call Java Methods +#' +#' Call a Java method in the JVM running the Spark driver. +#' +#' @param objId object to invoke the method on. Should be a "jobj" created by newJObject. +#' @param methodName method name to call. +#' @param ... parameters to pass to the Java method. +#' @export +#' @seealso callJStatic, newJObject +#' @examples --- End diff -- Added `rdname` for all 3 methods now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14775: [SPARK-16581][SPARKR] Make JVM backend calling functions...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14775 Thanks @felixcheung - Addressed both the comments. Let me know if this looks good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14859: [SPARK-17200][PROJECT INFRA][BUILD][SparkR] Automate bui...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14859 Good point on only running this on the master branch. We could even run it periodically (say nightly) instead of on every commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14744: [SPARK-17178][SPARKR][SPARKSUBMIT] Allow to set sparkr s...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14744 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14889: [SPARK-17326][SPARKR] Fix tests with HiveContext in Spar...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14889 Thanks @HyukjinKwon - This is a great catch. LGTM pending tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14890: [MINOR][SPARKR] Verbose build comment in WINDOWS.md rath...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14890 LGTM. Thanks @HyukjinKwon - Merging this to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14889: [SPARK-17326][SPARKR] Fix tests with HiveContext in Spar...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14889 I just ran this on one of the Jenkins machines ``` > packageVersion("testthat") [1] â0.11.0â ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14859: [SPARK-17200][PROJECT INFRA][BUILD][SPARKR] Automate bui...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14859 Thanks @HyukjinKwon - Some comments: - I think we can filter commits by `[SPARKR]` in the PR title. Changes to core/ or sql/ could of course break the SparkR tests but it'll give us some coverage for R PRs. - The periodic master branch build sounds good to me. Something like nightly might be good enough ? - Could you convert your comment describing how to setup / use AppVeyor into a markdown file in `dev/` ? - I created a JIRA to track the failing tests at https://issues.apache.org/jira/browse/SPARK-17339 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14784: [SPARK-17210][SPARKR] sparkr.zip is not distribut...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14784#discussion_r77031075 --- Diff: R/pkg/R/sparkR.R --- @@ -365,6 +365,10 @@ sparkR.session <- function( } overrideEnvs(sparkConfigMap, paramMap) } + if (nzchar(master)) { +assign("spark.master", master, envir = sparkConfigMap) --- End diff -- Yeah this is a tough one - I think preferring `master=` makes as it follows from a convention that the explicit arguments take precedence over more implicit ones passed as a list or read from a file. Also I think mimicing what `bin/spark-shell` does is probably good for consistency. The other option is to just throw an error and let the user fix this ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14889: [SPARK-17326][SPARKR] Fix tests with HiveContext in Spar...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14889 Yeah lets open a separate JIRA to update testthat on the Jenkins boxes. LGTM. Merging this to master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14784: [SPARK-17210][SPARKR] sparkr.zip is not distribut...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14784#discussion_r77073459 --- Diff: R/pkg/R/sparkR.R --- @@ -365,6 +365,10 @@ sparkR.session <- function( } overrideEnvs(sparkConfigMap, paramMap) } + if (nzchar(master)) { +assign("spark.master", master, envir = sparkConfigMap) --- End diff -- That just seems like an ordering issue ? For example ``` > val a = SparkSession.builder().config("spark.master", "local[3]").master("yarn").getOrCreate() > a.conf.get("spark.master") res1: String = yarn ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14903: [SparkR][Minor] Fix windowPartitionBy example
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14903 @junyangq @felixcheung The test error is due to an unrelated issue caused when we upgraded testthat on Jenkins. I'm sending a fix for it now --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14904: [SPARKR][SPARK-16581] Fix JVM API tests in SparkR
GitHub user shivaram opened a pull request: https://github.com/apache/spark/pull/14904 [SPARKR][SPARK-16581] Fix JVM API tests in SparkR ## What changes were proposed in this pull request? Remove cleanup.jobj test. Use JVM wrapper API for other test cases. ## How was this patch tested? Run R unit tests with testthat 1.0 You can merge this pull request into a Git repository by running: $ git pull https://github.com/shivaram/spark-1 sparkr-jvm-tests-fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14904.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14904 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14903: [SparkR][Minor] Fix windowPartitionBy example
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14903 The fix is in #14904 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14904: [SPARKR][SPARK-16581] Fix JVM API tests in SparkR
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14904 cc @junyangq @felixcheung --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14904: [SPARKR][SPARK-16581] Fix JVM API tests in SparkR
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14904 Thanks - merging this to master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14783: SPARK-16785 R dapply doesn't return array or raw columns
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14783 Sorry I think this was a break that I just fixed in #14904 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14903: [SparkR][Minor] Fix windowPartitionBy example
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14903 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14903: [SparkR][Minor] Fix windowPartitionBy example
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14903 Merging this into master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14783: SPARK-16785 R dapply doesn't return array or raw columns
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14783 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14856: [SPARK-17241][SparkR][MLlib] SparkR spark.glm should hav...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14856 Thanks @keypointt for the PR and @junyangq @felixcheung for reviewing. Merging this into master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14783: SPARK-16785 R dapply doesn't return array or raw columns
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14783 @sun-rui Any other comments ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14433: [SPARK-16829][SparkR]:sparkR sc.setLogLevel doesn...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14433#discussion_r77205301 --- Diff: core/src/main/scala/org/apache/spark/internal/Logging.scala --- @@ -135,7 +136,12 @@ private[spark] trait Logging { val replLevel = Option(replLogger.getLevel()).getOrElse(Level.WARN) if (replLevel != rootLogger.getEffectiveLevel()) { System.err.printf("Setting default log level to \"%s\".\n", replLevel) - System.err.println("To adjust logging level use sc.setLogLevel(newLevel).") + if (SparkSubmit.isRShell) { --- End diff -- +1 -- I personally find it odd that we would import `deploy.SparkSubmit` into `Logging` which is a base class used throughout the code base. We could also change the log message to be more vague. Something like `To adjust logging level please call setLogLevel(newLevel) using SparkContext` might work for all languages ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14433: [SPARK-16829][SparkR]:sparkR sc.setLogLevel doesn...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14433#discussion_r77261262 --- Diff: core/src/main/scala/org/apache/spark/internal/Logging.scala --- @@ -135,7 +136,12 @@ private[spark] trait Logging { val replLevel = Option(replLogger.getLevel()).getOrElse(Level.WARN) if (replLevel != rootLogger.getEffectiveLevel()) { System.err.printf("Setting default log level to \"%s\".\n", replLevel) - System.err.println("To adjust logging level use sc.setLogLevel(newLevel).") + if (SparkSubmit.isRShell) { --- End diff -- Sounds good to me --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13584: [SPARK-15509][ML][SparkR] R MLlib algorithms should supp...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13584 LGTM - @felixcheung Feel free to merge when its ready --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14935: [SPARK-17376][SPARKR] Spark version should be available ...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14935 LGTM. I actually think this might also be useful in 2.0.1 ? cc @junyangq --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14433: [SPARK-16829][SparkR]:sparkR sc.setLogLevel doesn't work
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14433 LGTM. Thanks @wangmiao1981 -- I'll keep this open for a bit to see if there are any more comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14934: [SPARKR][DOC] regexp_extract should doc that it r...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14934#discussion_r77375452 --- Diff: R/pkg/R/functions.R --- @@ -2876,7 +2876,8 @@ setMethod("randn", signature(seed = "numeric"), #' regexp_extract #' -#' Extract a specific(idx) group identified by a java regex, from the specified string column. +#' Extract a specific \code{idx} group identified by a Java regex, from the specified string column. --- End diff -- Minor comment: It looks like `idx` was removed in the comment change in https://github.com/apache/spark/pull/14525 -- I'm not sure if it matters, but we could either keep it consistent with the Scala/Python doc or keep the `idx` if you think it helps. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14935: [SPARK-17376][SPARKR] Spark version should be available ...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14935 Merged into master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14934: [SPARKR][DOC] regexp_extract should doc that it r...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14934#discussion_r77381083 --- Diff: R/pkg/R/functions.R --- @@ -2876,7 +2876,8 @@ setMethod("randn", signature(seed = "numeric"), #' regexp_extract #' -#' Extract a specific(idx) group identified by a java regex, from the specified string column. +#' Extract a specific \code{idx} group identified by a Java regex, from the specified string column. --- End diff -- Cool sounds good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14934: [SPARKR][DOC] regexp_extract should doc that it returns ...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14934 LGTM. Merging into master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14939: [SPARK-17376][SPARKR] followup - change since version
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14939 Ah thanks - I didn't notice this while merging the earlier PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14942: [SparkR][Minor] Fix docs for sparkR.session and count
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14942 cc @felixcheung --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14942: [SparkR][Minor] Fix docs for sparkR.session and count
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14942 Thanks - Merging this into master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14433: [SPARK-16829][SparkR]:sparkR sc.setLogLevel doesn't work
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14433 Merged into master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14433: [SPARK-16829][SparkR]:sparkR sc.setLogLevel doesn...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14433#discussion_r77443065 --- Diff: core/src/main/scala/org/apache/spark/internal/Logging.scala --- @@ -135,7 +135,8 @@ private[spark] trait Logging { val replLevel = Option(replLogger.getLevel()).getOrElse(Level.WARN) if (replLevel != rootLogger.getEffectiveLevel()) { System.err.printf("Setting default log level to \"%s\".\n", replLevel) - System.err.println("To adjust logging level use sc.setLogLevel(newLevel).") + System.err.println("To adjust logging level use sc.setLogLevel(newLevel). " + + "For SparkR, use setLogLevel(newLevel).") --- End diff -- I fixed this up during the merge --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13690: [SPARK-15767][R][ML] Decision Tree Regression wrapper in...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13690 @vectorijk Is this ready for another round of review ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14960: [WIP][SPARK-17339][SPARKR][CORE] Fix Windows path...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14960#discussion_r77543535 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1900,7 +1900,20 @@ private[spark] object Utils extends Logging { */ def resolveURI(path: String): URI = { try { - val uri = new URI(path) + val osSafePath = if (Path.isWindowsAbsolutePath(path, false)) { +// Make sure C:/ part becomes /C/. +val windowsUri = new URI(path) +val driveLetter = windowsUri.getScheme +s"/$driveLetter/${windowsUri.getSchemeSpecificPart()}" + } else if (Path.isWindowsAbsolutePath(path, true)) { +// Make sure /C:/ part becomes /C/. +val windowsUri = new URI(path.substring(1)) +val driveLetter = windowsUri.getScheme +s"/$driveLetter/${windowsUri.getSchemeSpecificPart()}" --- End diff -- When we pass in a path from R to Scala we first call `normalizePath` and then pass the output of that to JVM [1, 2]. So the strings in R look something like ``` > d <- normalizePath("./README.md") > d [1] "C:\\Users\\shivaram\\Downloads\\spark-1-sparkr-win-tests-fix\\spark-1-sparkr-win-tests-fix\\README.md" ``` Now there is an option in `normalizePath` to use the `/` instead for Windows alone. So then the output looks something like ``` > d <- normalizePath("./README.md", winslash="/") > d [1] "C:/Users/shivaram/Downloads/spark-1-sparkr-win-tests-fix/spark-1-sparkr-win-tests-fix/README.md" ``` If the second one works better with `resolveURI` we can just change the R code to pass in this option [1] https://github.com/apache/spark/blob/6d86403d8b252776effcddd71338b4d21a224f9b/R/pkg/R/context.R#L75 [2] https://github.com/apache/spark/blob/6d86403d8b252776effcddd71338b4d21a224f9b/R/pkg/R/mllib.R#L905 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14960: [WIP][SPARK-17339][SPARKR][CORE] Fix Windows path...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14960#discussion_r77544303 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1900,7 +1900,20 @@ private[spark] object Utils extends Logging { */ def resolveURI(path: String): URI = { try { - val uri = new URI(path) + val osSafePath = if (Path.isWindowsAbsolutePath(path, false)) { +// Make sure C:/ part becomes /C/. +val windowsUri = new URI(path) +val driveLetter = windowsUri.getScheme +s"/$driveLetter/${windowsUri.getSchemeSpecificPart()}" + } else if (Path.isWindowsAbsolutePath(path, true)) { +// Make sure /C:/ part becomes /C/. +val windowsUri = new URI(path.substring(1)) +val driveLetter = windowsUri.getScheme +s"/$driveLetter/${windowsUri.getSchemeSpecificPart()}" --- End diff -- BTW if we can get to the test errors in https://gist.github.com/HyukjinKwon/0c42b2c208e06c59525d91087252d9b0 it will be very cool. The only remaining errors are from the LDA tests and that is about not locating the test data file -- that seems like a different bug that we can fix in a follow up --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14960: [WIP][SPARK-17339][SPARKR][CORE] Fix Windows path...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14960#discussion_r77544947 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1900,7 +1900,20 @@ private[spark] object Utils extends Logging { */ def resolveURI(path: String): URI = { try { - val uri = new URI(path) + val osSafePath = if (Path.isWindowsAbsolutePath(path, false)) { +// Make sure C:/ part becomes /C/. +val windowsUri = new URI(path) +val driveLetter = windowsUri.getScheme +s"/$driveLetter/${windowsUri.getSchemeSpecificPart()}" + } else if (Path.isWindowsAbsolutePath(path, true)) { +// Make sure /C:/ part becomes /C/. +val windowsUri = new URI(path.substring(1)) +val driveLetter = windowsUri.getScheme +s"/$driveLetter/${windowsUri.getSchemeSpecificPart()}" --- End diff -- For the LDA test we can try a fix that looks like https://github.com/shivaram/spark-1/commit/5aaa322c6c4b567df6c89686fe2e2a65f55e1540#diff-51c07c6af7649f6c021e5a5438e31a4f --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14960: [WIP][SPARK-17339][SPARKR][CORE] Fix Windows path...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14960#discussion_r77572574 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1900,7 +1900,20 @@ private[spark] object Utils extends Logging { */ def resolveURI(path: String): URI = { try { - val uri = new URI(path) + val osSafePath = if (Path.isWindowsAbsolutePath(path, false)) { +// Make sure C:/ part becomes /C/. +val windowsUri = new URI(path) +val driveLetter = windowsUri.getScheme +s"/$driveLetter/${windowsUri.getSchemeSpecificPart()}" + } else if (Path.isWindowsAbsolutePath(path, true)) { +// Make sure /C:/ part becomes /C/. +val windowsUri = new URI(path.substring(1)) +val driveLetter = windowsUri.getScheme +s"/$driveLetter/${windowsUri.getSchemeSpecificPart()}" --- End diff -- I see - Unless there is a reason not to handle such paths it will be good to make the change to convert `C:/../..` to valid URIs as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14960: [WIP][SPARK-17339][SPARKR][CORE] Fix Windows path...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14960#discussion_r77573002 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1900,7 +1900,20 @@ private[spark] object Utils extends Logging { */ def resolveURI(path: String): URI = { try { - val uri = new URI(path) + val osSafePath = if (Path.isWindowsAbsolutePath(path, false)) { +// Make sure C:/ part becomes /C/. +val windowsUri = new URI(path) +val driveLetter = windowsUri.getScheme +s"/$driveLetter/${windowsUri.getSchemeSpecificPart()}" + } else if (Path.isWindowsAbsolutePath(path, true)) { +// Make sure /C:/ part becomes /C/. +val windowsUri = new URI(path.substring(1)) +val driveLetter = windowsUri.getScheme +s"/$driveLetter/${windowsUri.getSchemeSpecificPart()}" --- End diff -- The other option is to of course just use the Hadoop `Path` class and do something like `new Path(path).toURI` -- I think they handle `C:/` correctly. I don't know if this affects other functionality though (like SPARK-11227) and we should check with @sarutak --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14960: [SPARK-17339][SPARKR][CORE] Fix some R tests and ...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14960#discussion_r77665699 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -992,7 +992,7 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli // This is a hack to enforce loading hdfs-site.xml. // See SPARK-11227 for details. -FileSystem.get(new URI(path), hadoopConfiguration) +FileSystem.get(new Path(path).toUri, hadoopConfiguration) --- End diff -- One minor question I had was how this would work with comma separated list of file names as we allow that in textFile (for ample at ehttps://github.com/HyukjinKwon/spark/blob/790d5b2304473555d1edf113f9bbee3034134fac/core/src/test/scala/org/apache/spark/SparkContextSuite.scala#L323) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14960: [SPARK-17339][SPARKR][CORE] Fix some R tests and use Pat...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14960 Thanks @HyukjinKwon - LGTM but for a minor comment I had inline. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14960: [SPARK-17339][SPARKR][CORE] Fix some R tests and ...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14960#discussion_r77751910 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -992,7 +992,7 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli // This is a hack to enforce loading hdfs-site.xml. // See SPARK-11227 for details. -FileSystem.get(new URI(path), hadoopConfiguration) +FileSystem.get(new Path(path).toUri, hadoopConfiguration) --- End diff -- Yeah I'm not sure what part of the URI we are using here. If its just the scheme, authority then I think its fine to use that from the first path. FWIW there is a method in Hadoop to parse comma separated path strings but its private [1]. IMHO this problem existed even before this PR so I'm fine not fixing it here if thats okay with @sarutak [1] https://hadoop.apache.org/docs/r2.7.1/api/src-html/org/apache/hadoop/mapred/FileInputFormat.html#line.467 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14783: SPARK-16785 R dapply doesn't return array or raw ...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14783#discussion_r77757807 --- Diff: R/pkg/R/utils.R --- @@ -697,3 +697,18 @@ is_master_local <- function(master) { is_sparkR_shell <- function() { grepl(".*shell\\.R$", Sys.getenv("R_PROFILE_USER"), perl = TRUE) } + +# rbind a list of rows with raw (binary) columns +# +# @param inputData a list of rows, with each row a list +# @return data.frame with raw columns as lists +rbindRaws <- function(inputData){ + row1 <- inputData[[1]] + rawcolumns <- ("raw" == sapply(row1, class)) + + listmatrix <- do.call(rbind, inputData) --- End diff -- Do you know what happens if we have a mixed set of columns here ? i.e. say one column with "raw", one with "integer" and one with "character" -- From reading some docs it looks like everything is converted to create a `character` matrix when we use `rbind`. I think we have two choices if thats the case (a) we apply the type conversions after `rbind` (b) we only call this method when all columns are `raw` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14783: SPARK-16785 R dapply doesn't return array or raw columns
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14783 Sorry for the delay @clarkfitzg - The code change looks pretty good to me. I just had one question about mixed type columns. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14783: SPARK-16785 R dapply doesn't return array or raw ...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14783#discussion_r77762250 --- Diff: R/pkg/R/utils.R --- @@ -697,3 +697,18 @@ is_master_local <- function(master) { is_sparkR_shell <- function() { grepl(".*shell\\.R$", Sys.getenv("R_PROFILE_USER"), perl = TRUE) } + +# rbind a list of rows with raw (binary) columns +# +# @param inputData a list of rows, with each row a list +# @return data.frame with raw columns as lists +rbindRaws <- function(inputData){ + row1 <- inputData[[1]] + rawcolumns <- ("raw" == sapply(row1, class)) + + listmatrix <- do.call(rbind, inputData) --- End diff -- I was looking at https://stat.ethz.ch/R-manual/R-devel/library/base/html/cbind.html specifically the section `Value` which says ``` The type of a matrix result determined from the highest type of any of the inputs in the hierarchy raw < logical < integer < double < complex < character < list . ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14783: SPARK-16785 R dapply doesn't return array or raw ...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14783#discussion_r77763061 --- Diff: R/pkg/R/utils.R --- @@ -697,3 +697,18 @@ is_master_local <- function(master) { is_sparkR_shell <- function() { grepl(".*shell\\.R$", Sys.getenv("R_PROFILE_USER"), perl = TRUE) } + +# rbind a list of rows with raw (binary) columns +# +# @param inputData a list of rows, with each row a list +# @return data.frame with raw columns as lists +rbindRaws <- function(inputData){ + row1 <- inputData[[1]] + rawcolumns <- ("raw" == sapply(row1, class)) + + listmatrix <- do.call(rbind, inputData) --- End diff -- Ah I see - the types are inside the `listmatrix`. Thanks @clarkfitzg for clarifying. Let us know once you have added the test for a single column of raw as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14960: [SPARK-17339][SPARKR][CORE] Fix some R tests and use Pat...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14960 Looks like it passed ! LGTM pending our Jenkins tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14783: SPARK-16785 R dapply doesn't return array or raw columns
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14783 Thanks for the update. LGTM. Merging this to master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14980: [SPARK-17317][SparkR] Add SparkR vignette
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14980 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14859: [SPARK-17200][PROJECT INFRA][BUILD][SPARKR] Automate bui...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14859 @HyukjinKwon Given that the build is green after #14960 is this good to go ? Any other comments @dongjoon-hyun @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14980: [SPARK-17317][SparkR] Add SparkR vignette
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14980 FYI - @shaneknapp is helping us get the requisite packages installed to build this on Jenkins. https://issues.apache.org/jira/browse/SPARK-17420 is the relevant JIRA --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14993: [SPARK-15509][Follow-up][ML][SparkR] R MLlib algorithms ...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14993 cc @keypointt --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14980: [SPARK-17317][SparkR] Add SparkR vignette
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14980 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14980: [SPARK-17317][SparkR] Add SparkR vignette
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14980 @junyangq - Looks like we should also load SparkR in `create-docs.sh` -- The error in Jenkins is ``` Quitting from lines 22-23 (sparkr-vignettes.Rmd) Error in library(SparkR) : there is no package called 'SparkR' Calls: render ... withCallingHandlers -> withVisible -> eval -> eval -> library ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14980: [SPARK-17317][SparkR] Add SparkR vignette
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14980#discussion_r77873869 --- Diff: R/create-docs.sh --- @@ -43,4 +45,7 @@ Rscript -e 'libDir <- "../../lib"; library(SparkR, lib.loc=libDir); library(knit popd +# render creates SparkR vignettes +Rscript -e 'library(rmarkdown); Sys.setenv(SPARK_HOME=tools::file_path_as_absolute("..")); render("pkg/vignettes/sparkr-vignettes.Rmd")' --- End diff -- Similar to the previous line, lets add `libDir <- "../../lib"; library(SparkR, lib.loc=libDir);` here. I think that should make the Jenkins build work. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14859: [SPARK-17200][PROJECT INFRA][BUILD][SPARKR] Autom...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14859#discussion_r77920251 --- Diff: appveyor.yml --- @@ -0,0 +1,51 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +version: "{build}-{branch}" + +shallow_clone: true + +platform: x64 +configuration: Debug + +branches: + only: +- master + +only_commits: + files: +- R/ + +cache: + - C:\Users\appveyor\.m2 + +install: + # Install maven and dependencies + - ps: .\dev\appveyor-install-dependencies.ps1 + # Required package for R unit tests + - cmd: R -e "install.packages('testthat', repos='http://cran.us.r-project.org')" + +build_script: + - cmd: mvn -DskipTests -Psparkr -Phive -Phive-thriftserver package --- End diff -- Given that we are fixing the hadoop version that we download in `install-dependencies.ps1` does it make sense to also pass in the same profile here ? i.e. `-Phadoop-2.6` ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14859: [SPARK-17200][PROJECT INFRA][BUILD][SPARKR] Automate bui...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14859 LGTM pending the inline comment about hadoop-2.6 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15007: [SPARKR][SPARK-17442] Additional arguments in wri...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/15007#discussion_r77934359 --- Diff: R/pkg/R/DataFrame.R --- @@ -2635,7 +2635,7 @@ setMethod("write.df", write <- callJMethod(df@sdf, "write") write <- callJMethod(write, "format", source) write <- callJMethod(write, "mode", jmode) -write <- callJMethod(write, "save", path) +write <- callJMethod(write, "save", options) --- End diff -- I am not sure I see a `save` method which takes in `options`. What if we just call `write <- callJMethod(write, "options", options)` in the line before `write <- callJMethod(write, "save", path)` ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15008: [SPARK-17339][CORE][BRANCH-2.0] Do not use path to get a...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/15008 Thanks - Merging this to branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14859: [SPARK-17200][PROJECT INFRA][BUILD][SPARKR] Automate bui...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14859 Thanks. I'll merge this once Jenkins passes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15008: [SPARK-17339][CORE][BRANCH-2.0] Do not use path to get a...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/15008 @HyukjinKwon I think you might need to manually close this PR --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15010: [SPARK-17442][SPARKR] Additional arguments in wri...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/15010#discussion_r77944380 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -243,7 +243,15 @@ test_that("read csv as DataFrame", { expect_equal(count(withoutna2), 3) expect_equal(count(where(withoutna2, withoutna2$make == "Dummy")), 0) + # writing csv file + csvPath2 <- tempfile(pattern = "csvtest2", fileext = ".csv") + write.df(df2, path = csvPath2, "csv", header = "true") + df3 <- read.df(csvPath2, "csv", header = "true") --- End diff -- We could do this - but I was thinking we could also check if R's `read.csv` is able to read back the file correctly with headers ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14859: [SPARK-17200][PROJECT INFRA][BUILD][SPARKR] Autom...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14859#discussion_r77944576 --- Diff: appveyor.yml --- @@ -0,0 +1,51 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +version: "{build}-{branch}" + +shallow_clone: true + +platform: x64 +configuration: Debug + +branches: + only: +- master + +only_commits: + files: +- R/ + +cache: + - C:\Users\appveyor\.m2 + +install: + # Install maven and dependencies + - ps: .\dev\appveyor-install-dependencies.ps1 + # Required package for R unit tests + - cmd: R -e "install.packages('testthat', repos='http://cran.us.r-project.org')" --- End diff -- Thats a good point but its actually tricky to specify a version number using `install.packages` - FWIW on Jenkins I see ``` > packageVersion("testthat") [1] â1.0.2â ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15010: [SPARK-17442][SPARKR] Additional arguments in wri...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/15010#discussion_r77944989 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -243,7 +243,15 @@ test_that("read csv as DataFrame", { expect_equal(count(withoutna2), 3) expect_equal(count(where(withoutna2, withoutna2$make == "Dummy")), 0) + # writing csv file + csvPath2 <- tempfile(pattern = "csvtest2", fileext = ".csv") + write.df(df2, path = csvPath2, "csv", header = "true") + df3 <- read.df(csvPath2, "csv", header = "true") --- End diff -- The path is interpreted as a directory by `write.df`. It then puts in a `part-` or a sequence of such files inside the directory --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14859: [SPARK-17200][PROJECT INFRA][BUILD][SPARKR] Autom...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/14859#discussion_r77945551 --- Diff: appveyor.yml --- @@ -0,0 +1,51 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +version: "{build}-{branch}" + +shallow_clone: true + +platform: x64 +configuration: Debug + +branches: + only: +- master + +only_commits: + files: +- R/ + +cache: + - C:\Users\appveyor\.m2 + +install: + # Install maven and dependencies + - ps: .\dev\appveyor-install-dependencies.ps1 + # Required package for R unit tests + - cmd: R -e "install.packages('testthat', repos='http://cran.us.r-project.org')" + +build_script: + - cmd: mvn -DskipTests -Phadoop-2.6 -Psparkr -Phive -Phive-thriftserver package + +test_script: + - cmd: .\bin\spark-submit2.cmd --conf spark.hadoop.fs.default.name="file:///" R\pkg\tests\run-all.R + +notifications: --- End diff -- Yeah I think https://github.com/spark-test/spark/pull/1 was an example @HyukjinKwon shared before. I think we can see the status in Github --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15007: [SPARKR][SPARK-17442] Additional arguments in write.df a...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/15007 @falaki We can use the PR at https://github.com/apache/spark/pull/15010 and close this one ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15010: [SPARK-17442][SPARKR] Additional arguments in write.df a...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/15010 Thanks - the new test looks good --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15010: [SPARK-17442][SPARKR] Additional arguments in write.df a...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/15010 LGTM - @falaki any more comments ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15010: [SPARK-17442][SPARKR] Additional arguments in write.df a...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/15010 Merging into master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14859: [SPARK-17200][PROJECT INFRA][BUILD][SPARKR] Automate bui...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14859 Alright, I'm merging this into master. Lets see how this goes in the next few days and we can fine tune this with some follow up PRs if necessary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14980: [SPARK-17317][SparkR] Add SparkR vignette
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14980 It looks like the tests are failing because of a new CRAN check `NOTE` about the file `derby.log` not being cleaned up from `vignettes/`. I also noticed that `people.parquet` is created in my machine in `R/vignettes` -- Not sure if there is a better way to clean up but maybe we could we just add a line to move files that are not `Rmd` or `md` or `pdf` or `html` ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14784: [SPARK-17210][SPARKR] sparkr.zip is not distributed to e...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14784 @zjffdu Sorry for the delay. I think the change looks pretty good. We could just add a `logWarning` in case we find a collision ? Also could you bring this up to date with the master branch ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15015: [SPARK-16445][MLlib][SparkR] Fix @return description for...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/15015 @HyukjinKwon I thought it should have run the tests ? Any ideas why its not getting picked up ? Is there some other set of steps that we need to do ? (I'm now investigating how it works for Apache Thrift) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15015: [SPARK-16445][MLlib][SparkR] Fix @return description for...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/15015 I think we need to file an INFRA ticket like https://issues.apache.org/jira/browse/INFRA-11294 -- I'll file one now --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15015: [SPARK-16445][MLlib][SparkR] Fix @return description for...
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/15015 Created https://issues.apache.org/jira/browse/INFRA-12590 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14980: [SPARK-17317][SparkR] Add SparkR vignette
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14980 Hmm the Jenkins error was ``` find: cannot delete `pkg/vignettes': Directory not empty ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14980: [SPARK-17317][SparkR] Add SparkR vignette
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/14980 @junyangq @felixcheung Is this good to be merged ? Also some of ML algorithms are not present in 2.0.0, so we might need to remove them from the vignette when backporting ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15029: [DUMMY][SPARKR] Test AppVeyor with a change in R/
GitHub user shivaram opened a pull request: https://github.com/apache/spark/pull/15029 [DUMMY][SPARKR] Test AppVeyor with a change in R/ Dummy pull request to test AppVeyor integration. Do not merge You can merge this pull request into a Git repository by running: $ git pull https://github.com/shivaram/spark-1 sparkr-appveyor-check Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15029.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15029 commit 48641cbd403ee165dde0061aba77de1b7c1b1028 Author: Shivaram Venkataraman Date: 2016-09-09T18:33:23Z Test AppVeyor with a change in R/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15031: [DUMMY][SPARKR] Test AppVeyor with a change in R/
GitHub user shivaram opened a pull request: https://github.com/apache/spark/pull/15031 [DUMMY][SPARKR] Test AppVeyor with a change in R/ Dummy pull request to `branch-1.6` test AppVeyor integration. Do not merge You can merge this pull request into a Git repository by running: $ git pull https://github.com/shivaram/spark-1 sparkr-appveyor-check-1.6 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15031.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15031 commit a8848391afed44fc68584a104348719b066f68f1 Author: Shivaram Venkataraman Date: 2016-09-09T18:33:23Z Test AppVeyor with a change in R/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15031: [DUMMY][SPARKR] Test AppVeyor with a change in R/
Github user shivaram closed the pull request at: https://github.com/apache/spark/pull/15031 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org