[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r62281151 --- Diff: R/pkg/R/DataFrame.R --- @@ -1125,6 +1126,66 @@ setMethod("summarize", agg(x, ...) }) +#' dapply +#' +#' Apply a function to each partition of a DataFrame. +#' +#' @param x A SparkDataFrame +#' @param func A function to be applied to each partition of the SparkDataFrame. +#' func should have only one parameter, to which a data.frame corresponds +#' to each partition will be passed. +#' The output of func should be a data.frame. +#' @param schema The schema of the resulting DataFrame after the function is applied. +#' It must match the output of func. +#' @family SparkDataFrame functions +#' @rdname dapply +#' @name dapply +#' @export +#' @examples +#' \dontrun{ +#' df <- createDataFrame (sqlContext, iris) +#' df1 <- dapply(df, function(x) { x }, schema(df)) +#' collect(df1) +#' +#' # filter and add a column +#' df <- createDataFrame ( +#' sqlContext, +#' list(list(1L, 1, "1"), list(2L, 2, "2"), list(3L, 3, "3")), +#' c("a", "b", "c")) +#' schema <- structType(structField("a", "integer"), structField("b", "double"), --- End diff -- yeah, let me do some investigation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r62281013 --- Diff: R/pkg/R/DataFrame.R --- @@ -1125,6 +1126,66 @@ setMethod("summarize", agg(x, ...) }) +#' dapply +#' +#' Apply a function to each partition of a DataFrame. +#' +#' @param x A SparkDataFrame +#' @param func A function to be applied to each partition of the SparkDataFrame. +#' func should have only one parameter, to which a data.frame corresponds +#' to each partition will be passed. +#' The output of func should be a data.frame. +#' @param schema The schema of the resulting DataFrame after the function is applied. +#' It must match the output of func. +#' @family SparkDataFrame functions +#' @rdname dapply +#' @name dapply +#' @export +#' @examples +#' \dontrun{ +#' df <- createDataFrame (sqlContext, iris) +#' df1 <- dapply(df, function(x) { x }, schema(df)) +#' collect(df1) +#' +#' # filter and add a column +#' df <- createDataFrame ( +#' sqlContext, +#' list(list(1L, 1, "1"), list(2L, 2, "2"), list(3L, 3, "3")), +#' c("a", "b", "c")) +#' schema <- structType(structField("a", "integer"), structField("b", "double"), --- End diff -- @sun-rui - Just a note that it'll be great to have the simpler schema specification for 2.0. Let me know if you have a new JIRA or we will use 11046, so we can track it for the release. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12493 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215908392 Merging this to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215644198 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57312/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215644197 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215644086 **[Test build #57312 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57312/consoleFull)** for PR 12493 at commit [`3efe9f5`](https://github.com/apache/spark/commit/3efe9f5f067bf66d35c1c8243d00f2f1fdb4e6f9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215628799 **[Test build #57312 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57312/consoleFull)** for PR 12493 at commit [`3efe9f5`](https://github.com/apache/spark/commit/3efe9f5f067bf66d35c1c8243d00f2f1fdb4e6f9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215628604 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215619600 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57296/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215619599 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215619544 **[Test build #57296 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57296/consoleFull)** for PR 12493 at commit [`3efe9f5`](https://github.com/apache/spark/commit/3efe9f5f067bf66d35c1c8243d00f2f1fdb4e6f9). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215611550 **[Test build #57296 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57296/consoleFull)** for PR 12493 at commit [`3efe9f5`](https://github.com/apache/spark/commit/3efe9f5f067bf66d35c1c8243d00f2f1fdb4e6f9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r61526656 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -1981,6 +1982,23 @@ class Dataset[T] private[sql]( } /** + * Returns a new [[DataFrame]] that contains the result of applying a serialized R function + * `func` to each partition. + * + * @group func + */ --- End diff -- submitted https://issues.apache.org/jira/browse/SPARK-14995 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r61522881 --- Diff: R/pkg/R/DataFrame.R --- @@ -1125,6 +1126,66 @@ setMethod("summarize", agg(x, ...) }) +#' dapply +#' +#' Apply a function to each partition of a DataFrame. +#' +#' @param x A SparkDataFrame +#' @param func A function to be applied to each partition of the SparkDataFrame. +#' func should have only one parameter, to which a data.frame corresponds +#' to each partition will be passed. +#' The output of func should be a data.frame. +#' @param schema The schema of the resulting DataFrame after the function is applied. +#' It must match the output of func. +#' @family SparkDataFrame functions +#' @rdname dapply +#' @name dapply +#' @export +#' @examples +#' \dontrun{ +#' df <- createDataFrame (sqlContext, iris) +#' df1 <- dapply(df, function(x) { x }, schema(df)) +#' collect(df1) +#' +#' # filter and add a column +#' df <- createDataFrame ( +#' sqlContext, +#' list(list(1L, 1, "1"), list(2L, 2, "2"), list(3L, 3, "3")), +#' c("a", "b", "c")) +#' schema <- structType(structField("a", "integer"), structField("b", "double"), --- End diff -- OK, I will investigate it. Will submit a new PR for this or reuse https://issues.apache.org/jira/browse/SPARK-11046 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215600099 @davies, yes, those changes are deliberately for future PRs, like applyCollect() --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r61522642 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -1981,6 +1982,23 @@ class Dataset[T] private[sql]( } /** + * Returns a new [[DataFrame]] that contains the result of applying a serialized R function + * `func` to each partition. + * + * @group func + */ --- End diff -- Spark 2.0 is a good change for add "since" for SparkR API methods. But I think we can do it consistently for all methods at one. I will submit a new JIRA issue for it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215536504 LGTM over all. There are still a few of the change that are not needed by this PR (for example, SERIALIZED_R_DATA_SCHEMA), are these kept for future? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r61487677 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -158,10 +158,15 @@ object EliminateSerialization extends Rule[LogicalPlan] { def apply(plan: LogicalPlan): LogicalPlan = plan transform { case d @ DeserializeToObject(_, _, s: SerializeFromObject) if d.outputObjectType == s.inputObjectType => - // Adds an extra Project here, to preserve the output expr id of `DeserializeToObject`. - val objAttr = Alias(s.child.output.head, "obj")(exprId = d.output.head.exprId) - Project(objAttr :: Nil, s.child) - + // A workaround for SPARK-14803. Remove this after it is fixed. +if (d.outputObjectType.isInstanceOf[ObjectType] && --- End diff -- indents --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r61486972 --- Diff: R/pkg/inst/worker/worker.R --- @@ -84,6 +84,15 @@ broadcastElap <- elapsedSecs() # as number of partitions to create. numPartitions <- SparkR:::readInt(inputCon) +# If true, working for RDD --- End diff -- This comment is misleading (`isDataFrame` is obvious) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r61486517 --- Diff: R/pkg/R/DataFrame.R --- @@ -1125,6 +1126,66 @@ setMethod("summarize", agg(x, ...) }) +#' dapply +#' +#' Apply a function to each partition of a DataFrame. +#' +#' @param x A SparkDataFrame +#' @param func A function to be applied to each partition of the SparkDataFrame. +#' func should have only one parameter, to which a data.frame corresponds +#' to each partition will be passed. +#' The output of func should be a data.frame. +#' @param schema The schema of the resulting DataFrame after the function is applied. +#' It must match the output of func. +#' @family SparkDataFrame functions +#' @rdname dapply +#' @name dapply +#' @export +#' @examples +#' \dontrun{ +#' df <- createDataFrame (sqlContext, iris) +#' df1 <- dapply(df, function(x) { x }, schema(df)) +#' collect(df1) +#' +#' # filter and add a column +#' df <- createDataFrame ( +#' sqlContext, +#' list(list(1L, 1, "1"), list(2L, 2, "2"), list(3L, 3, "3")), +#' c("a", "b", "c")) +#' schema <- structType(structField("a", "integer"), structField("b", "double"), --- End diff -- btw, we already have a simpler way (string based) to define a schema in Scala and Python, we may also add that to R. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user NarineK commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r61377331 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -1981,6 +1982,23 @@ class Dataset[T] private[sql]( } /** + * Returns a new [[DataFrame]] that contains the result of applying a serialized R function + * `func` to each partition. + * + * @group func + */ --- End diff -- Maybe we can add @since attribute in the comment ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215303408 **[Test build #57201 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57201/consoleFull)** for PR 12493 at commit [`2264b57`](https://github.com/apache/spark/commit/2264b57a2d5f375eae6520b492a2152be259ccaa). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215303502 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215303503 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57201/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215287520 Cool R code LGTM. @davies / @rxin If one of you can take a final pass at the SQL changes this should be good to do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215286212 **[Test build #57201 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57201/consoleFull)** for PR 12493 at commit [`2264b57`](https://github.com/apache/spark/commit/2264b57a2d5f375eae6520b492a2152be259ccaa). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215285083 @shivaram, changed the code. let's wait for the testing result:) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215284725 Yes, I tried, "stringsAsFactors" must be FALSE, as our SerDe does not support factor for now so I am changing the code as your proposal --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215283979 FWIW I tried the 4 lines I wrote above and it works on my machine. The code in worker.R looks something like ``` ... +if (isDataFrame) { + if (deserializer == "row") { +# Transform the list of rows into a data.frame +oldOpt <- getOption("stringsAsFactors") +options(stringsAsFactors = FALSE) +data <- do.call(rbind.data.frame, data) +options(stringsAsFactors = oldOpt) +names(data) <- colNames + } else { ... ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215281615 Yeah adding a comment to revisit in future sounds good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215281354 and add a comment for future revisit --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215281271 I am not sure if it is necessary to add "stringsAsFactors" as FALSE. just add for safety. Remove it for now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215280314 I think the best workaround is to do something like ``` oldOpt <- getOption("stringsAsFactors") options(stringsAsFactors=FALSE) do.call(rbind.data.frame(data)) options(stringsAsFactors=oldOpt) ``` i.e. set the global option before calling rbind and then reset it to the previous value --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215279975 Aha - I think the option didn't exist before. From https://cran.r-project.org/src/base/NEWS ``` CHANGES IN R 3.2.4: The data.frame method of rbind() gains an optional argument stringsAsFactors (instead of only depending on getOption("stringsAsFactors")). ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215279387 I am using R 3.2.4. I just re-ran the test again with success. OK let me try some old versions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215278557 Yeah the version on Jenkins is `R version 3.1.1 (2014-07-10)` and on my laptop is `R version 3.2.1 (2015-06-18)`. I can see the error on my laptop as well --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215277993 @shivaram, is the R version on Jenkins 3.1.1? seems I need to test with it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215208501 @sun-rui I poked around this a little bit more today. It looks like what is happening is that somehow we are creating `factor` type objects when we have strings in our dataframe. I think the problem is in the line `data <- do.call(rbind.data.frame, c(data, stringsAsFactors = FALSE))` in worker.R -- I am not sure the `stringsAsFactors = F` is being passed correctly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215049539 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57110/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215049531 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215049395 **[Test build #57110 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57110/consoleFull)** for PR 12493 at commit [`b39466c`](https://github.com/apache/spark/commit/b39466cf3ed6fcf856d73d368b4bb67f9aabd1e7). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215017351 **[Test build #57110 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57110/consoleFull)** for PR 12493 at commit [`b39466c`](https://github.com/apache/spark/commit/b39466cf3ed6fcf856d73d368b4bb67f9aabd1e7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215016426 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215010957 **[Test build #57096 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57096/consoleFull)** for PR 12493 at commit [`b39466c`](https://github.com/apache/spark/commit/b39466cf3ed6fcf856d73d368b4bb67f9aabd1e7). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215011127 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215011131 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57096/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214985926 **[Test build #57096 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57096/consoleFull)** for PR 12493 at commit [`b39466c`](https://github.com/apache/spark/commit/b39466cf3ed6fcf856d73d368b4bb67f9aabd1e7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214985318 @shivaram , rebased to master. SparkR unit tests passed on my machine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214970063 @sun-rui Do the tests pass locally for you ? Because the error in Jenkins doesn't seem like a flaky test but from the `dapply` test (error pasted below). Also I still see that the PR doesn't merge cleanly. Can you bring it up to date with master ? ``` 1. Error: dapply() on a DataFrame -- org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1154.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1154.0 (TID 9753, localhost): org.apache.spark.SparkException: R computation failed with [1] 2 [1] 3 [1] 1 [1] 2 [1] 1 [1] 3 [1] 2 [1] 2 [1] 2 [1] 2 [1] 2 [1] 2 Unsupported type for serialization factor Calls: source ... serializeRow -> writeList -> writeObject -> writeType Execution halted ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214960152 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214960153 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57066/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214960044 **[Test build #57066 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57066/consoleFull)** for PR 12493 at commit [`a88c1db`](https://github.com/apache/spark/commit/a88c1dbda92791aec49e3fd122a9f7137939ae96). * This patch **fails SparkR unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214936020 **[Test build #57066 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57066/consoleFull)** for PR 12493 at commit [`a88c1db`](https://github.com/apache/spark/commit/a88c1dbda92791aec49e3fd122a9f7137939ae96). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214935436 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214658265 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56983/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214658264 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214658254 **[Test build #56983 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56983/consoleFull)** for PR 12493 at commit [`a88c1db`](https://github.com/apache/spark/commit/a88c1dbda92791aec49e3fd122a9f7137939ae96). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214656686 **[Test build #56983 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56983/consoleFull)** for PR 12493 at commit [`a88c1db`](https://github.com/apache/spark/commit/a88c1dbda92791aec49e3fd122a9f7137939ae96). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214585898 @shivaram, it may be related to the workaround for SPARK-14803, let me check it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214553151 @sun-rui It looks like some Catalyst tests are failing repeatedly. Can you check if its related to this PR or not ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214542583 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214542589 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56918/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214542146 **[Test build #56918 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56918/consoleFull)** for PR 12493 at commit [`8aaa91d`](https://github.com/apache/spark/commit/8aaa91d18862247b1e0e530f030b6e059fe60e53). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214508934 **[Test build #56918 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56918/consoleFull)** for PR 12493 at commit [`8aaa91d`](https://github.com/apache/spark/commit/8aaa91d18862247b1e0e530f030b6e059fe60e53). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214507799 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214507057 **[Test build #56909 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56909/consoleFull)** for PR 12493 at commit [`8aaa91d`](https://github.com/apache/spark/commit/8aaa91d18862247b1e0e530f030b6e059fe60e53). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214507478 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56909/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214507476 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shaneknapp commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214473135 i guess it just likes me more than you both. ;) i'll triple-check the whitelist and see if something broke there. as usual, there's nothing in the logs except benign stack traces and garbage (which @shivaram can attest to as he looked over my shoulder). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214472422 **[Test build #56909 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56909/consoleFull)** for PR 12493 at commit [`8aaa91d`](https://github.com/apache/spark/commit/8aaa91d18862247b1e0e530f030b6e059fe60e53). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shaneknapp commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214470976 jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shaneknapp commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214470548 it's because the github pull request builder generally sucks. :) anyways, looking in to it now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214460886 @shaneknapp Could you check why this PR isn't being picked up by Jenkins ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214411685 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214269773 Jenkins, retest it please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214241525 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214241528 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56889/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214241002 **[Test build #56889 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56889/consoleFull)** for PR 12493 at commit [`8aaa91d`](https://github.com/apache/spark/commit/8aaa91d18862247b1e0e530f030b6e059fe60e53). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214201014 @shivaram, rebased to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214200933 **[Test build #56889 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56889/consoleFull)** for PR 12493 at commit [`8aaa91d`](https://github.com/apache/spark/commit/8aaa91d18862247b1e0e530f030b6e059fe60e53). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214139743 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214139748 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56871/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214139137 **[Test build #56871 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56871/consoleFull)** for PR 12493 at commit [`bf7cc3b`](https://github.com/apache/spark/commit/bf7cc3b66d837c9a8fd36432ea5295a15ce4bcf3). * This patch **fails SparkR unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214100961 **[Test build #56871 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56871/consoleFull)** for PR 12493 at commit [`bf7cc3b`](https://github.com/apache/spark/commit/bf7cc3b66d837c9a8fd36432ea5295a15ce4bcf3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r60858977 --- Diff: R/pkg/inst/worker/worker.R --- @@ -100,7 +109,24 @@ if (isEmpty != 0) { # Timing reading input data for execution inputElap <- elapsedSecs() -output <- computeFunc(partition, data) +if (isDataFrame) { + if (deserializer == "row") { +# Transform the list of rows into a data.frame +data <- do.call(rbind.data.frame, c(data, stringsAsFactors = FALSE)) +names(data) <- colNames + } else { +# Check to see if data is a valid data.frame +stopifnot(class(data) == "data.frame") + } + output <- computeFunc(data) + if (serializer == "row") { +# Transform the result data.frame back to a list of rows +output <- split(output, seq(nrow(output))) + } --- End diff -- added assert --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214097354 @NarineK, seems not necessary. but that does not hurt? cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user NarineK commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-214041344 @sun-rui , It seems that there are some recent changes in logical plans. Specifically, there is no ObjectOperator in ../logical/object.scala but ObjectsConsumers and Producers instead. I'm a little confused. It seems that ObjectConsumer extends UnaryNode and SerializeFromObject extends UnaryNode with ObjectConsumer . Is it necessary that SerializeFromObject extends UnaryNode explicitly ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-213540482 @sun-rui Could you bring this up to date with master ? I think Jenkins is not running as its not merging cleanly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-213542906 Pending tests passing, this is overall looking fine to me. @davies / @rxin will be good if you do one more pass to see if the SQL integration is fine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r60782517 --- Diff: R/pkg/inst/worker/worker.R --- @@ -100,7 +109,24 @@ if (isEmpty != 0) { # Timing reading input data for execution inputElap <- elapsedSecs() -output <- computeFunc(partition, data) +if (isDataFrame) { + if (deserializer == "row") { +# Transform the list of rows into a data.frame +data <- do.call(rbind.data.frame, c(data, stringsAsFactors = FALSE)) +names(data) <- colNames + } else { +# Check to see if data is a valid data.frame +stopifnot(class(data) == "data.frame") + } + output <- computeFunc(data) + if (serializer == "row") { +# Transform the result data.frame back to a list of rows +output <- split(output, seq(nrow(output))) + } --- End diff -- Can we assert that the serializer is `byte` in the else clause ? Thats what we expect it to be right ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-213380195 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56686/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-213380192 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-213379654 **[Test build #56686 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56686/consoleFull)** for PR 12493 at commit [`2b04ef2`](https://github.com/apache/spark/commit/2b04ef2b4122d75b53f8ef5c154f24a02daecafa). * This patch **fails SparkR unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-213331807 Jenkins, retest it please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-213331130 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-213331135 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56663/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-213330411 **[Test build #56663 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56663/consoleFull)** for PR 12493 at commit [`75dae85`](https://github.com/apache/spark/commit/75dae85b8abc243533e346b62820e9a852798c5c). * This patch **fails SparkR unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-213324258 **[Test build #56686 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56686/consoleFull)** for PR 12493 at commit [`2b04ef2`](https://github.com/apache/spark/commit/2b04ef2b4122d75b53f8ef5c154f24a02daecafa). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-213315553 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56684/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org