Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-213315545
**[Test build #56684 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56684/consoleFull)**
for PR 12493 at commit
[`481df69`](https://g
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-213315552
Build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-213315207
**[Test build #56684 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56684/consoleFull)**
for PR 12493 at commit
[`481df69`](https://gi
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-213302454
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-213302452
Build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-213302449
**[Test build #56674 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56674/consoleFull)**
for PR 12493 at commit
[`76a6fd7`](https://g
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-213302236
**[Test build #56674 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56674/consoleFull)**
for PR 12493 at commit
[`76a6fd7`](https://gi
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60700305
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -84,6 +84,10 @@ broadcastElap <- elapsedSecs()
# as number of partitions to create.
numPartitions <- Spa
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60700161
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala
---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache S
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-213283714
**[Test build #56663 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56663/consoleFull)**
for PR 12493 at commit
[`75dae85`](https://gi
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60695623
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1137,11 +1137,22 @@ setMethod("summarize",
#' @rdname dapply
#' @name dapply
#' @export
+#' @examples
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60695610
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala
---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache S
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60695589
--- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R ---
@@ -1964,6 +1964,38 @@ test_that("Method str()", {
expect_equal(capture.output(utils:::str(iris
Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60677797
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala
---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apac
Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60677594
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -84,6 +84,10 @@ broadcastElap <- elapsedSecs()
# as number of partitions to create.
numPartitions <-
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60677525
--- Diff: R/pkg/R/generics.R ---
@@ -439,6 +439,10 @@ setGeneric("covar_samp", function(col1, col2)
{standardGeneric("covar_samp") })
#' @export
setGe
Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60677343
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1137,11 +1137,22 @@ setMethod("summarize",
#' @rdname dapply
#' @name dapply
#' @export
+#' @exam
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60672376
--- Diff: R/pkg/R/generics.R ---
@@ -439,6 +439,10 @@ setGeneric("covar_samp", function(col1, col2)
{standardGeneric("covar_samp") })
#' @export
se
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60647331
--- Diff: R/pkg/R/generics.R ---
@@ -439,6 +439,10 @@ setGeneric("covar_samp", function(col1, col2)
{standardGeneric("covar_samp") })
#' @export
set
Github user shivaram commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60644001
--- Diff: R/pkg/R/generics.R ---
@@ -439,6 +439,10 @@ setGeneric("covar_samp", function(col1, col2)
{standardGeneric("covar_samp") })
#' @export
s
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212914977
When I rebased this PR to master, I found a bug in Catalyst optimizer. I
submitted a PR for it https://github.com/apache/spark/pull/12575. I have to
wait for it to be f
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60522215
--- Diff: R/pkg/R/generics.R ---
@@ -439,6 +439,10 @@ setGeneric("covar_samp", function(col1, col2)
{standardGeneric("covar_samp") })
#' @export
se
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60521729
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala
---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache S
Github user shivaram commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212542868
@sun-rui Regarding the unit tests could it be related to the R version or
the version of testthat we are using on Jenkins ?
---
If your project is set up for it, you
Github user shivaram commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60452743
--- Diff: R/pkg/R/generics.R ---
@@ -439,6 +439,10 @@ setGeneric("covar_samp", function(col1, col2)
{standardGeneric("covar_samp") })
#' @export
s
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60446033
--- Diff: R/pkg/R/generics.R ---
@@ -439,6 +439,10 @@ setGeneric("covar_samp", function(col1, col2)
{standardGeneric("covar_samp") })
#' @export
set
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212398248
@rxin, I will implement dapplyCollect and collect() on DataFrame of
serialized R data in a following PR.
---
If your project is set up for it, you can reply to this em
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60394761
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1125,6 +1125,52 @@ setMethod("summarize",
agg(x, ...)
})
+#' dapply
+#'
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60394552
--- Diff: R/pkg/R/generics.R ---
@@ -439,6 +439,10 @@ setGeneric("covar_samp", function(col1, col2)
{standardGeneric("covar_samp") })
#' @export
se
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60394359
--- Diff: R/pkg/R/generics.R ---
@@ -439,6 +439,10 @@ setGeneric("covar_samp", function(col1, col2)
{standardGeneric("covar_samp") })
#' @export
se
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60393187
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1125,6 +1125,52 @@ setMethod("summarize",
agg(x, ...)
})
+#' dapply
+#'
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212393341
The test failure is weird:
{panel}
1. Failure (at test_sparkSQL.R#1973): dapply() on a DataFrame
--
expected is not identical to result. Diff
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212314329
Build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212314334
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212313646
**[Test build #56322 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56322/consoleFull)**
for PR 12493 at commit
[`80da663`](https://g
Github user NarineK commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60363417
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala
---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache S
Github user NarineK commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60362770
--- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R ---
@@ -1964,6 +1964,38 @@ test_that("Method str()", {
expect_equal(capture.output(utils:::str(iris
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212300476
BTW one observation: FWIW, I think the dapplyCollect method will be a lot
more useful, because that's the one that can be used for training models, etc.
---
If your proj
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60361260
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1125,6 +1125,52 @@ setMethod("summarize",
agg(x, ...)
})
+#' dapply
+#'
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60360979
--- Diff: R/pkg/R/generics.R ---
@@ -439,6 +439,10 @@ setGeneric("covar_samp", function(col1, col2)
{standardGeneric("covar_samp") })
#' @export
set
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60360866
--- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R ---
@@ -1964,6 +1964,38 @@ test_that("Method str()", {
expect_equal(capture.output(utils:::str(iris)
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60360191
--- Diff: R/pkg/R/generics.R ---
@@ -439,6 +439,10 @@ setGeneric("covar_samp", function(col1, col2)
{standardGeneric("covar_samp") })
#' @export
set
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60359916
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1125,6 +1125,52 @@ setMethod("summarize",
agg(x, ...)
})
+#' dapply
+#'
Github user davies commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60358763
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala
---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache So
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212288375
@davies should take a detailed look at this.
This looks pretty good based on my very very quick glance.
---
If your project is set up for it, you can reply to th
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212276444
@shivaram, there is already a test case for for where the schema is not
specified. Do you mean adding more?
---
If your project is set up for it, you can reply to this
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212267084
**[Test build #56322 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56322/consoleFull)**
for PR 12493 at commit
[`80da663`](https://gi
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212257638
Build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212257639
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212257637
**[Test build #56320 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56320/consoleFull)**
for PR 12493 at commit
[`480dec9`](https://g
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212257631
**[Test build #56320 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56320/consoleFull)**
for PR 12493 at commit
[`480dec9`](https://gi
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60349837
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1125,6 +1125,41 @@ setMethod("summarize",
agg(x, ...)
})
+#' dapply
+#'
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60349802
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala
---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache S
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60349778
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -100,7 +104,20 @@ if (isEmpty != 0) {
# Timing reading input data for execution
inputElap <- ela
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60349597
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1125,6 +1125,41 @@ setMethod("summarize",
agg(x, ...)
})
+#' dapply
+#'
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60345680
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala
---
@@ -23,12 +23,15 @@ import scala.util.matching.Regex
import org.ap
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60345317
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala
---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache S
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60344909
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -84,6 +84,10 @@ broadcastElap <- elapsedSecs()
# as number of partitions to create.
numPartitions <- Spa
Github user shivaram commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-212139588
Thanks @sun-rui for the change. I did a first pass over it. It would be
good to add some more test cases for where the schema is not specified as well.
Also I
Github user shivaram commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60315638
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -100,7 +104,20 @@ if (isEmpty != 0) {
# Timing reading input data for execution
inputElap <- el
Github user shivaram commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60315429
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -84,6 +84,10 @@ broadcastElap <- elapsedSecs()
# as number of partitions to create.
numPartitions <- Sp
Github user shivaram commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60315229
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1125,6 +1125,41 @@ setMethod("summarize",
agg(x, ...)
})
+#' dapply
+#'
Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60312062
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala
---
@@ -23,12 +23,15 @@ import scala.util.matching.Regex
import or
Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60311600
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala
---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apac
Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60310848
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala
---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apac
Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60309791
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -84,6 +84,10 @@ broadcastElap <- elapsedSecs()
# as number of partitions to create.
numPartitions <-
Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/12493#discussion_r60309189
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1125,6 +1125,41 @@ setMethod("summarize",
agg(x, ...)
})
+#' dapply
+#
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-211829812
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-211829787
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-211829510
**[Test build #56209 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56209/consoleFull)**
for PR 12493 at commit
[`e6b67b0`](https://g
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-211780870
**[Test build #56209 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56209/consoleFull)**
for PR 12493 at commit
[`e6b67b0`](https://gi
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-211779197
@rxin, @davies, @NarineK, @shivaram, please help to review it so that it
can catch spark 2.0
---
If your project is set up for it, you can reply to this email and have
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-211776343
**[Test build #56206 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56206/consoleFull)**
for PR 12493 at commit
[`00a8c1c`](https://g
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-211776353
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-211776358
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/12493#issuecomment-211775395
**[Test build #56206 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56206/consoleFull)**
for PR 12493 at commit
[`00a8c1c`](https://gi
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/12493
[SPARK-12919][SPARKR] Implement dapply() on DataFrame in SparkR.
## What changes were proposed in this pull request?
dapply() applies an R function on each partition of a DataFrame and retu
101 - 177 of 177 matches
Mail list logo