Github user sun-rui closed the pull request at:
https://github.com/apache/spark/pull/12575
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14783
@clarkfitzg, your patch is for bug fix but not for performance improvement,
right? If so, since there is no performance regression according to your
benchmark, let's focus on the functionalit
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14744
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user sun-rui closed the pull request at:
https://github.com/apache/spark/pull/14046
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14744
@felixcheung, I guess that spark conf is preferred over env variable.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14744
@zjffdu, basically LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14744#discussion_r76067417
--- Diff: docs/configuration.md ---
@@ -1752,6 +1752,13 @@ showDF(properties, numRows = 200, truncate = FALSE)
Executable for executing R scripts in
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14775#discussion_r76008164
--- Diff: R/pkg/NAMESPACE ---
@@ -363,4 +363,9 @@ S3method(structField, jobj)
S3method(structType, jobj)
S3method(structType, structField
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14783#discussion_r76007826
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -36,7 +36,14 @@ compute <- function(mode, partition, serializer,
deserializer, key,
# available si
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14775#discussion_r76007335
--- Diff: R/pkg/NAMESPACE ---
@@ -363,4 +363,9 @@ S3method(structField, jobj)
S3method(structType, jobj)
S3method(structType, structField
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14783#discussion_r76005367
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -36,7 +36,14 @@ compute <- function(mode, partition, serializer,
deserializer, key,
# available si
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14783#discussion_r76000970
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -36,7 +36,14 @@ compute <- function(mode, partition, serializer,
deserializer, key,
# available si
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14783#discussion_r76000958
--- Diff: R/pkg/inst/tests/testthat/test_utils.R ---
@@ -183,4 +183,13 @@ test_that("overrideEnvs", {
expect_equal(config[["conf
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14783#discussion_r76000918
--- Diff: R/pkg/R/SQLContext.R ---
@@ -183,6 +183,8 @@ getDefaultSqlSource <- function() {
# TODO(davies): support sampling and infer type from
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14639
Does this API get only the Spark SQL configurations or including SparkConf?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14639
If in the future SparkConf is needed, instead of passing all spark conf to
R via env variables, we can expose API for accessing SparkConf in the R
backend, similar to that in Pyspark.
https
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14639
I think there may be a simpler solution. Just as my comment in the JIRA,
"EXISTING_SPARKR_BACKEND_PORT" env variable can be checked, instead of getting
the whole spark conf from
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14639
@zjffdu, yes, no need to download spark in yarn-client provided that
spark-submit is called to launch an R script. I just want to verify that your
change works in this case.
But note if yarn
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14639
@zjffdu, does your change work on launching an R script in yarn-client
mode? It seems that it won't
---
If your project is set up for it, you can reply to this email and have your
reply appe
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14639
This is not only about the correct cache dir under MAC OS, but also in
yarn-cluster mode, there should not be downloading of Spark.
---
If your project is set up for it, you can reply to this
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/11157
has this change be documented to spark on mesos guide?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user sun-rui closed the pull request at:
https://github.com/apache/spark/pull/14575
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14575
@mgummelt, @srowen
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/14575
[SPARK-16522][MESOS] Spark application throws exception on exit.
This is backport of https://github.com/apache/spark/pull/14175 to branch 2.0
You can merge this pull request into a Git repository
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14175
ok, will submit another PR for 2.0 branch
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14175#discussion_r73823225
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackendSuite.scala
---
@@ -396,6 +425,10 @@ class
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14175#discussion_r73823074
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackendSuite.scala
---
@@ -341,6 +344,32 @@ class
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14175#discussion_r73823040
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackendSuite.scala
---
@@ -341,6 +344,32 @@ class
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14175
rebased to master
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14175
@mgummelt, regression test case added. Not sure it is the expected one.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14175
@mgummelt, will do it soon
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14175#discussion_r72368797
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
---
@@ -552,7 +552,12 @@ private[spark
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14175
Sure, will add it
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14309
@cloud-fan, could you help to review it?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14175#discussion_r72176972
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
---
@@ -552,7 +552,12 @@ private[spark
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14309
Does this solution co-operate with the access pattern of "table.column"?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as wel
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14309#discussion_r71982269
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
---
@@ -641,6 +641,10 @@ class DataFrameSuite extends QueryTest with
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14175#discussion_r71982037
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
---
@@ -552,7 +552,12 @@ private[spark
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14258#discussion_r71728669
--- Diff: R/pkg/inst/extdata/spark_download.csv ---
@@ -0,0 +1,2 @@
+"url","default"
+"http://apache.osuosl.org",TRUE
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14258#discussion_r71726598
--- Diff: R/pkg/R/install.R ---
@@ -0,0 +1,160 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14258#discussion_r71726450
--- Diff: R/pkg/R/install.R ---
@@ -0,0 +1,160 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14264
@rerngvit, I modifed the title of SPARK-11977 to a narrow scope. You can go
for it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14264
@rerngvit, sorry, I mean https://issues.apache.org/jira/browse/SPARK-11977.
If your PR can enable accesses to columns with "." in their names without
backticks, please first submit a PR
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14264
@rerngvit, could you share the background that this PR can fix the issue. I
see that https://issues.apache.org/jira/browse/SPARK-11976 is still open. Any
other PR in Spark 2.0 make this possible
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14243
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14175#discussion_r71284361
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
---
@@ -552,7 +552,9 @@ private[spark
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14243
better to add the SparkR installation check to the existing disable test?
ignore("correctly builds R packages included in a jar with --packages") {
...}
My expected solutio
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/12836
no, go ahead to submit one:)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14243
Will this test be run always no matter if the "sparkr" profile is specified
or not? In other words, does R need to installed for all spark tests to pass?
---
If your project is set up f
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14175#discussion_r71095619
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
---
@@ -552,7 +552,9 @@ private[spark
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/14192
[SPARK-16509][SPARKR] Rename window.partitionBy and window.orderBy to
windowPartitionBy and windowOrderBy.
## What changes were proposed in this pull request?
Rename window.partitionBy and
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/14175
[SPARK-16522][MESOS] Spark application throws exception on exit.
## What changes were proposed in this pull request?
Spark applications running on Mesos throw exception upon exit. For details
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/14046#discussion_r69501135
--- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R ---
@@ -1258,10 +1258,12 @@ test_that("date functions on a DataFrame", {
df2 <- creat
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/14046
[SPARK-16366][SPARKR] Fix time comparison failures in SparkR unit tests.
## What changes were proposed in this pull request?
Fix time comparison failures in SparkR unit tests. For details
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/13760
no
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13975#discussion_r69089989
--- Diff: R/pkg/inst/worker/daemon.R ---
@@ -44,7 +44,7 @@ while (TRUE) {
if (inherits(p, "masterProcess")) {
clos
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/13975
@shivaram, @davies
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/13975
[SPARK-16299][SPARKR] Capture errors from R workers in daemon.R to avoid
deletion of R session temporary directory.
## What changes were proposed in this pull request?
Capture errors from R
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/13760
LGTM except on minor comment
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13760#discussion_r67983701
--- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R ---
@@ -2236,12 +2236,15 @@ test_that("gapply() on a DataFrame", {
actual <
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13660#discussion_r67881287
--- Diff: docs/sparkr.md ---
@@ -262,6 +262,83 @@ head(df)
{% endhighlight %}
+### Applying User-defined Function
+In SparkR, we
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13660#discussion_r67880977
--- Diff: docs/sparkr.md ---
@@ -262,6 +262,83 @@ head(df)
{% endhighlight %}
+### Applying User-defined Function
+In SparkR, we
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13660#discussion_r67880749
--- Diff: docs/sparkr.md ---
@@ -262,6 +262,83 @@ head(df)
{% endhighlight %}
+### Applying User-defined Function
+In SparkR, we
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13660#discussion_r67880530
--- Diff: docs/sparkr.md ---
@@ -262,6 +262,83 @@ head(df)
{% endhighlight %}
+### Applying User-defined Function
+In SparkR, we
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13660#discussion_r67880209
--- Diff: docs/sparkr.md ---
@@ -262,6 +262,83 @@ head(df)
{% endhighlight %}
+### Applying User-defined Function
+In SparkR, we
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13660#discussion_r67880117
--- Diff: docs/sparkr.md ---
@@ -262,6 +262,83 @@ head(df)
{% endhighlight %}
+### Applying User-defined Function
+In SparkR, we
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/13660
Can you add documentation for gapply() and gapplyCollect() together here?
or @NarineK will do in another PR?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/13752
I think spark.lapply() is a case that demonstrates the need for supporting
Dataset in SparkR. Removing the explicit sc parameter is quite helpful to
moving to Dataset internally in the future if
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13760#discussion_r67799970
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1347,6 +1347,65 @@ setMethod("gapply",
gapply(grouped, fu
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13760#discussion_r67799929
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1347,6 +1347,65 @@ setMethod("gapply",
gapply(grouped, fu
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13760#discussion_r67799741
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1347,6 +1347,65 @@ setMethod("gapply",
gapply(grouped, fu
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/13790
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13763#discussion_r67610942
--- Diff: R/pkg/R/SQLContext.R ---
@@ -330,6 +330,30 @@ jsonRDD <- function(sqlContext, rdd, schema = NULL,
samplingRatio =
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13763#discussion_r67610931
--- Diff: R/pkg/R/SQLContext.R ---
@@ -330,6 +330,30 @@ jsonRDD <- function(sqlContext, rdd, schema = NULL,
samplingRatio =
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13763#discussion_r67610926
--- Diff: R/pkg/R/DataFrame.R ---
@@ -701,6 +701,33 @@ setMethod("write.json",
invisible(callJMethod(write, &q
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/13684
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13635#discussion_r67385138
--- Diff: R/pkg/R/sparkR.R ---
@@ -270,27 +291,97 @@ sparkRSQL.init <- function(jsc = NULL) {
#'}
sparkRHive.init <- function
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13635#discussion_r67384711
--- Diff: R/pkg/NAMESPACE ---
@@ -6,10 +6,15 @@ importFrom(methods, setGeneric, setMethod, setOldClass)
#useDynLib(SparkR, stringHashCode
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13684#discussion_r67287926
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1856,10 +1856,11 @@ setMethod("where",
#' the subset of columns.
#'
#'
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/13635
@shivaram, I probably take a look at this tonight.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/12836
@shivaram, LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12836#discussion_r67161527
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -79,75 +127,72 @@ if (numBroadcastVars > 0) {
# Timing broadcast
broadcastElap <- elaps
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/12836
@NarineK, there is one comment left un-addressed
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12836#discussion_r66733080
--- Diff: core/src/main/scala/org/apache/spark/api/r/RRunner.scala ---
@@ -40,7 +40,8 @@ private[spark] class RRunner[U](
broadcastVars: Array
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/12836
@shivaram, I think we are reaching the final version:). It would be better
that you can have a detailed review on the examples and test cases.
---
If your project is set up for it, you can reply
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12836#discussion_r66721674
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/objects.scala ---
@@ -325,6 +330,71 @@ case class MapGroupsExec
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12836#discussion_r66721643
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -79,75 +127,72 @@ if (numBroadcastVars > 0) {
# Timing broadcast
broadcastElap <- elaps
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12836#discussion_r66721611
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -79,75 +127,72 @@ if (numBroadcastVars > 0) {
# Timing broadcast
broadcastElap <- elaps
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12836#discussion_r66721551
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -27,6 +27,54 @@ elapsedSecs <- function() {
proc.time()[3]
}
+compute <- functio
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12836#discussion_r66721485
--- Diff: R/pkg/R/group.R ---
@@ -142,3 +142,58 @@ createMethods <- function() {
}
createMethods()
+
+#'
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12836#discussion_r66721436
--- Diff: R/pkg/R/group.R ---
@@ -142,3 +142,58 @@ createMethods <- function() {
}
createMethods()
+
+#'
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/12836
yes, let's do it in a separate PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
en
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12836#discussion_r66721354
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -381,6 +385,50 @@ class RelationalGroupedDataset protected[sql
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12836#discussion_r66721347
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala ---
@@ -381,6 +385,50 @@ class RelationalGroupedDataset protected[sql
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/13394#discussion_r66019351
--- Diff: R/pkg/R/functions.R ---
@@ -249,6 +249,10 @@ col <- function(x) {
#'
#' Returns a Column based on the give
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/12836
I guess the byte array of the serialized R function is dumped. Let me find
which commit caused this. I guess something like overriding toString may solve
this
---
If your project is set up for it
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/12836
@NarineK, thanks for hard work. Left some comments for you.
@shivaram, do we still have time window for this to be in 2.0?
---
If your project is set up for it, you can reply to this email and
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12836#discussion_r65563636
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -84,68 +136,51 @@ broadcastElap <- elapsedSecs()
# as number of partitions to create.
numPartiti
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12836#discussion_r65563441
--- Diff: R/pkg/inst/worker/worker.R ---
@@ -27,6 +27,58 @@ elapsedSecs <- function() {
proc.time()[3]
}
+computeHelper <- fu
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/12836#discussion_r65563298
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1266,6 +1266,83 @@ setMethod("dapplyCollect",
ldf
})
1 - 100 of 840 matches
Mail list logo