GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/5613
[SPARK-6852][SPARKR] Accept numeric as numPartitions in SparkR.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sun-rui/spark SPARK-6852
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/5655
[SPARK-6818][SPARKR] Support column deletion in SparkR DataFrame API.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sun-rui/spark SPARK-6818
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/5628
[SPARK-7033][SPARKR] Clean usage of split. Use partition instead where
applicable.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sun-rui
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/5628#issuecomment-95774826
@shivaram , yeah, I forgot to do cleanup in tests. Now the tests are
cleaned. please check it.
---
If your project is set up for it, you can reply to this email
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/5938
[SPARK-6812][SparkR] filter() on DataFrame does not work as expected.
According to the R manual:
https://stat.ethz.ch/R-manual/R-devel/library/base/html/Startup.html,
if a function .First
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/5989#issuecomment-100116112
@rekhajoshm , Thank you. As discussed in the JIRA issue, we can keep these
names as is. But we can improve the implementation of showDF() to print c-style
strings
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/6007
[SPARK-7482][SparkR] Rename some DataFrame API methods in SparkR to match
their counterparts in Scala.
You can merge this pull request into a Git repository by running:
$ git pull https
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6007#issuecomment-100420754
@shivaram, I think it is OK to rename APIs to ones that R user are
accustomed to, for example, read.df and write.df. It would be better if there
are more feedback from
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/6183
[SPARK-7227][SPARKR] Support fillna / dropna in R DataFrame.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sun-rui/spark SPARK-7227
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/6183#discussion_r30455256
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1431,3 +1431,119 @@ setMethod(describe,
sdf - callJMethod(x@sdf, describe, listToSeq(colList
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/6183#discussion_r30455447
--- Diff: core/src/main/scala/org/apache/spark/api/r/SerDe.scala ---
@@ -166,6 +166,24 @@ private[spark] object SerDe {
}
}
+ def
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6190#issuecomment-102560770
@hqzizania, is it possible to simplify the conversion to one
place,something like:
jrdd - getJRDD(lapply(rdd,naToNull), row)?
Another thought is, can we do
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/6007#discussion_r30202774
--- Diff: R/pkg/R/generics.R ---
@@ -453,7 +453,11 @@ setGeneric(saveAsTable, function(df, tableName,
source, mode, ...) {
standardGeneric(saveAsTable
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/6007#discussion_r30202781
--- Diff: R/pkg/R/SQLContext.R ---
@@ -446,14 +446,15 @@ dropTempTable - function(sqlCtx, tableName) {
#' @param source the name of external data source
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6007#issuecomment-101514500
rebased to master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6190#issuecomment-108145372
@davies, do you have any comment on this issue?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/6743
[SPARK-6797][SPARKR] Add support for YARN cluster mode.
This PR enables SparkR to dynamically ship the SparkR binary package to the
AM node in YARN cluster mode, thus it is no longer required
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6743#issuecomment-111738919
rebased
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6743#issuecomment-111315827
Yes this assigns a symbol link name. Thus we can refer to the shipped
package via the logical name instead of the specific archive file name.
---
If your project
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6743#issuecomment-111943437
I originally planned to have a separate JIRA issue for adding shipping of
the SparkR package for RDD APIs. But if this is still required by DataFrame
API, I can do
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/6605
[SPARK-8063][SPARKR] Spark master URL conflict between MASTER env variable
and --master command line option.
You can merge this pull request into a Git repository by running:
$ git pull
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/7152
[SPARK-7714][SPARKR] SparkR tests should use more specific expectations
than expect_true
1. Update the pattern 'expect_true(a == b)' to 'expect_equal(a, b)'.
2. Update the pattern 'expect_true
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6743#issuecomment-116698303
@tgravescsï¼ I think the problem of shipping R itself is that R executable
is platform specific. Also it may require OS specific installation before
running R
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6743#issuecomment-116734776
rebased
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6743#issuecomment-116710169
Add support for shipping SparkR package for R workers required by RDD APIs.
Tested createDataFrame() by creating a DataFrame from an R list.
Remove sparkRLibDir
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6743#issuecomment-117859704
@andrewor14,
I have tested this patch with a real YARN cluster.
For an R program, it can source other R files and call functions within
them, or it can
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/6743#discussion_r33738526
--- Diff: core/src/main/scala/org/apache/spark/api/r/RUtils.scala ---
@@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/6743#discussion_r33737682
--- Diff: core/src/main/scala/org/apache/spark/api/r/RUtils.scala ---
@@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/6743#discussion_r33739229
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -339,6 +341,23 @@ object SparkSubmit
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/7494#issuecomment-130662126
@CHOIJAEHONG1, @shivaram:
1. R worker can be in any locale, because R can recognize UTF-8 and
preserve UTF-8 encoding when manipulating strings. The root cause
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/8240#issuecomment-132065626
RDD API is deliberatly hidden in spark 1.4. For discussion of exposing
subset of RDD API and high level parallel computation API, please refer to
https
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/7419#issuecomment-131476840
@shivaram, fixed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/8276
[SPARK-10048][SPARKR] Support arbitrary nested Java array in serde.
This PR:
1. supports transferring arbitrary nested array from JVM to R side in SerDe;
2. based on 1, collect
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/8276#issuecomment-132571243
@shivaram, I tried to support ArrayType. By adding code like:
// Convert Seq[Any] to Array[Any]
val value =
if (obj.isInstanceOf[Seq[Any
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/8276#discussion_r37490241
--- Diff: R/pkg/R/DataFrame.R ---
@@ -628,18 +628,49 @@ setMethod(dim,
setMethod(collect,
signature(x = DataFrame
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/8240#issuecomment-132843607
I think UDF() is important, hope it can be available in 1.6 release.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/8276#issuecomment-132852075
will add test cases.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/8276#discussion_r37490835
--- Diff: R/pkg/R/DataFrame.R ---
@@ -628,18 +628,49 @@ setMethod(dim,
setMethod(collect,
signature(x = DataFrame
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/8240#issuecomment-132131289
Yes, DataFrame API has lapply and lapplyPartition, but actually there are
RDD-like API ,and its implementation is based on the corresponding RDD API
after converting
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/8276#issuecomment-134217045
rebased to master
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/8276#discussion_r37371028
--- Diff: core/src/main/scala/org/apache/spark/api/r/SerDe.scala ---
@@ -210,22 +213,31 @@ private[spark] object SerDe {
writeType(dos, void
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/8276#issuecomment-132406893
@davies , that will be done in another PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/8276#issuecomment-132478924
@shivaram, now ArrayType in a DataFrame is still not supported, as
ArrayType's class is something like
scala.collection.mutable.WrappedArray$ofRef, it will be passed
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/8276#discussion_r37723615
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala
---
@@ -98,27 +98,17 @@ private[r] object SQLUtils {
val bos = new
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/8276#issuecomment-134066063
@shivaram , test cases for SerDe added. Now the SerDe does not support
transferring a list of different element types from R side to JVM side. Let's
leave
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/8276#discussion_r37490860
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala
---
@@ -98,27 +98,20 @@ private[r] object SQLUtils {
val bos = new
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6743#issuecomment-116892414
@shivaram, yes, I saw that function, but felt confusing that it does not
consider the YARN mode case.
@davies, it seems unit tests for pySpark in YARN modes were
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/6743#discussion_r33539511
--- Diff: core/src/main/scala/org/apache/spark/api/r/RUtils.scala ---
@@ -0,0 +1,44 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/6743#discussion_r33539610
--- Diff: core/src/main/scala/org/apache/spark/api/r/RRDD.scala ---
@@ -390,9 +386,10 @@ private[r] object RRDD {
thread
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/6743#discussion_r33539384
--- Diff: core/src/main/scala/org/apache/spark/api/r/RUtils.scala ---
@@ -0,0 +1,44 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/6743#discussion_r33539628
--- Diff: core/src/main/scala/org/apache/spark/api/r/RUtils.scala ---
@@ -0,0 +1,44 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/6743#discussion_r33539591
--- Diff: core/src/main/scala/org/apache/spark/deploy/RRunner.scala ---
@@ -71,9 +71,10 @@ object RRunner {
val builder = new ProcessBuilder(Seq
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/6743#discussion_r33539310
--- Diff: core/src/main/scala/org/apache/spark/api/r/RUtils.scala ---
@@ -0,0 +1,44 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/6743#discussion_r33555215
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -339,6 +340,24 @@ object SparkSubmit
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6743#issuecomment-117086228
rebased
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6743#issuecomment-117372487
@davies, I will add test case for YARN modes in a separate PR which is
intended to update SparkR related logic to align with SPARK-5479.
---
If your project is set up
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6743#issuecomment-117369661
@shivaram, the rLibDir parameter of sparkR.init() was intended for locating
SparkR package on worker nodes at the time when SparkR was a separate project.
Since now
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/7419#issuecomment-131324275
@shivaram , as you pointed out, I come with a simpler fix. I realized that
simply creating an empty vector using vector() without mode is OK when there is
no row
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/7584#discussion_r35396836
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1384,7 +1384,7 @@ setMethod(saveAsTable,
org.apache.spark.sql.parquet
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/7584#discussion_r35392454
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1384,7 +1384,7 @@ setMethod(saveAsTable,
org.apache.spark.sql.parquet
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/7494#discussion_r35407059
--- Diff: R/pkg/R/deserialize.R ---
@@ -56,8 +56,10 @@ readTypedObject - function(con, type) {
readString - function(con) {
stringLen
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/7584#discussion_r35289630
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1384,7 +1384,7 @@ setMethod(saveAsTable,
org.apache.spark.sql.parquet
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/7584#discussion_r35288709
--- Diff: R/pkg/R/context.R ---
@@ -121,7 +121,7 @@ parallelize - function(sc, coll, numSlices = 1) {
numSlices - length(coll)
sliceLen
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6743#issuecomment-120893058
@shivaram , tests done. Also tested with YARN cluster, yarn-client,
standalone, createDataFrame() in YARN client mode.
---
If your project is set up for it, you can
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/7395
[SPARK-8808][SPARKR] Fix assignments in SparkR.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sun-rui/spark SPARK-8808
Alternatively you can
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/7356#issuecomment-121848726
@viirya,
1. Why pass the lower bound and upper bound in a vector instead of pass
separately? They are passed separately in Scala API. and A vector can not hold
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/7139#issuecomment-121884026
@brkyvz, more comments:)
3. Is it possible to run R specific tests only when -psparkr (sparkr
profile) is specified? If sparkr profile is specified when building
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/7139#issuecomment-121880406
@brkyvz, could you give more explaination on your usage scenario that this
PR is expected to support?
1. This PR introduces a manifest keyword, a hybrid JAR
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/7395#issuecomment-121436723
OK. I will clean them together
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/7139#issuecomment-122144505
@brkyvz, Thanks for your explanation of this requirement from Devs.
1. Is there any existing JIRA or discussion which you can point me to so
that I can learn
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/7494#issuecomment-122837016
I think readString() in deserialize.R should be updated accordingly. Could
you try:
string - readBin(...)
Encoding(string) - UTF-8
string - enc2native
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/7494#issuecomment-122867464
yeah, rawToChar() is needed. Then does it work now?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user sun-rui closed the pull request at:
https://github.com/apache/spark/pull/6743
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/6743#issuecomment-121103339
@shivaram, close the PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/7494#issuecomment-123162043
Could you try adding a zero as done previously in writeString():
val utf8 = value.getBytes(UTF-8)
val len = utf8.length
out.writeInt(len + 1
GitHub user sun-rui opened a pull request:
https://github.com/apache/spark/pull/7419
[SPARK-8844][SPARKR] head/collect is broken in SparkR.
This is a WIP patch for SPARK-8844 for collecting reviews.
This bug is about reading an empty DataFrame. in readCol
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/7395#issuecomment-121478749
All reported issues of such kind are cleaned.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/8276#issuecomment-134443211
The test passed on my machine, I don't know the reason. Anyway, add spark
context initialization into test_Serde to see if it can pass on Jenkins.
---
If your project
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9218#issuecomment-150809533
I am not clear that for both coltypes() and coltypes<-(), how to represent
complex types in R types? do you have idea?
---
If your project is set up for it, you
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9205#issuecomment-150119806
is it possible for lin-r to skip # comment with code outside a function
body?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9179#issuecomment-150106716
@shivaram, documentation updated.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/9192#discussion_r42711959
--- Diff: R/pkg/R/SQLContext.R ---
@@ -17,6 +17,34 @@
# SQLcontext.R: SQLContext-driven functions
+#' Temporary function to reroute old
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9193#issuecomment-150105010
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/9192#discussion_r42711903
--- Diff: R/pkg/R/SQLContext.R ---
@@ -17,6 +17,34 @@
# SQLcontext.R: SQLContext-driven functions
+#' Temporary function to reroute old
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/9222#discussion_r42733892
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala
---
@@ -130,16 +130,17 @@ private[r] object SQLUtils {
}
def
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9222#issuecomment-150377869
dfToCols is meant to be called from R side, it makes sense to test it in R.
Having a test case in R for it can test not only the logic in Scala, but also
helps to test
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/9192#discussion_r42726488
--- Diff: R/pkg/R/SQLContext.R ---
@@ -17,6 +17,34 @@
# SQLcontext.R: SQLContext-driven functions
+#' Temporary function to reroute old
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9222#issuecomment-150166755
Instead of creating a new testsuite in Scala, you can add a new test case
in R, using
callJStatic to invoke "dfToCols" on the Scala side.
---
If yo
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9185#issuecomment-150162723
On the R side, there is a cache of created SQLContext/HiveContext, so R
won't call createSQLContext() second time. See
https://github.com/apache/spark/blob/master/R/pkg
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9218#issuecomment-150164199
could we support both names() and colnames()?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/9192#discussion_r42726208
--- Diff: R/pkg/R/jobj.R ---
@@ -77,6 +77,11 @@ print.jobj <- function(x, ...) {
cat("Java ref type", name, "id
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9222#issuecomment-151060081
@FRosner, I am not denying a unit test:). My point is that it seems not
necessary to add a Scala unit case for dfToCols, which is dedicated for SparkR
now and its logic
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9193#issuecomment-151055969
@shivaram, [R lag
function](https://stat.ethz.ch/R-manual/R-devel/library/stats/html/lag.html) is
for time series objects, which is irrelevant to the lead/lag here
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9218#issuecomment-151399054
@felixcheung, type inferring works for complex types in createDataFrame().
You can refer to the test case for "create DataFrame with complex types" in
test_
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/9290#discussion_r43089490
--- Diff: R/pkg/R/sparkR.R ---
@@ -123,16 +123,30 @@ sparkR.init <- function(
uriSep <- ""
}
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9193#issuecomment-151350204
Yes. the lag will be masked. Just as discussed before, sometimes, this is
allowed, as I assume lag is not so commonly and frequently used ("dplyr" masks
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9193#issuecomment-151375443
@shivaram, for your suggestion that "we should a make a list of functions
that we mask and are incompatible", I submitted
https://issues.apache.org/jira/br
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9196#issuecomment-151376631
Rebased to master
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user sun-rui commented on a diff in the pull request:
https://github.com/apache/spark/pull/9196#discussion_r43243744
--- Diff: R/pkg/R/functions.R ---
@@ -2111,3 +2133,66 @@ setMethod("ntile",
jc <- callJStatic("org.apache.spark.sql.
Github user sun-rui commented on the pull request:
https://github.com/apache/spark/pull/9222#issuecomment-150803696
@FRosner, dfToCols is not a public API, and is now only a helper function
for SparkR. Could you add a private modifier for it?
---
If your project is set up
1 - 100 of 832 matches
Mail list logo