[GitHub] spark pull request #13786: [SPARK-15294][R] Add `pivot` to SparkR

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13786#discussion_r67771049 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -1397,6 +1397,26 @@ test_that("group by, agg functions", { unlink

[GitHub] spark issue #13734: [SPARK-14995][R] Add `since` tag in Roxygen documentatio...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13734 LGTM. Thanks for this PR @dongjoon-hyun -- This is very useful to have going forward. Merging this to master, branch-2.0 --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #13751: [SPARK-15159][SPARKR] SparkSession roxygen2 doc, program...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13751 LGTM. Merging this to master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13768: [SPARK-16053][R] Add `spark_partition_id` in SparkR

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13768 Thanks for the updates. LGTM. Merging this to master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #13734: [SPARK-14995][R] Add `since` tag in Roxygen documentatio...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13734 Ok. I think this is a reasonable proposal. I will take one more final pass on this PR today --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #13782: [SPARKR] fix R roxygen2 doc for count on GroupedData

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13782 Thanks - LGTM. Merging this to master and branch-2.0 -- (We can reverify this is #13734 I guess ?) --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #13295: [SPARK-15294][SPARKR][MINOR] Add pivot functionality to ...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13295 @dongjoon-hyun @felixcheung -- I think @mhnatiuk is busy. If you have time it will be cool to submit another version of this PR as I think this is a useful function for R users. --- If your

[GitHub] spark pull request #13768: [SPARK-16053][R] Add `spark_partition_id` in Spar...

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13768#discussion_r67748564 --- Diff: R/pkg/R/functions.R --- @@ -1179,6 +1179,27 @@ setMethod("soundex", column(jc) })

[GitHub] spark issue #13109: [SPARK-15319][SPARKR][DOCS] Fix SparkR doc layout for co...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13109 @felixcheung Is this PR still relevant ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13023: [SPARK-15177] [SparkR] [ML] SparkR 2.0 QA: New R APIs an...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13023 @mengxr @yanboliang Is this PR still active ? Just checking if this is something we should track for the 2.0 release --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #13751: [SPARK-15159][SPARKR] SparkSession roxygen2 doc, program...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13751 Changes look pretty good to me. Thanks -- I just had a couple of minor comments. Also I think we should look at #13592 to make sure there are no other inconsistencies in how we describe

[GitHub] spark pull request #13751: [SPARK-15159][SPARKR] SparkSession roxygen2 doc, ...

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13751#discussion_r67745787 --- Diff: docs/sparkr.md --- @@ -158,20 +152,19 @@ write.df(people, path="people.parquet", source="parquet", mode="overwrite&q

[GitHub] spark issue #13782: [SPARKR] fix R roxygen2 doc for count on GroupedData

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13782 cc @dongjoon-hyun --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #13751: [SPARK-15159][SPARKR] SparkSession roxygen2 doc, ...

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13751#discussion_r67744683 --- Diff: R/pkg/R/schema.R --- @@ -90,13 +87,10 @@ print.structType <- function(x, ...) { #' @export #' @examples #'\dontrun{ -#'

[GitHub] spark pull request #13751: [SPARK-15159][SPARKR] SparkSession roxygen2 doc, ...

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13751#discussion_r67744654 --- Diff: R/pkg/R/schema.R --- @@ -29,11 +29,8 @@ #' @export #' @examples #'\dontrun{ -#' sc <- sparkR.init() -#' sql

[GitHub] spark issue #13752: [SPARK-16028][SPARKR] spark.lapply can work with active ...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13752 Thanks - merging this to master and branch-2.0 after jenkins passes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #13734: [SPARK-14995][R] Add `since` tag in Roxygen documentatio...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13734 - Yeah we can make documentation changes after the RC as the doc updates are pushed separately to the Spark website. However I prefer to get R doc changes in before the RC as these are the ones

[GitHub] spark issue #13763: [SPARK-16051][R] Add `read.orc/write.orc` to SparkR

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13763 LGTM. Merging this to master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #13768: [SPARK-16053][R] Add `spark_partition_id` in Spar...

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13768#discussion_r67740472 --- Diff: R/pkg/R/functions.R --- @@ -1179,6 +1179,27 @@ setMethod("soundex", column(jc) }) +#' spark_pa

[GitHub] spark issue #13753: [SPARK-16029][SPARKR] SparkR add dropTempView and deprec...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13753 Thanks @felixcheung @liancheng - LGTM. Merging this to master, branch-2.0 Are there any other catalog functions that have changed in Spark 2.0 that we also expose in SparkR

[GitHub] spark issue #13752: [SPARK-16028][SPARKR] spark.lapply can work with active ...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13752 LGTM. Minor comment about docs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13752: [SPARK-16028][SPARKR] spark.lapply can work with ...

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13752#discussion_r67739171 --- Diff: R/pkg/R/context.R --- @@ -252,17 +252,19 @@ setCheckpointDir <- function(sc, dirName) { #' } #' #' @rdname spark.lap

[GitHub] spark issue #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13660 Yeah we can remove the duplication by having separate rd files or by just removing documentation for the overlapping arguments (I think in this case `x` and `func` are the same for `dapply

[GitHub] spark issue #13774: [SPARK-16059][R] Add `monotonically_increasing_id` funct...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13774 Thanks @dongjoon-hyun and @felixcheung -- Merging this to master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #13760: [SPARK-16012][SparkR] GapplyCollect - applies a R functi...

2016-06-19 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13760 Thanks @NarineK -- cc @sun-rui for review --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-17 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13635 LGTM. Merging this to master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13734: [SPARK-14995][R] Add `since` tag in Roxygen documentatio...

2016-06-17 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13734 Yeah since the notes appear in a separate section I think its better to be more explicit -- so `read.df since 1.6.0` will be good for these cases --- If your project is set up for it, you can

[GitHub] spark issue #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-17 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13635 Thanks - Could you also bring this up to date with master branch ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #13584: [SPARK-15509][ML][SparkR] R MLlib algorithms should supp...

2016-06-17 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13584 @jkbradley Is this important for 2.0 ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13722: [SPARK-15925][SPARKR] R DataFrame add back registerTempT...

2016-06-17 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13722 Thanks @felixcheung - LGTM. Merging this to master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-17 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13635 @felixcheung Thanks for the update. The change looks pretty good to me. I think there are 2-3 follow up JIRAs I opened from the review that can have separate PRs. There was only one comment

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-17 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67585149 --- Diff: R/pkg/inst/profile/shell.R --- @@ -18,17 +18,17 @@ .First <- function() { home <- Sys.getenv("SPARK_HOME")

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-17 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67585088 --- Diff: R/pkg/inst/tests/testthat/test_context.R --- @@ -156,7 +160,8 @@ test_that("sparkJars sparkPackages as comma-separated st

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-17 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67585014 --- Diff: R/pkg/inst/tests/testthat/test_context.R --- @@ -47,31 +47,33 @@ test_that("Check masked functions", { test_that("repe

[GitHub] spark issue #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-17 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13721 Thanks @dongjoon-hyun - LGTM. Will merge once Jenkins passes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-17 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67584715 --- Diff: R/pkg/inst/tests/testthat/test_context.R --- @@ -156,7 +160,8 @@ test_that("sparkJars sparkPackages as comma-separated st

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-17 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13721#discussion_r67582965 --- Diff: R/pkg/R/DataFrame.R --- @@ -2908,3 +2908,39 @@ setMethod("write.jdbc", write <- callJMethod(write,

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-17 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13721#discussion_r67582902 --- Diff: R/pkg/R/DataFrame.R --- @@ -2908,3 +2908,39 @@ setMethod("write.jdbc", write <- callJMethod(write,

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-17 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67580826 --- Diff: R/pkg/R/SQLContext.R --- @@ -615,11 +619,12 @@ clearCache <- function() { #' @method dropTempTable default dropTempTable.defa

[GitHub] spark issue #13722: [SPARK-15925][SPARKR] R DataFrame add back registerTempT...

2016-06-16 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13722 LGTM. I just had a couple of points about the docs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-16 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13721#discussion_r67457504 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -2264,6 +2264,14 @@ test_that("createDataFrame sqlContext parameter backward compatib

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-16 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13721#discussion_r67457476 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -2264,6 +2264,14 @@ test_that("createDataFrame sqlContext parameter backward compatib

[GitHub] spark pull request #13721: [SPARK-16005][R] Add `randomSplit` to SparkR

2016-06-16 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13721#discussion_r67457426 --- Diff: R/pkg/R/DataFrame.R --- @@ -2884,3 +2884,39 @@ setMethod("write.jdbc", write <- callJMethod(write,

[GitHub] spark pull request #13722: [SPARK-15925][SPARKR] R DataFrame add back regist...

2016-06-16 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13722#discussion_r67457292 --- Diff: R/pkg/R/DataFrame.R --- @@ -455,6 +455,17 @@ setMethod("createOrReplaceTempView", invisible(callJMe

[GitHub] spark pull request #13722: [SPARK-15925][SPARKR] R DataFrame add back regist...

2016-06-16 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13722#discussion_r67457183 --- Diff: R/pkg/R/DataFrame.R --- @@ -455,6 +455,17 @@ setMethod("createOrReplaceTempView", invisible(callJMe

[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-16 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13684 LGTM. Merging into master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13714: [SPARK-15996][R] Fix R examples by removing depre...

2016-06-16 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13714#discussion_r67438648 --- Diff: examples/src/main/r/data-manipulation.R --- @@ -75,8 +75,8 @@ destDF <- select(flightsDF, "dest", "cancelled")

[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-16 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13684#discussion_r67412216 --- Diff: R/pkg/R/DataFrame.R --- @@ -1949,14 +1950,24 @@ setMethod("where", #' path <- "path/to/file.json"

[GitHub] spark issue #13714: [SPARK-15996][R] Fix R examples by removing deprecated f...

2016-06-16 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13714 Merging this to master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13714: [SPARK-15996][R] Fix R examples by removing deprecated f...

2016-06-16 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13714 LGTM. Thanks for the fix @dongjoon-hyun --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-16 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67387830 --- Diff: R/pkg/R/sparkR.R --- @@ -270,27 +291,97 @@ sparkRSQL.init <- function(jsc = NULL) { #'} sparkRHive.init <- function(jsc

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-16 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67387375 --- Diff: R/pkg/NAMESPACE --- @@ -6,10 +6,15 @@ importFrom(methods, setGeneric, setMethod, setOldClass) #useDynLib(SparkR, stringHashCode

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-16 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67387255 --- Diff: R/pkg/R/sparkR.R --- @@ -270,27 +291,97 @@ sparkRSQL.init <- function(jsc = NULL) { #'} sparkRHive.init <- function(jsc

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-16 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67387218 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala --- @@ -18,27 +18,56 @@ package org.apache.spark.sql.api.r

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-16 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67383511 --- Diff: R/pkg/R/sparkR.R --- @@ -270,27 +291,97 @@ sparkRSQL.init <- function(jsc = NULL) { #'} sparkRHive.init <- function(jsc

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 Merging this to master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on ...

2016-06-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/12836#discussion_r67266001 --- Diff: R/pkg/R/DataFrame.R --- @@ -1266,6 +1266,83 @@ setMethod("dapplyCollect", ldf })

[GitHub] spark issue #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-15 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13635 Thanks @felixcheung for the PR. Other than the naming issues, I think the code changes look pretty good to me. I think there are some more docs, programming guide changes we'll need to make but I

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67263062 --- Diff: R/pkg/R/sparkR.R --- @@ -270,27 +291,97 @@ sparkRSQL.init <- function(jsc = NULL) { #'} sparkRHive.init <- function(jsc

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67262684 --- Diff: R/pkg/R/sparkR.R --- @@ -270,27 +291,97 @@ sparkRSQL.init <- function(jsc = NULL) { #'} sparkRHive.init <- function(jsc

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67262366 --- Diff: R/pkg/R/sparkR.R --- @@ -31,20 +31,27 @@ connExists <- function(env) { #' Stop the Spark context. #' #' Also terminates the back

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 @NarineK Thanks again for the updates to this PR and thanks @sun-rui for reviewing. The code changes LGTM -- the refactoring of worker.R is especially useful for readability. I just had

[GitHub] spark pull request #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on ...

2016-06-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/12836#discussion_r67261006 --- Diff: R/pkg/R/DataFrame.R --- @@ -1266,6 +1266,83 @@ setMethod("dapplyCollect", ldf })

[GitHub] spark pull request #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on ...

2016-06-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/12836#discussion_r67260862 --- Diff: R/pkg/R/DataFrame.R --- @@ -1266,6 +1266,83 @@ setMethod("dapplyCollect", ldf })

[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13684 Thanks @dongjoon-hyun -- Also would be good if @sun-rui took a look --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13684#discussion_r67234728 --- Diff: R/pkg/R/generics.R --- @@ -462,12 +462,9 @@ setGeneric("describe", function(x, col, ...) { standardGeneric("describe"

[GitHub] spark issue #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-06-15 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13394 Yeah I think the approach used by @vectorijk is fine. We could have the title as `Model Predictions` instead of `predict` (this is what R uses when you do `?predict`) --- If your project is set

[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13684#discussion_r67233674 --- Diff: R/pkg/R/DataFrame.R --- @@ -1869,14 +1871,23 @@ setMethod("where", #' path <- "path/to/file.json"

[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13684#discussion_r67208482 --- Diff: R/pkg/R/DataFrame.R --- @@ -1859,7 +1859,7 @@ setMethod("where", #' @param colnames A character vector of column names. --

[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13684#discussion_r67208357 --- Diff: R/pkg/R/DataFrame.R --- @@ -1869,6 +1869,7 @@ setMethod("where", #' path <- "path/to/file.json"

[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13684#discussion_r67205381 --- Diff: R/pkg/R/DataFrame.R --- @@ -1859,7 +1859,7 @@ setMethod("where", #' @param colnames A character vector of column names. --

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67204831 --- Diff: R/pkg/R/sparkR.R --- @@ -31,20 +31,27 @@ connExists <- function(env) { #' Stop the Spark context. #' #' Also terminates the back

[GitHub] spark issue #13636: [SPARK-15637][SPARK-15931][SPARKR] Fix R masked function...

2016-06-15 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13636 LGTM. Thanks all. I will merge this after Jenkins passes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-15 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13635#discussion_r67202631 --- Diff: R/pkg/NAMESPACE --- @@ -6,10 +6,15 @@ importFrom(methods, setGeneric, setMethod, setOldClass) #useDynLib(SparkR, stringHashCode

[GitHub] spark pull request #13636: [SPARK-15637][SPARK-15931][SPARKR] Fix R masked f...

2016-06-14 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13636#discussion_r67025422 --- Diff: R/pkg/inst/tests/testthat/test_context.R --- @@ -19,21 +19,25 @@ context("test functions in sparkR.R") test_that(&qu

[GitHub] spark issue #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-14 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13660 Thanks @vectorijk - I left some comments inline. cc @felixcheung --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-14 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13660#discussion_r67018458 --- Diff: docs/sparkr.md --- @@ -262,6 +262,67 @@ head(df) {% endhighlight %} +### Applying User-defined Function

[GitHub] spark pull request #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-14 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13660#discussion_r67018219 --- Diff: docs/sparkr.md --- @@ -262,6 +262,67 @@ head(df) {% endhighlight %} +### Applying User-defined Function

[GitHub] spark pull request #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-14 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13660#discussion_r67018163 --- Diff: docs/sparkr.md --- @@ -262,6 +262,67 @@ head(df) {% endhighlight %} +### Applying User-defined Function + --- End

[GitHub] spark issue #13636: [SPARK-15637][SPARK-15931][SPARKR] Remove R version chec...

2016-06-14 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13636 Thanks @felixcheung - @JoshRosen / @liancheng can you also test this PR with R 3.3.0 before we merge ? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #13636: [SPARK-15637][SPARK-15931][SPARKR] Remove R versi...

2016-06-14 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13636#discussion_r67009490 --- Diff: R/pkg/inst/tests/testthat/test_context.R --- @@ -19,21 +19,25 @@ context("test functions in sparkR.R") test_that(&qu

[GitHub] spark issue #13636: [SPARK-15637][SPARKR] Remove R version check since maske...

2016-06-13 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13636 @liancheng we can use this PR to also address https://issues.apache.org/jira/browse/SPARK-15931 --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #13644: [SPARK-15925][SQL][SPARKR] Replaces registerTempTable wi...

2016-06-13 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13644 Merging this to master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13644: [SPARK-15925][SQL][SPARKR] Replaces registerTempTable wi...

2016-06-13 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13644 Otherwise code change LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13644: [SPARK-15925][SQL][SPARKR] Replaces registerTempTable wi...

2016-06-13 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13644 Thanks @liancheng - there is a style error because line length exceeded 100 chars. Also we'll need to note this breaking change in the programming guide (http://spark.apache.org/docs

[GitHub] spark issue #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-13 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13635 Thanks @felixcheung - I'll take a look at this today. cc @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #13636: [SPARK-15637][SPARKR] Remove R version check since maske...

2016-06-13 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13636 LGTM. Just to confirm your local tests pass with R version > 3.2 ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proj

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-11 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 @rxin I think in this case we need access to grouping expression and DataFrame from within the RelationalGroupedDataset class. One solution could be to move the function `flatMapGroupsInR

[GitHub] spark pull request #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on ...

2016-06-11 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/12836#discussion_r66713462 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -381,6 +385,50 @@ class RelationalGroupedDataset protected[sql

[GitHub] spark issue #13610: Overriding stringArgs in MapPartitionsInR

2016-06-10 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13610 Thanks @NarineK - Change looks pretty good to me. Minor comment: Can you update the title to have `[SPARKR][SQL][SPARK-###]` at the beginning ? --- If your project is set up for it, you can

[GitHub] spark pull request #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on ...

2016-06-10 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/12836#discussion_r66673292 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -286,6 +290,9 @@ case class FlatMapGroupsInR

[GitHub] spark pull request #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on ...

2016-06-10 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/12836#discussion_r66671272 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -286,6 +290,9 @@ case class FlatMapGroupsInR

[GitHub] spark issue #13508: [SPARK-15766][SparkR]:R should export is.nan

2016-06-10 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13508 Thanks @wangmiao1981 - LGTM. Merging this to master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-10 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 Thanks @liancheng for clarification and @NarineK for implementing the override. I just had one minor comment. @sun-rui Can you take one final look ? Since we have not still cut RC1, we

[GitHub] spark pull request #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on ...

2016-06-10 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/12836#discussion_r9908 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -286,6 +290,9 @@ case class FlatMapGroupsInR

[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

2016-06-08 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r66363113 --- Diff: R/pkg/R/mllib.R --- @@ -197,11 +197,10 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) { invisibl

[GitHub] spark pull request #13394: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs a...

2016-06-08 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r66299402 --- Diff: R/pkg/R/mllib.R --- @@ -197,11 +197,10 @@ print.summary.GeneralizedLinearRegressionModel <- function(x, ...) { invisibl

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-08 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 I think I found the commit which causes this problem - https://github.com/apache/spark/commit/6dde27404cb3d921d75dd6afca4b383f9df5976a added toString to include arrays and the output we get

[GitHub] spark issue #13476: [SPARK-15684][SparkR]Not mask startsWith and endsWith in...

2016-06-07 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13476 Merging this to master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-06 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/12836 I don't know what could cause this - Do we have the beginning of the string ? My guess is `MapPartitions` or one of the nodes in the plan is calling `toString` on a byte Array that contains some R

<    4   5   6   7   8   9   10   11   12   13   >