[GitHub] spark pull request #13751: [SPARK-15159][SPARKR] SparkSession roxygen2 doc, ...

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13751#discussion_r67744654 --- Diff: R/pkg/R/schema.R --- @@ -29,11 +29,8 @@ #' @export #' @examples #'\dontrun{ -#' sc <- spark

[GitHub] spark pull request #13751: [SPARK-15159][SPARKR] SparkSession roxygen2 doc, ...

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13751#discussion_r67744683 --- Diff: R/pkg/R/schema.R --- @@ -90,13 +87,10 @@ print.structType <- function(x, ...) { #' @export #' @examples #'\d

[GitHub] spark issue #13782: [SPARKR] fix R roxygen2 doc for count on GroupedData

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13782 cc @dongjoon-hyun --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #13751: [SPARK-15159][SPARKR] SparkSession roxygen2 doc, ...

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13751#discussion_r67745787 --- Diff: docs/sparkr.md --- @@ -158,20 +152,19 @@ write.df(people, path="people.parquet", source="parquet", mode="overwrite&q

[GitHub] spark issue #13751: [SPARK-15159][SPARKR] SparkSession roxygen2 doc, program...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13751 Changes look pretty good to me. Thanks -- I just had a couple of minor comments. Also I think we should look at #13592 to make sure there are no other inconsistencies in how we describe

[GitHub] spark issue #13023: [SPARK-15177] [SparkR] [ML] SparkR 2.0 QA: New R APIs an...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13023 @mengxr @yanboliang Is this PR still active ? Just checking if this is something we should track for the 2.0 release --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #13109: [SPARK-15319][SPARKR][DOCS] Fix SparkR doc layout for co...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13109 @felixcheung Is this PR still relevant ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13768: [SPARK-16053][R] Add `spark_partition_id` in Spar...

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13768#discussion_r67748564 --- Diff: R/pkg/R/functions.R --- @@ -1179,6 +1179,27 @@ setMethod("soundex", column(jc) }) +#&#

[GitHub] spark issue #13295: [SPARK-15294][SPARKR][MINOR] Add pivot functionality to ...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13295 @dongjoon-hyun @felixcheung -- I think @mhnatiuk is busy. If you have time it will be cool to submit another version of this PR as I think this is a useful function for R users. --- If your

[GitHub] spark issue #13782: [SPARKR] fix R roxygen2 doc for count on GroupedData

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13782 Thanks - LGTM. Merging this to master and branch-2.0 -- (We can reverify this is #13734 I guess ?) --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #13734: [SPARK-14995][R] Add `since` tag in Roxygen documentatio...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13734 Ok. I think this is a reasonable proposal. I will take one more final pass on this PR today --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #13768: [SPARK-16053][R] Add `spark_partition_id` in SparkR

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13768 Thanks for the updates. LGTM. Merging this to master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #13751: [SPARK-15159][SPARKR] SparkSession roxygen2 doc, program...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13751 LGTM. Merging this to master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13734: [SPARK-14995][R] Add `since` tag in Roxygen documentatio...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13734 LGTM. Thanks for this PR @dongjoon-hyun -- This is very useful to have going forward. Merging this to master, branch-2.0 --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #13786: [SPARK-15294][R] Add `pivot` to SparkR

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13786#discussion_r67771049 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -1397,6 +1397,26 @@ test_that("group by, agg functions", { unlink

[GitHub] spark pull request #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13660#discussion_r67771773 --- Diff: docs/sparkr.md --- @@ -262,6 +262,79 @@ head(df) {% endhighlight %} +### Applying User-defined Function +In SparkR, we

[GitHub] spark pull request #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13660#discussion_r67772111 --- Diff: docs/sparkr.md --- @@ -262,6 +262,79 @@ head(df) {% endhighlight %} +### Applying User-defined Function +In SparkR, we

[GitHub] spark issue #13790: remove duplicated docs in dapply

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13790 LGTM -- @felixcheung @sun-rui let me know if you have any comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #13109: [SPARK-15319][SPARKR][DOCS] Fix SparkR doc layout for co...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13109 cc @dongjoon-hyun --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #13786: [SPARK-15294][R] Add `pivot` to SparkR

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13786 LGTM. Merging this to master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13109: [SPARK-15319][SPARKR][DOCS] Fix SparkR doc layout...

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13109#discussion_r67806444 --- Diff: R/pkg/R/stats.R --- @@ -19,7 +19,8 @@ setOldClass("jobj") -#' crosstab +#' @title SparkDataFrame

[GitHub] spark issue #13798: [SPARK-16088][SPARKR][DOCS] Remove setJobGroup, clearJob...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13798 Hmm can we document somewhere why we want to remove this functionality ? Is this because spark context is no longer accessible ? --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #13798: [SPARKR][DOCS] R code doc cleanup

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13798 Yeah lets do that in a separate PR with discussion ? We don't lose much by leaving in a few extra functions that we deprecate / remove later. Thanks for the doc cleanup - I'l

[GitHub] spark issue #13799: [SPARK-15863][SQL][DOC][SPARKR] sql programming guide up...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13799 Thanks LGTM. Will wait for @liancheng to also take a look --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #13798: [SPARKR][DOCS] R code doc cleanup

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13798#discussion_r67810125 --- Diff: R/pkg/R/generics.R --- @@ -689,67 +689,67 @@ setGeneric("randomSplit", function(x, weights, seed) { standardGeneric

[GitHub] spark issue #13295: [SPARK-15294][SPARKR][MINOR] Add pivot functionality to ...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13295 @mhnatiuk Given that #13786 was merged, can you close this PR ? Only the PR authors have permission to close a PR in the Spark project --- If your project is set up for it, you can reply to this

[GitHub] spark pull request #13109: [SPARK-15319][SPARKR][DOCS] Fix SparkR doc layout...

2016-06-20 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13109#discussion_r67811551 --- Diff: R/pkg/R/stats.R --- @@ -134,9 +129,7 @@ setMethod("freqItems", signature(x = "SparkDataFrame", cols = "character&qu

[GitHub] spark issue #13109: [SPARK-15319][SPARKR][DOCS] Fix SparkR doc layout for co...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13109 LGTM. This version looks good to me. Thanks for iterating on this. Will wait for Jenkins and then merge. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #13109: [SPARK-15319][SPARKR][DOCS] Fix SparkR doc layout for co...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13109 @mengxr Yes - this is true and in #13798 we are making a few more of the methods into individual Rd files. At a high level there is a tradition in R to group together similar methods (https

[GitHub] spark issue #13109: [SPARK-15319][SPARKR][DOCS] Fix SparkR doc layout for co...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13109 @mengxr @felixcheung Can we open a new issue of the form `Separate out rd files for SparkR functions` ? We can then make a list there of everything thats sharing a rd file right now and see what

[GitHub] spark issue #13798: [SPARKR][DOCS] R code doc cleanup

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13798 Thanks for the update. This one looks fine to me now. @dongjoon-hyun Any other comments ? --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #13109: [SPARK-15319][SPARKR][DOCS] Fix SparkR doc layout for co...

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13109 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #13798: [SPARKR][DOCS] R code doc cleanup

2016-06-20 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13798 Alright I'm merging this to master and branch-2.0 so that it makes it to the RC. We can try and fix minor things going forward Thanks @felixcheung -- This is a much needed cle

[GitHub] spark pull request #13801: [SPARK-15177.1] [R] make SparkR model params and ...

2016-06-21 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13801#discussion_r67818028 --- Diff: R/pkg/R/mllib.R --- @@ -298,17 +296,17 @@ setMethod("summary", signature(object = "NaiveBayesModel"), #'

[GitHub] spark issue #13801: [SPARK-15177.1] [R] make SparkR model params and default...

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13801 Changes look fine given what was a part of #13023 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #13801: [SPARK-15177.1] [R] make SparkR model params and ...

2016-06-21 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13801#discussion_r67818449 --- Diff: R/pkg/R/mllib.R --- @@ -298,17 +296,17 @@ setMethod("summary", signature(object = "NaiveBayesModel"), #'

[GitHub] spark issue #13109: [SPARK-15319][SPARKR][DOCS] Fix SparkR doc layout for co...

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13109 Cool. Thanks - LGTM. Merging this to master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #13790: [SPARK-16082][SparkR]remove duplicated docs in dapply

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13790 Thanks -- I should have noticed it before the merge, but missed it. I have resolved the JIRA and put in a link to the PR there, so I think its all fine. --- If your project is set up for it, you

[GitHub] spark issue #13803: [SPARKR][DOC] R more doc fixes

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13803 cc @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark pull request #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-21 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13660#discussion_r67910379 --- Diff: docs/sparkr.md --- @@ -262,6 +262,83 @@ head(df) {% endhighlight %} +### Applying User-defined Function +In SparkR, we

[GitHub] spark issue #13803: [SPARKR][DOC] R more doc fixes

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13803 @felixcheung Can you change the JIRA in the title to https://issues.apache.org/jira/browse/SPARK-16109 ? I created a sub-task for statfunctions --- If your project is set up for it, you can

[GitHub] spark pull request #13803: [SPARKR][DOC] R more doc fixes

2016-06-21 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13803#discussion_r67911376 --- Diff: R/pkg/R/stats.R --- @@ -33,7 +32,7 @@ setOldClass("jobj") #' of `col2`. The name of the first column will be `$col1

[GitHub] spark issue #13803: [SPARK-16109][SPARKR][DOC] R more doc fixes

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13803 Other than the comment about the `family` link the rest of the changes look good --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #13805: [SPARK-16096][SPARKR] add union and deprecate unionAll

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13805 cc @liancheng Code changes look good to me --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13803: [SPARK-16109][SPARKR][DOC] R more doc fixes

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13803 Thanks - rebuilt locally and the docs look good. Will merge after Jenkins passes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #13805: [SPARK-16096][SPARKR] add union and deprecate unionAll

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13805 Just wondering -- is there a list of SparkSQL deprecations for 2.0.0 ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #13803: [SPARK-16109][SPARKR][DOC] R more doc fixes

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13803 LGTM. Merging this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-21 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13660#discussion_r67920626 --- Diff: docs/sparkr.md --- @@ -262,6 +262,83 @@ head(df) {% endhighlight %} +### Applying User-defined Function +In SparkR, we

[GitHub] spark issue #13803: [SPARK-16109][SPARKR][DOC] R more doc fixes

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13803 Yeah - thanks for all the work in cleaning this up. One thing that i was wondering is we could add some style guide checks or contribution guide rules on how to maintain documentation. We can

[GitHub] spark issue #13805: [SPARK-16096][SPARKR] add union and deprecate unionAll

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13805 Alright - yeah lets leave `explode` as is for now. LGTM. Merging this to master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #13584: [SPARK-15509][ML][SparkR] R MLlib algorithms should supp...

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13584 cc @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #13820: [SPARK-16107] [R] group glm methods in documentation

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13820 I think the `#D` comes from using `dontrun` https://github.com/wch/r-source/blob/e5b21d0397c607883ff25cca379687b86933d730/src/library/tools/R/Rd2ex.R#L72 I don't see an easy way to disable

[GitHub] spark issue #13803: [SPARK-16109][SPARKR][DOC] R more doc fixes

2016-06-21 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13803 Irrespective of people reading the guide, it'll at least be useful to point out what is the expected behavior in a code review etc. But yeah automatic style checks would be really cool. -

[GitHub] spark pull request #13838: [SPARK-16088][SPARKR] update setJobGroup, cancelJ...

2016-06-22 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13838#discussion_r68086736 --- Diff: R/pkg/R/sparkR.R --- @@ -392,47 +392,81 @@ sparkR.session <- function( #' Assigns a group ID to all the jobs started by this thread u

[GitHub] spark pull request #13760: [SPARK-16012][SparkR] gapplyCollect - applies a R...

2016-06-22 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13760#discussion_r68087556 --- Diff: R/pkg/R/group.R --- @@ -242,18 +235,73 @@ createMethods() setMethod("gapply", signature(x = &q

[GitHub] spark pull request #13838: [SPARK-16088][SPARKR] update setJobGroup, cancelJ...

2016-06-22 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13838#discussion_r68089154 --- Diff: R/pkg/R/sparkR.R --- @@ -392,47 +392,81 @@ sparkR.session <- function( #' Assigns a group ID to all the jobs started by this thread u

[GitHub] spark pull request #13760: [SPARK-16012][SparkR] gapplyCollect - applies a R...

2016-06-22 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13760#discussion_r68090167 --- Diff: R/pkg/R/group.R --- @@ -199,17 +199,10 @@ createMethods() #' Applies a R function to each group in the input Groupe

[GitHub] spark issue #13660: [SPARK-15672][R][DOC] R programming guide update

2016-06-22 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13660 @felixcheung @jkbradley any more comments on this ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #13760: [SPARK-16012][SparkR] implement gapplyCollect whi...

2016-06-23 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13760#discussion_r68268602 --- Diff: R/pkg/R/group.R --- @@ -199,17 +199,10 @@ createMethods() #' Applies a R function to each group in the input Groupe

[GitHub] spark issue #13838: [SPARK-16088][SPARKR] update setJobGroup, cancelJobGroup...

2016-06-23 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13838 LGTM. Merging this to master, branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13839: [SPARK-16128][SQL] Allow setting length of charac...

2016-06-23 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13839#discussion_r68317336 --- Diff: R/pkg/R/DataFrame.R --- @@ -177,8 +177,8 @@ setMethod("isLocal", #' @param x A SparkDataFrame #' @param numRows T

[GitHub] spark pull request #13839: [SPARK-16128][SQL] Allow setting length of charac...

2016-06-23 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13839#discussion_r68317660 --- Diff: R/pkg/R/DataFrame.R --- @@ -194,7 +194,12 @@ setMethod("isLocal", setMethod("showDF", signature

[GitHub] spark pull request #13877: [SPARK-16142] [R] group naiveBayes method docs in...

2016-06-23 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13877#discussion_r68352034 --- Diff: R/pkg/R/mllib.R --- @@ -390,23 +376,41 @@ setMethod("predict", signature(object = "KMeansModel"), return(d

[GitHub] spark issue #13877: [SPARK-16142] [R] group naiveBayes method docs in a sing...

2016-06-23 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13877 The new document in the screenshot looks pretty good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #13839: [SPARK-16128][SQL] Allow setting length of characters to...

2016-06-24 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13839 Thanks @ScrapCodes -- R changes LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13885: [SPARK-16184][SPARKR] conf API for SparkSession

2016-06-24 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13885#discussion_r68423467 --- Diff: R/pkg/NAMESPACE --- @@ -10,6 +10,7 @@ export("sparkR.session") export("sparkR.init") export("

[GitHub] spark pull request #13839: [SPARK-16128][SQL] Allow setting length of charac...

2016-06-24 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13839#discussion_r68433503 --- Diff: R/pkg/R/DataFrame.R --- @@ -194,7 +194,12 @@ setMethod("isLocal", setMethod("showDF", signature

[GitHub] spark pull request #13839: [SPARK-16128][SQL] Allow setting length of charac...

2016-06-24 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13839#discussion_r68480005 --- Diff: R/pkg/R/DataFrame.R --- @@ -194,7 +194,12 @@ setMethod("isLocal", setMethod("showDF", signature

[GitHub] spark issue #13760: [SPARK-16012][SparkR] implement gapplyCollect which will...

2016-06-24 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13760 @felixcheung Any other comments on this ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13885: [SPARK-16184][SPARKR] conf API for SparkSession

2016-06-25 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13885#discussion_r68491306 --- Diff: R/pkg/R/SQLContext.R --- @@ -110,11 +110,46 @@ infer_type <- function(x) { } } -getDefaultSqlSource <- fu

[GitHub] spark issue #13885: [SPARK-16184][SPARKR] conf API for SparkSession

2016-06-26 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/13885 LGTM. Thanks @felixcheung - Merging this into master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #13760: [SPARK-16012][SparkR] implement gapplyCollect whi...

2016-06-26 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13760#discussion_r68508656 --- Diff: R/pkg/R/group.R --- @@ -243,17 +236,73 @@ setMethod("gapply", signature(x = "GroupedData"), func

[GitHub] spark pull request: [SPARK-13389] [SparkR] SparkR support first/la...

2016-02-19 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/11267#issuecomment-186380904 Is this option used by base R / other R libraries like dplyr ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-9325][SPARK-R] collect() head() and sho...

2016-02-24 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/11336#issuecomment-188367966 Looks like the last one was just a flaky test in mllib. Lets try again Jenkins, retest this please --- If your project is set up for it, you can reply to

[GitHub] spark pull request: [SPARK-10903] [SPARKR] R - Simplify SQLContext...

2016-05-23 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/9192#issuecomment-221103064 Sorry I was busy last week and missed this -- but +1 to keeping backwards compatibility. BTW on that note will this also change the entry point in SparkR to be

[GitHub] spark pull request: [SPARK-12071][Doc] Document the behaviour of N...

2016-05-23 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13268#issuecomment-221125458 Jenkins, ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-12071][Doc] Document the behaviour of N...

2016-05-23 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13268#issuecomment-221131481 cc @felixcheung --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [MINOR][SPARKR][DOC] Add a description for run...

2016-05-23 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13217#issuecomment-221132865 @HyukjinKwon - Thanks a lot for updating the documentation and for working on #13165 - I think there are a number of R users who use Windows and having SparkR work on

[GitHub] spark pull request: [SPARK-8603][SPARKR] Incorrect file separator ...

2016-05-23 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13165#issuecomment-221132935 Related to my comment on #13217 -- I will test this out on windows using the new instructions. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-15412][PySpark][SparkR][DOCS] Improve l...

2016-05-23 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13199#discussion_r64311380 --- Diff: docs/README.md --- @@ -20,8 +20,10 @@ installed. Also install the following libraries: $ sudo pip install Pygments # Following

[GitHub] spark pull request: [SPARK-15319][SPARKR][DOCS] Fix SparkR doc lay...

2016-05-23 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13109#issuecomment-221137833 Chiming in a little late here -- from my R usage, I've definitely seen two patterns commonly used - sharing the same function for multiple generics and

[GitHub] spark pull request: [SPARK-12922][SparkR][WIP] Implement gapply() ...

2016-05-23 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/12836#discussion_r64315819 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala --- @@ -21,10 +21,12 @@ import scala.collection.JavaConverters

[GitHub] spark pull request: [SPARK-12922][SparkR][WIP] Implement gapply() ...

2016-05-23 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/12836#discussion_r64315787 --- Diff: core/src/main/scala/org/apache/spark/api/r/RRunner.scala --- @@ -149,12 +150,17 @@ private[spark] class RRunner[U

[GitHub] spark pull request: [SPARK-12922][SparkR][WIP] Implement gapply() ...

2016-05-23 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12836#issuecomment-221142351 @NarineK @sun-rui Thanks a lot for your work on this PR. I think the second option (of giving key and data.frame) is more intuitive / flexible as well. Would

[GitHub] spark pull request: [SPARK-15412][PySpark][SparkR][DOCS] Improve l...

2016-05-24 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13199#discussion_r64438148 --- Diff: docs/README.md --- @@ -20,8 +20,10 @@ installed. Also install the following libraries: $ sudo pip install Pygments # Following

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-24 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221462859 @wangmiao1981 Thanks for investigating this. Do you know why these are not failing in Jenkins though ? (the subset test and the pipedRDD one) --- If your project is

[GitHub] spark pull request: [SPARK-15412][PySpark][SparkR][DOCS] Improve l...

2016-05-24 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13199#issuecomment-221475488 LGTM. Merging this to master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12071][Doc] Document the behaviour of N...

2016-05-24 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13268#issuecomment-221475713 LGTM. Merging this to master and branch-2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-24 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221476116 Hmm looks like `startsWith` and `endsWith` were added in R 3.3.0 - See http://www.r-statistics.com/2016/05/r-3-3-0-is-released/ --- If your project is set up

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-24 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221476748 Continuing from my previous message, we can't add or remove `endsWith` and `startsWith` as we want to support all R versions from 3.1.0 onwards. We could get

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-24 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221477450 Ok - so that explains one of the problems. Does anybody know what the problem in `subset` is ? --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-12922][SparkR][WIP] Implement gapply() ...

2016-05-24 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12836#issuecomment-221478783 Lets keep it as `dapply` - The specific choice of applying on a partition as a data frame is built into its semantics. If we do build a single row UDF then we can

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-24 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13284#discussion_r64520757 --- Diff: R/pkg/inst/tests/testthat/test_context.R --- @@ -24,10 +24,18 @@ test_that("Check masked functions", { func <- lapply(maske

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-24 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221483871 Ok - lets just fix the masking and the subset ones in this PR. Once we understand what is the problem with `pipeRDD` we can fix it in a separate PR. --- If your

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-25 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221663730 @wangmiao1981 Lets continue the pipeRDD debugging on the JIRA. This change LGTM for the subset and the masking tests @felixcheung any other comments

[GitHub] spark pull request: [SPARK-8603][SPARKR] Incorrect file separator ...

2016-05-25 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13165#discussion_r64628244 --- Diff: R/pkg/R/client.R --- @@ -60,6 +60,15 @@ generateSparkSubmitArgs <- function(args, sparkHome, jars, sparkSubmitOpts, pack combinedA

[GitHub] spark pull request: [SPARK-15294][SPARKR][MINOR] Add pivot functio...

2016-05-25 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13295#issuecomment-221670324 Thanks @mhnatiuk for opening this PR. Could we also add a unit test in `test_sparkSQL.R` for this ? --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-15294][SPARKR][MINOR] Add pivot functio...

2016-05-25 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13295#issuecomment-221670088 Jenkins, ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-12922][SparkR][WIP] Implement gapply() ...

2016-05-25 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12836#issuecomment-221669659 Hmm - What is the difference between `dapply_row` and SQL row UDF ? anyways this discussion probably belongs in a new JIRA and not in this PR --- If your project is

[GitHub] spark pull request: [SPARK-10903] [SPARKR] R - Simplify SQLContext...

2016-05-25 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/9192#discussion_r64633209 --- Diff: R/pkg/R/SQLContext.R --- @@ -254,6 +301,7 @@ jsonFile <- function(sqlContext, path) { #' df <- jsonRDD(sqlC

<    7   8   9   10   11   12   13   14   15   16   >