[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17825 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r115152814 --- Diff: R/pkg/R/DataFrame.R --- @@ -3745,3 +3745,26 @@ setMethod("hint", jdf <- callJMethod(x@sdf, "hint", name, parameters) dataFrame(jdf) }) + +#' alias +#' +#' @aliases alias,SparkDataFrame-method +#' @family SparkDataFrame functions +#' @rdname alias +#' @name alias +#' @examples --- End diff -- true, it's more for tracking it manually --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user zero323 commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r115113723 --- Diff: R/pkg/R/DataFrame.R --- @@ -3745,3 +3745,26 @@ setMethod("hint", jdf <- callJMethod(x@sdf, "hint", name, parameters) dataFrame(jdf) }) + +#' alias +#' +#' @aliases alias,SparkDataFrame-method +#' @family SparkDataFrame functions +#' @rdname alias +#' @name alias +#' @examples --- End diff -- Done, but do we actually need this? We don't use roxygen to maintain `NAMESPACE`, and (I believe i mentioned this before) we `@export` objects which are not really exported. Just saying... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r115113346 --- Diff: R/pkg/R/DataFrame.R --- @@ -3745,3 +3745,26 @@ setMethod("hint", jdf <- callJMethod(x@sdf, "hint", name, parameters) dataFrame(jdf) }) + +#' alias +#' +#' @aliases alias,SparkDataFrame-method +#' @family SparkDataFrame functions +#' @rdname alias +#' @name alias +#' @examples --- End diff -- add `@export` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r115113331 --- Diff: R/pkg/R/generics.R --- @@ -387,6 +387,16 @@ setGeneric("value", function(bcast) { standardGeneric("value") }) #' @export setGeneric("agg", function (x, ...) { standardGeneric("agg") }) +#' alias +#' +#' Set a new name for a Column or a SparkDataFrame. Equivalent to SQL "AS" keyword. +#' +#' @name alias +#' @rdname alias +#' @param object x a Column or a SparkDataFrame +#' @param data new name to use --- End diff -- sigh, sadly I think you have captured all the constraints we are working with here. let's get the 3 lines in the same order ``` #' Returns a new SparkDataFrame or Column with an alias set. Equivalent to SQL "AS" keyword. #' @param object x a Column or a SparkDataFrame #' @return a Column or a SparkDataFrame ``` to ``` #' Returns a new SparkDataFrame or Column with an alias set. Equivalent to SQL "AS" keyword. #' @param object x a SparkDataFrame or Column #' @return a SparkDataFrame or a Column ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user zero323 commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r115085302 --- Diff: R/pkg/R/generics.R --- @@ -387,6 +387,16 @@ setGeneric("value", function(bcast) { standardGeneric("value") }) #' @export setGeneric("agg", function (x, ...) { standardGeneric("agg") }) +#' alias +#' +#' Set a new name for a Column or a SparkDataFrame. Equivalent to SQL "AS" keyword. +#' +#' @name alias +#' @rdname alias +#' @param object x a Column or a SparkDataFrame +#' @param data new name to use --- End diff -- On the bright side it looks like matching `@rdname` and `@aliases` like: ```r #' alias #' #' @aliases alias,SparkDataFrame-method #' @family SparkDataFrame functions #' @rdname alias,SparkDataFrame-method #' @name alias ... ``` and ```r #' alias #' #' @aliases alias,SparkDataFrame-method #' @family SparkDataFrame functions #' @rdname alias,SparkDataFrame-method #' @name alias ... ``` (I hope this is what you mean) indeed solves SPARK-18825. But it doesn't generate any docs for these two and makes CRAN checker unhappy: ``` Undocumented S4 methods: generic 'alias' and siglist 'Column' generic 'alias' and siglist 'SparkDataFrame' ``` Docs for generic are created but it doesn't help us here. Even if we bring `@examples` there we still have to deal with CRAN. Theres is also my favorite `\name must exist and be unique in Rd files` which doesn't gives us much room here, does it? I opened to suggestions, but personally I am out ideas. I've been digging trough `roxygen` docs, but between CRAN, S4 requirements, `roxygen` limitation and our own rules there is not much room left. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114932485 --- Diff: R/pkg/R/generics.R --- @@ -387,6 +387,16 @@ setGeneric("value", function(bcast) { standardGeneric("value") }) #' @export setGeneric("agg", function (x, ...) { standardGeneric("agg") }) +#' alias +#' +#' Set a new name for a Column or a SparkDataFrame. Equivalent to SQL "AS" keyword. +#' +#' @name alias +#' @rdname alias +#' @param object x a Column or a SparkDataFrame +#' @param data new name to use --- End diff -- that's true actually. if you think it's useful we could always have them in separate rd. I'm pretty sure `@rdname` needs to match `@aliases` to fix multiple link bug https://issues.apache.org/jira/browse/SPARK-18825; which means we can't have multiple functions in the same rd - each has to have its own. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user zero323 commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114931344 --- Diff: R/pkg/R/generics.R --- @@ -387,6 +387,16 @@ setGeneric("value", function(bcast) { standardGeneric("value") }) #' @export setGeneric("agg", function (x, ...) { standardGeneric("agg") }) +#' alias +#' +#' Set a new name for a Column or a SparkDataFrame. Equivalent to SQL "AS" keyword. --- End diff -- I still believe that AS is applicable to both. Essentially what we do is: ``` SELECT column AS new_column FROM table ``` and ``` (SELECT * FROM table) AS new_table ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user zero323 commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114931185 --- Diff: R/pkg/R/generics.R --- @@ -387,6 +387,16 @@ setGeneric("value", function(bcast) { standardGeneric("value") }) #' @export setGeneric("agg", function (x, ...) { standardGeneric("agg") }) +#' alias +#' +#' Set a new name for a Column or a SparkDataFrame. Equivalent to SQL "AS" keyword. +#' +#' @name alias +#' @rdname alias +#' @param object x a Column or a SparkDataFrame +#' @param data new name to use --- End diff -- To be honest I find both equally confusing, so if you think that a single annotation is better, I am happy to oblige. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114929845 --- Diff: R/pkg/R/generics.R --- @@ -387,6 +387,16 @@ setGeneric("value", function(bcast) { standardGeneric("value") }) #' @export setGeneric("agg", function (x, ...) { standardGeneric("agg") }) +#' alias +#' +#' Set a new name for a Column or a SparkDataFrame. Equivalent to SQL "AS" keyword. +#' +#' @name alias +#' @rdname alias +#' @param object x a Column or a SparkDataFrame +#' @param data new name to use --- End diff -- that we did, at one point. I think the feedback is we could have one line for parameter (`object`) and return value could be more but which line matches which input parameter type? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user zero323 commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114929528 --- Diff: R/pkg/R/generics.R --- @@ -387,6 +387,16 @@ setGeneric("value", function(bcast) { standardGeneric("value") }) #' @export setGeneric("agg", function (x, ...) { standardGeneric("agg") }) +#' alias +#' +#' Set a new name for a Column or a SparkDataFrame. Equivalent to SQL "AS" keyword. +#' +#' @name alias +#' @rdname alias +#' @param object x a Column or a SparkDataFrame +#' @param data new name to use --- End diff -- Wouldn't be better to annotate actual implementations? To get something like this: ![image](https://cloud.githubusercontent.com/assets/1554276/25733425/295f465e-3159-11e7-87b7-d959c9bf3352.png) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114928953 --- Diff: R/pkg/R/generics.R --- @@ -387,6 +387,16 @@ setGeneric("value", function(bcast) { standardGeneric("value") }) #' @export setGeneric("agg", function (x, ...) { standardGeneric("agg") }) +#' alias +#' +#' Set a new name for a Column or a SparkDataFrame. Equivalent to SQL "AS" keyword. +#' +#' @name alias +#' @rdname alias +#' @param object x a Column or a SparkDataFrame +#' @param data new name to use --- End diff -- shouldn't we have a `@return` here? perhaps to say ``` Returns a new SparkDataFrame or Column with an alias set. For Column, equivalent to SQL "AS" keyword. @return a new SparkDataFrame or Column ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114928655 --- Diff: R/pkg/R/generics.R --- @@ -387,6 +387,16 @@ setGeneric("value", function(bcast) { standardGeneric("value") }) #' @export setGeneric("agg", function (x, ...) { standardGeneric("agg") }) +#' alias +#' +#' Set a new name for a Column or a SparkDataFrame. Equivalent to SQL "AS" keyword. --- End diff -- I guess we don't say `return a new Column` but more generally `return a Column` and in other cases we say `return a new SparkDataFrame` so I guess it's a difference in wording. I think what you propose is fine, though do you think it's confusing to say `Equivalent to SQL "AS" keyword.` because that makes sense only for Column and not the whole dataframe? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user zero323 closed the pull request at: https://github.com/apache/spark/pull/17825 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
GitHub user zero323 reopened a pull request: https://github.com/apache/spark/pull/17825 [SPARK-20550][SPARKR] R wrapper for Dataset.alias ## What changes were proposed in this pull request? - Add SparkR wrapper for `Dataset.alias`. - Adjust roxygen annotations for `functions.alias` (including example usage). ## How was this patch tested? Unit tests, `check_cran.sh`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zero323/spark SPARK-20550 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17825.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17825 commit 944a3ec791a8f103093e24511e895a4ce60970d8 Author: zero323Date: 2017-05-01T08:59:24Z Initial implementation commit 5e9f8da45c432e0752e5e78556add33e0a6d0557 Author: zero323 Date: 2017-05-01T22:27:11Z Adjust argument annotations - Remove param annotations from dataframe.alias - Use generic annotations for column.alias commit 73133f9442ad8317fb12b600221962bf47d8a95c Author: zero323 Date: 2017-05-01T22:31:26Z Add usage examples to column.alias commit 848eeefc1f18c6aabaf65e6efed259a2fa5c19c3 Author: zero323 Date: 2017-05-01T22:34:51Z Remove return type annotation commit 05c0781110b42a940e06cc31650449a8715e85c9 Author: zero323 Date: 2017-05-02T02:00:13Z Fix typo commit 22d7cf661bb54a8f7f9c660e1d914802f1eb4153 Author: zero323 Date: 2017-05-02T04:25:34Z Move dontruns to their own lines commit 22e1292557f1a5597cde6337267a099bbcdc07aa Author: zero323 Date: 2017-05-02T04:27:11Z Extend param description commit 6bb3d914960d1cf63e582a7d732ca80ed321e9c5 Author: zero323 Date: 2017-05-02T04:33:34Z Add type annotations to since notes commit b3c1a416a16a9d32649edda2b66fc9c3476358a5 Author: zero323 Date: 2017-05-02T04:38:51Z Attach alias test to select-with-column test case commit 40fedcb8c41bc84deead205aad81e84c095045b5 Author: zero323 Date: 2017-05-02T04:44:45Z Extend description commit 1e1ad443751fc3dc93487e5385cc934feb93f631 Author: zero323 Date: 2017-05-03T00:25:15Z Move alias documentation to generics commit 2d5ace288f2443327696823c343c095f0d8d64ca Author: zero323 Date: 2017-05-04T01:13:45Z Add family annotation commit 5fe5495580eb3852ea5092a34dc2334c0e45c9b7 Author: zero323 Date: 2017-05-04T06:32:54Z Check that stats::alias is not masked commit 09f9ccaf5e66a400d26b4ab6d600d951305d5fd3 Author: zero323 Date: 2017-05-04T07:04:52Z Fix style commit f1c74f338b8df865a5e8b9a6e281211aa27af7d3 Author: zero323 Date: 2017-05-04T10:17:42Z vim --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user zero323 commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114925159 --- Diff: R/pkg/R/generics.R --- @@ -387,6 +387,16 @@ setGeneric("value", function(bcast) { standardGeneric("value") }) #' @export setGeneric("agg", function (x, ...) { standardGeneric("agg") }) +#' alias +#' +#' Set a new name for a Column or a SparkDataFrame. Equivalent to SQL "AS" keyword. --- End diff -- How about? ``` #' Return a new Column or a SparkDataFrame with a name set. Equivalent to SQL "AS" keyword. ``` Is the `Column` new? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114924076 --- Diff: R/pkg/R/generics.R --- @@ -387,6 +387,16 @@ setGeneric("value", function(bcast) { standardGeneric("value") }) #' @export setGeneric("agg", function (x, ...) { standardGeneric("agg") }) +#' alias +#' +#' Set a new name for a Column or a SparkDataFrame. Equivalent to SQL "AS" keyword. --- End diff -- right - I think again we should emphasize on returning a new SparkDataFrame --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114714846 --- Diff: R/pkg/R/DataFrame.R --- @@ -3715,3 +3715,25 @@ setMethod("rollup", sgd <- callJMethod(x@sdf, "rollup", jcol) groupedData(sgd) }) + +#' alias +#' +#' @aliases alias,SparkDataFrame-method +#' @rdname alias +#' @name alias +#' @examples --- End diff -- yes! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user zero323 commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114687096 --- Diff: R/pkg/R/DataFrame.R --- @@ -3715,3 +3715,25 @@ setMethod("rollup", sgd <- callJMethod(x@sdf, "rollup", jcol) groupedData(sgd) }) + +#' alias +#' +#' @aliases alias,SparkDataFrame-method +#' @rdname alias +#' @name alias +#' @examples --- End diff -- I general it would nice to sweep all the files to make it more consistent. Capitalization, punctuation, examples. return and such. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114588642 --- Diff: R/pkg/R/DataFrame.R --- @@ -3715,3 +3715,25 @@ setMethod("rollup", sgd <- callJMethod(x@sdf, "rollup", jcol) groupedData(sgd) }) + +#' alias +#' +#' @aliases alias,SparkDataFrame-method +#' @rdname alias +#' @name alias +#' @examples --- End diff -- add `@family SparkDataFrame functions` I think we should probably review all these `@family` at one point... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user zero323 commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114457011 --- Diff: R/pkg/R/column.R --- @@ -132,17 +132,24 @@ createMethods() #' alias #' -#' Set a new name for a column +#' Set a new name for an object. Equivalent to SQL "AS" keyword. --- End diff -- Moving to `generics.R` sounds good. "Column or SparkDataFrame" in place of "object" as well. Regarding "AS"... In SQL it can be used with both expressions and tables so I deliberately didn't quantify this with `Column`. I am not sure if we really need to state that it returns a new object. Maybe _Return a new Column or SparkDataFrame with an alias. Equivalent to SQL "AS" keyword._? But it doesn't sound great. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114366868 --- Diff: R/pkg/R/column.R --- @@ -132,17 +132,24 @@ createMethods() #' alias #' -#' Set a new name for a column +#' Set a new name for an object. Equivalent to SQL "AS" keyword. --- End diff -- Also, I think this doc block (description, param list specifically) should be move to DataFrame.R or generic.R as mentioned before. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114366327 --- Diff: R/pkg/R/column.R --- @@ -132,17 +132,24 @@ createMethods() #' alias #' -#' Set a new name for a column +#' Set a new name for an object. Equivalent to SQL "AS" keyword. --- End diff -- right, this is Scala doc for Column.alias `Gives the column an alias` (which is not very concise) Dataset.alias `Returns a new Dataset with an alias set.` I think we need to say `Set a new name to return as a new object` or similar. Actually I think we should say "Column or SparkDataFrame" in place of "object" - what do you think? I think the `SQL "AS"` part but perhaps it will be more clear if lead with "for Column, ..."? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114245870 --- Diff: R/pkg/R/DataFrame.R --- @@ -3715,3 +3715,24 @@ setMethod("rollup", sgd <- callJMethod(x@sdf, "rollup", jcol) groupedData(sgd) }) + +#' alias +#' +#' @aliases alias,SparkDataFrame-method +#' @rdname alias +#' @name alias +#' @examples \dontrun{ --- End diff -- same here --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114245853 --- Diff: R/pkg/R/column.R --- @@ -132,16 +132,23 @@ createMethods() #' alias #' -#' Set a new name for a column +#' Set a new name for an object #' -#' @param object Column to rename +#' @param object object to rename #' @param data new name to use #' #' @rdname alias #' @name alias #' @aliases alias,Column-method #' @family colum_func #' @export +#' @examples \dontrun{ --- End diff -- think generally we put \dontrun on the next line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114245818 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -2253,6 +2253,15 @@ test_that("mutate(), transform(), rename() and names()", { detach(airquality) }) +test_that("alias on SparkDataFrame", { + df <- alias(read.df(jsonPath, "json"), "table") --- End diff -- because trying to make a set of tests that makes sense for CRAN https://github.com/apache/spark/pull/17817 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114245756 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -2253,6 +2253,15 @@ test_that("mutate(), transform(), rename() and names()", { detach(airquality) }) +test_that("alias on SparkDataFrame", { + df <- alias(read.df(jsonPath, "json"), "table") --- End diff -- instead of adding a new test, add to one already naming things to reuse an existing df? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17825#discussion_r114245780 --- Diff: R/pkg/R/DataFrame.R --- @@ -3715,3 +3715,24 @@ setMethod("rollup", sgd <- callJMethod(x@sdf, "rollup", jcol) groupedData(sgd) }) + +#' alias +#' +#' @aliases alias,SparkDataFrame-method +#' @rdname alias +#' @name alias +#' @examples \dontrun{ +#' df <- alias(createDataFrame(mtcars), "mtcars") +#' avg_mpg <- alias(agg(groupBy(df, df$cyl), avg(df$mpg)), "avg_mpg") +#' +#' head(select(df, column("mtcars.mpg"))) +#' head(join(df, avg_mpg, column("mtcars.cyl") == column("avg_mpg.cyl"))) +#' } +#' @note alias since 2.3.0 --- End diff -- then we put type in the note for each overload https://github.com/apache/spark/blob/master/R/pkg/R/mllib_classification.R#L121 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17825: [SPARK-20550][SPARKR] R wrapper for Dataset.alias
GitHub user zero323 opened a pull request: https://github.com/apache/spark/pull/17825 [SPARK-20550][SPARKR] R wrapper for Dataset.alias ## What changes were proposed in this pull request? Add SparkR wrapper for `Dataset.alias`. ## How was this patch tested? Unit tests, `check_cran.sh`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zero323/spark SPARK-20550 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17825.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17825 commit 87560ddf680b3d197cc80806bca7f8cadfe277c3 Author: zero323Date: 2017-05-01T08:59:24Z Initial implementation commit 8e3d3be3715c4e79f20cfe30da10a428a4cde600 Author: zero323 Date: 2017-05-01T22:27:11Z Adjust argument annotations - Remove param annotations from dataframe.alias - Use generic annotations for column.alias commit e281ec4cfe0724f079ad711be46b82e06bea20de Author: zero323 Date: 2017-05-01T22:31:26Z Add usage examples to column.alias commit b7d079b3601cabb86c18108e0eff6e5692a3640c Author: zero323 Date: 2017-05-01T22:34:51Z Remove return type annotation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org