[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r65285481 --- Diff: R/pkg/R/DataFrame.R --- @@ -2514,7 +2529,9 @@ setMethod("attach", #' environment. Then, the given expression is evaluated in this new #' environment. #' +#' @title with --- End diff -- I think we should follow the example of existing R packages and use the long form as the title. For example if you look at https://stat.ethz.ch/R-manual/R-devel/library/stats/html/glm.html the title of the page is "Fitting Generalized Linear Models" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r65284357 --- Diff: R/pkg/R/DataFrame.R --- @@ -2514,7 +2529,9 @@ setMethod("attach", #' environment. Then, the given expression is evaluated in this new #' environment. #' +#' @title with --- End diff -- @shivaram Yes, I also notice titles of other examples are not consistent. Which one should we use? Short description or just the name of the method. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r65268646 --- Diff: R/pkg/R/stats.R --- @@ -135,13 +136,13 @@ setMethod("freqItems", signature(x = "SparkDataFrame", cols = "character"), #' Calculates the approximate quantiles of a numerical column of a SparkDataFrame. #' #' The result of this algorithm has the following deterministic bound: -#' If the SparkDataFrame has N elements and if we request the quantile at probability `p` up to -#' error `err`, then the algorithm will return a sample `x` from the SparkDataFrame so that the -#' *exact* rank of `x` is close to (p * N). More precisely, -#' floor((p - err) * N) <= rank(x) <= ceil((p + err) * N). -#' This method implements a variation of the Greenwald-Khanna algorithm (with some speed -#' optimizations). The algorithm was first present in [[http://dx.doi.org/10.1145/375663.375670 -#' Space-efficient Online Computation of Quantile Summaries]] by Greenwald and Khanna. +#' If the SparkDataFrame has N elements and if we request the quantile at probability \strong{p} up --- End diff -- @shivaram Looking at the doc page for statfunctions, a lot of functions are being mushed together. E.g., "col1" and "col2" appear under the "Arguments" section many times. What is the best way to separate these methods? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r65267172 --- Diff: R/pkg/R/DataFrame.R --- @@ -2514,7 +2529,9 @@ setMethod("attach", #' environment. Then, the given expression is evaluated in this new #' environment. #' +#' @title with --- End diff -- @shivaram Is this supposed to be a long-form title, or just the name of the method? Looking at other examples, it looks like it should be a short description --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/13394 Thanks @vectorijk - I created https://issues.apache.org/jira/browse/SPARK-15672 for that --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13394 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59648/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13394 **[Test build #59648 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59648/consoleFull)** for PR 13394 at commit [`294cadd`](https://github.com/apache/spark/commit/294cadda4f790dd0e6df18501de363ce9aad0071). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13394 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/13394 **[Test build #59648 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59648/consoleFull)** for PR 13394 at commit [`294cadd`](https://github.com/apache/spark/commit/294cadda4f790dd0e6df18501de363ce9aad0071). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...
Github user vectorijk commented on the pull request: https://github.com/apache/spark/pull/13394 @shivaram For updating the programming guide, I'd love to do this in a separate PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r65163283 --- Diff: R/pkg/R/stats.R --- @@ -19,12 +19,11 @@ setOldClass("jobj") -#' crosstab -#' #' Computes a pair-wise frequency table of the given columns. Also known as a contingency #' table. The number of distinct values for each column should be less than 1e4. At most 1e6 #' non-zero pair frequencies will be returned. #' +#' @title Statistic functions for SparkDataFrames --- End diff -- I will remove title here. Meanwhile, I will leave links revise here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...
Github user vectorijk commented on a diff in the pull request: https://github.com/apache/spark/pull/13394#discussion_r65162041 --- Diff: R/pkg/R/DataFrame.R --- @@ -1069,7 +1080,10 @@ setMethod("first", #' #' @param x A SparkDataFrame #' -#' @noRd +#' @family SparkDataFrame functions +#' @rdname toRDD --- End diff -- ok, I will change this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org