[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/13394#discussion_r65285481
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -2514,7 +2529,9 @@ setMethod("attach",
 #' environment. Then, the given expression is evaluated in this new
 #' environment.
 #'
+#' @title with
--- End diff --

I think we should follow the example of existing R packages and use the 
long form as the title. For example if you look at 
https://stat.ethz.ch/R-manual/R-devel/library/stats/html/glm.html the title of 
the page is "Fitting Generalized Linear Models"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread vectorijk
Github user vectorijk commented on a diff in the pull request:

https://github.com/apache/spark/pull/13394#discussion_r65284357
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -2514,7 +2529,9 @@ setMethod("attach",
 #' environment. Then, the given expression is evaluated in this new
 #' environment.
 #'
+#' @title with
--- End diff --

@shivaram Yes, I also notice titles of other examples are not consistent. 
Which one should we use? Short description or just the name of the method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/13394#discussion_r65268646
  
--- Diff: R/pkg/R/stats.R ---
@@ -135,13 +136,13 @@ setMethod("freqItems", signature(x = 
"SparkDataFrame", cols = "character"),
 #' Calculates the approximate quantiles of a numerical column of a 
SparkDataFrame.
 #'
 #' The result of this algorithm has the following deterministic bound:
-#' If the SparkDataFrame has N elements and if we request the quantile at 
probability `p` up to
-#' error `err`, then the algorithm will return a sample `x` from the 
SparkDataFrame so that the
-#' *exact* rank of `x` is close to (p * N). More precisely,
-#'   floor((p - err) * N) <= rank(x) <= ceil((p + err) * N).
-#' This method implements a variation of the Greenwald-Khanna algorithm 
(with some speed
-#' optimizations). The algorithm was first present in 
[[http://dx.doi.org/10.1145/375663.375670
-#' Space-efficient Online Computation of Quantile Summaries]] by Greenwald 
and Khanna.
+#' If the SparkDataFrame has N elements and if we request the quantile at 
probability \strong{p} up
--- End diff --

@shivaram  Looking at the doc page for statfunctions, a lot of functions 
are being mushed together.  E.g., "col1" and "col2" appear under the 
"Arguments" section many times.  What is the best way to separate these methods?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/13394#discussion_r65267172
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -2514,7 +2529,9 @@ setMethod("attach",
 #' environment. Then, the given expression is evaluated in this new
 #' environment.
 #'
+#' @title with
--- End diff --

@shivaram Is this supposed to be a long-form title, or just the name of the 
method?  Looking at other examples, it looks like it should be a short 
description


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/13394
  
Thanks @vectorijk - I created 
https://issues.apache.org/jira/browse/SPARK-15672 for that


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13394
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59648/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13394
  
**[Test build #59648 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59648/consoleFull)**
 for PR 13394 at commit 
[`294cadd`](https://github.com/apache/spark/commit/294cadda4f790dd0e6df18501de363ce9aad0071).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13394
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13394
  
**[Test build #59648 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59648/consoleFull)**
 for PR 13394 at commit 
[`294cadd`](https://github.com/apache/spark/commit/294cadda4f790dd0e6df18501de363ce9aad0071).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread vectorijk
Github user vectorijk commented on the pull request:

https://github.com/apache/spark/pull/13394
  
@shivaram For updating the programming guide, I'd love to do this in a 
separate PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread vectorijk
Github user vectorijk commented on a diff in the pull request:

https://github.com/apache/spark/pull/13394#discussion_r65163283
  
--- Diff: R/pkg/R/stats.R ---
@@ -19,12 +19,11 @@
 
 setOldClass("jobj")
 
-#' crosstab
-#'
 #' Computes a pair-wise frequency table of the given columns. Also known 
as a contingency
 #' table. The number of distinct values for each column should be less 
than 1e4. At most 1e6
 #' non-zero pair frequencies will be returned.
 #'
+#' @title Statistic functions for SparkDataFrames
--- End diff --

I will remove title here. Meanwhile, I will leave links revise here. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API ...

2016-05-31 Thread vectorijk
Github user vectorijk commented on a diff in the pull request:

https://github.com/apache/spark/pull/13394#discussion_r65162041
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1069,7 +1080,10 @@ setMethod("first",
 #'
 #' @param x A SparkDataFrame
 #'
-#' @noRd
+#' @family SparkDataFrame functions
+#' @rdname toRDD
--- End diff --

ok, I will change this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org