[GitHub] spark pull request #17161: [SPARK-19819][SparkR] Use concrete data in SparkR...

2017-05-24 Thread actuaryzhang
Github user actuaryzhang closed the pull request at:

https://github.com/apache/spark/pull/17161


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17161: [SPARK-19819][SparkR] Use concrete data in SparkR...

2017-03-04 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/17161#discussion_r104306085
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -741,12 +724,12 @@ setMethod("coalesce",
 #' @examples
 #'\dontrun{
 #' sparkR.session()
-#' path <- "path/to/file.json"
-#' df <- read.json(path)
+#' df <- createDataFrame(mtcars)
+#' newDF <- coalesce(df, 1L)
--- End diff --

should probably not have coalesce in the example blob for repartition


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17161: [SPARK-19819][SparkR] Use concrete data in SparkR...

2017-03-04 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/17161#discussion_r104306095
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -548,10 +537,9 @@ setMethod("registerTempTable",
 #' @examples
 #'\dontrun{
 #' sparkR.session()
-#' df <- read.df(path, "parquet")
-#' df2 <- read.df(path2, "parquet")
-#' createOrReplaceTempView(df, "table1")
-#' insertInto(df2, "table1", overwrite = TRUE)
+#' df <- limit(createDataFrame(faithful), 5)
--- End diff --

why limit?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17161: [SPARK-19819][SparkR] Use concrete data in SparkR...

2017-03-04 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/17161#discussion_r104306091
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -741,12 +724,12 @@ setMethod("coalesce",
 #' @examples
 #'\dontrun{
 #' sparkR.session()
-#' path <- "path/to/file.json"
-#' df <- read.json(path)
+#' df <- createDataFrame(mtcars)
+#' newDF <- coalesce(df, 1L)
 #' newDF <- repartition(df, 2L)
 #' newDF <- repartition(df, numPartitions = 2L)
-#' newDF <- repartition(df, col = df$"col1", df$"col2")
-#' newDF <- repartition(df, 3L, col = df$"col1", df$"col2")
+#' newDF <- repartition(df, col = df[[1]], df[[2]])
--- End diff --

showing as an example column reference with `$name` is important too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17161: [SPARK-19819][SparkR] Use concrete data in SparkR...

2017-03-04 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/17161#discussion_r104306047
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -2805,10 +2779,9 @@ setMethod("except",
 #' @examples
 #'\dontrun{
 #' sparkR.session()
-#' path <- "path/to/file.json"
-#' df <- read.json(path)
-#' write.df(df, "myfile", "parquet", "overwrite")
-#' saveDF(df, parquetPath2, "parquet", mode = saveMode, mergeSchema = 
mergeSchema)
+#' df <- createDataFrame(mtcars)
+#' write.df(df, tempfile(), "parquet", "overwrite")
--- End diff --

I think we should avoid having `tempfile()` as output path in example, as 
that might point users into the wrong direction - anything saved in tempfile 
will disappear as soon as the R session ends.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17161: [SPARK-19819][SparkR] Use concrete data in SparkR...

2017-03-04 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/17161#discussion_r104306070
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1123,10 +1096,9 @@ setMethod("dim",
 #' @examples
 #'\dontrun{
 #' sparkR.session()
-#' path <- "path/to/file.json"
-#' df <- read.json(path)
+#' df <- createDataFrame(mtcars)
 #' collected <- collect(df)
-#' firstName <- collected[[1]]$name
+#' collected[[1]]
--- End diff --

right, that seems rather unnecessary. any other idea on how to show it is a 
data.frame?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17161: [SPARK-19819][SparkR] Use concrete data in SparkR...

2017-03-04 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/17161#discussion_r104284980
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -92,8 +92,7 @@ dataFrame <- function(sdf, isCached = FALSE) {
 #' @examples
 #'\dontrun{
 #' sparkR.session()
-#' path <- "path/to/file.json"
-#' df <- read.json(path)
+#' df <- createDataFrame(mtcars)
--- End diff --

`mtcars` is built in to R, like `iris`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17161: [SPARK-19819][SparkR] Use concrete data in SparkR...

2017-03-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/17161#discussion_r104283930
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -92,8 +92,7 @@ dataFrame <- function(sdf, isCached = FALSE) {
 #' @examples
 #'\dontrun{
 #' sparkR.session()
-#' path <- "path/to/file.json"
-#' df <- read.json(path)
+#' df <- createDataFrame(mtcars)
--- End diff --

Should we define `mtcars` to run this example?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17161: [SPARK-19819][SparkR] Use concrete data in SparkR...

2017-03-04 Thread actuaryzhang
GitHub user actuaryzhang opened a pull request:

https://github.com/apache/spark/pull/17161

[SPARK-19819][SparkR] Use concrete data in SparkR DataFrame examples

## What changes were proposed in this pull request?
Many examples in SparkDataFrame methods uses:
```
path <- "path/to/file.json"
df <- read.json(path)
```
This is not directly runnable. Replace this with real numerical examples so 
that users can directly execute the examples.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/actuaryzhang/spark sparkRDoc2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17161.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17161


commit 7e50acc4f270400aa00a728a9487ee1d5c1cc4fc
Author: actuaryzhang 
Date:   2017-03-04T08:45:56Z

update dataframe doc with examples

commit c4c14adf399bb7b5d272d574113edb7e9ee26d01
Author: actuaryzhang 
Date:   2017-03-04T08:53:27Z

update examples




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org