spark git commit: [SPARKR][DOC] fix typo in vignettes

2017-05-08 Thread felixcheung
Repository: spark
Updated Branches:
  refs/heads/master 42cc6d13e -> 2fdaeb52b


[SPARKR][DOC] fix typo in vignettes

## What changes were proposed in this pull request?
Fix typo in vignettes

Author: Wayne Zhang 

Closes #17884 from actuaryzhang/typo.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2fdaeb52
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2fdaeb52
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2fdaeb52

Branch: refs/heads/master
Commit: 2fdaeb52bbe2ed1a9127ac72917286e505303c85
Parents: 42cc6d1
Author: Wayne Zhang 
Authored: Sun May 7 23:16:30 2017 -0700
Committer: Felix Cheung 
Committed: Sun May 7 23:16:30 2017 -0700

--
 R/pkg/vignettes/sparkr-vignettes.Rmd | 36 +++
 1 file changed, 18 insertions(+), 18 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/2fdaeb52/R/pkg/vignettes/sparkr-vignettes.Rmd
--
diff --git a/R/pkg/vignettes/sparkr-vignettes.Rmd 
b/R/pkg/vignettes/sparkr-vignettes.Rmd
index d38ec4f..49f4ab8 100644
--- a/R/pkg/vignettes/sparkr-vignettes.Rmd
+++ b/R/pkg/vignettes/sparkr-vignettes.Rmd
@@ -65,7 +65,7 @@ We can view the first few rows of the `SparkDataFrame` by 
`head` or `showDF` fun
 head(carsDF)
 ```
 
-Common data processing operations such as `filter`, `select` are supported on 
the `SparkDataFrame`.
+Common data processing operations such as `filter` and `select` are supported 
on the `SparkDataFrame`.
 ```{r}
 carsSubDF <- select(carsDF, "model", "mpg", "hp")
 carsSubDF <- filter(carsSubDF, carsSubDF$hp >= 200)
@@ -379,7 +379,7 @@ out <- dapply(carsSubDF, function(x) { x <- cbind(x, x$mpg 
* 1.61) }, schema)
 head(collect(out))
 ```
 
-Like `dapply`, apply a function to each partition of a `SparkDataFrame` and 
collect the result back. The output of function should be a `data.frame`, but 
no schema is required in this case. Note that `dapplyCollect` can fail if the 
output of UDF run on all the partition cannot be pulled to the driver and fit 
in driver memory.
+Like `dapply`, `dapplyCollect` can apply a function to each partition of a 
`SparkDataFrame` and collect the result back. The output of the function should 
be a `data.frame`, but no schema is required in this case. Note that 
`dapplyCollect` can fail if the output of the UDF on all partitions cannot be 
pulled into the driver's memory.
 
 ```{r}
 out <- dapplyCollect(
@@ -405,7 +405,7 @@ result <- gapply(
 head(arrange(result, "max_mpg", decreasing = TRUE))
 ```
 
-Like gapply, `gapplyCollect` applies a function to each partition of a 
`SparkDataFrame` and collect the result back to R `data.frame`. The output of 
the function should be a `data.frame` but no schema is required in this case. 
Note that `gapplyCollect` can fail if the output of UDF run on all the 
partition cannot be pulled to the driver and fit in driver memory.
+Like `gapply`, `gapplyCollect` can apply a function to each partition of a 
`SparkDataFrame` and collect the result back to R `data.frame`. The output of 
the function should be a `data.frame` but no schema is required in this case. 
Note that `gapplyCollect` can fail if the output of the UDF on all partitions 
cannot be pulled into the driver's memory.
 
 ```{r}
 result <- gapplyCollect(
@@ -458,20 +458,20 @@ options(ops)
 
 
 ### SQL Queries
-A `SparkDataFrame` can also be registered as a temporary view in Spark SQL and 
that allows you to run SQL queries over its data. The sql function enables 
applications to run SQL queries programmatically and returns the result as a 
`SparkDataFrame`.
+A `SparkDataFrame` can also be registered as a temporary view in Spark SQL so 
that one can run SQL queries over its data. The sql function enables 
applications to run SQL queries programmatically and returns the result as a 
`SparkDataFrame`.
 
 ```{r}
 people <- read.df(paste0(sparkR.conf("spark.home"),
  "/examples/src/main/resources/people.json"), "json")
 ```
 
-Register this SparkDataFrame as a temporary view.
+Register this `SparkDataFrame` as a temporary view.
 
 ```{r}
 createOrReplaceTempView(people, "people")
 ```
 
-SQL statements can be run by using the sql method.
+SQL statements can be run using the sql method.
 ```{r}
 teenagers <- sql("SELECT name FROM people WHERE age >= 13 AND age <= 19")
 head(teenagers)
@@ -780,7 +780,7 @@ head(predict(isoregModel, newDF))
 `spark.gbt` fits a [gradient-boosted 
tree](https://en.wikipedia.org/wiki/Gradient_boosting) classification or 
regression model on a `SparkDataFrame`.
 Users can call `summary` to get a summary of the fitted model, `predict` to 
make predictions, and `write.ml`/`read.ml` to save/load fitted 

spark git commit: [SPARKR][DOC] fix typo in vignettes

2017-05-08 Thread felixcheung
Repository: spark
Updated Branches:
  refs/heads/branch-2.2 6c5b7e106 -> d8a5a0d34


[SPARKR][DOC] fix typo in vignettes

## What changes were proposed in this pull request?
Fix typo in vignettes

Author: Wayne Zhang 

Closes #17884 from actuaryzhang/typo.

(cherry picked from commit 2fdaeb52bbe2ed1a9127ac72917286e505303c85)
Signed-off-by: Felix Cheung 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d8a5a0d3
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d8a5a0d3
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d8a5a0d3

Branch: refs/heads/branch-2.2
Commit: d8a5a0d3420abbb911d8a80dc7165762eb08d779
Parents: 6c5b7e1
Author: Wayne Zhang 
Authored: Sun May 7 23:16:30 2017 -0700
Committer: Felix Cheung 
Committed: Sun May 7 23:16:44 2017 -0700

--
 R/pkg/vignettes/sparkr-vignettes.Rmd | 36 +++
 1 file changed, 18 insertions(+), 18 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/d8a5a0d3/R/pkg/vignettes/sparkr-vignettes.Rmd
--
diff --git a/R/pkg/vignettes/sparkr-vignettes.Rmd 
b/R/pkg/vignettes/sparkr-vignettes.Rmd
index b933c59..0f6d5c2 100644
--- a/R/pkg/vignettes/sparkr-vignettes.Rmd
+++ b/R/pkg/vignettes/sparkr-vignettes.Rmd
@@ -65,7 +65,7 @@ We can view the first few rows of the `SparkDataFrame` by 
`head` or `showDF` fun
 head(carsDF)
 ```
 
-Common data processing operations such as `filter`, `select` are supported on 
the `SparkDataFrame`.
+Common data processing operations such as `filter` and `select` are supported 
on the `SparkDataFrame`.
 ```{r}
 carsSubDF <- select(carsDF, "model", "mpg", "hp")
 carsSubDF <- filter(carsSubDF, carsSubDF$hp >= 200)
@@ -364,7 +364,7 @@ out <- dapply(carsSubDF, function(x) { x <- cbind(x, x$mpg 
* 1.61) }, schema)
 head(collect(out))
 ```
 
-Like `dapply`, apply a function to each partition of a `SparkDataFrame` and 
collect the result back. The output of function should be a `data.frame`, but 
no schema is required in this case. Note that `dapplyCollect` can fail if the 
output of UDF run on all the partition cannot be pulled to the driver and fit 
in driver memory.
+Like `dapply`, `dapplyCollect` can apply a function to each partition of a 
`SparkDataFrame` and collect the result back. The output of the function should 
be a `data.frame`, but no schema is required in this case. Note that 
`dapplyCollect` can fail if the output of the UDF on all partitions cannot be 
pulled into the driver's memory.
 
 ```{r}
 out <- dapplyCollect(
@@ -390,7 +390,7 @@ result <- gapply(
 head(arrange(result, "max_mpg", decreasing = TRUE))
 ```
 
-Like gapply, `gapplyCollect` applies a function to each partition of a 
`SparkDataFrame` and collect the result back to R `data.frame`. The output of 
the function should be a `data.frame` but no schema is required in this case. 
Note that `gapplyCollect` can fail if the output of UDF run on all the 
partition cannot be pulled to the driver and fit in driver memory.
+Like `gapply`, `gapplyCollect` can apply a function to each partition of a 
`SparkDataFrame` and collect the result back to R `data.frame`. The output of 
the function should be a `data.frame` but no schema is required in this case. 
Note that `gapplyCollect` can fail if the output of the UDF on all partitions 
cannot be pulled into the driver's memory.
 
 ```{r}
 result <- gapplyCollect(
@@ -443,20 +443,20 @@ options(ops)
 
 
 ### SQL Queries
-A `SparkDataFrame` can also be registered as a temporary view in Spark SQL and 
that allows you to run SQL queries over its data. The sql function enables 
applications to run SQL queries programmatically and returns the result as a 
`SparkDataFrame`.
+A `SparkDataFrame` can also be registered as a temporary view in Spark SQL so 
that one can run SQL queries over its data. The sql function enables 
applications to run SQL queries programmatically and returns the result as a 
`SparkDataFrame`.
 
 ```{r}
 people <- read.df(paste0(sparkR.conf("spark.home"),
  "/examples/src/main/resources/people.json"), "json")
 ```
 
-Register this SparkDataFrame as a temporary view.
+Register this `SparkDataFrame` as a temporary view.
 
 ```{r}
 createOrReplaceTempView(people, "people")
 ```
 
-SQL statements can be run by using the sql method.
+SQL statements can be run using the sql method.
 ```{r}
 teenagers <- sql("SELECT name FROM people WHERE age >= 13 AND age <= 19")
 head(teenagers)
@@ -765,7 +765,7 @@ head(predict(isoregModel, newDF))
 `spark.gbt` fits a [gradient-boosted 
tree](https://en.wikipedia.org/wiki/Gradient_boosting) classification or 
regression model on a `SparkDataFrame`.
 Users can