date:20161111

spark git commit: [SPARK-18060][ML] Avoid unnecessary computation for MLOR

2016-11-11 Thread dbtsai

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 c2ebda443 -> 56859c029


[SPARK-18060][ML] Avoid unnecessary computation for MLOR

## What changes were proposed in this pull request?

Before this patch, the gradient updates for multinomial logistic regression 
were computed by an outer loop over the number of classes and an inner loop 
over the number of features. Inside the inner loop, we standardized the feature 
value (`value / featuresStd(index)`), which means we performed the computation 
`numFeatures * numClasses` times. We only need to perform that computation 
`numFeatures` times, however. If we re-order the inner and outer loop, we can 
avoid this, but then we lose sequential memory access. In this patch, we 
instead lay out the coefficients in column major order while we train, so that 
we can avoid the extra computation and retain sequential memory access. We 
convert back to row-major order when we create the model.

## How was this patch tested?

This is an implementation detail only, so the original behavior should be 
maintained. All tests pass. I ran some performance tests to verify speedups. 
The results are below, and show significant speedups.
## Performance Tests

**Setup**

3 node bare-metal cluster
120 cores total
384 gb RAM total

**Results**

NOTE: The `currentMasterTime` and `thisPatchTime` are times in seconds for a 
single iteration of L-BFGS or OWL-QN.

||   numPoints |   numFeatures |   numClasses |   regParam |   
elasticNetParam |   currentMasterTime (sec) |   thisPatchTime (sec) |   
pctSpeedup |
||-|---|--||---|---|---|--|
|  0 |   1e+07 |   100 |  500 |   0.5  |
 0 |90 |18 |   80 |
|  1 |   1e+08 |   100 |   50 |   0.5  |
 0 |90 |19 |   78 |
|  2 |   1e+08 |   100 |   50 |   0.05 |
 1 |72 |19 |   73 |
|  3 |   1e+06 |   100 | 5000 |   0.5  |
 0 |93 |53 |   43 |
|  4 |   1e+07 |   100 | 5000 |   0.5  |
 0 |   900 |   390 |   56 |
|  5 |   1e+08 |   100 |  500 |   0.5  |
 0 |   840 |   174 |   79 |
|  6 |   1e+08 |   100 |  200 |   0.5  |
 0 |   360 |72 |   80 |
|  7 |   1e+08 |  1000 |5 |   0.5  |
 0 | 9 | 3 |   66 |

Author: sethah 

Closes #15593 from sethah/MLOR_PERF_COL_MAJOR_COEF.

(cherry picked from commit 46b2550bcd3690a260b995fd4d024a73b92a0299)
Signed-off-by: DB Tsai 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/56859c02
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/56859c02
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/56859c02

Branch: refs/heads/branch-2.1
Commit: 56859c029476bc41b2d2e05043c119146b287bce
Parents: c2ebda4
Author: sethah 
Authored: Sat Nov 12 01:38:26 2016 +
Committer: DB Tsai 
Committed: Sat Nov 12 01:41:29 2016 +

--
 .../ml/classification/LogisticRegression.scala  | 125 +++
 1 file changed, 74 insertions(+), 51 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/56859c02/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
 
b/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
index c465105..18b9b30 100644
--- 
a/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
@@ -438,18 +438,14 @@ class LogisticRegression @Since("1.2.0") (
   val standardizationParam = $(standardization)
   def regParamL1Fun = (index: Int) => {
 // Remove the L1 penalization on the intercept
-val isIntercept = $(fitIntercept) && ((index + 1) % 
numFeaturesPlusIntercept == 0)
+val isIntercept = $(fitIntercept) && index >= numFeatures * 
numCoefficientSets
 if

spark git commit: [SPARK-17982][SQL][BACKPORT-2.0] SQLBuilder should wrap the generated SQL with parenthesis for LIMIT

2016-11-11 Thread lixiao

Repository: spark
Updated Branches:
  refs/heads/branch-2.0 99575e88f -> 80c1a1f30


[SPARK-17982][SQL][BACKPORT-2.0] SQLBuilder should wrap the generated SQL with 
parenthesis for LIMIT

## What changes were proposed in this pull request?

Currently, `SQLBuilder` handles `LIMIT` by always adding `LIMIT` at the end of 
the generated subSQL. It makes `RuntimeException`s like the following. This PR 
adds a parenthesis always except `SubqueryAlias` is used together with `LIMIT`.

**Before**

``` scala
scala> sql("CREATE TABLE tbl(id INT)")
scala> sql("CREATE VIEW v1(id2) AS SELECT id FROM tbl LIMIT 2")
java.lang.RuntimeException: Failed to analyze the canonicalized SQL: ...
```

**After**

``` scala
scala> sql("CREATE TABLE tbl(id INT)")
scala> sql("CREATE VIEW v1(id2) AS SELECT id FROM tbl LIMIT 2")
scala> sql("SELECT id2 FROM v1")
res4: org.apache.spark.sql.DataFrame = [id2: int]
```

**Fixed cases in this PR**

The following two cases are the detail query plans having problematic SQL 
generations.

1. `SELECT * FROM (SELECT id FROM tbl LIMIT 2)`

Please note that **FROM SELECT** part of the generated SQL in the below. 
When we don't use '()' for limit, this fails.

```scala
# Original logical plan:
Project [id#1]
+- GlobalLimit 2
   +- LocalLimit 2
  +- Project [id#1]
 +- MetastoreRelation default, tbl

# Canonicalized logical plan:
Project [gen_attr_0#1 AS id#4]
+- SubqueryAlias tbl
   +- Project [gen_attr_0#1]
  +- GlobalLimit 2
 +- LocalLimit 2
+- Project [gen_attr_0#1]
   +- SubqueryAlias gen_subquery_0
  +- Project [id#1 AS gen_attr_0#1]
 +- SQLTable default, tbl, [id#1]

# Generated SQL:
SELECT `gen_attr_0` AS `id` FROM (SELECT `gen_attr_0` FROM SELECT `gen_attr_0` 
FROM (SELECT `id` AS `gen_attr_0` FROM `default`.`tbl`) AS gen_subquery_0 LIMIT 
2) AS tbl
```

2. `SELECT * FROM (SELECT id FROM tbl TABLESAMPLE (2 ROWS))`

Please note that **((~~~) AS gen_subquery_0 LIMIT 2)** in the below. When 
we use '()' for limit on `SubqueryAlias`, this fails.

```scala
# Original logical plan:
Project [id#1]
+- Project [id#1]
   +- GlobalLimit 2
  +- LocalLimit 2
 +- MetastoreRelation default, tbl

# Canonicalized logical plan:
Project [gen_attr_0#1 AS id#4]
+- SubqueryAlias tbl
   +- Project [gen_attr_0#1]
  +- GlobalLimit 2
 +- LocalLimit 2
+- SubqueryAlias gen_subquery_0
   +- Project [id#1 AS gen_attr_0#1]
  +- SQLTable default, tbl, [id#1]

# Generated SQL:
SELECT `gen_attr_0` AS `id` FROM (SELECT `gen_attr_0` FROM ((SELECT `id` AS 
`gen_attr_0` FROM `default`.`tbl`) AS gen_subquery_0 LIMIT 2)) AS tbl
```

## How was this patch tested?

Pass the Jenkins test with a newly added test case.

Author: Dongjoon Hyun 

Closes #15856 from dongjoon-hyun/SPARK-17982-2.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/80c1a1f3
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/80c1a1f3
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/80c1a1f3

Branch: refs/heads/branch-2.0
Commit: 80c1a1f30b156cfe9d69044eabef4a1de9be9cb2
Parents: 99575e8
Author: Dongjoon Hyun 
Authored: Fri Nov 11 17:38:56 2016 -0800
Committer: gatorsmile 
Committed: Fri Nov 11 17:38:56 2016 -0800

--
 .../scala/org/apache/spark/sql/catalyst/SQLBuilder.scala  |  7 ++-
 .../src/test/resources/sqlgen/generate_with_other_1.sql   |  2 +-
 .../src/test/resources/sqlgen/generate_with_other_2.sql   |  2 +-
 sql/hive/src/test/resources/sqlgen/limit.sql  |  4 
 .../apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala | 10 ++
 5 files changed, 22 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/80c1a1f3/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala
index 5e0263e..4b2cd73 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala
@@ -138,9 +138,14 @@ class SQLBuilder private (
 case g: Generate =>
   generateToSQL(g)
 
-case Limit(limitExpr, child) =>
+// This prevents a pattern of `((...) AS gen_subquery_0 LIMIT 1)` which 
does not work.
+// For example, `SELECT * FROM (SELECT id FROM tbl TABLESAMPLE (2 ROWS))` 
makes this plan.
+case Limit(limitExpr, child: SubqueryAlias) =>
   s"${toSQL(child)} LIMIT ${limitExpr.sql}"
 
+case Limit(limitExpr, child) =>
+  s"(${toSQL(child)}

spark git commit: [SPARK-18060][ML] Avoid unnecessary computation for MLOR

2016-11-11 Thread dbtsai

Repository: spark
Updated Branches:
  refs/heads/master ba23f768f -> 46b2550bc


[SPARK-18060][ML] Avoid unnecessary computation for MLOR

## What changes were proposed in this pull request?

Before this patch, the gradient updates for multinomial logistic regression 
were computed by an outer loop over the number of classes and an inner loop 
over the number of features. Inside the inner loop, we standardized the feature 
value (`value / featuresStd(index)`), which means we performed the computation 
`numFeatures * numClasses` times. We only need to perform that computation 
`numFeatures` times, however. If we re-order the inner and outer loop, we can 
avoid this, but then we lose sequential memory access. In this patch, we 
instead lay out the coefficients in column major order while we train, so that 
we can avoid the extra computation and retain sequential memory access. We 
convert back to row-major order when we create the model.

## How was this patch tested?

This is an implementation detail only, so the original behavior should be 
maintained. All tests pass. I ran some performance tests to verify speedups. 
The results are below, and show significant speedups.
## Performance Tests

**Setup**

3 node bare-metal cluster
120 cores total
384 gb RAM total

**Results**

NOTE: The `currentMasterTime` and `thisPatchTime` are times in seconds for a 
single iteration of L-BFGS or OWL-QN.

||   numPoints |   numFeatures |   numClasses |   regParam |   
elasticNetParam |   currentMasterTime (sec) |   thisPatchTime (sec) |   
pctSpeedup |
||-|---|--||---|---|---|--|
|  0 |   1e+07 |   100 |  500 |   0.5  |
 0 |90 |18 |   80 |
|  1 |   1e+08 |   100 |   50 |   0.5  |
 0 |90 |19 |   78 |
|  2 |   1e+08 |   100 |   50 |   0.05 |
 1 |72 |19 |   73 |
|  3 |   1e+06 |   100 | 5000 |   0.5  |
 0 |93 |53 |   43 |
|  4 |   1e+07 |   100 | 5000 |   0.5  |
 0 |   900 |   390 |   56 |
|  5 |   1e+08 |   100 |  500 |   0.5  |
 0 |   840 |   174 |   79 |
|  6 |   1e+08 |   100 |  200 |   0.5  |
 0 |   360 |72 |   80 |
|  7 |   1e+08 |  1000 |5 |   0.5  |
 0 | 9 | 3 |   66 |

Author: sethah 

Closes #15593 from sethah/MLOR_PERF_COL_MAJOR_COEF.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/46b2550b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/46b2550b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/46b2550b

Branch: refs/heads/master
Commit: 46b2550bcd3690a260b995fd4d024a73b92a0299
Parents: ba23f76
Author: sethah 
Authored: Sat Nov 12 01:38:26 2016 +
Committer: DB Tsai 
Committed: Sat Nov 12 01:38:26 2016 +

--
 .../ml/classification/LogisticRegression.scala  | 125 +++
 1 file changed, 74 insertions(+), 51 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/46b2550b/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
 
b/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
index c465105..18b9b30 100644
--- 
a/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
@@ -438,18 +438,14 @@ class LogisticRegression @Since("1.2.0") (
   val standardizationParam = $(standardization)
   def regParamL1Fun = (index: Int) => {
 // Remove the L1 penalization on the intercept
-val isIntercept = $(fitIntercept) && ((index + 1) % 
numFeaturesPlusIntercept == 0)
+val isIntercept = $(fitIntercept) && index >= numFeatures * 
numCoefficientSets
 if (isIntercept) {
   0.0
 } else {
   if (standardizationParam) {
 regParamL1

spark git commit: [SPARK-18264][SPARKR] build vignettes with package, update vignettes for CRAN release build and add info on release

2016-11-11 Thread shivaram

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 87820da78 -> c2ebda443


[SPARK-18264][SPARKR] build vignettes with package, update vignettes for CRAN 
release build and add info on release

## What changes were proposed in this pull request?

Changes to DESCRIPTION to build vignettes.
Changes the metadata for vignettes to generate the recommended format (which is 
about <10% of size before). Unfortunately it does not look as nice
(before - left, after - right)

![image](https://cloud.githubusercontent.com/assets/8969467/20040492/b75883e6-a40d-11e6-9534-25cdd5d59a8b.png)

![image](https://cloud.githubusercontent.com/assets/8969467/20040490/a40f4d42-a40d-11e6-8c91-af00ddcbdad9.png)

Also add information on how to run build/release to CRAN later.

## How was this patch tested?

manually, unit tests

shivaram

We need this for branch-2.1

Author: Felix Cheung 

Closes #15790 from felixcheung/rpkgvignettes.

(cherry picked from commit ba23f768f7419039df85530b84258ec31f0c22b4)
Signed-off-by: Shivaram Venkataraman 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c2ebda44
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c2ebda44
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c2ebda44

Branch: refs/heads/branch-2.1
Commit: c2ebda443b2678e554d859d866af53e2e94822f2
Parents: 87820da
Author: Felix Cheung 
Authored: Fri Nov 11 15:49:55 2016 -0800
Committer: Shivaram Venkataraman 
Committed: Fri Nov 11 15:50:03 2016 -0800

--
 R/CRAN_RELEASE.md| 91 +++
 R/README.md  |  8 +--
 R/check-cran.sh  | 33 +--
 R/create-docs.sh | 19 +--
 R/pkg/DESCRIPTION|  9 ++-
 R/pkg/vignettes/sparkr-vignettes.Rmd |  9 +--
 6 files changed, 134 insertions(+), 35 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/c2ebda44/R/CRAN_RELEASE.md
--
diff --git a/R/CRAN_RELEASE.md b/R/CRAN_RELEASE.md
new file mode 100644
index 000..bea8f9f
--- /dev/null
+++ b/R/CRAN_RELEASE.md
@@ -0,0 +1,91 @@
+# SparkR CRAN Release
+
+To release SparkR as a package to CRAN, we would use the `devtools` package. 
Please work with the
+`d...@spark.apache.org` community and R package maintainer on this.
+
+### Release
+
+First, check that the `Version:` field in the `pkg/DESCRIPTION` file is 
updated. Also, check for stale files not under source control.
+
+Note that while `check-cran.sh` is running `R CMD check`, it is doing so with 
`--no-manual --no-vignettes`, which skips a few vignettes or PDF checks - 
therefore it will be preferred to run `R CMD check` on the source package built 
manually before uploading a release.
+
+To upload a release, we would need to update the `cran-comments.md`. This 
should generally contain the results from running the `check-cran.sh` script 
along with comments on status of all `WARNING` (should not be any) or `NOTE`. 
As a part of `check-cran.sh` and the release process, the vignettes is build - 
make sure `SPARK_HOME` is set and Spark jars are accessible.
+
+Once everything is in place, run in R under the `SPARK_HOME/R` directory:
+
+```R
+paths <- .libPaths(); .libPaths(c("lib", paths)); 
Sys.setenv(SPARK_HOME=tools::file_path_as_absolute("..")); devtools::release(); 
.libPaths(paths)
+```
+
+For more information please refer to 
http://r-pkgs.had.co.nz/release.html#release-check
+
+### Testing: build package manually
+
+To build package manually such as to inspect the resulting `.tar.gz` file 
content, we would also use the `devtools` package.
+
+Source package is what get released to CRAN. CRAN would then build 
platform-specific binary packages from the source package.
+
+ Build source package
+
+To build source package locally without releasing to CRAN, run in R under the 
`SPARK_HOME/R` directory:
+
+```R
+paths <- .libPaths(); .libPaths(c("lib", paths)); 
Sys.setenv(SPARK_HOME=tools::file_path_as_absolute("..")); 
devtools::build("pkg"); .libPaths(paths)
+```
+
+(http://r-pkgs.had.co.nz/vignettes.html#vignette-workflow-2)
+
+Similarly, the source package is also created by `check-cran.sh` with `R CMD 
build pkg`.
+
+For example, this should be the content of the source package:
+
+```sh
+DESCRIPTIONR   insttests
+NAMESPACE  build   man vignettes
+
+inst/doc/
+sparkr-vignettes.html
+sparkr-vignettes.Rmd
+sparkr-vignettes.Rman
+
+build/
+vignette.rds
+
+man/
+ *.Rd files...
+
+vignettes/
+sparkr-vignettes.Rmd
+```
+
+ Test source package
+
+To install, run this:
+
+```sh
+R CMD INSTALL

spark git commit: [SPARK-18264][SPARKR] build vignettes with package, update vignettes for CRAN release build and add info on release

2016-11-11 Thread shivaram

Repository: spark
Updated Branches:
  refs/heads/master 6e95325fc -> ba23f768f


[SPARK-18264][SPARKR] build vignettes with package, update vignettes for CRAN 
release build and add info on release

## What changes were proposed in this pull request?

Changes to DESCRIPTION to build vignettes.
Changes the metadata for vignettes to generate the recommended format (which is 
about <10% of size before). Unfortunately it does not look as nice
(before - left, after - right)

![image](https://cloud.githubusercontent.com/assets/8969467/20040492/b75883e6-a40d-11e6-9534-25cdd5d59a8b.png)

![image](https://cloud.githubusercontent.com/assets/8969467/20040490/a40f4d42-a40d-11e6-8c91-af00ddcbdad9.png)

Also add information on how to run build/release to CRAN later.

## How was this patch tested?

manually, unit tests

shivaram

We need this for branch-2.1

Author: Felix Cheung 

Closes #15790 from felixcheung/rpkgvignettes.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ba23f768
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ba23f768
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ba23f768

Branch: refs/heads/master
Commit: ba23f768f7419039df85530b84258ec31f0c22b4
Parents: 6e95325
Author: Felix Cheung 
Authored: Fri Nov 11 15:49:55 2016 -0800
Committer: Shivaram Venkataraman 
Committed: Fri Nov 11 15:49:55 2016 -0800

--
 R/CRAN_RELEASE.md| 91 +++
 R/README.md  |  8 +--
 R/check-cran.sh  | 33 +--
 R/create-docs.sh | 19 +--
 R/pkg/DESCRIPTION|  9 ++-
 R/pkg/vignettes/sparkr-vignettes.Rmd |  9 +--
 6 files changed, 134 insertions(+), 35 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/ba23f768/R/CRAN_RELEASE.md
--
diff --git a/R/CRAN_RELEASE.md b/R/CRAN_RELEASE.md
new file mode 100644
index 000..bea8f9f
--- /dev/null
+++ b/R/CRAN_RELEASE.md
@@ -0,0 +1,91 @@
+# SparkR CRAN Release
+
+To release SparkR as a package to CRAN, we would use the `devtools` package. 
Please work with the
+`d...@spark.apache.org` community and R package maintainer on this.
+
+### Release
+
+First, check that the `Version:` field in the `pkg/DESCRIPTION` file is 
updated. Also, check for stale files not under source control.
+
+Note that while `check-cran.sh` is running `R CMD check`, it is doing so with 
`--no-manual --no-vignettes`, which skips a few vignettes or PDF checks - 
therefore it will be preferred to run `R CMD check` on the source package built 
manually before uploading a release.
+
+To upload a release, we would need to update the `cran-comments.md`. This 
should generally contain the results from running the `check-cran.sh` script 
along with comments on status of all `WARNING` (should not be any) or `NOTE`. 
As a part of `check-cran.sh` and the release process, the vignettes is build - 
make sure `SPARK_HOME` is set and Spark jars are accessible.
+
+Once everything is in place, run in R under the `SPARK_HOME/R` directory:
+
+```R
+paths <- .libPaths(); .libPaths(c("lib", paths)); 
Sys.setenv(SPARK_HOME=tools::file_path_as_absolute("..")); devtools::release(); 
.libPaths(paths)
+```
+
+For more information please refer to 
http://r-pkgs.had.co.nz/release.html#release-check
+
+### Testing: build package manually
+
+To build package manually such as to inspect the resulting `.tar.gz` file 
content, we would also use the `devtools` package.
+
+Source package is what get released to CRAN. CRAN would then build 
platform-specific binary packages from the source package.
+
+ Build source package
+
+To build source package locally without releasing to CRAN, run in R under the 
`SPARK_HOME/R` directory:
+
+```R
+paths <- .libPaths(); .libPaths(c("lib", paths)); 
Sys.setenv(SPARK_HOME=tools::file_path_as_absolute("..")); 
devtools::build("pkg"); .libPaths(paths)
+```
+
+(http://r-pkgs.had.co.nz/vignettes.html#vignette-workflow-2)
+
+Similarly, the source package is also created by `check-cran.sh` with `R CMD 
build pkg`.
+
+For example, this should be the content of the source package:
+
+```sh
+DESCRIPTIONR   insttests
+NAMESPACE  build   man vignettes
+
+inst/doc/
+sparkr-vignettes.html
+sparkr-vignettes.Rmd
+sparkr-vignettes.Rman
+
+build/
+vignette.rds
+
+man/
+ *.Rd files...
+
+vignettes/
+sparkr-vignettes.Rmd
+```
+
+ Test source package
+
+To install, run this:
+
+```sh
+R CMD INSTALL SparkR_2.1.0.tar.gz
+```
+
+With "2.1.0" replaced with the version of SparkR.
+
+This command installs SparkR to the default libPaths. Once that is done, you

svn commit: r16970 - /dev/spark/spark-2.0.2/ /release/spark/spark-2.0.2/

2016-11-11 Thread rxin

Author: rxin
Date: Fri Nov 11 22:51:28 2016
New Revision: 16970

Log:
Artifacts for Spark 2.0.2

Added:
release/spark/spark-2.0.2/
  - copied from r16969, dev/spark/spark-2.0.2/
Removed:
dev/spark/spark-2.0.2/


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[26/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/tableToDF.html
--
diff --git a/site/docs/2.0.2/api/R/tableToDF.html 
b/site/docs/2.0.2/api/R/tableToDF.html
new file mode 100644
index 000..6a11caa
--- /dev/null
+++ b/site/docs/2.0.2/api/R/tableToDF.html
@@ -0,0 +1,65 @@
+
+R: Create a SparkDataFrame from a SparkSQL Table
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+tableToDF 
{SparkR}R Documentation
+
+Create a SparkDataFrame from a SparkSQL Table
+
+Description
+
+Returns the specified Table as a SparkDataFrame.  The Table must have 
already been registered
+in the SparkSession.
+
+
+
+Usage
+
+
+tableToDF(tableName)
+
+
+
+Arguments
+
+
+tableName
+
+The SparkSQL Table to convert to a SparkDataFrame.
+
+
+
+
+Value
+
+SparkDataFrame
+
+
+
+Note
+
+tableToDF since 2.0.0
+
+
+
+Examples
+
+## Not run: 
+##D sparkR.session()
+##D path - path/to/file.json
+##D df - read.json(path)
+##D createOrReplaceTempView(df, table)
+##D new_df - tableToDF(table)
+## End(Not run)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/tables.html
--
diff --git a/site/docs/2.0.2/api/R/tables.html 
b/site/docs/2.0.2/api/R/tables.html
new file mode 100644
index 000..e486018
--- /dev/null
+++ b/site/docs/2.0.2/api/R/tables.html
@@ -0,0 +1,62 @@
+
+R: Tables
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+tables 
{SparkR}R Documentation
+
+Tables
+
+Description
+
+Returns a SparkDataFrame containing names of tables in the given database.
+
+
+
+Usage
+
+
+## Default S3 method:
+tables(databaseName = NULL)
+
+
+
+Arguments
+
+
+databaseName
+
+name of the database
+
+
+
+
+Value
+
+a SparkDataFrame
+
+
+
+Note
+
+tables since 1.4.0
+
+
+
+Examples
+
+## Not run: 
+##D sparkR.session()
+##D tables(hive)
+## End(Not run)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/take.html
--
diff --git a/site/docs/2.0.2/api/R/take.html b/site/docs/2.0.2/api/R/take.html
new file mode 100644
index 000..b792543
--- /dev/null
+++ b/site/docs/2.0.2/api/R/take.html
@@ -0,0 +1,262 @@
+
+R: Take the first NUM rows of a SparkDataFrame and return 
the...
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+take 
{SparkR}R Documentation
+
+Take the first NUM rows of a SparkDataFrame and return the results as a R 
data.frame
+
+Description
+
+Take the first NUM rows of a SparkDataFrame and return the results as a R 
data.frame
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame,numeric'
+take(x, num)
+
+take(x, num)
+
+
+
+Arguments
+
+
+x
+
+a SparkDataFrame.
+
+num
+
+number of rows to take.
+
+
+
+
+Note
+
+take since 1.4.0
+
+
+
+See Also
+
+Other SparkDataFrame functions: $,
+$,SparkDataFrame-method, $-,
+$-,SparkDataFrame-method,
+select, select,
+select,SparkDataFrame,Column-method,
+select,SparkDataFrame,character-method,
+select,SparkDataFrame,list-method;
+SparkDataFrame-class; [,
+[,SparkDataFrame-method, [[,
+[[,SparkDataFrame,numericOrcharacter-method,
+subset, subset,
+subset,SparkDataFrame-method;
+agg, agg, agg,
+agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+arrange, arrange,
+arrange,
+arrange,SparkDataFrame,Column-method,
+arrange,SparkDataFrame,character-method,
+orderBy,SparkDataFrame,characterOrColumn-method;
+as.data.frame,
+as.data.frame,SparkDataFrame-method;
+attach,
+attach,SparkDataFrame-method;
+cache, cache,
+cache,SparkDataFrame-method;
+collect, collect,
+collect,SparkDataFrame-method;
+colnames, colnames,
+colnames,SparkDataFrame-method,
+colnames-, colnames-,
+colnames-,SparkDataFrame-method,
+columns, columns,
+columns,SparkDataFrame-method,
+names,
+names,SparkDataFrame-method,
+names-,
+names-,SparkDataFrame-method;
+coltypes, coltypes,
+coltypes,SparkDataFrame-method,
+coltypes-, coltypes-,
+coltypes-,SparkDataFrame,character-method;

[21/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/allclasses-frame.html
--
diff --git a/site/docs/2.0.2/api/java/allclasses-frame.html 
b/site/docs/2.0.2/api/java/allclasses-frame.html
new file mode 100644
index 000..d3f2f4d
--- /dev/null
+++ b/site/docs/2.0.2/api/java/allclasses-frame.html
@@ -0,0 +1,1119 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+All Classes (Spark 2.0.2 JavaDoc)
+
+
+
+
+All Classes
+
+
+AbsoluteError
+Accumulable
+AccumulableInfo
+AccumulableInfo
+AccumulableParam
+Accumulator
+AccumulatorContext
+AccumulatorParam
+AccumulatorParam.DoubleAccumulatorParam$
+AccumulatorParam.FloatAccumulatorParam$
+AccumulatorParam.IntAccumulatorParam$
+AccumulatorParam.LongAccumulatorParam$
+AccumulatorParam.StringAccumulatorParam$
+AccumulatorV2
+AFTAggregator
+AFTCostFun
+AFTSurvivalRegression
+AFTSurvivalRegressionModel
+AggregatedDialect
+AggregatingEdgeContext
+Aggregator
+Aggregator
+Algo
+AllJobsCancelled
+AllReceiverIds
+ALS
+ALS
+ALS.InBlock$
+ALS.Rating
+ALS.Rating$
+ALS.RatingBlock$
+ALSModel
+AnalysisException
+And
+AnyDataType
+ApplicationAttemptInfo
+ApplicationInfo
+ApplicationsListResource
+ApplicationStatus
+ApplyInPlace
+AreaUnderCurve
+ArrayType
+AskPermissionToCommitOutput
+AssociationRules
+AssociationRules.Rule
+AsyncRDDActions
+Attribute
+AttributeGroup
+AttributeKeys
+AttributeType
+BaseRelation
+BaseRRDD
+BatchInfo
+BernoulliCellSampler
+BernoulliSampler
+Binarizer
+BinaryAttribute
+BinaryClassificationEvaluator
+BinaryClassificationMetrics
+BinaryLogisticRegressionSummary
+BinaryLogisticRegressionTrainingSummary
+BinarySample
+BinaryType
+BinomialBounds
+BisectingKMeans
+BisectingKMeans
+BisectingKMeansModel
+BisectingKMeansModel
+BisectingKMeansModel.SaveLoadV1_0$
+BLAS
+BLAS
+BlockId
+BlockManagerId
+BlockManagerMessages
+BlockManagerMessages.BlockManagerHeartbeat
+BlockManagerMessages.BlockManagerHeartbeat$
+BlockManagerMessages.GetBlockStatus
+BlockManagerMessages.GetBlockStatus$
+BlockManagerMessages.GetExecutorEndpointRef
+BlockManagerMessages.GetExecutorEndpointRef$
+BlockManagerMessages.GetLocations
+BlockManagerMessages.GetLocations$
+BlockManagerMessages.GetLocationsMultipleBlockIds
+BlockManagerMessages.GetLocationsMultipleBlockIds$
+BlockManagerMessages.GetMatchingBlockIds
+BlockManagerMessages.GetMatchingBlockIds$
+BlockManagerMessages.GetMemoryStatus$
+BlockManagerMessages.GetPeers
+BlockManagerMessages.GetPeers$
+BlockManagerMessages.GetStorageStatus$
+BlockManagerMessages.HasCachedBlocks
+BlockManagerMessages.HasCachedBlocks$
+BlockManagerMessages.RegisterBlockManager
+BlockManagerMessages.RegisterBlockManager$
+BlockManagerMessages.RemoveBlock
+BlockManagerMessages.RemoveBlock$
+BlockManagerMessages.RemoveBroadcast
+BlockManagerMessages.RemoveBroadcast$
+BlockManagerMessages.RemoveExecutor
+BlockManagerMessages.RemoveExecutor$
+BlockManagerMessages.RemoveRdd
+BlockManagerMessages.RemoveRdd$
+BlockManagerMessages.RemoveShuffle
+BlockManagerMessages.RemoveShuffle$
+BlockManagerMessages.StopBlockManagerMaster$
+BlockManagerMessages.ToBlockManagerMaster
+BlockManagerMessages.ToBlockManagerSlave
+BlockManagerMessages.TriggerThreadDump$
+BlockManagerMessages.UpdateBlockInfo
+BlockManagerMessages.UpdateBlockInfo$
+BlockMatrix
+BlockNotFoundException
+BlockStatus
+BlockUpdatedInfo
+BloomFilter
+BloomFilter.Version
+BooleanParam
+BooleanType
+BoostingStrategy
+BoundedDouble
+BreezeUtil
+Broadcast
+BroadcastBlockId
+Broker
+Bucketizer
+BufferReleasingInputStream
+BytecodeUtils
+ByteType
+CalendarIntervalType
+Catalog
+CatalogImpl
+CatalystScan
+CategoricalSplit
+CausedBy
+CheckpointReader
+CheckpointState
+ChiSqSelector
+ChiSqSelector
+ChiSqSelectorModel
+ChiSqSelectorModel
+ChiSqSelectorModel.SaveLoadV1_0$
+ChiSqTest
+ChiSqTest.Method
+ChiSqTest.Method$
+ChiSqTest.NullHypothesis$
+ChiSqTestResult
+CholeskyDecomposition
+ChunkedByteBufferInputStream
+ClassificationModel
+ClassificationModel
+Classifier
+CleanAccum
+CleanBroadcast
+CleanCheckpoint
+CleanRDD
+CleanShuffle
+CleanupTask
+CleanupTaskWeakReference
+ClosureCleaner
+CoarseGrainedClusterMessages
+CoarseGrainedClusterMessages.AddWebUIFilter
+CoarseGrainedClusterMessages.AddWebUIFilter$
+CoarseGrainedClusterMessages.GetExecutorLossReason
+CoarseGrainedClusterMessages.GetExecutorLossReason$
+CoarseGrainedClusterMessages.KillExecutors
+CoarseGrainedClusterMessages.KillExecutors$
+CoarseGrainedClusterMessages.KillTask
+CoarseGrainedClusterMessages.KillTask$
+CoarseGrainedClusterMessages.LaunchTask
+CoarseGrainedClusterMessages.LaunchTask$
+CoarseGrainedClusterMessages.RegisterClusterManager
+CoarseGrainedClusterMessages.RegisterClusterManager$
+CoarseGrainedClusterMessages.RegisteredExecutor$
+CoarseGrainedClusterMessages.RegisterExecutor
+CoarseGrainedClusterMessages.RegisterExecutor$
+CoarseGrainedClusterMessages.RegisterExecutorFailed
+CoarseGrainedClusterMessages.RegisterExecutorFailed$

[44/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/dapply.html
--
diff --git a/site/docs/2.0.2/api/R/dapply.html 
b/site/docs/2.0.2/api/R/dapply.html
new file mode 100644
index 000..2bac687
--- /dev/null
+++ b/site/docs/2.0.2/api/R/dapply.html
@@ -0,0 +1,290 @@
+
+R: dapply
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+dapply 
{SparkR}R Documentation
+
+dapply
+
+Description
+
+Apply a function to each partition of a SparkDataFrame.
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame,'function',structType'
+dapply(x, func, schema)
+
+dapply(x, func, schema)
+
+
+
+Arguments
+
+
+x
+
+A SparkDataFrame
+
+func
+
+A function to be applied to each partition of the SparkDataFrame.
+func should have only one parameter, to which a R data.frame corresponds
+to each partition will be passed.
+The output of func should be a R data.frame.
+
+schema
+
+The schema of the resulting SparkDataFrame after the function is applied.
+It must match the output of func.
+
+
+
+
+Note
+
+dapply since 2.0.0
+
+
+
+See Also
+
+dapplyCollect
+
+Other SparkDataFrame functions: $,
+$,SparkDataFrame-method, $-,
+$-,SparkDataFrame-method,
+select, select,
+select,SparkDataFrame,Column-method,
+select,SparkDataFrame,character-method,
+select,SparkDataFrame,list-method;
+SparkDataFrame-class; [,
+[,SparkDataFrame-method, [[,
+[[,SparkDataFrame,numericOrcharacter-method,
+subset, subset,
+subset,SparkDataFrame-method;
+agg, agg, agg,
+agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+arrange, arrange,
+arrange,
+arrange,SparkDataFrame,Column-method,
+arrange,SparkDataFrame,character-method,
+orderBy,SparkDataFrame,characterOrColumn-method;
+as.data.frame,
+as.data.frame,SparkDataFrame-method;
+attach,
+attach,SparkDataFrame-method;
+cache, cache,
+cache,SparkDataFrame-method;
+collect, collect,
+collect,SparkDataFrame-method;
+colnames, colnames,
+colnames,SparkDataFrame-method,
+colnames-, colnames-,
+colnames-,SparkDataFrame-method,
+columns, columns,
+columns,SparkDataFrame-method,
+names,
+names,SparkDataFrame-method,
+names-,
+names-,SparkDataFrame-method;
+coltypes, coltypes,
+coltypes,SparkDataFrame-method,
+coltypes-, coltypes-,
+coltypes-,SparkDataFrame,character-method;
+count,SparkDataFrame-method,
+nrow, nrow,
+nrow,SparkDataFrame-method;
+createOrReplaceTempView,
+createOrReplaceTempView,
+createOrReplaceTempView,SparkDataFrame,character-method;
+dapplyCollect, dapplyCollect,
+dapplyCollect,SparkDataFrame,function-method;
+describe, describe,
+describe,
+describe,SparkDataFrame,ANY-method,
+describe,SparkDataFrame,character-method,
+describe,SparkDataFrame-method,
+summary, summary,
+summary,SparkDataFrame-method;
+dim,
+dim,SparkDataFrame-method;
+distinct, distinct,
+distinct,SparkDataFrame-method,
+unique,
+unique,SparkDataFrame-method;
+dropDuplicates,
+dropDuplicates,
+dropDuplicates,SparkDataFrame-method;
+dropna, dropna,
+dropna,SparkDataFrame-method,
+fillna, fillna,
+fillna,SparkDataFrame-method,
+na.omit, na.omit,
+na.omit,SparkDataFrame-method;
+drop, drop,
+drop, drop,ANY-method,
+drop,SparkDataFrame-method;
+dtypes, dtypes,
+dtypes,SparkDataFrame-method;
+except, except,
+except,SparkDataFrame,SparkDataFrame-method;
+explain, explain,
+explain,SparkDataFrame-method;
+filter, filter,
+filter,SparkDataFrame,characterOrColumn-method,
+where, where,
+where,SparkDataFrame,characterOrColumn-method;
+first, first,
+first,
+first,SparkDataFrame-method,
+first,characterOrColumn-method;
+gapplyCollect, gapplyCollect,
+gapplyCollect,
+gapplyCollect,GroupedData-method,
+gapplyCollect,SparkDataFrame-method;
+gapply, gapply,
+gapply,
+gapply,GroupedData-method,
+gapply,SparkDataFrame-method;
+groupBy, groupBy,
+groupBy,SparkDataFrame-method,
+group_by, group_by,
+group_by,SparkDataFrame-method;
+head,
+head,SparkDataFrame-method;
+histogram,
+histogram,SparkDataFrame,characterOrColumn-method;
+insertInto, insertInto,
+insertInto,SparkDataFrame,character-method;
+intersect, intersect,
+intersect,SparkDataFrame,SparkDataFrame-method;
+isLocal, isLocal,
+isLocal,SparkDataFrame-method;
+join,
+join,SparkDataFrame,SparkDataFrame-method;
+limit, limit,
+limit,SparkDataFrame,numeric-method;
+merge, merge,
+merge,SparkDataFrame,SparkDataFrame-method;
+mutate, mutate,
+mutate,SparkDataFrame-method,
+transform, transform,
+transform,SparkDataFrame-method;
+ncol,
+ncol,SparkDataFrame-method;
+persist, persist,
+persist,SparkDataFrame,character-method;
+printSchema, printSchema,
+printSchema,SparkDataFrame-method;
+randomSplit, randomSplit,

[47/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/avg.html
--
diff --git a/site/docs/2.0.2/api/R/avg.html b/site/docs/2.0.2/api/R/avg.html
new file mode 100644
index 000..b146502
--- /dev/null
+++ b/site/docs/2.0.2/api/R/avg.html
@@ -0,0 +1,109 @@
+
+R: avg
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+avg 
{SparkR}R Documentation
+
+avg
+
+Description
+
+Aggregate function: returns the average of the values in a group.
+
+
+
+Usage
+
+
+## S4 method for signature 'Column'
+avg(x)
+
+avg(x, ...)
+
+
+
+Arguments
+
+
+x
+
+Column to compute on or a GroupedData object.
+
+...
+
+additional argument(s) when x is a GroupedData object.
+
+
+
+
+Note
+
+avg since 1.4.0
+
+
+
+See Also
+
+Other agg_funcs: agg, agg,
+agg, agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+countDistinct, countDistinct,
+countDistinct,Column-method,
+n_distinct, n_distinct,
+n_distinct,Column-method;
+count, count,
+count,Column-method,
+count,GroupedData-method, n,
+n, n,Column-method;
+first, first,
+first,
+first,SparkDataFrame-method,
+first,characterOrColumn-method;
+kurtosis, kurtosis,
+kurtosis,Column-method; last,
+last,
+last,characterOrColumn-method;
+max, max,Column-method;
+mean, mean,Column-method;
+min, min,Column-method;
+sd, sd,
+sd,Column-method, stddev,
+stddev, stddev,Column-method;
+skewness, skewness,
+skewness,Column-method;
+stddev_pop, stddev_pop,
+stddev_pop,Column-method;
+stddev_samp, stddev_samp,
+stddev_samp,Column-method;
+sumDistinct, sumDistinct,
+sumDistinct,Column-method;
+sum, sum,Column-method;
+var_pop, var_pop,
+var_pop,Column-method;
+var_samp, var_samp,
+var_samp,Column-method; var,
+var, var,Column-method,
+variance, variance,
+variance,Column-method
+
+
+
+Examples
+
+## Not run: avg(df$c)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/base64.html
--
diff --git a/site/docs/2.0.2/api/R/base64.html 
b/site/docs/2.0.2/api/R/base64.html
new file mode 100644
index 000..736a1fd
--- /dev/null
+++ b/site/docs/2.0.2/api/R/base64.html
@@ -0,0 +1,115 @@
+
+R: base64
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+base64 
{SparkR}R Documentation
+
+base64
+
+Description
+
+Computes the BASE64 encoding of a binary column and returns it as a string 
column.
+This is the reverse of unbase64.
+
+
+
+Usage
+
+
+## S4 method for signature 'Column'
+base64(x)
+
+base64(x)
+
+
+
+Arguments
+
+
+x
+
+Column to compute on.
+
+
+
+
+Note
+
+base64 since 1.5.0
+
+
+
+See Also
+
+Other string_funcs: ascii,
+ascii, ascii,Column-method;
+concat_ws, concat_ws,
+concat_ws,character,Column-method;
+concat, concat,
+concat,Column-method; decode,
+decode,
+decode,Column,character-method;
+encode, encode,
+encode,Column,character-method;
+format_number, format_number,
+format_number,Column,numeric-method;
+format_string, format_string,
+format_string,character,Column-method;
+initcap, initcap,
+initcap,Column-method; instr,
+instr,
+instr,Column,character-method;
+length, length,Column-method;
+levenshtein, levenshtein,
+levenshtein,Column-method;
+locate, locate,
+locate,character,Column-method;
+lower, lower,
+lower,Column-method; lpad,
+lpad,
+lpad,Column,numeric,character-method;
+ltrim, ltrim,
+ltrim,Column-method;
+regexp_extract,
+regexp_extract,
+regexp_extract,Column,character,numeric-method;
+regexp_replace,
+regexp_replace,
+regexp_replace,Column,character,character-method;
+reverse, reverse,
+reverse,Column-method; rpad,
+rpad,
+rpad,Column,numeric,character-method;
+rtrim, rtrim,
+rtrim,Column-method; soundex,
+soundex,
+soundex,Column-method;
+substring_index,
+substring_index,
+substring_index,Column,character,numeric-method;
+translate, translate,
+translate,Column,character,character-method;
+trim, trim,
+trim,Column-method; unbase64,
+unbase64,
+unbase64,Column-method;
+upper, upper,
+upper,Column-method
+
+
+
+Examples
+
+## Not run: base64(df$c)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/between.html
--
diff --git a/site/docs/2.0.2/api/R/between.html 
b/site/docs/2.0.2/api/R/between.html

[15/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/Accumulable.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/Accumulable.html 
b/site/docs/2.0.2/api/java/org/apache/spark/Accumulable.html
new file mode 100644
index 000..742e046
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/Accumulable.html
@@ -0,0 +1,456 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+Accumulable (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Class AccumulableR,T
+
+
+
+Object
+
+
+org.apache.spark.AccumulableR,T
+
+
+
+
+
+
+
+All Implemented Interfaces:
+java.io.Serializable
+
+
+Direct Known Subclasses:
+Accumulator
+
+
+Deprecated.
+use AccumulatorV2. Since 2.0.0.
+
+
+public class AccumulableR,T
+extends Object
+implements java.io.Serializable
+A data type that can be accumulated, i.e. has a commutative 
and associative "add" operation,
+ but where the result type, R, may be different from the element 
type being added, T.
+ 
+ You must define how to add data, and how to merge two of these together.  For 
some data types,
+ such as a counter, these might be the same operation. In that case, you can 
use the simpler
+ Accumulator. They won't always be the same, 
though -- e.g., imagine you are
+ accumulating a set. You will add items to the set, and you will union two 
sets together.
+ 
+ Operations are not thread-safe.
+ 
+ param:  id ID of this accumulator; for internal use only.
+ param:  initialValue initial value of accumulator
+ param:  param helper object defining how to add elements of type 
R and T
+ param:  name human-readable name for use in Spark's web UI
+ param:  countFailedValues whether to accumulate values from failed tasks. 
This is set to true
+  for system and time metrics like serialization time 
or bytes spilled,
+  and false for things with absolute values like 
number of input rows.
+  This should be used for internal metrics only.
+See Also:Serialized 
Form
+
+
+
+
+
+
+
+
+
+
+
+Constructor Summary
+
+Constructors
+
+Constructor and Description
+
+
+Accumulable(RinitialValue,
+   AccumulableParamR,Tparam)
+Deprecated.
+
+
+
+
+
+
+
+
+
+
+Method Summary
+
+Methods
+
+Modifier and Type
+Method and Description
+
+
+void
+add(Tterm)
+Deprecated.
+Add more data to this accumulator / accumulable
+
+
+
+long
+id()
+Deprecated.
+
+
+
+R
+localValue()
+Deprecated.
+Get the current value of this accumulator from within a 
task.
+
+
+
+void
+merge(Rterm)
+Deprecated.
+Merge two accumulable objects together
+
+
+
+scala.OptionString
+name()
+Deprecated.
+
+
+
+void
+setValue(RnewValue)
+Deprecated.
+Set the accumulator's value.
+
+
+
+String
+toString()
+Deprecated.
+
+
+
+R
+value()
+Deprecated.
+Access the accumulator's current value; only allowed on 
driver.
+
+
+
+R
+zero()
+Deprecated.
+
+
+
+
+
+
+
+Methods inherited from classObject
+equals, getClass, hashCode, notify, notifyAll, wait, wait, 
wait
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Constructor Detail
+
+
+
+
+
+
+
+Accumulable
+publicAccumulable(RinitialValue,
+   AccumulableParamR,Tparam)
+Deprecated.
+
+
+
+
+
+
+
+
+
+Method Detail
+
+
+
+
+
+id
+publiclongid()
+Deprecated.
+
+
+
+
+
+
+
+name
+publicscala.OptionStringname()
+Deprecated.
+
+
+
+
+
+
+
+zero
+publicRzero()
+Deprecated.
+
+
+
+
+
+
+
+
+
+add
+publicvoidadd(Tterm)
+Deprecated.
+Add more data to this accumulator / accumulable
+Parameters:term - 
the data to add
+
+
+
+
+
+
+
+
+
+merge
+publicvoidmerge(Rterm)
+Deprecated.
+Merge two accumulable objects together
+ 
+ Normally, a user will not want to use this version, but will instead call 
add.
+Parameters:term - 
the other R that will get merged with this
+
+
+
+
+
+
+
+value
+publicRvalue()
+Deprecated.
+Access the accumulator's current value; only allowed on 
driver.
+Returns:(undocumented)
+
+
+
+
+
+
+
+localValue
+publicRlocalValue()
+Deprecated.
+Get the current value of this accumulator from within a 
task.
+ 
+ This is NOT the global value of the accumulator.  To get the global value 
after a
+ completed operation on the dataset, call value.
+ 
+ The typical use of this method is to directly mutate the local value, eg., to 
add
+ an element

[18/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/index-all.html
--
diff --git a/site/docs/2.0.2/api/java/index-all.html 
b/site/docs/2.0.2/api/java/index-all.html
new file mode 100644
index 000..12e185e
--- /dev/null
+++ b/site/docs/2.0.2/api/java/index-all.html
@@ -0,0 +1,45389 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+Index (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev
+Next
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+
+
+$ABCDEFGHIJKLMNOPQRSTUVWXYZ_
+
+
+$
+
+$colon$bslash(B,
 Function2A, B, B) - Static method in class 
org.apache.spark.sql.types.StructType
+
+$colon$plus(B,
 CanBuildFromRepr, B, That) - Static method in class 
org.apache.spark.sql.types.StructType
+
+$div$colon(B,
 Function2B, A, B) - Static method in class 
org.apache.spark.sql.types.StructType
+
+$greater(A)
 - Static method in class org.apache.spark.sql.types.Decimal
+
+$greater(A)
 - Static method in class org.apache.spark.storage.RDDInfo
+
+$greater$eq(A)
 - Static method in class org.apache.spark.sql.types.Decimal
+
+$greater$eq(A)
 - Static method in class org.apache.spark.storage.RDDInfo
+
+$less(A) - 
Static method in class org.apache.spark.sql.types.Decimal
+
+$less(A) - 
Static method in class org.apache.spark.storage.RDDInfo
+
+$less$eq(A)
 - Static method in class org.apache.spark.sql.types.Decimal
+
+$less$eq(A)
 - Static method in class org.apache.spark.storage.RDDInfo
+
+$minus$greater(T)
 - Static method in class org.apache.spark.ml.param.DoubleParam
+
+$minus$greater(T)
 - Static method in class org.apache.spark.ml.param.FloatParam
+
+$plus$colon(B,
 CanBuildFromRepr, B, That) - Static method in class 
org.apache.spark.sql.types.StructType
+
+$plus$eq(T) - 
Static method in class org.apache.spark.Accumulator
+
+Deprecated.
+
+$plus$plus(RDDT)
 - Static method in class org.apache.spark.api.r.RRDD
+
+$plus$plus(RDDT)
 - Static method in class org.apache.spark.graphx.EdgeRDD
+
+$plus$plus(RDDT)
 - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
+
+$plus$plus(RDDT)
 - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
+
+$plus$plus(RDDT)
 - Static method in class org.apache.spark.graphx.VertexRDD
+
+$plus$plus(RDDT)
 - Static method in class org.apache.spark.rdd.HadoopRDD
+
+$plus$plus(RDDT)
 - Static method in class org.apache.spark.rdd.JdbcRDD
+
+$plus$plus(RDDT)
 - Static method in class org.apache.spark.rdd.NewHadoopRDD
+
+$plus$plus(RDDT)
 - Static method in class org.apache.spark.rdd.PartitionPruningRDD
+
+$plus$plus(RDDT)
 - Static method in class org.apache.spark.rdd.UnionRDD
+
+$plus$plus(GenTraversableOnceB,
 CanBuildFromRepr, B, That) - Static method in class 
org.apache.spark.sql.types.StructType
+
+$plus$plus$colon(TraversableOnceB,
 CanBuildFromRepr, B, That) - Static method in class 
org.apache.spark.sql.types.StructType
+
+$plus$plus$colon(TraversableB,
 CanBuildFromRepr, B, That) - Static method in class 
org.apache.spark.sql.types.StructType
+
+$plus$plus$eq(R)
 - Static method in class org.apache.spark.Accumulator
+
+Deprecated.
+
+
+
+
+
+A
+
+abs(Column)
 - Static method in class org.apache.spark.sql.functions
+
+Computes the absolute value.
+
+abs() - 
Method in class org.apache.spark.sql.types.Decimal
+
+absent() - 
Static method in class org.apache.spark.api.java.Optional
+
+AbsoluteError - Class in org.apache.spark.mllib.tree.loss
+
+:: DeveloperApi ::
+ Class for absolute error loss calculation (for regression).
+
+AbsoluteError()
 - Constructor for class org.apache.spark.mllib.tree.loss.AbsoluteError
+
+accept(Parsers)
 - Static method in class org.apache.spark.ml.feature.RFormulaParser
+
+accept(ES,
 Function1ES, ListObject) - Static method in class 
org.apache.spark.ml.feature.RFormulaParser
+
+accept(String,
 PartialFunctionObject, U) - Static method in class 
org.apache.spark.ml.feature.RFormulaParser
+
+acceptIf(Function1Object,
 Object, Function1Object, String) - Static method in 
class org.apache.spark.ml.feature.RFormulaParser
+
+acceptMatch(String,
 PartialFunctionObject, U) - Static method in class 
org.apache.spark.ml.feature.RFormulaParser
+
+acceptSeq(ES,
 Function1ES, IterableObject) - Static method in 
class org.apache.spark.ml.feature.RFormulaParser
+
+accId() - Method 
in class org.apache.spark.CleanAccum
+
+AccumulableR,T - Class in org.apache.spark
+
+Deprecated.
+use AccumulatorV2. Since 2.0.0.
+
+
+Accumulable(R,
 AccumulableParamR, T) -

[41/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/factorial.html
--
diff --git a/site/docs/2.0.2/api/R/factorial.html 
b/site/docs/2.0.2/api/R/factorial.html
new file mode 100644
index 000..b72dc6b
--- /dev/null
+++ b/site/docs/2.0.2/api/R/factorial.html
@@ -0,0 +1,119 @@
+
+R: factorial
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+factorial 
{SparkR}R Documentation
+
+factorial
+
+Description
+
+Computes the factorial of the given value.
+
+
+
+Usage
+
+
+## S4 method for signature 'Column'
+factorial(x)
+
+
+
+Arguments
+
+
+x
+
+Column to compute on.
+
+
+
+
+Note
+
+factorial since 1.5.0
+
+
+
+See Also
+
+Other math_funcs: acos,
+acos,Column-method; asin,
+asin,Column-method; atan2,
+atan2,Column-method; atan,
+atan,Column-method; bin,
+bin, bin,Column-method;
+bround, bround,
+bround,Column-method; cbrt,
+cbrt, cbrt,Column-method;
+ceil, ceil,
+ceil,Column-method, ceiling,
+ceiling,Column-method; conv,
+conv,
+conv,Column,numeric,numeric-method;
+corr, corr,
+corr, corr,Column-method,
+corr,SparkDataFrame-method;
+cosh, cosh,Column-method;
+cos, cos,Column-method;
+covar_pop, covar_pop,
+covar_pop,characterOrColumn,characterOrColumn-method;
+cov, cov, cov,
+cov,SparkDataFrame-method,
+cov,characterOrColumn-method,
+covar_samp, covar_samp,
+covar_samp,characterOrColumn,characterOrColumn-method;
+expm1, expm1,Column-method;
+exp, exp,Column-method;
+floor, floor,Column-method;
+hex, hex,
+hex,Column-method; hypot,
+hypot, hypot,Column-method;
+log10, log10,Column-method;
+log1p, log1p,Column-method;
+log2, log2,Column-method;
+log, log,Column-method;
+pmod, pmod,
+pmod,Column-method; rint,
+rint, rint,Column-method;
+round, round,Column-method;
+shiftLeft, shiftLeft,
+shiftLeft,Column,numeric-method;
+shiftRightUnsigned,
+shiftRightUnsigned,
+shiftRightUnsigned,Column,numeric-method;
+shiftRight, shiftRight,
+shiftRight,Column,numeric-method;
+sign, sign,Column-method,
+signum, signum,
+signum,Column-method; sinh,
+sinh,Column-method; sin,
+sin,Column-method; sqrt,
+sqrt,Column-method; tanh,
+tanh,Column-method; tan,
+tan,Column-method; toDegrees,
+toDegrees,
+toDegrees,Column-method;
+toRadians, toRadians,
+toRadians,Column-method;
+unhex, unhex,
+unhex,Column-method
+
+
+
+Examples
+
+## Not run: factorial(df$c)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/filter.html
--
diff --git a/site/docs/2.0.2/api/R/filter.html 
b/site/docs/2.0.2/api/R/filter.html
new file mode 100644
index 000..eed100d
--- /dev/null
+++ b/site/docs/2.0.2/api/R/filter.html
@@ -0,0 +1,288 @@
+
+R: Filter
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+filter 
{SparkR}R Documentation
+
+Filter
+
+Description
+
+Filter the rows of a SparkDataFrame according to a given condition.
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame,characterOrColumn'
+filter(x, condition)
+
+## S4 method for signature 'SparkDataFrame,characterOrColumn'
+where(x, condition)
+
+filter(x, condition)
+
+where(x, condition)
+
+
+
+Arguments
+
+
+x
+
+A SparkDataFrame to be sorted.
+
+condition
+
+The condition to filter on. This may either be a Column expression
+or a string containing a SQL statement
+
+
+
+
+Value
+
+A SparkDataFrame containing only the rows that meet the condition.
+
+
+
+Note
+
+filter since 1.4.0
+
+where since 1.4.0
+
+
+
+See Also
+
+Other SparkDataFrame functions: $,
+$,SparkDataFrame-method, $-,
+$-,SparkDataFrame-method,
+select, select,
+select,SparkDataFrame,Column-method,
+select,SparkDataFrame,character-method,
+select,SparkDataFrame,list-method;
+SparkDataFrame-class; [,
+[,SparkDataFrame-method, [[,
+[[,SparkDataFrame,numericOrcharacter-method,
+subset, subset,
+subset,SparkDataFrame-method;
+agg, agg, agg,
+agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+arrange, arrange,
+arrange,
+arrange,SparkDataFrame,Column-method,
+arrange,SparkDataFrame,character-method,
+orderBy,SparkDataFrame,characterOrColumn-method;
+as.data.frame,
+as.data.frame,SparkDataFrame-method;
+attach,
+attach,SparkDataFrame-method;
+cache, cache,
+cache,SparkDataFrame-method;
+collect, collect,
+collect,SparkDataFrame-method;
+colnames, colnames,
+colnames,SparkDataFrame-method,

[29/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/show.html
--
diff --git a/site/docs/2.0.2/api/R/show.html b/site/docs/2.0.2/api/R/show.html
new file mode 100644
index 000..e6d3735
--- /dev/null
+++ b/site/docs/2.0.2/api/R/show.html
@@ -0,0 +1,269 @@
+
+R: show
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+show 
{SparkR}R Documentation
+
+show
+
+Description
+
+Print class and type information of a Spark object.
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame'
+show(object)
+
+## S4 method for signature 'WindowSpec'
+show(object)
+
+## S4 method for signature 'Column'
+show(object)
+
+## S4 method for signature 'GroupedData'
+show(object)
+
+
+
+Arguments
+
+
+object
+
+a Spark object. Can be a SparkDataFrame, Column, GroupedData, 
WindowSpec.
+
+
+
+
+Note
+
+show(SparkDataFrame) since 1.4.0
+
+show(WindowSpec) since 2.0.0
+
+show(Column) since 1.4.0
+
+show(GroupedData) since 1.4.0
+
+
+
+See Also
+
+Other SparkDataFrame functions: $,
+$,SparkDataFrame-method, $-,
+$-,SparkDataFrame-method,
+select, select,
+select,SparkDataFrame,Column-method,
+select,SparkDataFrame,character-method,
+select,SparkDataFrame,list-method;
+SparkDataFrame-class; [,
+[,SparkDataFrame-method, [[,
+[[,SparkDataFrame,numericOrcharacter-method,
+subset, subset,
+subset,SparkDataFrame-method;
+agg, agg, agg,
+agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+arrange, arrange,
+arrange,
+arrange,SparkDataFrame,Column-method,
+arrange,SparkDataFrame,character-method,
+orderBy,SparkDataFrame,characterOrColumn-method;
+as.data.frame,
+as.data.frame,SparkDataFrame-method;
+attach,
+attach,SparkDataFrame-method;
+cache, cache,
+cache,SparkDataFrame-method;
+collect, collect,
+collect,SparkDataFrame-method;
+colnames, colnames,
+colnames,SparkDataFrame-method,
+colnames-, colnames-,
+colnames-,SparkDataFrame-method,
+columns, columns,
+columns,SparkDataFrame-method,
+names,
+names,SparkDataFrame-method,
+names-,
+names-,SparkDataFrame-method;
+coltypes, coltypes,
+coltypes,SparkDataFrame-method,
+coltypes-, coltypes-,
+coltypes-,SparkDataFrame,character-method;
+count,SparkDataFrame-method,
+nrow, nrow,
+nrow,SparkDataFrame-method;
+createOrReplaceTempView,
+createOrReplaceTempView,
+createOrReplaceTempView,SparkDataFrame,character-method;
+dapplyCollect, dapplyCollect,
+dapplyCollect,SparkDataFrame,function-method;
+dapply, dapply,
+dapply,SparkDataFrame,function,structType-method;
+describe, describe,
+describe,
+describe,SparkDataFrame,ANY-method,
+describe,SparkDataFrame,character-method,
+describe,SparkDataFrame-method,
+summary, summary,
+summary,SparkDataFrame-method;
+dim,
+dim,SparkDataFrame-method;
+distinct, distinct,
+distinct,SparkDataFrame-method,
+unique,
+unique,SparkDataFrame-method;
+dropDuplicates,
+dropDuplicates,
+dropDuplicates,SparkDataFrame-method;
+dropna, dropna,
+dropna,SparkDataFrame-method,
+fillna, fillna,
+fillna,SparkDataFrame-method,
+na.omit, na.omit,
+na.omit,SparkDataFrame-method;
+drop, drop,
+drop, drop,ANY-method,
+drop,SparkDataFrame-method;
+dtypes, dtypes,
+dtypes,SparkDataFrame-method;
+except, except,
+except,SparkDataFrame,SparkDataFrame-method;
+explain, explain,
+explain,SparkDataFrame-method;
+filter, filter,
+filter,SparkDataFrame,characterOrColumn-method,
+where, where,
+where,SparkDataFrame,characterOrColumn-method;
+first, first,
+first,
+first,SparkDataFrame-method,
+first,characterOrColumn-method;
+gapplyCollect, gapplyCollect,
+gapplyCollect,
+gapplyCollect,GroupedData-method,
+gapplyCollect,SparkDataFrame-method;
+gapply, gapply,
+gapply,
+gapply,GroupedData-method,
+gapply,SparkDataFrame-method;
+groupBy, groupBy,
+groupBy,SparkDataFrame-method,
+group_by, group_by,
+group_by,SparkDataFrame-method;
+head,
+head,SparkDataFrame-method;
+histogram,
+histogram,SparkDataFrame,characterOrColumn-method;
+insertInto, insertInto,
+insertInto,SparkDataFrame,character-method;
+intersect, intersect,
+intersect,SparkDataFrame,SparkDataFrame-method;
+isLocal, isLocal,
+isLocal,SparkDataFrame-method;
+join,
+join,SparkDataFrame,SparkDataFrame-method;
+limit, limit,
+limit,SparkDataFrame,numeric-method;
+merge, merge,
+merge,SparkDataFrame,SparkDataFrame-method;
+mutate, mutate,
+mutate,SparkDataFrame-method,
+transform, transform,
+transform,SparkDataFrame-method;
+ncol,
+ncol,SparkDataFrame-method;
+persist, persist,
+persist,SparkDataFrame,character-method;
+printSchema, printSchema,
+printSchema,SparkDataFrame-method;
+randomSplit, randomSplit,
+randomSplit,SparkDataFrame,numeric-method;
+rbind, rbind,

[36/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/lower.html
--
diff --git a/site/docs/2.0.2/api/R/lower.html b/site/docs/2.0.2/api/R/lower.html
new file mode 100644
index 000..8122f5f
--- /dev/null
+++ b/site/docs/2.0.2/api/R/lower.html
@@ -0,0 +1,114 @@
+
+R: lower
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+lower 
{SparkR}R Documentation
+
+lower
+
+Description
+
+Converts a string column to lower case.
+
+
+
+Usage
+
+
+## S4 method for signature 'Column'
+lower(x)
+
+lower(x)
+
+
+
+Arguments
+
+
+x
+
+Column to compute on.
+
+
+
+
+Note
+
+lower since 1.4.0
+
+
+
+See Also
+
+Other string_funcs: ascii,
+ascii, ascii,Column-method;
+base64, base64,
+base64,Column-method;
+concat_ws, concat_ws,
+concat_ws,character,Column-method;
+concat, concat,
+concat,Column-method; decode,
+decode,
+decode,Column,character-method;
+encode, encode,
+encode,Column,character-method;
+format_number, format_number,
+format_number,Column,numeric-method;
+format_string, format_string,
+format_string,character,Column-method;
+initcap, initcap,
+initcap,Column-method; instr,
+instr,
+instr,Column,character-method;
+length, length,Column-method;
+levenshtein, levenshtein,
+levenshtein,Column-method;
+locate, locate,
+locate,character,Column-method;
+lpad, lpad,
+lpad,Column,numeric,character-method;
+ltrim, ltrim,
+ltrim,Column-method;
+regexp_extract,
+regexp_extract,
+regexp_extract,Column,character,numeric-method;
+regexp_replace,
+regexp_replace,
+regexp_replace,Column,character,character-method;
+reverse, reverse,
+reverse,Column-method; rpad,
+rpad,
+rpad,Column,numeric,character-method;
+rtrim, rtrim,
+rtrim,Column-method; soundex,
+soundex,
+soundex,Column-method;
+substring_index,
+substring_index,
+substring_index,Column,character,numeric-method;
+translate, translate,
+translate,Column,character,character-method;
+trim, trim,
+trim,Column-method; unbase64,
+unbase64,
+unbase64,Column-method;
+upper, upper,
+upper,Column-method
+
+
+
+Examples
+
+## Not run: lower(df$c)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/lpad.html
--
diff --git a/site/docs/2.0.2/api/R/lpad.html b/site/docs/2.0.2/api/R/lpad.html
new file mode 100644
index 000..ea0f899
--- /dev/null
+++ b/site/docs/2.0.2/api/R/lpad.html
@@ -0,0 +1,121 @@
+
+R: lpad
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+lpad 
{SparkR}R Documentation
+
+lpad
+
+Description
+
+Left-pad the string column with
+
+
+
+Usage
+
+
+## S4 method for signature 'Column,numeric,character'
+lpad(x, len, pad)
+
+lpad(x, len, pad)
+
+
+
+Arguments
+
+
+x
+
+the string Column to be left-padded.
+
+len
+
+maximum length of each output result.
+
+pad
+
+a character string to be padded with.
+
+
+
+
+Note
+
+lpad since 1.5.0
+
+
+
+See Also
+
+Other string_funcs: ascii,
+ascii, ascii,Column-method;
+base64, base64,
+base64,Column-method;
+concat_ws, concat_ws,
+concat_ws,character,Column-method;
+concat, concat,
+concat,Column-method; decode,
+decode,
+decode,Column,character-method;
+encode, encode,
+encode,Column,character-method;
+format_number, format_number,
+format_number,Column,numeric-method;
+format_string, format_string,
+format_string,character,Column-method;
+initcap, initcap,
+initcap,Column-method; instr,
+instr,
+instr,Column,character-method;
+length, length,Column-method;
+levenshtein, levenshtein,
+levenshtein,Column-method;
+locate, locate,
+locate,character,Column-method;
+lower, lower,
+lower,Column-method; ltrim,
+ltrim, ltrim,Column-method;
+regexp_extract,
+regexp_extract,
+regexp_extract,Column,character,numeric-method;
+regexp_replace,
+regexp_replace,
+regexp_replace,Column,character,character-method;
+reverse, reverse,
+reverse,Column-method; rpad,
+rpad,
+rpad,Column,numeric,character-method;
+rtrim, rtrim,
+rtrim,Column-method; soundex,
+soundex,
+soundex,Column-method;
+substring_index,
+substring_index,
+substring_index,Column,character,numeric-method;
+translate, translate,
+translate,Column,character,character-method;
+trim, trim,
+trim,Column-method; unbase64,
+unbase64,
+unbase64,Column-method;
+upper, upper,
+upper,Column-method
+
+
+
+Examples
+
+## Not run: lpad(df$c, 6, #)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

[16/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/lib/jquery.js
--
diff --git a/site/docs/2.0.2/api/java/lib/jquery.js 
b/site/docs/2.0.2/api/java/lib/jquery.js
new file mode 100644
index 000..bc3fbc8
--- /dev/null
+++ b/site/docs/2.0.2/api/java/lib/jquery.js
@@ -0,0 +1,2 @@
+/*! jQuery v1.8.2 jquery.com | jquery.org/license */
+(function(a,b){function G(a){var b=F[a]={};return 
p.each(a.split(s),function(a,c){b[c]=!0}),b}function 
J(a,c,d){if(d===b&===1){var 
e="data-"+c.replace(I,"-$1").toLowerCase();d=a.getAttribute(e);if(typeof 
d=="string"){try{d=d==="true"?!0:d==="false"?!1:d==="null"?null:+d+""===d?+d:H.test(d)?p.parseJSON(d):d}catch(f){}p.data(a,c,d)}else
 d=b}return d}function K(a){var b;for(b in 
a){if(b==="data"&(a[b]))continue;if(b!=="toJSON")return!1}return!0}function
 ba(){return!1}function bb(){return!0}function 
bh(a){return!a||!a.parentNode||a.parentNode.nodeType===11}function bi(a,b){do 
a=a[b];while(a&!==1);return a}function 
bj(a,b,c){b=b||0;if(p.isFunction(b))return p.grep(a,function(a,d){var 
e=!!b.call(a,d,a);return e===c});if(b.nodeType)return 
p.grep(a,function(a,d){return a===b===c});if(typeof b=="string"){var 
d=p.grep(a,function(a){return a.nodeType===1});if(be.test(b))return 
p.filter(b,d,!c);b=p.filter(b,d)}return p.grep(a,function(a,d){return p.inArray(
 a,b)>=0===c})}function bk(a){var 
b=bl.split("|"),c=a.createDocumentFragment();if(c.createElement)while(b.length)c.createElement(b.pop());return
 c}function bC(a,b){return 
a.getElementsByTagName(b)[0]||a.appendChild(a.ownerDocument.createElement(b))}function
 bD(a,b){if(b.nodeType!==1||!p.hasData(a))return;var 
c,d,e,f=p._data(a),g=p._data(b,f),h=f.events;if(h){delete 
g.handle,g.events={};for(c in 
h)for(d=0,e=h[c].length;d").appendTo(e.body),c=b.css("display");b.remove();if(c==="none"||c===""){bI=e.body.appendChild(bI||p.extend(e.createElement("iframe"),{frameBorder:0,width:0,height:0}));if(!bJ||!bI.
 
createElement)bJ=(bI.contentWindow||bI.contentDocument).document,bJ.write(""),bJ.close();b=bJ.body.appendChild(bJ.createElement(a)),c=bH(b,"display"),e.body.removeChild(bI)}return
 bS[a]=c,c}function ci(a,b,c,d){var 
e;if(p.isArray(b))p.each(b,function(b,e){c||ce.test(a)?d(a,e):ci(a+"["+(typeof 
e=="object"?b:"")+"]",e,c,d)});else if(!c&(b)==="object")for(e in 
b)ci(a+"["+e+"]",b[e],c,d);else d(a,b)}function cz(a){return 
function(b,c){typeof b!="string"&&(c=b,b="*");var

[10/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/InternalAccumulator.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/InternalAccumulator.html 
b/site/docs/2.0.2/api/java/org/apache/spark/InternalAccumulator.html
new file mode 100644
index 000..d6fed2d
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/InternalAccumulator.html
@@ -0,0 +1,469 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+InternalAccumulator (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next 
Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Class 
InternalAccumulator
+
+
+
+Object
+
+
+org.apache.spark.InternalAccumulator
+
+
+
+
+
+
+
+
+public class InternalAccumulator
+extends Object
+A collection of fields and methods concerned with internal 
accumulators that represent
+ task level metrics.
+
+
+
+
+
+
+
+
+
+
+
+Nested Class Summary
+
+Nested Classes
+
+Modifier and Type
+Class and Description
+
+
+static class
+InternalAccumulator.input$
+
+
+static class
+InternalAccumulator.output$
+
+
+static class
+InternalAccumulator.shuffleRead$
+
+
+static class
+InternalAccumulator.shuffleWrite$
+
+
+
+
+
+
+
+
+
+Constructor Summary
+
+Constructors
+
+Constructor and Description
+
+
+InternalAccumulator()
+
+
+
+
+
+
+
+
+
+Method Summary
+
+Methods
+
+Modifier and Type
+Method and Description
+
+
+static String
+DISK_BYTES_SPILLED()
+
+
+static String
+EXECUTOR_DESERIALIZE_TIME()
+
+
+static String
+EXECUTOR_RUN_TIME()
+
+
+static String
+INPUT_METRICS_PREFIX()
+
+
+static String
+JVM_GC_TIME()
+
+
+static String
+MEMORY_BYTES_SPILLED()
+
+
+static String
+METRICS_PREFIX()
+
+
+static String
+OUTPUT_METRICS_PREFIX()
+
+
+static String
+PEAK_EXECUTION_MEMORY()
+
+
+static String
+RESULT_SERIALIZATION_TIME()
+
+
+static String
+RESULT_SIZE()
+
+
+static String
+SHUFFLE_READ_METRICS_PREFIX()
+
+
+static String
+SHUFFLE_WRITE_METRICS_PREFIX()
+
+
+static String
+TEST_ACCUM()
+
+
+static String
+UPDATED_BLOCK_STATUSES()
+
+
+
+
+
+
+Methods inherited from classObject
+equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, 
wait
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Constructor Detail
+
+
+
+
+
+InternalAccumulator
+publicInternalAccumulator()
+
+
+
+
+
+
+
+
+
+Method Detail
+
+
+
+
+
+METRICS_PREFIX
+public staticStringMETRICS_PREFIX()
+
+
+
+
+
+
+
+SHUFFLE_READ_METRICS_PREFIX
+public staticStringSHUFFLE_READ_METRICS_PREFIX()
+
+
+
+
+
+
+
+SHUFFLE_WRITE_METRICS_PREFIX
+public staticStringSHUFFLE_WRITE_METRICS_PREFIX()
+
+
+
+
+
+
+
+OUTPUT_METRICS_PREFIX
+public staticStringOUTPUT_METRICS_PREFIX()
+
+
+
+
+
+
+
+INPUT_METRICS_PREFIX
+public staticStringINPUT_METRICS_PREFIX()
+
+
+
+
+
+
+
+EXECUTOR_DESERIALIZE_TIME
+public staticStringEXECUTOR_DESERIALIZE_TIME()
+
+
+
+
+
+
+
+EXECUTOR_RUN_TIME
+public staticStringEXECUTOR_RUN_TIME()
+
+
+
+
+
+
+
+RESULT_SIZE
+public staticStringRESULT_SIZE()
+
+
+
+
+
+
+
+JVM_GC_TIME
+public staticStringJVM_GC_TIME()
+
+
+
+
+
+
+
+RESULT_SERIALIZATION_TIME
+public staticStringRESULT_SERIALIZATION_TIME()
+
+
+
+
+
+
+
+MEMORY_BYTES_SPILLED
+public staticStringMEMORY_BYTES_SPILLED()
+
+
+
+
+
+
+
+DISK_BYTES_SPILLED
+public staticStringDISK_BYTES_SPILLED()
+
+
+
+
+
+
+
+PEAK_EXECUTION_MEMORY
+public staticStringPEAK_EXECUTION_MEMORY()
+
+
+
+
+
+
+
+UPDATED_BLOCK_STATUSES
+public staticStringUPDATED_BLOCK_STATUSES()
+
+
+
+
+
+
+
+TEST_ACCUM
+public staticStringTEST_ACCUM()
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next 
Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/InternalAccumulator.input$.html
--
diff --git 
a/site/docs/2.0.2/api/java/org/apache/spark/InternalAccumulator.input$.html 
b/site/docs/2.0.2/api/java/org/apache/spark/InternalAccumulator.input$.html
new file

[39/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/head.html
--
diff --git a/site/docs/2.0.2/api/R/head.html b/site/docs/2.0.2/api/R/head.html
new file mode 100644
index 000..ec3431b
--- /dev/null
+++ b/site/docs/2.0.2/api/R/head.html
@@ -0,0 +1,267 @@
+
+R: Head
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+head 
{SparkR}R Documentation
+
+Head
+
+Description
+
+Return the first num rows of a SparkDataFrame as a R 
data.frame. If num is not
+specified, then head() returns the first 6 rows as with R data.frame.
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame'
+head(x, num = 6L)
+
+
+
+Arguments
+
+
+x
+
+a SparkDataFrame.
+
+num
+
+the number of rows to return. Default is 6.
+
+
+
+
+Value
+
+A data.frame.
+
+
+
+Note
+
+head since 1.4.0
+
+
+
+See Also
+
+Other SparkDataFrame functions: $,
+$,SparkDataFrame-method, $-,
+$-,SparkDataFrame-method,
+select, select,
+select,SparkDataFrame,Column-method,
+select,SparkDataFrame,character-method,
+select,SparkDataFrame,list-method;
+SparkDataFrame-class; [,
+[,SparkDataFrame-method, [[,
+[[,SparkDataFrame,numericOrcharacter-method,
+subset, subset,
+subset,SparkDataFrame-method;
+agg, agg, agg,
+agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+arrange, arrange,
+arrange,
+arrange,SparkDataFrame,Column-method,
+arrange,SparkDataFrame,character-method,
+orderBy,SparkDataFrame,characterOrColumn-method;
+as.data.frame,
+as.data.frame,SparkDataFrame-method;
+attach,
+attach,SparkDataFrame-method;
+cache, cache,
+cache,SparkDataFrame-method;
+collect, collect,
+collect,SparkDataFrame-method;
+colnames, colnames,
+colnames,SparkDataFrame-method,
+colnames-, colnames-,
+colnames-,SparkDataFrame-method,
+columns, columns,
+columns,SparkDataFrame-method,
+names,
+names,SparkDataFrame-method,
+names-,
+names-,SparkDataFrame-method;
+coltypes, coltypes,
+coltypes,SparkDataFrame-method,
+coltypes-, coltypes-,
+coltypes-,SparkDataFrame,character-method;
+count,SparkDataFrame-method,
+nrow, nrow,
+nrow,SparkDataFrame-method;
+createOrReplaceTempView,
+createOrReplaceTempView,
+createOrReplaceTempView,SparkDataFrame,character-method;
+dapplyCollect, dapplyCollect,
+dapplyCollect,SparkDataFrame,function-method;
+dapply, dapply,
+dapply,SparkDataFrame,function,structType-method;
+describe, describe,
+describe,
+describe,SparkDataFrame,ANY-method,
+describe,SparkDataFrame,character-method,
+describe,SparkDataFrame-method,
+summary, summary,
+summary,SparkDataFrame-method;
+dim,
+dim,SparkDataFrame-method;
+distinct, distinct,
+distinct,SparkDataFrame-method,
+unique,
+unique,SparkDataFrame-method;
+dropDuplicates,
+dropDuplicates,
+dropDuplicates,SparkDataFrame-method;
+dropna, dropna,
+dropna,SparkDataFrame-method,
+fillna, fillna,
+fillna,SparkDataFrame-method,
+na.omit, na.omit,
+na.omit,SparkDataFrame-method;
+drop, drop,
+drop, drop,ANY-method,
+drop,SparkDataFrame-method;
+dtypes, dtypes,
+dtypes,SparkDataFrame-method;
+except, except,
+except,SparkDataFrame,SparkDataFrame-method;
+explain, explain,
+explain,SparkDataFrame-method;
+filter, filter,
+filter,SparkDataFrame,characterOrColumn-method,
+where, where,
+where,SparkDataFrame,characterOrColumn-method;
+first, first,
+first,
+first,SparkDataFrame-method,
+first,characterOrColumn-method;
+gapplyCollect, gapplyCollect,
+gapplyCollect,
+gapplyCollect,GroupedData-method,
+gapplyCollect,SparkDataFrame-method;
+gapply, gapply,
+gapply,
+gapply,GroupedData-method,
+gapply,SparkDataFrame-method;
+groupBy, groupBy,
+groupBy,SparkDataFrame-method,
+group_by, group_by,
+group_by,SparkDataFrame-method;
+histogram,
+histogram,SparkDataFrame,characterOrColumn-method;
+insertInto, insertInto,
+insertInto,SparkDataFrame,character-method;
+intersect, intersect,
+intersect,SparkDataFrame,SparkDataFrame-method;
+isLocal, isLocal,
+isLocal,SparkDataFrame-method;
+join,
+join,SparkDataFrame,SparkDataFrame-method;
+limit, limit,
+limit,SparkDataFrame,numeric-method;
+merge, merge,
+merge,SparkDataFrame,SparkDataFrame-method;
+mutate, mutate,
+mutate,SparkDataFrame-method,
+transform, transform,
+transform,SparkDataFrame-method;
+ncol,
+ncol,SparkDataFrame-method;
+persist, persist,
+persist,SparkDataFrame,character-method;
+printSchema, printSchema,
+printSchema,SparkDataFrame-method;
+randomSplit, randomSplit,
+randomSplit,SparkDataFrame,numeric-method;
+rbind, rbind,
+rbind,SparkDataFrame-method;
+registerTempTable,
+registerTempTable,
+registerTempTable,SparkDataFrame,character-method;
+rename, rename,
+rename,SparkDataFrame-method,
+withColumnRenamed,

[33/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/printSchema.html
--
diff --git a/site/docs/2.0.2/api/R/printSchema.html 
b/site/docs/2.0.2/api/R/printSchema.html
new file mode 100644
index 000..b8846b6
--- /dev/null
+++ b/site/docs/2.0.2/api/R/printSchema.html
@@ -0,0 +1,258 @@
+
+R: Print Schema of a SparkDataFrame
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+printSchema {SparkR}R 
Documentation
+
+Print Schema of a SparkDataFrame
+
+Description
+
+Prints out the schema in tree format
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame'
+printSchema(x)
+
+printSchema(x)
+
+
+
+Arguments
+
+
+x
+
+A SparkDataFrame
+
+
+
+
+Note
+
+printSchema since 1.4.0
+
+
+
+See Also
+
+Other SparkDataFrame functions: $,
+$,SparkDataFrame-method, $-,
+$-,SparkDataFrame-method,
+select, select,
+select,SparkDataFrame,Column-method,
+select,SparkDataFrame,character-method,
+select,SparkDataFrame,list-method;
+SparkDataFrame-class; [,
+[,SparkDataFrame-method, [[,
+[[,SparkDataFrame,numericOrcharacter-method,
+subset, subset,
+subset,SparkDataFrame-method;
+agg, agg, agg,
+agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+arrange, arrange,
+arrange,
+arrange,SparkDataFrame,Column-method,
+arrange,SparkDataFrame,character-method,
+orderBy,SparkDataFrame,characterOrColumn-method;
+as.data.frame,
+as.data.frame,SparkDataFrame-method;
+attach,
+attach,SparkDataFrame-method;
+cache, cache,
+cache,SparkDataFrame-method;
+collect, collect,
+collect,SparkDataFrame-method;
+colnames, colnames,
+colnames,SparkDataFrame-method,
+colnames-, colnames-,
+colnames-,SparkDataFrame-method,
+columns, columns,
+columns,SparkDataFrame-method,
+names,
+names,SparkDataFrame-method,
+names-,
+names-,SparkDataFrame-method;
+coltypes, coltypes,
+coltypes,SparkDataFrame-method,
+coltypes-, coltypes-,
+coltypes-,SparkDataFrame,character-method;
+count,SparkDataFrame-method,
+nrow, nrow,
+nrow,SparkDataFrame-method;
+createOrReplaceTempView,
+createOrReplaceTempView,
+createOrReplaceTempView,SparkDataFrame,character-method;
+dapplyCollect, dapplyCollect,
+dapplyCollect,SparkDataFrame,function-method;
+dapply, dapply,
+dapply,SparkDataFrame,function,structType-method;
+describe, describe,
+describe,
+describe,SparkDataFrame,ANY-method,
+describe,SparkDataFrame,character-method,
+describe,SparkDataFrame-method,
+summary, summary,
+summary,SparkDataFrame-method;
+dim,
+dim,SparkDataFrame-method;
+distinct, distinct,
+distinct,SparkDataFrame-method,
+unique,
+unique,SparkDataFrame-method;
+dropDuplicates,
+dropDuplicates,
+dropDuplicates,SparkDataFrame-method;
+dropna, dropna,
+dropna,SparkDataFrame-method,
+fillna, fillna,
+fillna,SparkDataFrame-method,
+na.omit, na.omit,
+na.omit,SparkDataFrame-method;
+drop, drop,
+drop, drop,ANY-method,
+drop,SparkDataFrame-method;
+dtypes, dtypes,
+dtypes,SparkDataFrame-method;
+except, except,
+except,SparkDataFrame,SparkDataFrame-method;
+explain, explain,
+explain,SparkDataFrame-method;
+filter, filter,
+filter,SparkDataFrame,characterOrColumn-method,
+where, where,
+where,SparkDataFrame,characterOrColumn-method;
+first, first,
+first,
+first,SparkDataFrame-method,
+first,characterOrColumn-method;
+gapplyCollect, gapplyCollect,
+gapplyCollect,
+gapplyCollect,GroupedData-method,
+gapplyCollect,SparkDataFrame-method;
+gapply, gapply,
+gapply,
+gapply,GroupedData-method,
+gapply,SparkDataFrame-method;
+groupBy, groupBy,
+groupBy,SparkDataFrame-method,
+group_by, group_by,
+group_by,SparkDataFrame-method;
+head,
+head,SparkDataFrame-method;
+histogram,
+histogram,SparkDataFrame,characterOrColumn-method;
+insertInto, insertInto,
+insertInto,SparkDataFrame,character-method;
+intersect, intersect,
+intersect,SparkDataFrame,SparkDataFrame-method;
+isLocal, isLocal,
+isLocal,SparkDataFrame-method;
+join,
+join,SparkDataFrame,SparkDataFrame-method;
+limit, limit,
+limit,SparkDataFrame,numeric-method;
+merge, merge,
+merge,SparkDataFrame,SparkDataFrame-method;
+mutate, mutate,
+mutate,SparkDataFrame-method,
+transform, transform,
+transform,SparkDataFrame-method;
+ncol,
+ncol,SparkDataFrame-method;
+persist, persist,
+persist,SparkDataFrame,character-method;
+randomSplit, randomSplit,
+randomSplit,SparkDataFrame,numeric-method;
+rbind, rbind,
+rbind,SparkDataFrame-method;
+registerTempTable,
+registerTempTable,
+registerTempTable,SparkDataFrame,character-method;
+rename, rename,
+rename,SparkDataFrame-method,
+withColumnRenamed,
+withColumnRenamed,
+withColumnRenamed,SparkDataFrame,character,character-method;
+repartition, repartition,

[50/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/README.md
--
diff --git a/site/docs/2.0.2/README.md b/site/docs/2.0.2/README.md
new file mode 100644
index 000..ffd3b57
--- /dev/null
+++ b/site/docs/2.0.2/README.md
@@ -0,0 +1,72 @@
+Welcome to the Spark documentation!
+
+This readme will walk you through navigating and building the Spark 
documentation, which is included
+here with the Spark source code. You can also find documentation specific to 
release versions of
+Spark at http://spark.apache.org/documentation.html.
+
+Read on to learn more about viewing documentation in plain text (i.e., 
markdown) or building the
+documentation yourself. Why build it yourself? So that you have the docs that 
corresponds to
+whichever version of Spark you currently have checked out of revision control.
+
+## Prerequisites
+The Spark documentation build uses a number of tools to build HTML docs and 
API docs in Scala,
+Python and R.
+
+You need to have 
[Ruby](https://www.ruby-lang.org/en/documentation/installation/) and
+[Python](https://docs.python.org/2/using/unix.html#getting-and-installing-the-latest-version-of-python)
+installed. Also install the following libraries:
+```sh
+$ sudo gem install jekyll jekyll-redirect-from pygments.rb
+$ sudo pip install Pygments
+# Following is needed only for generating API docs
+$ sudo pip install sphinx pypandoc
+$ sudo Rscript -e 'install.packages(c("knitr", "devtools", "roxygen2", 
"testthat", "rmarkdown"), repos="http://cran.stat.ucla.edu/;)'
+```
+(Note: If you are on a system with both Ruby 1.9 and Ruby 2.0 you may need to 
replace gem with gem2.0)
+
+## Generating the Documentation HTML
+
+We include the Spark documentation as part of the source (as opposed to using 
a hosted wiki, such as
+the github wiki, as the definitive documentation) to enable the documentation 
to evolve along with
+the source code and be captured by revision control (currently git). This way 
the code automatically
+includes the version of the documentation that is relevant regardless of which 
version or release
+you have checked out or downloaded.
+
+In this directory you will find textfiles formatted using Markdown, with an 
".md" suffix. You can
+read those text files directly if you want. Start with index.md.
+
+Execute `jekyll build` from the `docs/` directory to compile the site. 
Compiling the site with
+Jekyll will create a directory called `_site` containing index.html as well as 
the rest of the
+compiled files.
+
+$ cd docs
+$ jekyll build
+
+You can modify the default Jekyll build as follows:
+```sh
+# Skip generating API docs (which takes a while)
+$ SKIP_API=1 jekyll build
+
+# Serve content locally on port 4000
+$ jekyll serve --watch
+
+# Build the site with extra features used on the live page
+$ PRODUCTION=1 jekyll build
+```
+
+## API Docs (Scaladoc, Sphinx, roxygen2)
+
+You can build just the Spark scaladoc by running `build/sbt unidoc` from the 
SPARK_PROJECT_ROOT directory.
+
+Similarly, you can build just the PySpark docs by running `make html` from the
+SPARK_PROJECT_ROOT/python/docs directory. Documentation is only generated for 
classes that are listed as
+public in `__init__.py`. The SparkR docs can be built by running 
SPARK_PROJECT_ROOT/R/create-docs.sh.
+
+When you run `jekyll` in the `docs` directory, it will also copy over the 
scaladoc for the various
+Spark subprojects into the `docs` directory (and then also into the `_site` 
directory). We use a
+jekyll plugin to run `build/sbt unidoc` before building the site so if you 
haven't run it (recently) it
+may take some time as it generates all of the scaladoc.  The jekyll plugin 
also generates the
+PySpark docs using [Sphinx](http://sphinx-doc.org/).
+
+NOTE: To skip the step of building and copying over the Scala, Python, R API 
docs, run `SKIP_API=1
+jekyll`.

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api.html
--
diff --git a/site/docs/2.0.2/api.html b/site/docs/2.0.2/api.html
new file mode 100644
index 000..731bd07
--- /dev/null
+++ b/site/docs/2.0.2/api.html
@@ -0,0 +1,178 @@
+
+
+
+
+
+  
+
+
+
+Spark API Documentation - Spark 2.0.2 Documentation
+
+
+
+
+
+
+body {
+padding-top: 60px;
+padding-bottom: 40px;
+}
+
+
+
+
+
+
+
+
+
+
+
+
+  var _gaq = _gaq || [];
+  _gaq.push(['_setAccount', 'UA-32518208-2']);
+  _gaq.push(['_trackPageview']);
+
+  (function() {
+var ga = document.createElement('script'); ga.type = 
'text/javascript'; ga.async = true;
+ga.src = ('https:' ==

[27/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/structField.html
--
diff --git a/site/docs/2.0.2/api/R/structField.html 
b/site/docs/2.0.2/api/R/structField.html
new file mode 100644
index 000..6325141
--- /dev/null
+++ b/site/docs/2.0.2/api/R/structField.html
@@ -0,0 +1,84 @@
+
+R: structField
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+structField {SparkR}R 
Documentation
+
+structField
+
+Description
+
+Create a structField object that contains the metadata for a single field 
in a schema.
+
+
+
+Usage
+
+
+structField(x, ...)
+
+## S3 method for class 'jobj'
+structField(x, ...)
+
+## S3 method for class 'character'
+structField(x, type, nullable = TRUE, ...)
+
+
+
+Arguments
+
+
+x
+
+the name of the field.
+
+...
+
+additional argument(s) passed to the method.
+
+type
+
+The data type of the field
+
+nullable
+
+A logical vector indicating whether or not the field is nullable
+
+
+
+
+Value
+
+A structField object.
+
+
+
+Note
+
+structField since 1.4.0
+
+
+
+Examples
+
+## Not run: 
+##D field1 - structField(a, integer)
+##D field2 - structField(c, string)
+##D field3 - structField(avg, double)
+##D schema -  structType(field1, field2, field3)
+##D df1 - gapply(df, list(a, c),
+##D   function(key, x) { y - data.frame(key, mean(x$b), 
stringsAsFactors = FALSE) },
+##D   schema)
+## End(Not run)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/structType.html
--
diff --git a/site/docs/2.0.2/api/R/structType.html 
b/site/docs/2.0.2/api/R/structType.html
new file mode 100644
index 000..d068b59
--- /dev/null
+++ b/site/docs/2.0.2/api/R/structType.html
@@ -0,0 +1,75 @@
+
+R: structType
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+structType 
{SparkR}R Documentation
+
+structType
+
+Description
+
+Create a structType object that contains the metadata for a SparkDataFrame. 
Intended for
+use with createDataFrame and toDF.
+
+
+
+Usage
+
+
+structType(x, ...)
+
+## S3 method for class 'jobj'
+structType(x, ...)
+
+## S3 method for class 'structField'
+structType(x, ...)
+
+
+
+Arguments
+
+
+x
+
+a structField object (created with the field() function)
+
+...
+
+additional structField objects
+
+
+
+
+Value
+
+a structType object
+
+
+
+Note
+
+structType since 1.4.0
+
+
+
+Examples
+
+## Not run: 
+##D schema -  structType(structField(a, integer), 
structField(c, string),
+##D   structField(avg, double))
+##D df1 - gapply(df, list(a, c),
+##D   function(key, x) { y - data.frame(key, mean(x$b), 
stringsAsFactors = FALSE) },
+##D   schema)
+## End(Not run)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/subset.html
--
diff --git a/site/docs/2.0.2/api/R/subset.html 
b/site/docs/2.0.2/api/R/subset.html
new file mode 100644
index 000..987e20b
--- /dev/null
+++ b/site/docs/2.0.2/api/R/subset.html
@@ -0,0 +1,309 @@
+
+R: Subset
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+[[ {SparkR}R Documentation
+
+Subset
+
+Description
+
+Return subsets of SparkDataFrame according to given conditions
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame,numericOrcharacter'
+x[[i]]
+
+## S4 method for signature 'SparkDataFrame'
+x[i, j, ..., drop = F]
+
+## S4 method for signature 'SparkDataFrame'
+subset(x, subset, select, drop = F, ...)
+
+subset(x, ...)
+
+
+
+Arguments
+
+
+x
+
+a SparkDataFrame.
+
+i,subset
+
+(Optional) a logical expression to filter on rows.
+
+j,select
+
+expression for the single Column or a list of columns to select from the 
SparkDataFrame.
+
+...
+
+currently not used.
+
+drop
+
+if TRUE, a Column will be returned if the resulting dataset has only one 
column.
+Otherwise, a SparkDataFrame will always be returned.
+
+
+
+
+Value
+
+A new SparkDataFrame containing only the rows that meet the condition with 
selected columns.
+
+
+
+Note
+
+[[ since 1.4.0
+
+[

[46/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/collect.html
--
diff --git a/site/docs/2.0.2/api/R/collect.html 
b/site/docs/2.0.2/api/R/collect.html
new file mode 100644
index 000..61090a7
--- /dev/null
+++ b/site/docs/2.0.2/api/R/collect.html
@@ -0,0 +1,268 @@
+
+R: Collects all the elements of a SparkDataFrame and 
coerces...
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+collect 
{SparkR}R Documentation
+
+Collects all the elements of a SparkDataFrame and coerces them into an R 
data.frame.
+
+Description
+
+Collects all the elements of a SparkDataFrame and coerces them into an R 
data.frame.
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame'
+collect(x, stringsAsFactors = FALSE)
+
+collect(x, ...)
+
+
+
+Arguments
+
+
+x
+
+a SparkDataFrame.
+
+stringsAsFactors
+
+(Optional) a logical indicating whether or not string columns
+should be converted to factors. FALSE by default.
+
+...
+
+further arguments to be passed to or from other methods.
+
+
+
+
+Note
+
+collect since 1.4.0
+
+
+
+See Also
+
+Other SparkDataFrame functions: $,
+$,SparkDataFrame-method, $-,
+$-,SparkDataFrame-method,
+select, select,
+select,SparkDataFrame,Column-method,
+select,SparkDataFrame,character-method,
+select,SparkDataFrame,list-method;
+SparkDataFrame-class; [,
+[,SparkDataFrame-method, [[,
+[[,SparkDataFrame,numericOrcharacter-method,
+subset, subset,
+subset,SparkDataFrame-method;
+agg, agg, agg,
+agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+arrange, arrange,
+arrange,
+arrange,SparkDataFrame,Column-method,
+arrange,SparkDataFrame,character-method,
+orderBy,SparkDataFrame,characterOrColumn-method;
+as.data.frame,
+as.data.frame,SparkDataFrame-method;
+attach,
+attach,SparkDataFrame-method;
+cache, cache,
+cache,SparkDataFrame-method;
+colnames, colnames,
+colnames,SparkDataFrame-method,
+colnames-, colnames-,
+colnames-,SparkDataFrame-method,
+columns, columns,
+columns,SparkDataFrame-method,
+names,
+names,SparkDataFrame-method,
+names-,
+names-,SparkDataFrame-method;
+coltypes, coltypes,
+coltypes,SparkDataFrame-method,
+coltypes-, coltypes-,
+coltypes-,SparkDataFrame,character-method;
+count,SparkDataFrame-method,
+nrow, nrow,
+nrow,SparkDataFrame-method;
+createOrReplaceTempView,
+createOrReplaceTempView,
+createOrReplaceTempView,SparkDataFrame,character-method;
+dapplyCollect, dapplyCollect,
+dapplyCollect,SparkDataFrame,function-method;
+dapply, dapply,
+dapply,SparkDataFrame,function,structType-method;
+describe, describe,
+describe,
+describe,SparkDataFrame,ANY-method,
+describe,SparkDataFrame,character-method,
+describe,SparkDataFrame-method,
+summary, summary,
+summary,SparkDataFrame-method;
+dim,
+dim,SparkDataFrame-method;
+distinct, distinct,
+distinct,SparkDataFrame-method,
+unique,
+unique,SparkDataFrame-method;
+dropDuplicates,
+dropDuplicates,
+dropDuplicates,SparkDataFrame-method;
+dropna, dropna,
+dropna,SparkDataFrame-method,
+fillna, fillna,
+fillna,SparkDataFrame-method,
+na.omit, na.omit,
+na.omit,SparkDataFrame-method;
+drop, drop,
+drop, drop,ANY-method,
+drop,SparkDataFrame-method;
+dtypes, dtypes,
+dtypes,SparkDataFrame-method;
+except, except,
+except,SparkDataFrame,SparkDataFrame-method;
+explain, explain,
+explain,SparkDataFrame-method;
+filter, filter,
+filter,SparkDataFrame,characterOrColumn-method,
+where, where,
+where,SparkDataFrame,characterOrColumn-method;
+first, first,
+first,
+first,SparkDataFrame-method,
+first,characterOrColumn-method;
+gapplyCollect, gapplyCollect,
+gapplyCollect,
+gapplyCollect,GroupedData-method,
+gapplyCollect,SparkDataFrame-method;
+gapply, gapply,
+gapply,
+gapply,GroupedData-method,
+gapply,SparkDataFrame-method;
+groupBy, groupBy,
+groupBy,SparkDataFrame-method,
+group_by, group_by,
+group_by,SparkDataFrame-method;
+head,
+head,SparkDataFrame-method;
+histogram,
+histogram,SparkDataFrame,characterOrColumn-method;
+insertInto, insertInto,
+insertInto,SparkDataFrame,character-method;
+intersect, intersect,
+intersect,SparkDataFrame,SparkDataFrame-method;
+isLocal, isLocal,
+isLocal,SparkDataFrame-method;
+join,
+join,SparkDataFrame,SparkDataFrame-method;
+limit, limit,
+limit,SparkDataFrame,numeric-method;
+merge, merge,
+merge,SparkDataFrame,SparkDataFrame-method;
+mutate, mutate,
+mutate,SparkDataFrame-method,
+transform, transform,
+transform,SparkDataFrame-method;
+ncol,
+ncol,SparkDataFrame-method;
+persist, persist,
+persist,SparkDataFrame,character-method;
+printSchema, printSchema,
+printSchema,SparkDataFrame-method;
+randomSplit, randomSplit,

[20/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/allclasses-noframe.html
--
diff --git a/site/docs/2.0.2/api/java/allclasses-noframe.html 
b/site/docs/2.0.2/api/java/allclasses-noframe.html
new file mode 100644
index 000..05ef78b
--- /dev/null
+++ b/site/docs/2.0.2/api/java/allclasses-noframe.html
@@ -0,0 +1,1119 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+All Classes (Spark 2.0.2 JavaDoc)
+
+
+
+
+All Classes
+
+
+AbsoluteError
+Accumulable
+AccumulableInfo
+AccumulableInfo
+AccumulableParam
+Accumulator
+AccumulatorContext
+AccumulatorParam
+AccumulatorParam.DoubleAccumulatorParam$
+AccumulatorParam.FloatAccumulatorParam$
+AccumulatorParam.IntAccumulatorParam$
+AccumulatorParam.LongAccumulatorParam$
+AccumulatorParam.StringAccumulatorParam$
+AccumulatorV2
+AFTAggregator
+AFTCostFun
+AFTSurvivalRegression
+AFTSurvivalRegressionModel
+AggregatedDialect
+AggregatingEdgeContext
+Aggregator
+Aggregator
+Algo
+AllJobsCancelled
+AllReceiverIds
+ALS
+ALS
+ALS.InBlock$
+ALS.Rating
+ALS.Rating$
+ALS.RatingBlock$
+ALSModel
+AnalysisException
+And
+AnyDataType
+ApplicationAttemptInfo
+ApplicationInfo
+ApplicationsListResource
+ApplicationStatus
+ApplyInPlace
+AreaUnderCurve
+ArrayType
+AskPermissionToCommitOutput
+AssociationRules
+AssociationRules.Rule
+AsyncRDDActions
+Attribute
+AttributeGroup
+AttributeKeys
+AttributeType
+BaseRelation
+BaseRRDD
+BatchInfo
+BernoulliCellSampler
+BernoulliSampler
+Binarizer
+BinaryAttribute
+BinaryClassificationEvaluator
+BinaryClassificationMetrics
+BinaryLogisticRegressionSummary
+BinaryLogisticRegressionTrainingSummary
+BinarySample
+BinaryType
+BinomialBounds
+BisectingKMeans
+BisectingKMeans
+BisectingKMeansModel
+BisectingKMeansModel
+BisectingKMeansModel.SaveLoadV1_0$
+BLAS
+BLAS
+BlockId
+BlockManagerId
+BlockManagerMessages
+BlockManagerMessages.BlockManagerHeartbeat
+BlockManagerMessages.BlockManagerHeartbeat$
+BlockManagerMessages.GetBlockStatus
+BlockManagerMessages.GetBlockStatus$
+BlockManagerMessages.GetExecutorEndpointRef
+BlockManagerMessages.GetExecutorEndpointRef$
+BlockManagerMessages.GetLocations
+BlockManagerMessages.GetLocations$
+BlockManagerMessages.GetLocationsMultipleBlockIds
+BlockManagerMessages.GetLocationsMultipleBlockIds$
+BlockManagerMessages.GetMatchingBlockIds
+BlockManagerMessages.GetMatchingBlockIds$
+BlockManagerMessages.GetMemoryStatus$
+BlockManagerMessages.GetPeers
+BlockManagerMessages.GetPeers$
+BlockManagerMessages.GetStorageStatus$
+BlockManagerMessages.HasCachedBlocks
+BlockManagerMessages.HasCachedBlocks$
+BlockManagerMessages.RegisterBlockManager
+BlockManagerMessages.RegisterBlockManager$
+BlockManagerMessages.RemoveBlock
+BlockManagerMessages.RemoveBlock$
+BlockManagerMessages.RemoveBroadcast
+BlockManagerMessages.RemoveBroadcast$
+BlockManagerMessages.RemoveExecutor
+BlockManagerMessages.RemoveExecutor$
+BlockManagerMessages.RemoveRdd
+BlockManagerMessages.RemoveRdd$
+BlockManagerMessages.RemoveShuffle
+BlockManagerMessages.RemoveShuffle$
+BlockManagerMessages.StopBlockManagerMaster$
+BlockManagerMessages.ToBlockManagerMaster
+BlockManagerMessages.ToBlockManagerSlave
+BlockManagerMessages.TriggerThreadDump$
+BlockManagerMessages.UpdateBlockInfo
+BlockManagerMessages.UpdateBlockInfo$
+BlockMatrix
+BlockNotFoundException
+BlockStatus
+BlockUpdatedInfo
+BloomFilter
+BloomFilter.Version
+BooleanParam
+BooleanType
+BoostingStrategy
+BoundedDouble
+BreezeUtil
+Broadcast
+BroadcastBlockId
+Broker
+Bucketizer
+BufferReleasingInputStream
+BytecodeUtils
+ByteType
+CalendarIntervalType
+Catalog
+CatalogImpl
+CatalystScan
+CategoricalSplit
+CausedBy
+CheckpointReader
+CheckpointState
+ChiSqSelector
+ChiSqSelector
+ChiSqSelectorModel
+ChiSqSelectorModel
+ChiSqSelectorModel.SaveLoadV1_0$
+ChiSqTest
+ChiSqTest.Method
+ChiSqTest.Method$
+ChiSqTest.NullHypothesis$
+ChiSqTestResult
+CholeskyDecomposition
+ChunkedByteBufferInputStream
+ClassificationModel
+ClassificationModel
+Classifier
+CleanAccum
+CleanBroadcast
+CleanCheckpoint
+CleanRDD
+CleanShuffle
+CleanupTask
+CleanupTaskWeakReference
+ClosureCleaner
+CoarseGrainedClusterMessages
+CoarseGrainedClusterMessages.AddWebUIFilter
+CoarseGrainedClusterMessages.AddWebUIFilter$
+CoarseGrainedClusterMessages.GetExecutorLossReason
+CoarseGrainedClusterMessages.GetExecutorLossReason$
+CoarseGrainedClusterMessages.KillExecutors
+CoarseGrainedClusterMessages.KillExecutors$
+CoarseGrainedClusterMessages.KillTask
+CoarseGrainedClusterMessages.KillTask$
+CoarseGrainedClusterMessages.LaunchTask
+CoarseGrainedClusterMessages.LaunchTask$
+CoarseGrainedClusterMessages.RegisterClusterManager
+CoarseGrainedClusterMessages.RegisterClusterManager$
+CoarseGrainedClusterMessages.RegisteredExecutor$
+CoarseGrainedClusterMessages.RegisterExecutor
+CoarseGrainedClusterMessages.RegisterExecutor$
+CoarseGrainedClusterMessages.RegisterExecutorFailed

[28/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/spark.lapply.html
--
diff --git a/site/docs/2.0.2/api/R/spark.lapply.html 
b/site/docs/2.0.2/api/R/spark.lapply.html
new file mode 100644
index 000..f337327
--- /dev/null
+++ b/site/docs/2.0.2/api/R/spark.lapply.html
@@ -0,0 +1,96 @@
+
+R: Run a function over a list of elements, distributing 
the...
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+spark.lapply {SparkR}R 
Documentation
+
+Run a function over a list of elements, distributing the computations with 
Spark
+
+Description
+
+Run a function over a list of elements, distributing the computations with 
Spark. Applies a
+function in a manner that is similar to doParallel or lapply to elements of a 
list.
+The computations are distributed using Spark. It is conceptually the same as 
the following code:
+lapply(list, func)
+
+
+
+Usage
+
+
+spark.lapply(list, func)
+
+
+
+Arguments
+
+
+list
+
+the list of elements
+
+func
+
+a function that takes one argument.
+
+
+
+
+Details
+
+Known limitations:
+
+
+
+ variable scoping and capture: compared to R's rich support for 
variable resolutions,
+the distributed nature of SparkR limits how variables are resolved at runtime. 
All the
+variables that are available through lexical scoping are embedded in the 
closure of the
+function and available as read-only variables within the function. The 
environment variables
+should be stored into temporary variables outside the function, and not 
directly accessed
+within the function.
+
+
+ loading external packages: In order to use a package, you need to load 
it inside the
+closure. For example, if you rely on the MASS module, here is how you would 
use it:
+
+
+train - function(hyperparam) {
+  library(MASS)
+  lm.ridge("y ~ x+z", data, lambda=hyperparam)
+  model
+}
+  
+
+
+
+
+Value
+
+a list of results (the exact type being determined by the function)
+
+
+
+Note
+
+spark.lapply since 2.0.0
+
+
+
+Examples
+
+## Not run: 
+##D sparkR.session()
+##D doubled - spark.lapply(1:10, function(x){2 * x})
+## End(Not run)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/spark.naiveBayes.html
--
diff --git a/site/docs/2.0.2/api/R/spark.naiveBayes.html 
b/site/docs/2.0.2/api/R/spark.naiveBayes.html
new file mode 100644
index 000..b4d60c2
--- /dev/null
+++ b/site/docs/2.0.2/api/R/spark.naiveBayes.html
@@ -0,0 +1,143 @@
+
+R: Naive Bayes Models
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+spark.naiveBayes {SparkR}R 
Documentation
+
+Naive Bayes Models
+
+Description
+
+spark.naiveBayes fits a Bernoulli naive Bayes model against a 
SparkDataFrame.
+Users can call summary to print a summary of the fitted model, 
predict to make
+predictions on new data, and write.ml/read.ml to 
save/load fitted models.
+Only categorical data is supported.
+
+
+
+Usage
+
+
+spark.naiveBayes(data, formula, ...)
+
+## S4 method for signature 'NaiveBayesModel'
+predict(object, newData)
+
+## S4 method for signature 'NaiveBayesModel'
+summary(object, ...)
+
+## S4 method for signature 'SparkDataFrame,formula'
+spark.naiveBayes(data, formula,
+  smoothing = 1, ...)
+
+## S4 method for signature 'NaiveBayesModel,character'
+write.ml(object, path,
+  overwrite = FALSE)
+
+
+
+Arguments
+
+
+data
+
+a SparkDataFrame of observations and labels for model 
fitting.
+
+formula
+
+a symbolic description of the model to be fitted. Currently only a few 
formula
+operators are supported, including '~', '.', ':', '+', and '-'.
+
+...
+
+additional argument(s) passed to the method. Currently only 
smoothing.
+
+object
+
+a naive Bayes model fitted by spark.naiveBayes.
+
+newData
+
+a SparkDataFrame for testing.
+
+smoothing
+
+smoothing parameter.
+
+path
+
+the directory where the model is saved
+
+overwrite
+
+overwrites or not if the output path already exists. Default is FALSE
+which means throw exception if the output path exists.
+
+
+
+
+Value
+
+predict returns a SparkDataFrame containing predicted labeled 
in a column named
+prediction
+
+summary returns a list containing apriori, the 
label distribution, and
+tables, conditional probabilities given the target label.
+
+spark.naiveBayes returns a fitted naive Bayes model.
+
+
+
+Note
+
+predict(NaiveBayesModel) since 2.0.0
+

[49/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/00frame_toc.html
--
diff --git a/site/docs/2.0.2/api/R/00frame_toc.html 
b/site/docs/2.0.2/api/R/00frame_toc.html
new file mode 100644
index 000..c35b36d
--- /dev/null
+++ b/site/docs/2.0.2/api/R/00frame_toc.html
@@ -0,0 +1,378 @@
+
+
+
+
+
+R Documentation of SparkR
+
+
+window.onload = function() {
+  var imgs = document.getElementsByTagName('img'), i, img;
+  for (i = 0; i < imgs.length; i++) {
+img = imgs[i];
+// center an image if it is the only element of its parent
+if (img.parentElement.childElementCount === 1)
+  img.parentElement.style.textAlign = 'center';
+  }
+};
+
+
+
+
+
+
+
+* {
+   font-family: "Trebuchet MS", "Lucida Grande", "Lucida Sans Unicode", 
"Lucida Sans", Arial, sans-serif;
+   font-size: 14px;
+}
+body {
+  padding: 0 5px; 
+  margin: 0 auto; 
+  width: 80%;
+  max-width: 60em; /* 960px */
+}
+
+h1, h2, h3, h4, h5, h6 {
+   color: #666;
+}
+h1, h2 {
+   text-align: center;
+}
+h1 {
+   font-size: x-large;
+}
+h2, h3 {
+   font-size: large;
+}
+h4, h6 {
+   font-style: italic;
+}
+h3 {
+   border-left: solid 5px #ddd;
+   padding-left: 5px;
+   font-variant: small-caps;
+}
+
+p img {
+   display: block;
+   margin: auto;
+}
+
+span, code, pre {
+   font-family: Monaco, "Lucida Console", "Courier New", Courier, 
monospace;
+}
+span.acronym {}
+span.env {
+   font-style: italic;
+}
+span.file {}
+span.option {}
+span.pkg {
+   font-weight: bold;
+}
+span.samp{}
+
+dt, p code {
+   background-color: #F7F7F7;
+}
+
+
+
+
+
+
+
+
+SparkR
+
+
+AFTSurvivalRegressionModel-class
+GeneralizedLinearRegressionModel-class
+GroupedData
+KMeansModel-class
+NaiveBayesModel-class
+SparkDataFrame
+WindowSpec
+abs
+acos
+add_months
+alias
+approxCountDistinct
+approxQuantile
+arrange
+array_contains
+as.data.frame
+ascii
+asin
+atan
+atan2
+attach
+avg
+base64
+between
+bin
+bitwiseNOT
+bround
+cache
+cacheTable
+cancelJobGroup
+cast
+cbrt
+ceil
+clearCache
+clearJobGroup
+collect
+coltypes
+column
+columnfunctions
+columns
+concat
+concat_ws
+conv
+corr
+cos
+cosh
+count
+countDistinct
+cov
+covar_pop
+crc32
+createDataFrame
+createExternalTable
+createOrReplaceTempView
+crosstab
+cume_dist
+dapply
+dapplyCollect
+date_add
+date_format
+date_sub
+datediff
+dayofmonth
+dayofyear
+decode
+dense_rank
+dim
+distinct
+drop
+dropDuplicates
+dropTempTable-deprecated
+dropTempView
+dtypes
+encode
+endsWith
+except
+exp
+explain
+explode
+expm1
+expr
+factorial
+filter
+first
+fitted
+floor
+format_number
+format_string
+freqItems
+from_unixtime
+fromutctimestamp
+gapply
+gapplyCollect
+generateAliasesForIntersectedCols
+glm
+greatest
+groupBy
+hash
+hashCode
+head
+hex
+histogram
+hour
+hypot
+ifelse
+initcap
+insertInto
+install.spark
+instr
+intersect
+is.nan
+isLocal
+join
+kurtosis
+lag
+last
+last_day
+lead
+least
+length
+levenshtein
+limit
+lit
+locate
+log
+log10
+log1p
+log2
+lower
+lpad
+ltrim
+match
+max
+md5
+mean
+merge
+min
+minute
+monotonicallyincreasingid
+month
+months_between
+mutate
+nafunctions
+nanvl
+ncol
+negate
+next_day
+nrow
+ntile
+orderBy
+otherwise
+over
+partitionBy
+percent_rank
+persist
+pivot
+pmod
+posexplode
+predict
+print.jobj
+print.structField
+print.structType
+printSchema
+quarter
+rand
+randn
+randomSplit
+rangeBetween
+rank
+rbind
+read.df
+read.jdbc
+read.json
+read.ml
+read.orc
+read.parquet
+read.text
+regexp_extract
+regexp_replace
+registerTempTable-deprecated
+rename
+repartition
+reverse
+rint
+round
+row_number
+rowsBetween
+rpad
+rtrim
+sample
+sampleBy
+saveAsTable
+schema
+sd
+second
+select
+selectExpr
+setJobGroup
+setLogLevel
+sha1
+sha2
+shiftLeft
+shiftRight
+shiftRightUnsigned
+show
+showDF
+sign
+sin
+sinh
+size
+skewness
+sort_array
+soundex
+spark.glm
+spark.kmeans
+spark.lapply
+spark.naiveBayes
+spark.survreg
+sparkR.callJMethod
+sparkR.callJStatic
+sparkR.conf
+sparkR.init-deprecated
+sparkR.newJObject
+sparkR.session
+sparkR.session.stop
+sparkR.version
+sparkRHive.init-deprecated
+sparkRSQL.init-deprecated
+sparkpartitionid
+sql
+sqrt
+startsWith
+stddev_pop
+stddev_samp
+str
+struct
+structField
+structType
+subset
+substr
+substring_index
+sum
+sumDistinct
+summarize
+summary
+tableNames
+tableToDF
+tables
+take
+tan
+tanh
+toDegrees
+toRadians
+to_date
+toutctimestamp
+translate
+trim
+unbase64
+uncacheTable
+unhex
+union
+unix_timestamp
+unpersist-methods
+upper
+var
+var_pop
+var_samp
+weekofyear
+when
+window
+windowOrderBy
+windowPartitionBy
+with
+withColumn
+write.df
+write.jdbc
+write.json
+write.ml
+write.orc
+write.parquet
+write.text
+year
+
+
+Generated with http://yihui.name/knitr;>knitr  1.14
+
+
+
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/AFTSurvivalRegressionModel-class.html

[34/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/nrow.html
--
diff --git a/site/docs/2.0.2/api/R/nrow.html b/site/docs/2.0.2/api/R/nrow.html
new file mode 100644
index 000..2626e03
--- /dev/null
+++ b/site/docs/2.0.2/api/R/nrow.html
@@ -0,0 +1,260 @@
+
+R: Returns the number of rows in a SparkDataFrame
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+nrow 
{SparkR}R Documentation
+
+Returns the number of rows in a SparkDataFrame
+
+Description
+
+Returns the number of rows in a SparkDataFrame
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame'
+count(x)
+
+## S4 method for signature 'SparkDataFrame'
+nrow(x)
+
+
+
+Arguments
+
+
+x
+
+a SparkDataFrame.
+
+
+
+
+Note
+
+count since 1.4.0
+
+nrow since 1.5.0
+
+
+
+See Also
+
+Other SparkDataFrame functions: $,
+$,SparkDataFrame-method, $-,
+$-,SparkDataFrame-method,
+select, select,
+select,SparkDataFrame,Column-method,
+select,SparkDataFrame,character-method,
+select,SparkDataFrame,list-method;
+SparkDataFrame-class; [,
+[,SparkDataFrame-method, [[,
+[[,SparkDataFrame,numericOrcharacter-method,
+subset, subset,
+subset,SparkDataFrame-method;
+agg, agg, agg,
+agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+arrange, arrange,
+arrange,
+arrange,SparkDataFrame,Column-method,
+arrange,SparkDataFrame,character-method,
+orderBy,SparkDataFrame,characterOrColumn-method;
+as.data.frame,
+as.data.frame,SparkDataFrame-method;
+attach,
+attach,SparkDataFrame-method;
+cache, cache,
+cache,SparkDataFrame-method;
+collect, collect,
+collect,SparkDataFrame-method;
+colnames, colnames,
+colnames,SparkDataFrame-method,
+colnames-, colnames-,
+colnames-,SparkDataFrame-method,
+columns, columns,
+columns,SparkDataFrame-method,
+names,
+names,SparkDataFrame-method,
+names-,
+names-,SparkDataFrame-method;
+coltypes, coltypes,
+coltypes,SparkDataFrame-method,
+coltypes-, coltypes-,
+coltypes-,SparkDataFrame,character-method;
+createOrReplaceTempView,
+createOrReplaceTempView,
+createOrReplaceTempView,SparkDataFrame,character-method;
+dapplyCollect, dapplyCollect,
+dapplyCollect,SparkDataFrame,function-method;
+dapply, dapply,
+dapply,SparkDataFrame,function,structType-method;
+describe, describe,
+describe,
+describe,SparkDataFrame,ANY-method,
+describe,SparkDataFrame,character-method,
+describe,SparkDataFrame-method,
+summary, summary,
+summary,SparkDataFrame-method;
+dim,
+dim,SparkDataFrame-method;
+distinct, distinct,
+distinct,SparkDataFrame-method,
+unique,
+unique,SparkDataFrame-method;
+dropDuplicates,
+dropDuplicates,
+dropDuplicates,SparkDataFrame-method;
+dropna, dropna,
+dropna,SparkDataFrame-method,
+fillna, fillna,
+fillna,SparkDataFrame-method,
+na.omit, na.omit,
+na.omit,SparkDataFrame-method;
+drop, drop,
+drop, drop,ANY-method,
+drop,SparkDataFrame-method;
+dtypes, dtypes,
+dtypes,SparkDataFrame-method;
+except, except,
+except,SparkDataFrame,SparkDataFrame-method;
+explain, explain,
+explain,SparkDataFrame-method;
+filter, filter,
+filter,SparkDataFrame,characterOrColumn-method,
+where, where,
+where,SparkDataFrame,characterOrColumn-method;
+first, first,
+first,
+first,SparkDataFrame-method,
+first,characterOrColumn-method;
+gapplyCollect, gapplyCollect,
+gapplyCollect,
+gapplyCollect,GroupedData-method,
+gapplyCollect,SparkDataFrame-method;
+gapply, gapply,
+gapply,
+gapply,GroupedData-method,
+gapply,SparkDataFrame-method;
+groupBy, groupBy,
+groupBy,SparkDataFrame-method,
+group_by, group_by,
+group_by,SparkDataFrame-method;
+head,
+head,SparkDataFrame-method;
+histogram,
+histogram,SparkDataFrame,characterOrColumn-method;
+insertInto, insertInto,
+insertInto,SparkDataFrame,character-method;
+intersect, intersect,
+intersect,SparkDataFrame,SparkDataFrame-method;
+isLocal, isLocal,
+isLocal,SparkDataFrame-method;
+join,
+join,SparkDataFrame,SparkDataFrame-method;
+limit, limit,
+limit,SparkDataFrame,numeric-method;
+merge, merge,
+merge,SparkDataFrame,SparkDataFrame-method;
+mutate, mutate,
+mutate,SparkDataFrame-method,
+transform, transform,
+transform,SparkDataFrame-method;
+ncol,
+ncol,SparkDataFrame-method;
+persist, persist,
+persist,SparkDataFrame,character-method;
+printSchema, printSchema,
+printSchema,SparkDataFrame-method;
+randomSplit, randomSplit,
+randomSplit,SparkDataFrame,numeric-method;
+rbind, rbind,
+rbind,SparkDataFrame-method;
+registerTempTable,
+registerTempTable,
+registerTempTable,SparkDataFrame,character-method;
+rename, rename,
+rename,SparkDataFrame-method,
+withColumnRenamed,
+withColumnRenamed,
+withColumnRenamed,SparkDataFrame,character,character-method;

[48/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/arrange.html
--
diff --git a/site/docs/2.0.2/api/R/arrange.html 
b/site/docs/2.0.2/api/R/arrange.html
new file mode 100644
index 000..e5ac48a
--- /dev/null
+++ b/site/docs/2.0.2/api/R/arrange.html
@@ -0,0 +1,287 @@
+
+R: Arrange Rows by Variables
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+arrange 
{SparkR}R Documentation
+
+Arrange Rows by Variables
+
+Description
+
+Sort a SparkDataFrame by the specified column(s).
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame,Column'
+arrange(x, col, ...)
+
+## S4 method for signature 'SparkDataFrame,character'
+arrange(x, col, ..., decreasing = FALSE)
+
+## S4 method for signature 'SparkDataFrame,characterOrColumn'
+orderBy(x, col, ...)
+
+arrange(x, col, ...)
+
+
+
+Arguments
+
+
+x
+
+a SparkDataFrame to be sorted.
+
+col
+
+a character or Column object indicating the fields to sort on
+
+...
+
+additional sorting fields
+
+decreasing
+
+a logical argument indicating sorting order for columns when
+a character vector is specified for col
+
+
+
+
+Value
+
+A SparkDataFrame where all elements are sorted.
+
+
+
+Note
+
+arrange(SparkDataFrame, Column) since 1.4.0
+
+arrange(SparkDataFrame, character) since 1.4.0
+
+orderBy(SparkDataFrame, characterOrColumn) since 1.4.0
+
+
+
+See Also
+
+Other SparkDataFrame functions: $,
+$,SparkDataFrame-method, $-,
+$-,SparkDataFrame-method,
+select, select,
+select,SparkDataFrame,Column-method,
+select,SparkDataFrame,character-method,
+select,SparkDataFrame,list-method;
+SparkDataFrame-class; [,
+[,SparkDataFrame-method, [[,
+[[,SparkDataFrame,numericOrcharacter-method,
+subset, subset,
+subset,SparkDataFrame-method;
+agg, agg, agg,
+agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+as.data.frame,
+as.data.frame,SparkDataFrame-method;
+attach,
+attach,SparkDataFrame-method;
+cache, cache,
+cache,SparkDataFrame-method;
+collect, collect,
+collect,SparkDataFrame-method;
+colnames, colnames,
+colnames,SparkDataFrame-method,
+colnames-, colnames-,
+colnames-,SparkDataFrame-method,
+columns, columns,
+columns,SparkDataFrame-method,
+names,
+names,SparkDataFrame-method,
+names-,
+names-,SparkDataFrame-method;
+coltypes, coltypes,
+coltypes,SparkDataFrame-method,
+coltypes-, coltypes-,
+coltypes-,SparkDataFrame,character-method;
+count,SparkDataFrame-method,
+nrow, nrow,
+nrow,SparkDataFrame-method;
+createOrReplaceTempView,
+createOrReplaceTempView,
+createOrReplaceTempView,SparkDataFrame,character-method;
+dapplyCollect, dapplyCollect,
+dapplyCollect,SparkDataFrame,function-method;
+dapply, dapply,
+dapply,SparkDataFrame,function,structType-method;
+describe, describe,
+describe,
+describe,SparkDataFrame,ANY-method,
+describe,SparkDataFrame,character-method,
+describe,SparkDataFrame-method,
+summary, summary,
+summary,SparkDataFrame-method;
+dim,
+dim,SparkDataFrame-method;
+distinct, distinct,
+distinct,SparkDataFrame-method,
+unique,
+unique,SparkDataFrame-method;
+dropDuplicates,
+dropDuplicates,
+dropDuplicates,SparkDataFrame-method;
+dropna, dropna,
+dropna,SparkDataFrame-method,
+fillna, fillna,
+fillna,SparkDataFrame-method,
+na.omit, na.omit,
+na.omit,SparkDataFrame-method;
+drop, drop,
+drop, drop,ANY-method,
+drop,SparkDataFrame-method;
+dtypes, dtypes,
+dtypes,SparkDataFrame-method;
+except, except,
+except,SparkDataFrame,SparkDataFrame-method;
+explain, explain,
+explain,SparkDataFrame-method;
+filter, filter,
+filter,SparkDataFrame,characterOrColumn-method,
+where, where,
+where,SparkDataFrame,characterOrColumn-method;
+first, first,
+first,
+first,SparkDataFrame-method,
+first,characterOrColumn-method;
+gapplyCollect, gapplyCollect,
+gapplyCollect,
+gapplyCollect,GroupedData-method,
+gapplyCollect,SparkDataFrame-method;
+gapply, gapply,
+gapply,
+gapply,GroupedData-method,
+gapply,SparkDataFrame-method;
+groupBy, groupBy,
+groupBy,SparkDataFrame-method,
+group_by, group_by,
+group_by,SparkDataFrame-method;
+head,
+head,SparkDataFrame-method;
+histogram,
+histogram,SparkDataFrame,characterOrColumn-method;
+insertInto, insertInto,
+insertInto,SparkDataFrame,character-method;
+intersect, intersect,
+intersect,SparkDataFrame,SparkDataFrame-method;
+isLocal, isLocal,
+isLocal,SparkDataFrame-method;
+join,
+join,SparkDataFrame,SparkDataFrame-method;
+limit, limit,
+limit,SparkDataFrame,numeric-method;
+merge, merge,
+merge,SparkDataFrame,SparkDataFrame-method;
+mutate, mutate,
+mutate,SparkDataFrame-method,
+transform, transform,
+transform,SparkDataFrame-method;
+ncol,
+ncol,SparkDataFrame-method;
+persist,

[40/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/gapply.html
--
diff --git a/site/docs/2.0.2/api/R/gapply.html 
b/site/docs/2.0.2/api/R/gapply.html
new file mode 100644
index 000..03d3587
--- /dev/null
+++ b/site/docs/2.0.2/api/R/gapply.html
@@ -0,0 +1,348 @@
+
+R: gapply
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+gapply 
{SparkR}R Documentation
+
+gapply
+
+Description
+
+Groups the SparkDataFrame using the specified columns and applies the R 
function to each
+group.
+
+gapply
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame'
+gapply(x, cols, func, schema)
+
+gapply(x, ...)
+
+## S4 method for signature 'GroupedData'
+gapply(x, func, schema)
+
+
+
+Arguments
+
+
+x
+
+a SparkDataFrame or GroupedData.
+
+cols
+
+grouping columns.
+
+func
+
+a function to be applied to each group partition specified by grouping
+column of the SparkDataFrame. The function func takes as argument
+a key - grouping columns and a data frame - a local R data.frame.
+The output of func is a local R data.frame.
+
+schema
+
+the schema of the resulting SparkDataFrame after the function is applied.
+The schema must match to output of func. It has to be defined for 
each
+output column with preferred output column name and corresponding data 
type.
+
+...
+
+additional argument(s) passed to the method.
+
+
+
+
+Value
+
+A SparkDataFrame.
+
+
+
+Note
+
+gapply(SparkDataFrame) since 2.0.0
+
+gapply(GroupedData) since 2.0.0
+
+
+
+See Also
+
+gapplyCollect
+
+Other SparkDataFrame functions: $,
+$,SparkDataFrame-method, $-,
+$-,SparkDataFrame-method,
+select, select,
+select,SparkDataFrame,Column-method,
+select,SparkDataFrame,character-method,
+select,SparkDataFrame,list-method;
+SparkDataFrame-class; [,
+[,SparkDataFrame-method, [[,
+[[,SparkDataFrame,numericOrcharacter-method,
+subset, subset,
+subset,SparkDataFrame-method;
+agg, agg, agg,
+agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+arrange, arrange,
+arrange,
+arrange,SparkDataFrame,Column-method,
+arrange,SparkDataFrame,character-method,
+orderBy,SparkDataFrame,characterOrColumn-method;
+as.data.frame,
+as.data.frame,SparkDataFrame-method;
+attach,
+attach,SparkDataFrame-method;
+cache, cache,
+cache,SparkDataFrame-method;
+collect, collect,
+collect,SparkDataFrame-method;
+colnames, colnames,
+colnames,SparkDataFrame-method,
+colnames-, colnames-,
+colnames-,SparkDataFrame-method,
+columns, columns,
+columns,SparkDataFrame-method,
+names,
+names,SparkDataFrame-method,
+names-,
+names-,SparkDataFrame-method;
+coltypes, coltypes,
+coltypes,SparkDataFrame-method,
+coltypes-, coltypes-,
+coltypes-,SparkDataFrame,character-method;
+count,SparkDataFrame-method,
+nrow, nrow,
+nrow,SparkDataFrame-method;
+createOrReplaceTempView,
+createOrReplaceTempView,
+createOrReplaceTempView,SparkDataFrame,character-method;
+dapplyCollect, dapplyCollect,
+dapplyCollect,SparkDataFrame,function-method;
+dapply, dapply,
+dapply,SparkDataFrame,function,structType-method;
+describe, describe,
+describe,
+describe,SparkDataFrame,ANY-method,
+describe,SparkDataFrame,character-method,
+describe,SparkDataFrame-method,
+summary, summary,
+summary,SparkDataFrame-method;
+dim,
+dim,SparkDataFrame-method;
+distinct, distinct,
+distinct,SparkDataFrame-method,
+unique,
+unique,SparkDataFrame-method;
+dropDuplicates,
+dropDuplicates,
+dropDuplicates,SparkDataFrame-method;
+dropna, dropna,
+dropna,SparkDataFrame-method,
+fillna, fillna,
+fillna,SparkDataFrame-method,
+na.omit, na.omit,
+na.omit,SparkDataFrame-method;
+drop, drop,
+drop, drop,ANY-method,
+drop,SparkDataFrame-method;
+dtypes, dtypes,
+dtypes,SparkDataFrame-method;
+except, except,
+except,SparkDataFrame,SparkDataFrame-method;
+explain, explain,
+explain,SparkDataFrame-method;
+filter, filter,
+filter,SparkDataFrame,characterOrColumn-method,
+where, where,
+where,SparkDataFrame,characterOrColumn-method;
+first, first,
+first,
+first,SparkDataFrame-method,
+first,characterOrColumn-method;
+gapplyCollect, gapplyCollect,
+gapplyCollect,
+gapplyCollect,GroupedData-method,
+gapplyCollect,SparkDataFrame-method;
+groupBy, groupBy,
+groupBy,SparkDataFrame-method,
+group_by, group_by,
+group_by,SparkDataFrame-method;
+head,
+head,SparkDataFrame-method;
+histogram,
+histogram,SparkDataFrame,characterOrColumn-method;
+insertInto, insertInto,
+insertInto,SparkDataFrame,character-method;
+intersect, intersect,
+intersect,SparkDataFrame,SparkDataFrame-method;
+isLocal, isLocal,
+isLocal,SparkDataFrame-method;
+join,
+join,SparkDataFrame,SparkDataFrame-method;
+limit, limit,

[32/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/read.orc.html
--
diff --git a/site/docs/2.0.2/api/R/read.orc.html 
b/site/docs/2.0.2/api/R/read.orc.html
new file mode 100644
index 000..e67fa6a
--- /dev/null
+++ b/site/docs/2.0.2/api/R/read.orc.html
@@ -0,0 +1,46 @@
+
+R: Create a SparkDataFrame from an ORC file.
+
+
+
+
+read.orc 
{SparkR}R Documentation
+
+Create a SparkDataFrame from an ORC file.
+
+Description
+
+Loads an ORC file, returning the result as a SparkDataFrame.
+
+
+
+Usage
+
+
+read.orc(path)
+
+
+
+Arguments
+
+
+path
+
+Path of file to read.
+
+
+
+
+Value
+
+SparkDataFrame
+
+
+
+Note
+
+read.orc since 2.0.0
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/read.parquet.html
--
diff --git a/site/docs/2.0.2/api/R/read.parquet.html 
b/site/docs/2.0.2/api/R/read.parquet.html
new file mode 100644
index 000..0d42bcd
--- /dev/null
+++ b/site/docs/2.0.2/api/R/read.parquet.html
@@ -0,0 +1,56 @@
+
+R: Create a SparkDataFrame from a Parquet file.
+
+
+
+
+read.parquet {SparkR}R 
Documentation
+
+Create a SparkDataFrame from a Parquet file.
+
+Description
+
+Loads a Parquet file, returning the result as a SparkDataFrame.
+
+
+
+Usage
+
+
+## Default S3 method:
+read.parquet(path)
+
+## Default S3 method:
+parquetFile(...)
+
+
+
+Arguments
+
+
+path
+
+path of file to read. A vector of multiple paths is allowed.
+
+...
+
+argument(s) passed to the method.
+
+
+
+
+Value
+
+SparkDataFrame
+
+
+
+Note
+
+read.parquet since 1.6.0
+
+parquetFile since 1.4.0
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/read.text.html
--
diff --git a/site/docs/2.0.2/api/R/read.text.html 
b/site/docs/2.0.2/api/R/read.text.html
new file mode 100644
index 000..2b6d8ca
--- /dev/null
+++ b/site/docs/2.0.2/api/R/read.text.html
@@ -0,0 +1,71 @@
+
+R: Create a SparkDataFrame from a text file.
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+read.text 
{SparkR}R Documentation
+
+Create a SparkDataFrame from a text file.
+
+Description
+
+Loads text files and returns a SparkDataFrame whose schema starts with
+a string column named value, and followed by partitioned columns if
+there are any.
+
+
+
+Usage
+
+
+## Default S3 method:
+read.text(path)
+
+
+
+Arguments
+
+
+path
+
+Path of file to read. A vector of multiple paths is allowed.
+
+
+
+
+Details
+
+Each line in the text file is a new row in the resulting SparkDataFrame.
+
+
+
+Value
+
+SparkDataFrame
+
+
+
+Note
+
+read.text since 1.6.1
+
+
+
+Examples
+
+## Not run: 
+##D sparkR.session()
+##D path - path/to/file.txt
+##D df - read.text(path)
+## End(Not run)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/regexp_extract.html
--
diff --git a/site/docs/2.0.2/api/R/regexp_extract.html 
b/site/docs/2.0.2/api/R/regexp_extract.html
new file mode 100644
index 000..375ceb0
--- /dev/null
+++ b/site/docs/2.0.2/api/R/regexp_extract.html
@@ -0,0 +1,122 @@
+
+R: regexp_extract
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+regexp_extract {SparkR}R 
Documentation
+
+regexp_extract
+
+Description
+
+Extract a specific idx group identified by a Java regex, from 
the specified string column.
+If the regex did not match, or the specified group did not match, an empty 
string is returned.
+
+
+
+Usage
+
+
+## S4 method for signature 'Column,character,numeric'
+regexp_extract(x, pattern, idx)
+
+regexp_extract(x, pattern, idx)
+
+
+
+Arguments
+
+
+x
+
+a string Column.
+
+pattern
+
+a regular expression.
+
+idx
+
+a group index.
+
+
+
+
+Note
+
+regexp_extract since 1.5.0
+
+
+
+See Also
+
+Other string_funcs: ascii,
+ascii, ascii,Column-method;
+base64, base64,
+base64,Column-method;
+concat_ws, concat_ws,
+concat_ws,character,Column-method;
+concat, concat,
+concat,Column-method; decode,
+decode,
+decode,Column,character-method;
+encode, encode,
+encode,Column,character-method;
+format_number, format_number,
+format_number,Column,numeric-method;
+format_string, format_string,
+format_string,character,Column-method;

[17/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/index.html
--
diff --git a/site/docs/2.0.2/api/java/index.html 
b/site/docs/2.0.2/api/java/index.html
new file mode 100644
index 000..f0b9c05
--- /dev/null
+++ b/site/docs/2.0.2/api/java/index.html
@@ -0,0 +1,74 @@
+http://www.w3.org/TR/html4/frameset.dtd;>
+
+
+
+
+Spark 2.0.2 JavaDoc
+
+targetPage = "" + window.location.search;
+if (targetPage != "" && targetPage != "undefined")
+targetPage = targetPage.substring(1);
+if (targetPage.indexOf(":") != -1 || (targetPage != "" && 
!validURL(targetPage)))
+targetPage = "undefined";
+function validURL(url) {
+try {
+url = decodeURIComponent(url);
+}
+catch (error) {
+return false;
+}
+var pos = url.indexOf(".html");
+if (pos == -1 || pos != url.length - 5)
+return false;
+var allowNumber = false;
+var allowSep = false;
+var seenDot = false;
+for (var i = 0; i < url.length - 5; i++) {
+var ch = url.charAt(i);
+if ('a' <= ch && ch <= 'z' ||
+'A' <= ch && ch <= 'Z' ||
+ch == '$' ||
+ch == '_' ||
+ch.charCodeAt(0) > 127) {
+allowNumber = true;
+allowSep = true;
+} else if ('0' <= ch && ch <= '9'
+|| ch == '-') {
+if (!allowNumber)
+ return false;
+} else if (ch == '/' || ch == '.') {
+if (!allowSep)
+return false;
+allowNumber = false;
+allowSep = false;
+if (ch == '.')
+ seenDot = true;
+if (ch == '/' && seenDot)
+ return false;
+} else {
+return false;
+}
+}
+return true;
+}
+function loadFrames() {
+if (targetPage != "" && targetPage != "undefined")
+ top.classFrame.location = top.targetPage;
+}
+
+
+
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+Frame Alert
+This document is designed to be viewed using the frames feature. If you see 
this message, you are using a non-frame-capable web client. Link to Non-frame version.
+
+
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/lib/api-javadocs.js
--
diff --git a/site/docs/2.0.2/api/java/lib/api-javadocs.js 
b/site/docs/2.0.2/api/java/lib/api-javadocs.js
new file mode 100644
index 000..ead13d6
--- /dev/null
+++ b/site/docs/2.0.2/api/java/lib/api-javadocs.js
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/* Dynamically injected post-processing code for the API docs */
+
+$(document).ready(function() {
+  addBadges(":: AlphaComponent ::", 'Alpha 
Component');
+  addBadges(":: DeveloperApi ::", 'Developer 
API');
+  addBadges(":: Experimental ::", 'Experimental');
+});
+
+function addBadges(tag, html) {
+  var tags = $(".block:contains(" + tag + ")")
+
+  // Remove identifier tags
+  tags.each(function(index) {
+var oldHTML = $(this).html();
+var newHTML = oldHTML.replace(tag, "");
+$(this).html(newHTML);
+  });
+
+  // Add html badge tags
+  tags.each(function(index) {
+if ($(this).parent().is('td.colLast')) {
+  $(this).parent().prepend(html);
+} else if ($(this).parent('li.blockList')
+  .parent('ul.blockList')
+  .parent('div.description')
+  .parent().is('div.contentContainer')) {
+  var contentContainer = $(this).parent('li.blockList')
+.parent('ul.blockList')
+.parent('div.description')
+.parent('div.contentContainer')
+  var header = contentContainer.prev('div.header');
+  if (header.length > 0) {
+header.prepend(html);
+  } else {
+

[23/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/write.jdbc.html
--
diff --git a/site/docs/2.0.2/api/R/write.jdbc.html 
b/site/docs/2.0.2/api/R/write.jdbc.html
new file mode 100644
index 000..d357087
--- /dev/null
+++ b/site/docs/2.0.2/api/R/write.jdbc.html
@@ -0,0 +1,299 @@
+
+R: Save the content of SparkDataFrame to an external 
database...
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+write.jdbc 
{SparkR}R Documentation
+
+Save the content of SparkDataFrame to an external database table via 
JDBC.
+
+Description
+
+Save the content of the SparkDataFrame to an external database table via 
JDBC. Additional JDBC
+database connection properties can be set (...)
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame,character,character'
+write.jdbc(x, url, tableName,
+  mode = "error", ...)
+
+write.jdbc(x, url, tableName, mode = "error", ...)
+
+
+
+Arguments
+
+
+x
+
+a SparkDataFrame.
+
+url
+
+JDBC database url of the form jdbc:subprotocol:subname.
+
+tableName
+
+yhe name of the table in the external database.
+
+mode
+
+one of 'append', 'overwrite', 'error', 'ignore' save mode (it is 'error' by 
default).
+
+...
+
+additional JDBC database connection properties.
+
+
+
+
+Details
+
+Also, mode is used to specify the behavior of the save operation when
+data already exists in the data source. There are four modes:
+
+
+
+ append: Contents of this SparkDataFrame are expected to be appended to 
existing data.
+
+
+ overwrite: Existing data is expected to be overwritten by the contents 
of this
+SparkDataFrame.
+
+
+ error: An exception is expected to be thrown.
+
+
+ ignore: The save operation is expected to not save the contents of the 
SparkDataFrame
+and to not change the existing data.
+
+
+
+
+
+Note
+
+write.jdbc since 2.0.0
+
+
+
+See Also
+
+Other SparkDataFrame functions: $,
+$,SparkDataFrame-method, $-,
+$-,SparkDataFrame-method,
+select, select,
+select,SparkDataFrame,Column-method,
+select,SparkDataFrame,character-method,
+select,SparkDataFrame,list-method;
+SparkDataFrame-class; [,
+[,SparkDataFrame-method, [[,
+[[,SparkDataFrame,numericOrcharacter-method,
+subset, subset,
+subset,SparkDataFrame-method;
+agg, agg, agg,
+agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+arrange, arrange,
+arrange,
+arrange,SparkDataFrame,Column-method,
+arrange,SparkDataFrame,character-method,
+orderBy,SparkDataFrame,characterOrColumn-method;
+as.data.frame,
+as.data.frame,SparkDataFrame-method;
+attach,
+attach,SparkDataFrame-method;
+cache, cache,
+cache,SparkDataFrame-method;
+collect, collect,
+collect,SparkDataFrame-method;
+colnames, colnames,
+colnames,SparkDataFrame-method,
+colnames-, colnames-,
+colnames-,SparkDataFrame-method,
+columns, columns,
+columns,SparkDataFrame-method,
+names,
+names,SparkDataFrame-method,
+names-,
+names-,SparkDataFrame-method;
+coltypes, coltypes,
+coltypes,SparkDataFrame-method,
+coltypes-, coltypes-,
+coltypes-,SparkDataFrame,character-method;
+count,SparkDataFrame-method,
+nrow, nrow,
+nrow,SparkDataFrame-method;
+createOrReplaceTempView,
+createOrReplaceTempView,
+createOrReplaceTempView,SparkDataFrame,character-method;
+dapplyCollect, dapplyCollect,
+dapplyCollect,SparkDataFrame,function-method;
+dapply, dapply,
+dapply,SparkDataFrame,function,structType-method;
+describe, describe,
+describe,
+describe,SparkDataFrame,ANY-method,
+describe,SparkDataFrame,character-method,
+describe,SparkDataFrame-method,
+summary, summary,
+summary,SparkDataFrame-method;
+dim,
+dim,SparkDataFrame-method;
+distinct, distinct,
+distinct,SparkDataFrame-method,
+unique,
+unique,SparkDataFrame-method;
+dropDuplicates,
+dropDuplicates,
+dropDuplicates,SparkDataFrame-method;
+dropna, dropna,
+dropna,SparkDataFrame-method,
+fillna, fillna,
+fillna,SparkDataFrame-method,
+na.omit, na.omit,
+na.omit,SparkDataFrame-method;
+drop, drop,
+drop, drop,ANY-method,
+drop,SparkDataFrame-method;
+dtypes, dtypes,
+dtypes,SparkDataFrame-method;
+except, except,
+except,SparkDataFrame,SparkDataFrame-method;
+explain, explain,
+explain,SparkDataFrame-method;
+filter, filter,
+filter,SparkDataFrame,characterOrColumn-method,
+where, where,
+where,SparkDataFrame,characterOrColumn-method;
+first, first,
+first,
+first,SparkDataFrame-method,
+first,characterOrColumn-method;
+gapplyCollect, gapplyCollect,
+gapplyCollect,
+gapplyCollect,GroupedData-method,
+gapplyCollect,SparkDataFrame-method;
+gapply, gapply,
+gapply,
+gapply,GroupedData-method,
+gapply,SparkDataFrame-method;
+groupBy, groupBy,
+groupBy,SparkDataFrame-method,
+group_by, group_by,

[35/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/mutate.html
--
diff --git a/site/docs/2.0.2/api/R/mutate.html 
b/site/docs/2.0.2/api/R/mutate.html
new file mode 100644
index 000..76e5ba6
--- /dev/null
+++ b/site/docs/2.0.2/api/R/mutate.html
@@ -0,0 +1,285 @@
+
+R: Mutate
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+mutate 
{SparkR}R Documentation
+
+Mutate
+
+Description
+
+Return a new SparkDataFrame with the specified columns added or replaced.
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame'
+mutate(.data, ...)
+
+## S4 method for signature 'SparkDataFrame'
+transform(`_data`, ...)
+
+mutate(.data, ...)
+
+transform(`_data`, ...)
+
+
+
+Arguments
+
+
+.data
+
+a SparkDataFrame.
+
+...
+
+additional column argument(s) each in the form name = col.
+
+_data
+
+a SparkDataFrame.
+
+
+
+
+Value
+
+A new SparkDataFrame with the new columns added or replaced.
+
+
+
+Note
+
+mutate since 1.4.0
+
+transform since 1.5.0
+
+
+
+See Also
+
+rename withColumn
+
+Other SparkDataFrame functions: $,
+$,SparkDataFrame-method, $-,
+$-,SparkDataFrame-method,
+select, select,
+select,SparkDataFrame,Column-method,
+select,SparkDataFrame,character-method,
+select,SparkDataFrame,list-method;
+SparkDataFrame-class; [,
+[,SparkDataFrame-method, [[,
+[[,SparkDataFrame,numericOrcharacter-method,
+subset, subset,
+subset,SparkDataFrame-method;
+agg, agg, agg,
+agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+arrange, arrange,
+arrange,
+arrange,SparkDataFrame,Column-method,
+arrange,SparkDataFrame,character-method,
+orderBy,SparkDataFrame,characterOrColumn-method;
+as.data.frame,
+as.data.frame,SparkDataFrame-method;
+attach,
+attach,SparkDataFrame-method;
+cache, cache,
+cache,SparkDataFrame-method;
+collect, collect,
+collect,SparkDataFrame-method;
+colnames, colnames,
+colnames,SparkDataFrame-method,
+colnames-, colnames-,
+colnames-,SparkDataFrame-method,
+columns, columns,
+columns,SparkDataFrame-method,
+names,
+names,SparkDataFrame-method,
+names-,
+names-,SparkDataFrame-method;
+coltypes, coltypes,
+coltypes,SparkDataFrame-method,
+coltypes-, coltypes-,
+coltypes-,SparkDataFrame,character-method;
+count,SparkDataFrame-method,
+nrow, nrow,
+nrow,SparkDataFrame-method;
+createOrReplaceTempView,
+createOrReplaceTempView,
+createOrReplaceTempView,SparkDataFrame,character-method;
+dapplyCollect, dapplyCollect,
+dapplyCollect,SparkDataFrame,function-method;
+dapply, dapply,
+dapply,SparkDataFrame,function,structType-method;
+describe, describe,
+describe,
+describe,SparkDataFrame,ANY-method,
+describe,SparkDataFrame,character-method,
+describe,SparkDataFrame-method,
+summary, summary,
+summary,SparkDataFrame-method;
+dim,
+dim,SparkDataFrame-method;
+distinct, distinct,
+distinct,SparkDataFrame-method,
+unique,
+unique,SparkDataFrame-method;
+dropDuplicates,
+dropDuplicates,
+dropDuplicates,SparkDataFrame-method;
+dropna, dropna,
+dropna,SparkDataFrame-method,
+fillna, fillna,
+fillna,SparkDataFrame-method,
+na.omit, na.omit,
+na.omit,SparkDataFrame-method;
+drop, drop,
+drop, drop,ANY-method,
+drop,SparkDataFrame-method;
+dtypes, dtypes,
+dtypes,SparkDataFrame-method;
+except, except,
+except,SparkDataFrame,SparkDataFrame-method;
+explain, explain,
+explain,SparkDataFrame-method;
+filter, filter,
+filter,SparkDataFrame,characterOrColumn-method,
+where, where,
+where,SparkDataFrame,characterOrColumn-method;
+first, first,
+first,
+first,SparkDataFrame-method,
+first,characterOrColumn-method;
+gapplyCollect, gapplyCollect,
+gapplyCollect,
+gapplyCollect,GroupedData-method,
+gapplyCollect,SparkDataFrame-method;
+gapply, gapply,
+gapply,
+gapply,GroupedData-method,
+gapply,SparkDataFrame-method;
+groupBy, groupBy,
+groupBy,SparkDataFrame-method,
+group_by, group_by,
+group_by,SparkDataFrame-method;
+head,
+head,SparkDataFrame-method;
+histogram,
+histogram,SparkDataFrame,characterOrColumn-method;
+insertInto, insertInto,
+insertInto,SparkDataFrame,character-method;
+intersect, intersect,
+intersect,SparkDataFrame,SparkDataFrame-method;
+isLocal, isLocal,
+isLocal,SparkDataFrame-method;
+join,
+join,SparkDataFrame,SparkDataFrame-method;
+limit, limit,
+limit,SparkDataFrame,numeric-method;
+merge, merge,
+merge,SparkDataFrame,SparkDataFrame-method;
+ncol,
+ncol,SparkDataFrame-method;
+persist, persist,
+persist,SparkDataFrame,character-method;
+printSchema, printSchema,
+printSchema,SparkDataFrame-method;
+randomSplit, randomSplit,
+randomSplit,SparkDataFrame,numeric-method;
+rbind, rbind,
+rbind,SparkDataFrame-method;
+registerTempTable,
+registerTempTable,

[08/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/RangePartitioner.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/RangePartitioner.html 
b/site/docs/2.0.2/api/java/org/apache/spark/RangePartitioner.html
new file mode 100644
index 000..21e8fd1
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/RangePartitioner.html
@@ -0,0 +1,390 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+RangePartitioner (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Class 
RangePartitionerK,V
+
+
+
+Object
+
+
+org.apache.spark.Partitioner
+
+
+org.apache.spark.RangePartitionerK,V
+
+
+
+
+
+
+
+
+
+All Implemented Interfaces:
+java.io.Serializable
+
+
+
+public class RangePartitionerK,V
+extends Partitioner
+A Partitioner that partitions 
sortable records by range into roughly
+ equal ranges. The ranges are determined by sampling the content of the RDD 
passed in.
+ 
+ Note that the actual number of partitions created by the RangePartitioner 
might not be the same
+ as the partitions parameter, in the case where the number of 
sampled records is less than
+ the value of partitions.
+See Also:Serialized
 Form
+
+
+
+
+
+
+
+
+
+
+
+Constructor Summary
+
+Constructors
+
+Constructor and Description
+
+
+RangePartitioner(intpartitions,
+RDD? extends scala.Product2K,Vrdd,
+booleanascending,
+scala.math.OrderingKevidence$1,
+scala.reflect.ClassTagKevidence$2)
+
+
+
+
+
+
+
+
+
+Method Summary
+
+Methods
+
+Modifier and Type
+Method and Description
+
+
+static KObject
+determineBounds(scala.collection.mutable.ArrayBufferscala.Tuple2K,Objectcandidates,
+   intpartitions,
+   scala.math.OrderingKevidence$4,
+   scala.reflect.ClassTagKevidence$5)
+Determines the bounds for range partitioning from 
candidates with weights indicating how many
+ items each represents.
+
+
+
+boolean
+equals(Objectother)
+
+
+int
+getPartition(Objectkey)
+
+
+int
+hashCode()
+
+
+int
+numPartitions()
+
+
+static 
Kscala.Tuple2Object,scala.Tuple3Object,Object,Object[]
+sketch(RDDKrdd,
+  intsampleSizePerPartition,
+  scala.reflect.ClassTagKevidence$3)
+Sketches the input RDD via reservoir sampling on each 
partition.
+
+
+
+
+
+
+
+Methods inherited from classorg.apache.spark.Partitioner
+defaultPartitioner
+
+
+
+
+
+Methods inherited from classObject
+getClass, notify, notifyAll, toString, wait, wait, wait
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Constructor Detail
+
+
+
+
+
+RangePartitioner
+publicRangePartitioner(intpartitions,
+RDD? extends scala.Product2K,Vrdd,
+booleanascending,
+scala.math.OrderingKevidence$1,
+scala.reflect.ClassTagKevidence$2)
+
+
+
+
+
+
+
+
+
+Method Detail
+
+
+
+
+
+sketch
+public 
staticKscala.Tuple2Object,scala.Tuple3Object,Object,Object[]sketch(RDDKrdd,
+   
intsampleSizePerPartition,
+   
scala.reflect.ClassTagKevidence$3)
+Sketches the input RDD via reservoir sampling on each 
partition.
+ 
+Parameters:rdd - the 
input RDD to sketchsampleSizePerPartition - max sample 
size per partitionevidence$3 - (undocumented)
+Returns:(total number of items, an 
array of (partitionId, number of items, sample))
+
+
+
+
+
+
+
+determineBounds
+public 
staticKObjectdetermineBounds(scala.collection.mutable.ArrayBufferscala.Tuple2K,Objectcandidates,
+ intpartitions,
+ scala.math.OrderingKevidence$4,
+ scala.reflect.ClassTagKevidence$5)
+Determines the bounds for range partitioning from 
candidates with weights indicating how many
+ items each represents. Usually this is 1 over the probability used to sample 
this candidate.
+ 
+Parameters:candidates - unordered 
candidates with weightspartitions - number of 
partitionsevidence$4 - 
(undocumented)evidence$5 - (undocumented)
+Returns:selected bounds
+
+
+
+
+
+
+
+numPartitions
+publicintnumPartitions()
+
+Specified by:
+numPartitionsin
 classPartitioner
+
+
+
+
+
+
+
+
+getPartition
+publicintgetPartition(Objectkey)
+

[05/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/SparkEnv.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/SparkEnv.html 
b/site/docs/2.0.2/api/java/org/apache/spark/SparkEnv.html
new file mode 100644
index 000..9bcf96c
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/SparkEnv.html
@@ -0,0 +1,474 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+SparkEnv (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Class SparkEnv
+
+
+
+Object
+
+
+org.apache.spark.SparkEnv
+
+
+
+
+
+
+
+
+public class SparkEnv
+extends Object
+:: DeveloperApi ::
+ Holds all the runtime environment objects for a running Spark instance 
(either master or worker),
+ including the serializer, RpcEnv, block manager, map output tracker, etc. 
Currently
+ Spark code finds the SparkEnv through a global variable, so all the threads 
can access the same
+ SparkEnv. It can be accessed by SparkEnv.get (e.g. after creating a 
SparkContext).
+ 
+ NOTE: This is not intended for external use. This is exposed for Shark and 
may be made private
+   in a future release.
+
+
+
+
+
+
+
+
+
+
+
+Constructor Summary
+
+Constructors
+
+Constructor and Description
+
+
+SparkEnv(StringexecutorId,
+org.apache.spark.rpc.RpcEnvrpcEnv,
+Serializerserializer,
+SerializerclosureSerializer,
+org.apache.spark.serializer.SerializerManagerserializerManager,
+org.apache.spark.MapOutputTrackermapOutputTracker,
+org.apache.spark.shuffle.ShuffleManagershuffleManager,
+org.apache.spark.broadcast.BroadcastManagerbroadcastManager,
+org.apache.spark.storage.BlockManagerblockManager,
+org.apache.spark.SecurityManagersecurityManager,
+org.apache.spark.metrics.MetricsSystemmetricsSystem,
+org.apache.spark.memory.MemoryManagermemoryManager,
+
org.apache.spark.scheduler.OutputCommitCoordinatoroutputCommitCoordinator,
+SparkConfconf)
+
+
+
+
+
+
+
+
+
+Method Summary
+
+Methods
+
+Modifier and Type
+Method and Description
+
+
+org.apache.spark.storage.BlockManager
+blockManager()
+
+
+org.apache.spark.broadcast.BroadcastManager
+broadcastManager()
+
+
+Serializer
+closureSerializer()
+
+
+SparkConf
+conf()
+
+
+String
+executorId()
+
+
+static SparkEnv
+get()
+Returns the SparkEnv.
+
+
+
+org.apache.spark.MapOutputTracker
+mapOutputTracker()
+
+
+org.apache.spark.memory.MemoryManager
+memoryManager()
+
+
+org.apache.spark.metrics.MetricsSystem
+metricsSystem()
+
+
+org.apache.spark.scheduler.OutputCommitCoordinator
+outputCommitCoordinator()
+
+
+org.apache.spark.SecurityManager
+securityManager()
+
+
+Serializer
+serializer()
+
+
+org.apache.spark.serializer.SerializerManager
+serializerManager()
+
+
+static void
+set(SparkEnve)
+
+
+org.apache.spark.shuffle.ShuffleManager
+shuffleManager()
+
+
+
+
+
+
+Methods inherited from classObject
+equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, 
wait
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Constructor Detail
+
+
+
+
+
+SparkEnv
+publicSparkEnv(StringexecutorId,
+org.apache.spark.rpc.RpcEnvrpcEnv,
+Serializerserializer,
+SerializerclosureSerializer,
+org.apache.spark.serializer.SerializerManagerserializerManager,
+org.apache.spark.MapOutputTrackermapOutputTracker,
+org.apache.spark.shuffle.ShuffleManagershuffleManager,
+org.apache.spark.broadcast.BroadcastManagerbroadcastManager,
+org.apache.spark.storage.BlockManagerblockManager,
+org.apache.spark.SecurityManagersecurityManager,
+org.apache.spark.metrics.MetricsSystemmetricsSystem,
+org.apache.spark.memory.MemoryManagermemoryManager,
+
org.apache.spark.scheduler.OutputCommitCoordinatoroutputCommitCoordinator,
+SparkConfconf)
+
+
+
+
+
+
+
+
+
+Method Detail
+
+
+
+
+
+set
+public staticvoidset(SparkEnve)
+
+
+
+
+
+
+
+get
+public staticSparkEnvget()
+Returns the SparkEnv.
+Returns:(undocumented)
+
+
+
+
+
+
+
+executorId
+publicStringexecutorId()
+
+
+
+
+
+
+
+serializer
+publicSerializerserializer()
+
+
+
+
+
+
+
+closureSerializer
+publicSerializerclosureSerializer()
+
+
+
+
+
+
+
+serializerManager

[43/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/dim.html
--
diff --git a/site/docs/2.0.2/api/R/dim.html b/site/docs/2.0.2/api/R/dim.html
new file mode 100644
index 000..1227bc6
--- /dev/null
+++ b/site/docs/2.0.2/api/R/dim.html
@@ -0,0 +1,256 @@
+
+R: Returns the dimensions of SparkDataFrame
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+dim 
{SparkR}R Documentation
+
+Returns the dimensions of SparkDataFrame
+
+Description
+
+Returns the dimensions (number of rows and columns) of a SparkDataFrame
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame'
+dim(x)
+
+
+
+Arguments
+
+
+x
+
+a SparkDataFrame
+
+
+
+
+Note
+
+dim since 1.5.0
+
+
+
+See Also
+
+Other SparkDataFrame functions: $,
+$,SparkDataFrame-method, $-,
+$-,SparkDataFrame-method,
+select, select,
+select,SparkDataFrame,Column-method,
+select,SparkDataFrame,character-method,
+select,SparkDataFrame,list-method;
+SparkDataFrame-class; [,
+[,SparkDataFrame-method, [[,
+[[,SparkDataFrame,numericOrcharacter-method,
+subset, subset,
+subset,SparkDataFrame-method;
+agg, agg, agg,
+agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+arrange, arrange,
+arrange,
+arrange,SparkDataFrame,Column-method,
+arrange,SparkDataFrame,character-method,
+orderBy,SparkDataFrame,characterOrColumn-method;
+as.data.frame,
+as.data.frame,SparkDataFrame-method;
+attach,
+attach,SparkDataFrame-method;
+cache, cache,
+cache,SparkDataFrame-method;
+collect, collect,
+collect,SparkDataFrame-method;
+colnames, colnames,
+colnames,SparkDataFrame-method,
+colnames-, colnames-,
+colnames-,SparkDataFrame-method,
+columns, columns,
+columns,SparkDataFrame-method,
+names,
+names,SparkDataFrame-method,
+names-,
+names-,SparkDataFrame-method;
+coltypes, coltypes,
+coltypes,SparkDataFrame-method,
+coltypes-, coltypes-,
+coltypes-,SparkDataFrame,character-method;
+count,SparkDataFrame-method,
+nrow, nrow,
+nrow,SparkDataFrame-method;
+createOrReplaceTempView,
+createOrReplaceTempView,
+createOrReplaceTempView,SparkDataFrame,character-method;
+dapplyCollect, dapplyCollect,
+dapplyCollect,SparkDataFrame,function-method;
+dapply, dapply,
+dapply,SparkDataFrame,function,structType-method;
+describe, describe,
+describe,
+describe,SparkDataFrame,ANY-method,
+describe,SparkDataFrame,character-method,
+describe,SparkDataFrame-method,
+summary, summary,
+summary,SparkDataFrame-method;
+distinct, distinct,
+distinct,SparkDataFrame-method,
+unique,
+unique,SparkDataFrame-method;
+dropDuplicates,
+dropDuplicates,
+dropDuplicates,SparkDataFrame-method;
+dropna, dropna,
+dropna,SparkDataFrame-method,
+fillna, fillna,
+fillna,SparkDataFrame-method,
+na.omit, na.omit,
+na.omit,SparkDataFrame-method;
+drop, drop,
+drop, drop,ANY-method,
+drop,SparkDataFrame-method;
+dtypes, dtypes,
+dtypes,SparkDataFrame-method;
+except, except,
+except,SparkDataFrame,SparkDataFrame-method;
+explain, explain,
+explain,SparkDataFrame-method;
+filter, filter,
+filter,SparkDataFrame,characterOrColumn-method,
+where, where,
+where,SparkDataFrame,characterOrColumn-method;
+first, first,
+first,
+first,SparkDataFrame-method,
+first,characterOrColumn-method;
+gapplyCollect, gapplyCollect,
+gapplyCollect,
+gapplyCollect,GroupedData-method,
+gapplyCollect,SparkDataFrame-method;
+gapply, gapply,
+gapply,
+gapply,GroupedData-method,
+gapply,SparkDataFrame-method;
+groupBy, groupBy,
+groupBy,SparkDataFrame-method,
+group_by, group_by,
+group_by,SparkDataFrame-method;
+head,
+head,SparkDataFrame-method;
+histogram,
+histogram,SparkDataFrame,characterOrColumn-method;
+insertInto, insertInto,
+insertInto,SparkDataFrame,character-method;
+intersect, intersect,
+intersect,SparkDataFrame,SparkDataFrame-method;
+isLocal, isLocal,
+isLocal,SparkDataFrame-method;
+join,
+join,SparkDataFrame,SparkDataFrame-method;
+limit, limit,
+limit,SparkDataFrame,numeric-method;
+merge, merge,
+merge,SparkDataFrame,SparkDataFrame-method;
+mutate, mutate,
+mutate,SparkDataFrame-method,
+transform, transform,
+transform,SparkDataFrame-method;
+ncol,
+ncol,SparkDataFrame-method;
+persist, persist,
+persist,SparkDataFrame,character-method;
+printSchema, printSchema,
+printSchema,SparkDataFrame-method;
+randomSplit, randomSplit,
+randomSplit,SparkDataFrame,numeric-method;
+rbind, rbind,
+rbind,SparkDataFrame-method;
+registerTempTable,
+registerTempTable,
+registerTempTable,SparkDataFrame,character-method;
+rename, rename,
+rename,SparkDataFrame-method,
+withColumnRenamed,
+withColumnRenamed,
+withColumnRenamed,SparkDataFrame,character,character-method;
+repartition, repartition,

[31/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/round.html
--
diff --git a/site/docs/2.0.2/api/R/round.html b/site/docs/2.0.2/api/R/round.html
new file mode 100644
index 000..fd2fb17
--- /dev/null
+++ b/site/docs/2.0.2/api/R/round.html
@@ -0,0 +1,120 @@
+
+R: round
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+round 
{SparkR}R Documentation
+
+round
+
+Description
+
+Returns the value of the column e rounded to 0 decimal places 
using HALF_UP rounding mode.
+
+
+
+Usage
+
+
+## S4 method for signature 'Column'
+round(x)
+
+
+
+Arguments
+
+
+x
+
+Column to compute on.
+
+
+
+
+Note
+
+round since 1.5.0
+
+
+
+See Also
+
+Other math_funcs: acos,
+acos,Column-method; asin,
+asin,Column-method; atan2,
+atan2,Column-method; atan,
+atan,Column-method; bin,
+bin, bin,Column-method;
+bround, bround,
+bround,Column-method; cbrt,
+cbrt, cbrt,Column-method;
+ceil, ceil,
+ceil,Column-method, ceiling,
+ceiling,Column-method; conv,
+conv,
+conv,Column,numeric,numeric-method;
+corr, corr,
+corr, corr,Column-method,
+corr,SparkDataFrame-method;
+cosh, cosh,Column-method;
+cos, cos,Column-method;
+covar_pop, covar_pop,
+covar_pop,characterOrColumn,characterOrColumn-method;
+cov, cov, cov,
+cov,SparkDataFrame-method,
+cov,characterOrColumn-method,
+covar_samp, covar_samp,
+covar_samp,characterOrColumn,characterOrColumn-method;
+expm1, expm1,Column-method;
+exp, exp,Column-method;
+factorial,
+factorial,Column-method;
+floor, floor,Column-method;
+hex, hex,
+hex,Column-method; hypot,
+hypot, hypot,Column-method;
+log10, log10,Column-method;
+log1p, log1p,Column-method;
+log2, log2,Column-method;
+log, log,Column-method;
+pmod, pmod,
+pmod,Column-method; rint,
+rint, rint,Column-method;
+shiftLeft, shiftLeft,
+shiftLeft,Column,numeric-method;
+shiftRightUnsigned,
+shiftRightUnsigned,
+shiftRightUnsigned,Column,numeric-method;
+shiftRight, shiftRight,
+shiftRight,Column,numeric-method;
+sign, sign,Column-method,
+signum, signum,
+signum,Column-method; sinh,
+sinh,Column-method; sin,
+sin,Column-method; sqrt,
+sqrt,Column-method; tanh,
+tanh,Column-method; tan,
+tan,Column-method; toDegrees,
+toDegrees,
+toDegrees,Column-method;
+toRadians, toRadians,
+toRadians,Column-method;
+unhex, unhex,
+unhex,Column-method
+
+
+
+Examples
+
+## Not run: round(df$c)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/row_number.html
--
diff --git a/site/docs/2.0.2/api/R/row_number.html 
b/site/docs/2.0.2/api/R/row_number.html
new file mode 100644
index 000..d3fd4fb
--- /dev/null
+++ b/site/docs/2.0.2/api/R/row_number.html
@@ -0,0 +1,86 @@
+
+R: row_number
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+row_number 
{SparkR}R Documentation
+
+row_number
+
+Description
+
+Window function: returns a sequential number starting at 1 within a window 
partition.
+
+
+
+Usage
+
+
+## S4 method for signature 'missing'
+row_number()
+
+row_number(x = "missing")
+
+
+
+Arguments
+
+
+x
+
+empty. Should be used with no argument.
+
+
+
+
+Details
+
+This is equivalent to the ROW_NUMBER function in SQL.
+
+
+
+Note
+
+row_number since 1.6.0
+
+
+
+See Also
+
+Other window_funcs: cume_dist,
+cume_dist,
+cume_dist,missing-method;
+dense_rank, dense_rank,
+dense_rank,missing-method;
+lag, lag,
+lag,characterOrColumn-method;
+lead, lead,
+lead,characterOrColumn,numeric-method;
+ntile, ntile,
+ntile,numeric-method;
+percent_rank, percent_rank,
+percent_rank,missing-method;
+rank, rank,
+rank, rank,ANY-method,
+rank,missing-method
+
+
+
+Examples
+
+## Not run: 
+##D   df - createDataFrame(mtcars)
+##D   ws - orderBy(windowPartitionBy(am), hp)
+##D   out - select(df, over(row_number(), ws), df$hp, df$am)
+## End(Not run)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/rowsBetween.html
--
diff --git a/site/docs/2.0.2/api/R/rowsBetween.html 
b/site/docs/2.0.2/api/R/rowsBetween.html
new file mode 100644
index 000..571df1a
--- /dev/null
+++ b/site/docs/2.0.2/api/R/rowsBetween.html
@@ -0,0 +1,94 @@
+
+R: rowsBetween
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>

[25/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/unhex.html
--
diff --git a/site/docs/2.0.2/api/R/unhex.html b/site/docs/2.0.2/api/R/unhex.html
new file mode 100644
index 000..208ff4c
--- /dev/null
+++ b/site/docs/2.0.2/api/R/unhex.html
@@ -0,0 +1,122 @@
+
+R: unhex
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+unhex 
{SparkR}R Documentation
+
+unhex
+
+Description
+
+Inverse of hex. Interprets each pair of characters as a hexadecimal number
+and converts to the byte representation of number.
+
+
+
+Usage
+
+
+## S4 method for signature 'Column'
+unhex(x)
+
+unhex(x)
+
+
+
+Arguments
+
+
+x
+
+Column to compute on.
+
+
+
+
+Note
+
+unhex since 1.5.0
+
+
+
+See Also
+
+Other math_funcs: acos,
+acos,Column-method; asin,
+asin,Column-method; atan2,
+atan2,Column-method; atan,
+atan,Column-method; bin,
+bin, bin,Column-method;
+bround, bround,
+bround,Column-method; cbrt,
+cbrt, cbrt,Column-method;
+ceil, ceil,
+ceil,Column-method, ceiling,
+ceiling,Column-method; conv,
+conv,
+conv,Column,numeric,numeric-method;
+corr, corr,
+corr, corr,Column-method,
+corr,SparkDataFrame-method;
+cosh, cosh,Column-method;
+cos, cos,Column-method;
+covar_pop, covar_pop,
+covar_pop,characterOrColumn,characterOrColumn-method;
+cov, cov, cov,
+cov,SparkDataFrame-method,
+cov,characterOrColumn-method,
+covar_samp, covar_samp,
+covar_samp,characterOrColumn,characterOrColumn-method;
+expm1, expm1,Column-method;
+exp, exp,Column-method;
+factorial,
+factorial,Column-method;
+floor, floor,Column-method;
+hex, hex,
+hex,Column-method; hypot,
+hypot, hypot,Column-method;
+log10, log10,Column-method;
+log1p, log1p,Column-method;
+log2, log2,Column-method;
+log, log,Column-method;
+pmod, pmod,
+pmod,Column-method; rint,
+rint, rint,Column-method;
+round, round,Column-method;
+shiftLeft, shiftLeft,
+shiftLeft,Column,numeric-method;
+shiftRightUnsigned,
+shiftRightUnsigned,
+shiftRightUnsigned,Column,numeric-method;
+shiftRight, shiftRight,
+shiftRight,Column,numeric-method;
+sign, sign,Column-method,
+signum, signum,
+signum,Column-method; sinh,
+sinh,Column-method; sin,
+sin,Column-method; sqrt,
+sqrt,Column-method; tanh,
+tanh,Column-method; tan,
+tan,Column-method; toDegrees,
+toDegrees,
+toDegrees,Column-method;
+toRadians, toRadians,
+toRadians,Column-method
+
+
+
+Examples
+
+## Not run: unhex(df$c)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/union.html
--
diff --git a/site/docs/2.0.2/api/R/union.html b/site/docs/2.0.2/api/R/union.html
new file mode 100644
index 000..8e92124
--- /dev/null
+++ b/site/docs/2.0.2/api/R/union.html
@@ -0,0 +1,280 @@
+
+R: Return a new SparkDataFrame containing the union of 
rows
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+union 
{SparkR}R Documentation
+
+Return a new SparkDataFrame containing the union of rows
+
+Description
+
+Return a new SparkDataFrame containing the union of rows in this 
SparkDataFrame
+and another SparkDataFrame. This is equivalent to UNION ALL in 
SQL.
+Note that this does not remove duplicate rows across the two SparkDataFrames.
+
+unionAll is deprecated - use union instead
+
+
+
+Usage
+
+
+## S4 method for signature 'SparkDataFrame,SparkDataFrame'
+union(x, y)
+
+## S4 method for signature 'SparkDataFrame,SparkDataFrame'
+unionAll(x, y)
+
+union(x, y)
+
+unionAll(x, y)
+
+
+
+Arguments
+
+
+x
+
+A SparkDataFrame
+
+y
+
+A SparkDataFrame
+
+
+
+
+Value
+
+A SparkDataFrame containing the result of the union.
+
+
+
+Note
+
+union since 2.0.0
+
+unionAll since 1.4.0
+
+
+
+See Also
+
+rbind
+
+Other SparkDataFrame functions: $,
+$,SparkDataFrame-method, $-,
+$-,SparkDataFrame-method,
+select, select,
+select,SparkDataFrame,Column-method,
+select,SparkDataFrame,character-method,
+select,SparkDataFrame,list-method;
+SparkDataFrame-class; [,
+[,SparkDataFrame-method, [[,
+[[,SparkDataFrame,numericOrcharacter-method,
+subset, subset,
+subset,SparkDataFrame-method;
+agg, agg, agg,
+agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+arrange, arrange,
+arrange,
+arrange,SparkDataFrame,Column-method,
+arrange,SparkDataFrame,character-method,
+orderBy,SparkDataFrame,characterOrColumn-method;
+as.data.frame,

[06/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/SparkContext.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/SparkContext.html 
b/site/docs/2.0.2/api/java/org/apache/spark/SparkContext.html
new file mode 100644
index 000..09f31f9
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/SparkContext.html
@@ -0,0 +1,2467 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+SparkContext (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Class SparkContext
+
+
+
+Object
+
+
+org.apache.spark.SparkContext
+
+
+
+
+
+
+
+
+public class SparkContext
+extends Object
+Main entry point for Spark functionality. A SparkContext 
represents the connection to a Spark
+ cluster, and can be used to create RDDs, accumulators and broadcast variables 
on that cluster.
+ 
+ Only one SparkContext may be active per JVM.  You must stop() 
the active SparkContext before
+ creating a new one.  This limitation may eventually be removed; see 
SPARK-2243 for more details.
+ 
+ param:  config a Spark Config object describing the application 
configuration. Any settings in
+   this config overrides the default configs as well as system 
properties.
+
+
+
+
+
+
+
+
+
+
+
+Constructor Summary
+
+Constructors
+
+Constructor and Description
+
+
+SparkContext()
+Create a SparkContext that loads settings from system 
properties (for instance, when
+ launching with ./bin/spark-submit).
+
+
+
+SparkContext(SparkConfconfig)
+
+
+SparkContext(Stringmaster,
+StringappName,
+SparkConfconf)
+Alternative constructor that allows setting common Spark 
properties directly
+
+
+
+SparkContext(Stringmaster,
+StringappName,
+StringsparkHome,
+scala.collection.SeqStringjars,
+scala.collection.MapString,Stringenvironment)
+Alternative constructor that allows setting common Spark 
properties directly
+
+
+
+
+
+
+
+
+
+
+Method Summary
+
+Methods
+
+Modifier and Type
+Method and Description
+
+
+R,TAccumulableR,T
+accumulable(RinitialValue,
+   AccumulableParamR,Tparam)
+Deprecated.
+use AccumulatorV2. Since 2.0.0.
+
+
+
+
+R,TAccumulableR,T
+accumulable(RinitialValue,
+   Stringname,
+   AccumulableParamR,Tparam)
+Deprecated.
+use AccumulatorV2. Since 2.0.0.
+
+
+
+
+R,TAccumulableR,T
+accumulableCollection(RinitialValue,
+ 
scala.Function1R,scala.collection.generic.GrowableTevidence$9,
+ scala.reflect.ClassTagRevidence$10)
+Deprecated.
+use AccumulatorV2. Since 2.0.0.
+
+
+
+
+TAccumulatorT
+accumulator(TinitialValue,
+   AccumulatorParamTparam)
+Deprecated.
+use AccumulatorV2. Since 2.0.0.
+
+
+
+
+TAccumulatorT
+accumulator(TinitialValue,
+   Stringname,
+   AccumulatorParamTparam)
+Deprecated.
+use AccumulatorV2. Since 2.0.0.
+
+
+
+
+void
+addFile(Stringpath)
+Add a file to be downloaded with this Spark job on every 
node.
+
+
+
+void
+addFile(Stringpath,
+   booleanrecursive)
+Add a file to be downloaded with this Spark job on every 
node.
+
+
+
+void
+addJar(Stringpath)
+Adds a JAR dependency for all tasks to be executed on this 
SparkContext in the future.
+
+
+
+void
+addSparkListener(org.apache.spark.scheduler.SparkListenerInterfacelistener)
+:: DeveloperApi ::
+ Register a listener to receive up-calls from events that happen during 
execution.
+
+
+
+scala.OptionString
+applicationAttemptId()
+
+
+String
+applicationId()
+A unique identifier for the Spark application.
+
+
+
+String
+appName()
+
+
+RDDscala.Tuple2String,PortableDataStream
+binaryFiles(Stringpath,
+   intminPartitions)
+Get an RDD for a Hadoop-readable dataset as 
PortableDataStream for each file
+ (useful for binary data)
+
+
+
+RDDbyte[]
+binaryRecords(Stringpath,
+ intrecordLength,
+ org.apache.hadoop.conf.Configurationconf)
+Load data from a flat binary file, assuming the length of 
each record is constant.
+
+
+
+TBroadcastT
+broadcast(Tvalue,
+ scala.reflect.ClassTagTevidence$11)
+Broadcast a read-only variable to the cluster, returning a
+ Broadcast object for reading it in 
distributed functions.
+
+
+
+void
+cancelAllJobs()
+Cancel all jobs that have been scheduled or are

[45/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/corr.html
--
diff --git a/site/docs/2.0.2/api/R/corr.html b/site/docs/2.0.2/api/R/corr.html
new file mode 100644
index 000..9f058a4
--- /dev/null
+++ b/site/docs/2.0.2/api/R/corr.html
@@ -0,0 +1,177 @@
+
+R: corr
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+corr 
{SparkR}R Documentation
+
+corr
+
+Description
+
+Computes the Pearson Correlation Coefficient for two Columns.
+
+Calculates the correlation of two columns of a SparkDataFrame.
+Currently only supports the Pearson Correlation Coefficient.
+For Spearman Correlation, consider using RDD methods found in MLlib's 
Statistics.
+
+
+
+Usage
+
+
+## S4 method for signature 'Column'
+corr(x, col2)
+
+corr(x, ...)
+
+## S4 method for signature 'SparkDataFrame'
+corr(x, colName1, colName2, method = "pearson")
+
+
+
+Arguments
+
+
+x
+
+a Column or a SparkDataFrame.
+
+col2
+
+a (second) Column.
+
+...
+
+additional argument(s). If x is a Column, a Column
+should be provided. If x is a SparkDataFrame, two column names 
should
+be provided.
+
+colName1
+
+the name of the first column
+
+colName2
+
+the name of the second column
+
+method
+
+Optional. A character specifying the method for calculating the correlation.
+only pearson is allowed now.
+
+
+
+
+Value
+
+The Pearson Correlation Coefficient as a Double.
+
+
+
+Note
+
+corr since 1.6.0
+
+corr since 1.6.0
+
+
+
+See Also
+
+Other math_funcs: acos,
+acos,Column-method; asin,
+asin,Column-method; atan2,
+atan2,Column-method; atan,
+atan,Column-method; bin,
+bin, bin,Column-method;
+bround, bround,
+bround,Column-method; cbrt,
+cbrt, cbrt,Column-method;
+ceil, ceil,
+ceil,Column-method, ceiling,
+ceiling,Column-method; conv,
+conv,
+conv,Column,numeric,numeric-method;
+cosh, cosh,Column-method;
+cos, cos,Column-method;
+covar_pop, covar_pop,
+covar_pop,characterOrColumn,characterOrColumn-method;
+cov, cov, cov,
+cov,SparkDataFrame-method,
+cov,characterOrColumn-method,
+covar_samp, covar_samp,
+covar_samp,characterOrColumn,characterOrColumn-method;
+expm1, expm1,Column-method;
+exp, exp,Column-method;
+factorial,
+factorial,Column-method;
+floor, floor,Column-method;
+hex, hex,
+hex,Column-method; hypot,
+hypot, hypot,Column-method;
+log10, log10,Column-method;
+log1p, log1p,Column-method;
+log2, log2,Column-method;
+log, log,Column-method;
+pmod, pmod,
+pmod,Column-method; rint,
+rint, rint,Column-method;
+round, round,Column-method;
+shiftLeft, shiftLeft,
+shiftLeft,Column,numeric-method;
+shiftRightUnsigned,
+shiftRightUnsigned,
+shiftRightUnsigned,Column,numeric-method;
+shiftRight, shiftRight,
+shiftRight,Column,numeric-method;
+sign, sign,Column-method,
+signum, signum,
+signum,Column-method; sinh,
+sinh,Column-method; sin,
+sin,Column-method; sqrt,
+sqrt,Column-method; tanh,
+tanh,Column-method; tan,
+tan,Column-method; toDegrees,
+toDegrees,
+toDegrees,Column-method;
+toRadians, toRadians,
+toRadians,Column-method;
+unhex, unhex,
+unhex,Column-method
+
+Other stat functions: approxQuantile,
+approxQuantile,SparkDataFrame,character,numeric,numeric-method;
+cov, cov, cov,
+cov,SparkDataFrame-method,
+cov,characterOrColumn-method,
+covar_samp, covar_samp,
+covar_samp,characterOrColumn,characterOrColumn-method;
+crosstab,
+crosstab,SparkDataFrame,character,character-method;
+freqItems,
+freqItems,SparkDataFrame,character-method;
+sampleBy, sampleBy,
+sampleBy,SparkDataFrame,character,list,numeric-method
+
+
+
+Examples
+
+## Not run: corr(df$c, df$d)
+## Not run: 
+##D df - read.json(/path/to/file.json)
+##D corr - corr(df, title, gender)
+##D corr - corr(df, title, gender, method = 
pearson)
+## End(Not run)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/cos.html
--
diff --git a/site/docs/2.0.2/api/R/cos.html b/site/docs/2.0.2/api/R/cos.html
new file mode 100644
index 000..64090a4
--- /dev/null
+++ b/site/docs/2.0.2/api/R/cos.html
@@ -0,0 +1,120 @@
+
+R: cos
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+cos 
{SparkR}R Documentation
+
+cos
+
+Description
+
+Computes the cosine of the given value.
+
+
+
+Usage
+
+
+## S4 method for signature 'Column'
+cos(x)
+
+
+
+Arguments
+
+
+x
+
+Column to compute on.
+
+
+
+
+Note
+
+cos since 1.5.0
+
+
+
+See Also
+
+Other math_funcs: acos,

[07/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/SparkConf.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/SparkConf.html 
b/site/docs/2.0.2/api/java/org/apache/spark/SparkConf.html
new file mode 100644
index 000..8fb676c
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/SparkConf.html
@@ -0,0 +1,1124 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+SparkConf (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Class SparkConf
+
+
+
+Object
+
+
+org.apache.spark.SparkConf
+
+
+
+
+
+
+
+All Implemented Interfaces:
+Cloneable
+
+
+
+public class SparkConf
+extends Object
+implements scala.Cloneable
+Configuration for a Spark application. Used to set various 
Spark parameters as key-value pairs.
+ 
+ Most of the time, you would create a SparkConf object with new 
SparkConf(), which will load
+ values from any spark.* Java system properties set in your 
application as well. In this case,
+ parameters you set directly on the SparkConf object take 
priority over system properties.
+ 
+ For unit tests, you can also call new SparkConf(false) to skip 
loading external settings and
+ get the same configuration no matter what the system properties are.
+ 
+ All setter methods in this class support chaining. For example, you can write
+ new SparkConf().setMaster("local").setAppName("My app").
+ 
+ Note that once a SparkConf object is passed to Spark, it is cloned and can no 
longer be modified
+ by the user. Spark does not support modifying the configuration at runtime.
+ 
+ param:  loadDefaults whether to also load values from Java system 
properties
+
+
+
+
+
+
+
+
+
+
+
+Constructor Summary
+
+Constructors
+
+Constructor and Description
+
+
+SparkConf()
+Create a SparkConf that loads defaults from system 
properties and the classpath
+
+
+
+SparkConf(booleanloadDefaults)
+
+
+
+
+
+
+
+
+
+Method Summary
+
+Methods
+
+Modifier and Type
+Method and Description
+
+
+SparkConf
+clone()
+Copy this object
+
+
+
+boolean
+contains(Stringkey)
+Does the configuration contain a given parameter?
+
+
+
+String
+get(Stringkey)
+Get a parameter; throws a NoSuchElementException if it's 
not set
+
+
+
+String
+get(Stringkey,
+   StringdefaultValue)
+Get a parameter, falling back to a default if not set
+
+
+
+scala.Tuple2String,String[]
+getAll()
+Get all parameters as a list of pairs
+
+
+
+String
+getAppId()
+Returns the Spark application id, valid in the Driver after 
TaskScheduler registration and
+ from the start in the Executor.
+
+
+
+scala.collection.immutable.MapObject,String
+getAvroSchema()
+Gets all the avro schemas in the configuration used in the 
generic Avro record serializer
+
+
+
+boolean
+getBoolean(Stringkey,
+  booleandefaultValue)
+Get a parameter as a boolean, falling back to a default if 
not set
+
+
+
+static scala.OptionString
+getDeprecatedConfig(Stringkey,
+   SparkConfconf)
+Looks for available deprecated keys for the given config 
option, and return the first
+ value available.
+
+
+
+double
+getDouble(Stringkey,
+ doubledefaultValue)
+Get a parameter as a double, falling back to a default if 
not set
+
+
+
+scala.collection.Seqscala.Tuple2String,String
+getExecutorEnv()
+Get all executor environment variables set on this 
SparkConf
+
+
+
+int
+getInt(Stringkey,
+  intdefaultValue)
+Get a parameter as an integer, falling back to a default if 
not set
+
+
+
+long
+getLong(Stringkey,
+   longdefaultValue)
+Get a parameter as a long, falling back to a default if not 
set
+
+
+
+scala.OptionString
+getOption(Stringkey)
+Get a parameter as an Option
+
+
+
+long
+getSizeAsBytes(Stringkey)
+Get a size parameter as bytes; throws a 
NoSuchElementException if it's not set.
+
+
+
+long
+getSizeAsBytes(Stringkey,
+  longdefaultValue)
+Get a size parameter as bytes, falling back to a default if 
not set.
+
+
+
+long
+getSizeAsBytes(Stringkey,
+  StringdefaultValue)
+Get a size parameter as bytes, falling back to a default if 
not set.
+
+
+
+long
+getSizeAsGb(Stringkey)
+Get a size parameter as Gibibytes; throws a 
NoSuchElementException if it's not set.
+
+
+
+long
+getSizeAsGb(Stringkey,
+   StringdefaultValue)
+Get a size parameter as

[14/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/AccumulatorParam.IntAccumulatorParam$.html
--
diff --git 
a/site/docs/2.0.2/api/java/org/apache/spark/AccumulatorParam.IntAccumulatorParam$.html
 
b/site/docs/2.0.2/api/java/org/apache/spark/AccumulatorParam.IntAccumulatorParam$.html
new file mode 100644
index 000..c77f232
--- /dev/null
+++ 
b/site/docs/2.0.2/api/java/org/apache/spark/AccumulatorParam.IntAccumulatorParam$.html
@@ -0,0 +1,361 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+AccumulatorParam.IntAccumulatorParam$ (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev 
Class
+Next 
Class
+
+
+Frames
+No 
Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Class 
AccumulatorParam.IntAccumulatorParam$
+
+
+
+Object
+
+
+org.apache.spark.AccumulatorParam.IntAccumulatorParam$
+
+
+
+
+
+
+
+All Implemented Interfaces:
+java.io.Serializable, AccumulableParamObject,Object, AccumulatorParamObject
+
+
+Enclosing interface:
+AccumulatorParamT
+
+
+Deprecated.
+use AccumulatorV2. Since 2.0.0.
+
+
+public static class AccumulatorParam.IntAccumulatorParam$
+extends Object
+implements AccumulatorParamObject
+See Also:Serialized
 Form
+
+
+
+
+
+
+
+
+
+
+
+Nested Class Summary
+
+
+
+
+Nested classes/interfaces inherited from 
interfaceorg.apache.spark.AccumulatorParam
+AccumulatorParam.DoubleAccumulatorParam$, 
AccumulatorParam.FloatAccumulatorParam$, 
AccumulatorParam.IntAccumulatorParam$, AccumulatorParam.LongAccumulatorParam$, 
AccumulatorParam.StringAccumulatorParam$
+
+
+
+
+
+
+
+
+Field Summary
+
+Fields
+
+Modifier and Type
+Field and Description
+
+
+static AccumulatorParam.IntAccumulatorParam$
+MODULE$
+Deprecated.
+Static reference to the singleton instance of this Scala 
object.
+
+
+
+
+
+
+
+
+
+
+Constructor Summary
+
+Constructors
+
+Constructor and Description
+
+
+AccumulatorParam.IntAccumulatorParam$()
+Deprecated.
+
+
+
+
+
+
+
+
+
+
+Method Summary
+
+Methods
+
+Modifier and Type
+Method and Description
+
+
+int
+addInPlace(intt1,
+  intt2)
+Deprecated.
+
+
+
+int
+zero(intinitialValue)
+Deprecated.
+
+
+
+
+
+
+
+Methods inherited from classObject
+equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, 
wait
+
+
+
+
+
+Methods inherited from interfaceorg.apache.spark.AccumulatorParam
+addAccumulator
+
+
+
+
+
+Methods inherited from interfaceorg.apache.spark.AccumulableParam
+addInPlace,
 zero
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Field Detail
+
+
+
+
+
+MODULE$
+public static finalAccumulatorParam.IntAccumulatorParam$ 
MODULE$
+Deprecated.
+Static reference to the singleton instance of this Scala 
object.
+
+
+
+
+
+
+
+
+
+Constructor Detail
+
+
+
+
+
+AccumulatorParam.IntAccumulatorParam$
+publicAccumulatorParam.IntAccumulatorParam$()
+Deprecated.
+
+
+
+
+
+
+
+
+
+Method Detail
+
+
+
+
+
+addInPlace
+publicintaddInPlace(intt1,
+ intt2)
+Deprecated.
+
+
+
+
+
+
+
+zero
+publicintzero(intinitialValue)
+Deprecated.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev 
Class
+Next 
Class
+
+
+Frames
+No 
Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/AccumulatorParam.LongAccumulatorParam$.html
--
diff --git 
a/site/docs/2.0.2/api/java/org/apache/spark/AccumulatorParam.LongAccumulatorParam$.html
 
b/site/docs/2.0.2/api/java/org/apache/spark/AccumulatorParam.LongAccumulatorParam$.html
new file mode 100644
index 000..d39926f
--- /dev/null
+++ 
b/site/docs/2.0.2/api/java/org/apache/spark/AccumulatorParam.LongAccumulatorParam$.html
@@ -0,0 +1,361 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+AccumulatorParam.LongAccumulatorParam$ (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Class ExecutorRemoved
+
+
+
+Object
+
+
+org.apache.spark.ExecutorRemoved
+
+
+
+
+
+
+
+All Implemented Interfaces:
+java.io.Serializable, scala.Equals, scala.Product
+
+
+
+public class ExecutorRemoved
+extends Object
+implements scala.Product, scala.Serializable
+See Also:Serialized
 Form
+
+
+
+
+
+
+
+
+
+
+
+Constructor Summary
+
+Constructors
+
+Constructor and Description
+
+
+ExecutorRemoved(StringexecutorId)
+
+
+
+
+
+
+
+
+
+Method Summary
+
+Methods
+
+Modifier and Type
+Method and Description
+
+
+abstract static boolean
+canEqual(Objectthat)
+
+
+abstract static boolean
+equals(Objectthat)
+
+
+String
+executorId()
+
+
+abstract static int
+productArity()
+
+
+abstract static Object
+productElement(intn)
+
+
+static 
scala.collection.IteratorObject
+productIterator()
+
+
+static String
+productPrefix()
+
+
+
+
+
+
+Methods inherited from classObject
+equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, 
wait
+
+
+
+
+
+Methods inherited from interfacescala.Product
+productArity, productElement, productIterator, productPrefix
+
+
+
+
+
+Methods inherited from interfacescala.Equals
+canEqual, equals
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Constructor Detail
+
+
+
+
+
+ExecutorRemoved
+publicExecutorRemoved(StringexecutorId)
+
+
+
+
+
+
+
+
+
+Method Detail
+
+
+
+
+
+canEqual
+public abstract staticbooleancanEqual(Objectthat)
+
+
+
+
+
+
+
+equals
+public abstract staticbooleanequals(Objectthat)
+
+
+
+
+
+
+
+productElement
+public abstract staticObjectproductElement(intn)
+
+
+
+
+
+
+
+productArity
+public abstract staticintproductArity()
+
+
+
+
+
+
+
+productIterator
+public 
staticscala.collection.IteratorObjectproductIterator()
+
+
+
+
+
+
+
+productPrefix
+public staticStringproductPrefix()
+
+
+
+
+
+
+
+executorId
+publicStringexecutorId()
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/ExpireDeadHosts.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/ExpireDeadHosts.html 
b/site/docs/2.0.2/api/java/org/apache/spark/ExpireDeadHosts.html
new file mode 100644
index 000..492fd65
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/ExpireDeadHosts.html
@@ -0,0 +1,319 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+ExpireDeadHosts (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Class ExpireDeadHosts
+
+
+
+Object
+
+
+org.apache.spark.ExpireDeadHosts
+
+
+
+
+
+
+
+
+public class ExpireDeadHosts
+extends Object
+
+
+
+
+
+
+
+
+
+
+
+Constructor

[01/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

Repository: spark-website
Updated Branches:
  refs/heads/asf-site b9aa4c3ee -> 0bd363165


http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/UnknownReason.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/UnknownReason.html 
b/site/docs/2.0.2/api/java/org/apache/spark/UnknownReason.html
new file mode 100644
index 000..5197a6b
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/UnknownReason.html
@@ -0,0 +1,348 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+UnknownReason (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Class UnknownReason
+
+
+
+Object
+
+
+org.apache.spark.UnknownReason
+
+
+
+
+
+
+
+
+public class UnknownReason
+extends Object
+:: DeveloperApi ::
+ We don't know why the task ended -- for example, because of a ClassNotFound 
exception when
+ deserializing the task result.
+
+
+
+
+
+
+
+
+
+
+
+Constructor Summary
+
+Constructors
+
+Constructor and Description
+
+
+UnknownReason()
+
+
+
+
+
+
+
+
+
+Method Summary
+
+Methods
+
+Modifier and Type
+Method and Description
+
+
+abstract static boolean
+canEqual(Objectthat)
+
+
+static boolean
+countTowardsTaskFailures()
+
+
+abstract static boolean
+equals(Objectthat)
+
+
+abstract static int
+productArity()
+
+
+abstract static Object
+productElement(intn)
+
+
+static 
scala.collection.IteratorObject
+productIterator()
+
+
+static String
+productPrefix()
+
+
+static String
+toErrorString()
+
+
+
+
+
+
+Methods inherited from classObject
+equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, 
wait
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Constructor Detail
+
+
+
+
+
+UnknownReason
+publicUnknownReason()
+
+
+
+
+
+
+
+
+
+Method Detail
+
+
+
+
+
+toErrorString
+public staticStringtoErrorString()
+
+
+
+
+
+
+
+countTowardsTaskFailures
+public staticbooleancountTowardsTaskFailures()
+
+
+
+
+
+
+
+canEqual
+public abstract staticbooleancanEqual(Objectthat)
+
+
+
+
+
+
+
+equals
+public abstract staticbooleanequals(Objectthat)
+
+
+
+
+
+
+
+productElement
+public abstract staticObjectproductElement(intn)
+
+
+
+
+
+
+
+productArity
+public abstract staticintproductArity()
+
+
+
+
+
+
+
+productIterator
+public 
staticscala.collection.IteratorObjectproductIterator()
+
+
+
+
+
+
+
+productPrefix
+public staticStringproductPrefix()
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[02/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/TaskKilled.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/TaskKilled.html 
b/site/docs/2.0.2/api/java/org/apache/spark/TaskKilled.html
new file mode 100644
index 000..976143b
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/TaskKilled.html
@@ -0,0 +1,347 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+TaskKilled (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Class TaskKilled
+
+
+
+Object
+
+
+org.apache.spark.TaskKilled
+
+
+
+
+
+
+
+
+public class TaskKilled
+extends Object
+:: DeveloperApi ::
+ Task was killed intentionally and needs to be rescheduled.
+
+
+
+
+
+
+
+
+
+
+
+Constructor Summary
+
+Constructors
+
+Constructor and Description
+
+
+TaskKilled()
+
+
+
+
+
+
+
+
+
+Method Summary
+
+Methods
+
+Modifier and Type
+Method and Description
+
+
+abstract static boolean
+canEqual(Objectthat)
+
+
+static boolean
+countTowardsTaskFailures()
+
+
+abstract static boolean
+equals(Objectthat)
+
+
+abstract static int
+productArity()
+
+
+abstract static Object
+productElement(intn)
+
+
+static 
scala.collection.IteratorObject
+productIterator()
+
+
+static String
+productPrefix()
+
+
+static String
+toErrorString()
+
+
+
+
+
+
+Methods inherited from classObject
+equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, 
wait
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Constructor Detail
+
+
+
+
+
+TaskKilled
+publicTaskKilled()
+
+
+
+
+
+
+
+
+
+Method Detail
+
+
+
+
+
+toErrorString
+public staticStringtoErrorString()
+
+
+
+
+
+
+
+countTowardsTaskFailures
+public staticbooleancountTowardsTaskFailures()
+
+
+
+
+
+
+
+canEqual
+public abstract staticbooleancanEqual(Objectthat)
+
+
+
+
+
+
+
+equals
+public abstract staticbooleanequals(Objectthat)
+
+
+
+
+
+
+
+productElement
+public abstract staticObjectproductElement(intn)
+
+
+
+
+
+
+
+productArity
+public abstract staticintproductArity()
+
+
+
+
+
+
+
+productIterator
+public 
staticscala.collection.IteratorObjectproductIterator()
+
+
+
+
+
+
+
+productPrefix
+public staticStringproductPrefix()
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/TaskKilledException.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/TaskKilledException.html 
b/site/docs/2.0.2/api/java/org/apache/spark/TaskKilledException.html
new file mode 100644
index 000..6f9a0b7
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/TaskKilledException.html
@@ -0,0 +1,255 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+TaskKilledException (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Class 
TaskKilledException
+
+
+
+Object
+
+
+Throwable
+
+
+Exception
+
+
+RuntimeException
+
+
+org.apache.spark.TaskKilledException
+
+
+
+
+
+
+
+
+
+
+
+
+
+All Implemented Interfaces:
+java.io.Serializable
+
+
+
+public class TaskKilledException
+extends RuntimeException
+:: DeveloperApi ::
+ Exception thrown when a task is explicitly killed

[04/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/SparkJobInfo.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/SparkJobInfo.html 
b/site/docs/2.0.2/api/java/org/apache/spark/SparkJobInfo.html
new file mode 100644
index 000..bc6e486
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/SparkJobInfo.html
@@ -0,0 +1,243 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+SparkJobInfo (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev 
Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Interface SparkJobInfo
+
+
+
+
+
+
+All Superinterfaces:
+java.io.Serializable
+
+
+All Known Implementing Classes:
+SparkJobInfoImpl
+
+
+
+public interface SparkJobInfo
+extends java.io.Serializable
+Exposes information about Spark Jobs.
+
+ This interface is not designed to be implemented outside of Spark.  We may 
add additional methods
+ which may break binary compatibility with outside implementations.
+
+
+
+
+
+
+
+
+
+
+
+Method Summary
+
+Methods
+
+Modifier and Type
+Method and Description
+
+
+int
+jobId()
+
+
+int[]
+stageIds()
+
+
+JobExecutionStatus
+status()
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Method Detail
+
+
+
+
+
+jobId
+intjobId()
+
+
+
+
+
+
+
+stageIds
+int[]stageIds()
+
+
+
+
+
+
+
+status
+JobExecutionStatusstatus()
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev 
Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/SparkJobInfoImpl.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/SparkJobInfoImpl.html 
b/site/docs/2.0.2/api/java/org/apache/spark/SparkJobInfoImpl.html
new file mode 100644
index 000..d95f90b
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/SparkJobInfoImpl.html
@@ -0,0 +1,302 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+SparkJobInfoImpl (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Class SparkJobInfoImpl
+
+
+
+Object
+
+
+org.apache.spark.SparkJobInfoImpl
+
+
+
+
+
+
+
+All Implemented Interfaces:
+java.io.Serializable, SparkJobInfo
+
+
+
+public class SparkJobInfoImpl
+extends Object
+implements SparkJobInfo
+See Also:Serialized
 Form
+
+
+
+
+
+
+
+
+
+
+
+Constructor Summary
+
+Constructors
+
+Constructor and Description
+
+
+SparkJobInfoImpl(intjobId,
+int[]stageIds,
+JobExecutionStatusstatus)
+
+
+
+
+
+
+
+
+
+Method Summary
+
+Methods
+
+Modifier and Type
+Method and Description
+
+
+int
+jobId()
+
+
+int[]
+stageIds()
+
+
+JobExecutionStatus
+status()
+
+
+
+
+
+
+Methods inherited from classObject
+equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, 
wait
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Constructor Detail
+
+
+
+
+
+SparkJobInfoImpl
+publicSparkJobInfoImpl(intjobId,
+int[]stageIds,
+JobExecutionStatusstatus)
+
+
+
+
+
+
+
+
+
+Method Detail
+
+
+
+
+
+jobId
+publicintjobId()
+
+Specified by:
+jobIdin
 interfaceSparkJobInfo
+
+
+
+
+
+
+
+
+stageIds
+publicint[]stageIds()
+
+Specified by:
+stageIdsin
 interfaceSparkJobInfo
+
+
+
+
+
+
+
+
+status
+publicJobExecutionStatusstatus()
+
+Specified by:
+statusin

[24/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/window.html
--
diff --git a/site/docs/2.0.2/api/R/window.html 
b/site/docs/2.0.2/api/R/window.html
new file mode 100644
index 000..01536c1
--- /dev/null
+++ b/site/docs/2.0.2/api/R/window.html
@@ -0,0 +1,163 @@
+
+R: window
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+window 
{SparkR}R Documentation
+
+window
+
+Description
+
+Bucketize rows into one or more time windows given a timestamp specifying 
column. Window
+starts are inclusive but the window ends are exclusive, e.g. 12:05 will be in 
the window
+[12:05,12:10) but not in [12:00,12:05). Windows can support microsecond 
precision. Windows in
+the order of months are not supported.
+
+
+
+Usage
+
+
+## S4 method for signature 'Column'
+window(x, windowDuration, slideDuration = NULL,
+  startTime = NULL)
+
+window(x, ...)
+
+
+
+Arguments
+
+
+x
+
+a time Column. Must be of TimestampType.
+
+windowDuration
+
+a string specifying the width of the window, e.g. '1 second',
+'1 day 12 hours', '2 minutes'. Valid interval strings are 'week',
+'day', 'hour', 'minute', 'second', 'millisecond', 'microsecond'. Note that
+the duration is a fixed length of time, and does not vary over time
+according to a calendar. For example, '1 day' always means 86,400,000
+milliseconds, not a calendar day.
+
+slideDuration
+
+a string specifying the sliding interval of the window. Same format as
+windowDuration. A new window will be generated every
+slideDuration. Must be less than or equal to
+the windowDuration. This duration is likewise absolute, and does 
not
+vary according to a calendar.
+
+startTime
+
+the offset with respect to 1970-01-01 00:00:00 UTC with which to start
+window intervals. For example, in order to have hourly tumbling windows
+that start 15 minutes past the hour, e.g. 12:15-13:15, 13:15-14:15... provide
+startTime as "15 minutes".
+
+...
+
+further arguments to be passed to or from other methods.
+
+
+
+
+Value
+
+An output column of struct called 'window' by default with the nested 
columns 'start'
+and 'end'.
+
+
+
+Note
+
+window since 2.0.0
+
+
+
+See Also
+
+Other datetime_funcs: add_months,
+add_months,
+add_months,Column,numeric-method;
+date_add, date_add,
+date_add,Column,numeric-method;
+date_format, date_format,
+date_format,Column,character-method;
+date_sub, date_sub,
+date_sub,Column,numeric-method;
+datediff, datediff,
+datediff,Column-method;
+dayofmonth, dayofmonth,
+dayofmonth,Column-method;
+dayofyear, dayofyear,
+dayofyear,Column-method;
+from_unixtime, from_unixtime,
+from_unixtime,Column-method;
+from_utc_timestamp,
+from_utc_timestamp,
+from_utc_timestamp,Column,character-method;
+hour, hour,
+hour,Column-method; last_day,
+last_day,
+last_day,Column-method;
+minute, minute,
+minute,Column-method;
+months_between,
+months_between,
+months_between,Column-method;
+month, month,
+month,Column-method;
+next_day, next_day,
+next_day,Column,character-method;
+quarter, quarter,
+quarter,Column-method;
+second, second,
+second,Column-method;
+to_date, to_date,
+to_date,Column-method;
+to_utc_timestamp,
+to_utc_timestamp,
+to_utc_timestamp,Column,character-method;
+unix_timestamp,
+unix_timestamp,
+unix_timestamp,
+unix_timestamp,
+unix_timestamp,Column,character-method,
+unix_timestamp,Column,missing-method,
+unix_timestamp,missing,missing-method;
+weekofyear, weekofyear,
+weekofyear,Column-method;
+year, year,
+year,Column-method
+
+
+
+Examples
+
+## Not run: 
+##D   # One minute windows every 15 seconds 10 seconds after the minute, e.g. 
09:00:10-09:01:10,
+##D   # 09:00:25-09:01:25, 09:00:40-09:01:40, ...
+##D   window(df$time, 1 minute, 15 seconds, 10 
seconds)
+##D 
+##D   # One minute tumbling windows 15 seconds after the minute, e.g. 
09:00:15-09:01:15,
+##D# 09:01:15-09:02:15...
+##D   window(df$time, 1 minute, startTime = 15 seconds)
+##D 
+##D   # Thirty-second windows every 10 seconds, e.g. 09:00:00-09:00:30, 
09:00:10-09:00:40, ...
+##D   window(df$time, 30 seconds, 10 seconds)
+## End(Not run)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/windowOrderBy.html
--
diff --git a/site/docs/2.0.2/api/R/windowOrderBy.html 
b/site/docs/2.0.2/api/R/windowOrderBy.html
new file mode 100644
index 000..b0cb39e
--- /dev/null
+++ b/site/docs/2.0.2/api/R/windowOrderBy.html
@@ -0,0 +1,72 @@
+
+R: windowOrderBy
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>

[38/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/install.spark.html
--
diff --git a/site/docs/2.0.2/api/R/install.spark.html 
b/site/docs/2.0.2/api/R/install.spark.html
new file mode 100644
index 000..5657727
--- /dev/null
+++ b/site/docs/2.0.2/api/R/install.spark.html
@@ -0,0 +1,119 @@
+
+R: Download and Install Apache Spark to a Local 
Directory
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+install.spark {SparkR}R 
Documentation
+
+Download and Install Apache Spark to a Local Directory
+
+Description
+
+install.spark downloads and installs Spark to a local 
directory if
+it is not found. The Spark version we use is the same as the SparkR version.
+Users can specify a desired Hadoop version, the remote mirror site, and
+the directory where the package is installed locally.
+
+
+
+Usage
+
+
+install.spark(hadoopVersion = "2.7", mirrorUrl = NULL, localDir = NULL,
+  overwrite = FALSE)
+
+
+
+Arguments
+
+
+hadoopVersion
+
+Version of Hadoop to install. Default is "2.7". It can take 
other
+version number in the format of x.y where x and y are integer.
+If hadoopVersion = "without", Hadoop free build is 
installed.
+See
+http://spark.apache.org/docs/latest/hadoop-provided.html;>
+Hadoop Free Build for more information.
+Other patched version names can also be used, e.g. "cdh4"
+
+mirrorUrl
+
+base URL of the repositories to use. The directory layout should follow
+http://www.apache.org/dyn/closer.lua/spark/;>Apache mirrors.
+
+localDir
+
+a local directory where Spark is installed. The directory contains
+version-specific folders of Spark packages. Default is path to
+the cache directory:
+
+
+
+ Mac OS X: ~/Library/Caches/spark
+
+
+ Unix: $XDG_CACHE_HOME if defined, otherwise 
~/.cache/spark
+
+
+ Windows: %LOCALAPPDATA%\spark\spark\Cache.
+
+
+
+overwrite
+
+If TRUE, download and overwrite the existing tar file in 
localDir
+and force re-install Spark (in case the local directory or file is 
corrupted)
+
+
+
+
+Details
+
+The full url of remote file is inferred from mirrorUrl and 
hadoopVersion.
+mirrorUrl specifies the remote path to a Spark folder. It is 
followed by a subfolder
+named after the Spark version (that corresponds to SparkR), and then the tar 
filename.
+The filename is composed of four parts, i.e. [Spark version]-bin-[Hadoop 
version].tgz.
+For example, the full path for a Spark 2.0.0 package for Hadoop 2.7 from
+http://apache.osuosl.org has path:
+http://apache.osuosl.org/spark/spark-2.0.0/spark-2.0.0-bin-hadoop2.7.tgz.
+For hadoopVersion = "without", [Hadoop version] in the filename 
is then
+without-hadoop.
+
+
+
+Value
+
+install.spark returns the local directory where Spark is found 
or installed
+
+
+
+Note
+
+install.spark since 2.1.0
+
+
+
+See Also
+
+See available Hadoop versions:
+http://spark.apache.org/downloads.html;>Apache Spark
+
+
+
+Examples
+
+## Not run: 
+##D install.spark()
+## End(Not run)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/instr.html
--
diff --git a/site/docs/2.0.2/api/R/instr.html b/site/docs/2.0.2/api/R/instr.html
new file mode 100644
index 000..c0483ad
--- /dev/null
+++ b/site/docs/2.0.2/api/R/instr.html
@@ -0,0 +1,126 @@
+
+R: instr
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+instr 
{SparkR}R Documentation
+
+instr
+
+Description
+
+Locate the position of the first occurrence of substr column in the given 
string.
+Returns null if either of the arguments are null.
+
+
+
+Usage
+
+
+## S4 method for signature 'Column,character'
+instr(y, x)
+
+instr(y, x)
+
+
+
+Arguments
+
+
+y
+
+column to check
+
+x
+
+substring to check
+
+
+
+
+Details
+
+NOTE: The position is not zero based, but 1 based index, returns 0 if substr
+could not be found in str.
+
+
+
+Note
+
+instr since 1.5.0
+
+
+
+See Also
+
+Other string_funcs: ascii,
+ascii, ascii,Column-method;
+base64, base64,
+base64,Column-method;
+concat_ws, concat_ws,
+concat_ws,character,Column-method;
+concat, concat,
+concat,Column-method; decode,
+decode,
+decode,Column,character-method;
+encode, encode,
+encode,Column,character-method;
+format_number, format_number,
+format_number,Column,numeric-method;
+format_string, format_string,
+format_string,character,Column-method;
+initcap, initcap,
+initcap,Column-method;
+length,

[09/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/JobExecutionStatus.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/JobExecutionStatus.html 
b/site/docs/2.0.2/api/java/org/apache/spark/JobExecutionStatus.html
new file mode 100644
index 000..cb8b512
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/JobExecutionStatus.html
@@ -0,0 +1,354 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+JobExecutionStatus (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev 
Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Enum Constants|
+Field|
+Method
+
+
+Detail:
+Enum Constants|
+Field|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Enum JobExecutionStatus
+
+
+
+Object
+
+
+EnumJobExecutionStatus
+
+
+org.apache.spark.JobExecutionStatus
+
+
+
+
+
+
+
+
+
+All Implemented Interfaces:
+java.io.Serializable, ComparableJobExecutionStatus
+
+
+
+public enum JobExecutionStatus
+extends EnumJobExecutionStatus
+
+
+
+
+
+
+
+
+
+
+
+Enum Constant Summary
+
+Enum Constants
+
+Enum Constant and Description
+
+
+FAILED
+
+
+RUNNING
+
+
+SUCCEEDED
+
+
+UNKNOWN
+
+
+
+
+
+
+
+
+
+Method Summary
+
+Methods
+
+Modifier and Type
+Method and Description
+
+
+static JobExecutionStatus
+fromString(Stringstr)
+
+
+static JobExecutionStatus
+valueOf(Stringname)
+Returns the enum constant of this type with the specified 
name.
+
+
+
+static JobExecutionStatus[]
+values()
+Returns an array containing the constants of this enum 
type, in
+the order they are declared.
+
+
+
+
+
+
+
+Methods inherited from classEnum
+compareTo, equals, getDeclaringClass, hashCode, name, ordinal, toString, 
valueOf
+
+
+
+
+
+Methods inherited from classObject
+getClass, notify, notifyAll, wait, wait, wait
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Enum Constant Detail
+
+
+
+
+
+RUNNING
+public static finalJobExecutionStatus RUNNING
+
+
+
+
+
+
+
+SUCCEEDED
+public static finalJobExecutionStatus SUCCEEDED
+
+
+
+
+
+
+
+FAILED
+public static finalJobExecutionStatus FAILED
+
+
+
+
+
+
+
+UNKNOWN
+public static finalJobExecutionStatus UNKNOWN
+
+
+
+
+
+
+
+
+
+Method Detail
+
+
+
+
+
+values
+public staticJobExecutionStatus[]values()
+Returns an array containing the constants of this enum 
type, in
+the order they are declared.  This method may be used to iterate
+over the constants as follows:
+
+for (JobExecutionStatus c : JobExecutionStatus.values())
+   System.out.println(c);
+
+Returns:an array containing the 
constants of this enum type, in the order they are declared
+
+
+
+
+
+
+
+valueOf
+public staticJobExecutionStatusvalueOf(Stringname)
+Returns the enum constant of this type with the specified 
name.
+The string must match exactly an identifier used to declare an
+enum constant in this type.  (Extraneous whitespace characters are 
+not permitted.)
+Parameters:name - 
the name of the enum constant to be returned.
+Returns:the enum constant with the 
specified name
+Throws:
+IllegalArgumentException - if this enum type has no constant 
with the specified name
+NullPointerException - if the argument is null
+
+
+
+
+
+
+
+fromString
+public staticJobExecutionStatusfromString(Stringstr)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev 
Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Enum Constants|
+Field|
+Method
+
+
+Detail:
+Enum Constants|
+Field|
+Method
+
+
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/JobSubmitter.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/JobSubmitter.html 
b/site/docs/2.0.2/api/java/org/apache/spark/JobSubmitter.html
new file mode 100644
index 000..5681e58
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/JobSubmitter.html
@@ -0,0 +1,221 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+JobSubmitter (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript

[19/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/constant-values.html
--
diff --git a/site/docs/2.0.2/api/java/constant-values.html 
b/site/docs/2.0.2/api/java/constant-values.html
new file mode 100644
index 000..ee81714
--- /dev/null
+++ b/site/docs/2.0.2/api/java/constant-values.html
@@ -0,0 +1,233 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+Constant Field Values (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev
+Next
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+
+
+
+Constant Field Values
+Contents
+
+org.apache.*
+
+
+
+
+
+org.apache.*
+
+
+
+org.apache.spark.launcher.SparkLauncher
+
+Modifier and Type
+Constant Field
+Value
+
+
+
+
+
+publicstaticfinalString
+CHILD_CONNECTION_TIMEOUT
+"spark.launcher.childConectionTimeout"
+
+
+
+
+publicstaticfinalString
+CHILD_PROCESS_LOGGER_NAME
+"spark.launcher.childProcLoggerName"
+
+
+
+
+publicstaticfinalString
+DEPLOY_MODE
+"spark.submit.deployMode"
+
+
+
+
+publicstaticfinalString
+DRIVER_EXTRA_CLASSPATH
+"spark.driver.extraClassPath"
+
+
+
+
+publicstaticfinalString
+DRIVER_EXTRA_JAVA_OPTIONS
+"spark.driver.extraJavaOptions"
+
+
+
+
+publicstaticfinalString
+DRIVER_EXTRA_LIBRARY_PATH
+"spark.driver.extraLibraryPath"
+
+
+
+
+publicstaticfinalString
+DRIVER_MEMORY
+"spark.driver.memory"
+
+
+
+
+publicstaticfinalString
+EXECUTOR_CORES
+"spark.executor.cores"
+
+
+
+
+publicstaticfinalString
+EXECUTOR_EXTRA_CLASSPATH
+"spark.executor.extraClassPath"
+
+
+
+
+publicstaticfinalString
+EXECUTOR_EXTRA_JAVA_OPTIONS
+"spark.executor.extraJavaOptions"
+
+
+
+
+publicstaticfinalString
+EXECUTOR_EXTRA_LIBRARY_PATH
+"spark.executor.extraLibraryPath"
+
+
+
+
+publicstaticfinalString
+EXECUTOR_MEMORY
+"spark.executor.memory"
+
+
+
+
+publicstaticfinalString
+NO_RESOURCE
+"spark-internal"
+
+
+
+
+publicstaticfinalString
+SPARK_MASTER
+"spark.master"
+
+
+
+
+
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev
+Next
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/deprecated-list.html
--
diff --git a/site/docs/2.0.2/api/java/deprecated-list.html 
b/site/docs/2.0.2/api/java/deprecated-list.html
new file mode 100644
index 000..3f4dacc
--- /dev/null
+++ b/site/docs/2.0.2/api/java/deprecated-list.html
@@ -0,0 +1,577 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+Deprecated List (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev
+Next
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+
+
+
+Deprecated API
+Contents
+
+Deprecated Interfaces
+Deprecated Classes
+Deprecated Methods
+Deprecated Constructors
+
+
+
+
+
+
+
+
+Deprecated Interfaces
+
+Interface and Description
+
+
+
+org.apache.spark.AccumulableParam
+use AccumulatorV2. Since 2.0.0.
+
+
+
+org.apache.spark.AccumulatorParam
+use AccumulatorV2. Since 2.0.0.
+
+
+
+
+
+
+
+
+
+
+
+
+Deprecated Classes
+
+Class and Description
+
+
+
+org.apache.spark.Accumulable
+use AccumulatorV2. Since 2.0.0.
+
+
+
+org.apache.spark.Accumulator
+use AccumulatorV2. Since 2.0.0.
+
+
+
+org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
+use AccumulatorV2. Since 2.0.0.
+
+
+
+org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
+use AccumulatorV2. Since 2.0.0.
+
+
+
+org.apache.spark.AccumulatorParam.IntAccumulatorParam$
+use AccumulatorV2. Since 2.0.0.
+
+
+
+org.apache.spark.AccumulatorParam.LongAccumulatorParam$
+use AccumulatorV2. Since 2.0.0.
+
+
+
+org.apache.spark.AccumulatorParam.StringAccumulatorParam$
+use AccumulatorV2. Since 2.0.0.
+
+
+
+org.apache.spark.sql.hive.HiveContext
+Use

[12/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/java/org/apache/spark/ComplexFutureAction.html
--
diff --git a/site/docs/2.0.2/api/java/org/apache/spark/ComplexFutureAction.html 
b/site/docs/2.0.2/api/java/org/apache/spark/ComplexFutureAction.html
new file mode 100644
index 000..598387d
--- /dev/null
+++ b/site/docs/2.0.2/api/java/org/apache/spark/ComplexFutureAction.html
@@ -0,0 +1,485 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+
+
+ComplexFutureAction (Spark 2.0.2 JavaDoc)
+
+
+
+
+
+
+
+JavaScript is disabled on your browser.
+
+
+
+
+
+
+
+
+Overview
+Package
+Class
+Tree
+Deprecated
+Index
+Help
+
+
+
+
+Prev 
Class
+Next Class
+
+
+Frames
+No Frames
+
+
+All Classes
+
+
+
+
+
+
+
+Summary:
+Nested|
+Field|
+Constr|
+Method
+
+
+Detail:
+Field|
+Constr|
+Method
+
+
+
+
+
+
+
+
+org.apache.spark
+Class 
ComplexFutureActionT
+
+
+
+Object
+
+
+org.apache.spark.ComplexFutureActionT
+
+
+
+
+
+
+
+All Implemented Interfaces:
+FutureActionT, 
scala.concurrent.AwaitableT, scala.concurrent.FutureT
+
+
+
+public class ComplexFutureActionT
+extends Object
+implements FutureActionT
+A FutureAction for actions 
that could trigger multiple Spark jobs. Examples include take,
+ takeSample. Cancellation works by setting the cancelled flag to true and 
cancelling any pending
+ jobs.
+
+
+
+
+
+
+
+
+
+
+
+Nested Class Summary
+
+
+
+
+Nested classes/interfaces inherited from 
interfacescala.concurrent.Future
+scala.concurrent.Future.InternalCallbackExecutor$
+
+
+
+
+
+
+
+
+Constructor Summary
+
+Constructors
+
+Constructor and Description
+
+
+ComplexFutureAction(scala.Function1JobSubmitter,scala.concurrent.FutureTrun)
+
+
+
+
+
+
+
+
+
+Method Summary
+
+Methods
+
+Modifier and Type
+Method and Description
+
+
+void
+cancel()
+Cancels the execution of this action.
+
+
+
+boolean
+isCancelled()
+Returns whether the action has been cancelled.
+
+
+
+boolean
+isCompleted()
+Returns whether the action has already been completed with 
a value or an exception.
+
+
+
+scala.collection.SeqObject
+jobIds()
+Returns the job IDs run by the underlying async 
operation.
+
+
+
+Uvoid
+onComplete(scala.Function1scala.util.TryT,Ufunc,
+  scala.concurrent.ExecutionContextexecutor)
+When this action is completed, either through an exception, 
or a value, applies the provided
+ function.
+
+
+
+ComplexFutureActionT
+ready(scala.concurrent.duration.DurationatMost,
+ scala.concurrent.CanAwaitpermit)
+Blocks until this action completes.
+
+
+
+T
+result(scala.concurrent.duration.DurationatMost,
+  scala.concurrent.CanAwaitpermit)
+Awaits and returns the result (of type T) of this 
action.
+
+
+
+scala.Optionscala.util.TryT
+value()
+The value of this Future.
+
+
+
+
+
+
+
+Methods inherited from classObject
+equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, 
wait
+
+
+
+
+
+Methods inherited from interfaceorg.apache.spark.FutureAction
+get
+
+
+
+
+
+Methods inherited from interfacescala.concurrent.Future
+andThen, collect, failed, fallbackTo, filter, flatMap, foreach, map, 
mapTo, onFailure, onSuccess, recover, recoverWith, transform, withFilter, 
zip
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Constructor Detail
+
+
+
+
+
+ComplexFutureAction
+publicComplexFutureAction(scala.Function1JobSubmitter,scala.concurrent.FutureTrun)
+
+
+
+
+
+
+
+
+
+Method Detail
+
+
+
+
+
+cancel
+publicvoidcancel()
+Description copied from interface:FutureAction
+Cancels the execution of this action.
+
+Specified by:
+cancelin
 interfaceFutureActionT
+
+
+
+
+
+
+
+
+isCancelled
+publicbooleanisCancelled()
+Description copied from interface:FutureAction
+Returns whether the action has been cancelled.
+
+Specified by:
+isCancelledin
 interfaceFutureActionT
+Returns:(undocumented)
+
+
+
+
+
+
+
+ready
+publicComplexFutureActionTready(scala.concurrent.duration.DurationatMost,
+   scala.concurrent.CanAwaitpermit)
+ throws InterruptedException,
+java.util.concurrent.TimeoutException
+Description copied from interface:FutureAction
+Blocks until this action completes.
+ 
+
+Specified by:
+readyin
 interfaceFutureActionT
+Specified by:
+readyin 
interfacescala.concurrent.AwaitableT
+Parameters:atMost - 
maximum wait time, which may be negative (no waiting is done), Duration.Inf
+   for unbounded waiting, or a finite positive 
durationpermit - (undocumented)
+Returns:this FutureAction
+Throws:
+InterruptedException
+java.util.concurrent.TimeoutException
+
+
+
+
+
+
+

[30/51] [partial] spark-website git commit: Add docs for 2.0.2.

2016-11-11 Thread rxin

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/sd.html
--
diff --git a/site/docs/2.0.2/api/R/sd.html b/site/docs/2.0.2/api/R/sd.html
new file mode 100644
index 000..38f67ec
--- /dev/null
+++ b/site/docs/2.0.2/api/R/sd.html
@@ -0,0 +1,121 @@
+
+R: sd
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+sd {SparkR}R Documentation
+
+sd
+
+Description
+
+Aggregate function: alias for stddev_samp
+
+
+
+Usage
+
+
+## S4 method for signature 'Column'
+sd(x)
+
+## S4 method for signature 'Column'
+stddev(x)
+
+sd(x, na.rm = FALSE)
+
+stddev(x)
+
+
+
+Arguments
+
+
+x
+
+Column to compute on.
+
+na.rm
+
+currently not used.
+
+
+
+
+Note
+
+sd since 1.6.0
+
+stddev since 1.6.0
+
+
+
+See Also
+
+stddev_pop, stddev_samp
+
+Other agg_funcs: agg, agg,
+agg, agg,GroupedData-method,
+agg,SparkDataFrame-method,
+summarize, summarize,
+summarize,
+summarize,GroupedData-method,
+summarize,SparkDataFrame-method;
+avg, avg,
+avg,Column-method;
+countDistinct, countDistinct,
+countDistinct,Column-method,
+n_distinct, n_distinct,
+n_distinct,Column-method;
+count, count,
+count,Column-method,
+count,GroupedData-method, n,
+n, n,Column-method;
+first, first,
+first,
+first,SparkDataFrame-method,
+first,characterOrColumn-method;
+kurtosis, kurtosis,
+kurtosis,Column-method; last,
+last,
+last,characterOrColumn-method;
+max, max,Column-method;
+mean, mean,Column-method;
+min, min,Column-method;
+skewness, skewness,
+skewness,Column-method;
+stddev_pop, stddev_pop,
+stddev_pop,Column-method;
+stddev_samp, stddev_samp,
+stddev_samp,Column-method;
+sumDistinct, sumDistinct,
+sumDistinct,Column-method;
+sum, sum,Column-method;
+var_pop, var_pop,
+var_pop,Column-method;
+var_samp, var_samp,
+var_samp,Column-method; var,
+var, var,Column-method,
+variance, variance,
+variance,Column-method
+
+
+
+Examples
+
+## Not run: 
+##D stddev(df$c)
+##D select(df, stddev(df$age))
+##D agg(df, sd(df$age))
+## End(Not run)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/second.html
--
diff --git a/site/docs/2.0.2/api/R/second.html 
b/site/docs/2.0.2/api/R/second.html
new file mode 100644
index 000..92dc854
--- /dev/null
+++ b/site/docs/2.0.2/api/R/second.html
@@ -0,0 +1,112 @@
+
+R: second
+
+
+
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css;>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js";>
+https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js";>
+hljs.initHighlightingOnLoad();
+
+
+second 
{SparkR}R Documentation
+
+second
+
+Description
+
+Extracts the seconds as an integer from a given date/timestamp/string.
+
+
+
+Usage
+
+
+## S4 method for signature 'Column'
+second(x)
+
+second(x)
+
+
+
+Arguments
+
+
+x
+
+Column to compute on.
+
+
+
+
+Note
+
+second since 1.5.0
+
+
+
+See Also
+
+Other datetime_funcs: add_months,
+add_months,
+add_months,Column,numeric-method;
+date_add, date_add,
+date_add,Column,numeric-method;
+date_format, date_format,
+date_format,Column,character-method;
+date_sub, date_sub,
+date_sub,Column,numeric-method;
+datediff, datediff,
+datediff,Column-method;
+dayofmonth, dayofmonth,
+dayofmonth,Column-method;
+dayofyear, dayofyear,
+dayofyear,Column-method;
+from_unixtime, from_unixtime,
+from_unixtime,Column-method;
+from_utc_timestamp,
+from_utc_timestamp,
+from_utc_timestamp,Column,character-method;
+hour, hour,
+hour,Column-method; last_day,
+last_day,
+last_day,Column-method;
+minute, minute,
+minute,Column-method;
+months_between,
+months_between,
+months_between,Column-method;
+month, month,
+month,Column-method;
+next_day, next_day,
+next_day,Column,character-method;
+quarter, quarter,
+quarter,Column-method;
+to_date, to_date,
+to_date,Column-method;
+to_utc_timestamp,
+to_utc_timestamp,
+to_utc_timestamp,Column,character-method;
+unix_timestamp,
+unix_timestamp,
+unix_timestamp,
+unix_timestamp,
+unix_timestamp,Column,character-method,
+unix_timestamp,Column,missing-method,
+unix_timestamp,missing,missing-method;
+weekofyear, weekofyear,
+weekofyear,Column-method;
+window, window,
+window,Column-method; year,
+year, year,Column-method
+
+
+
+Examples
+
+## Not run: second(df$c)
+
+
+
+[Package SparkR version 2.0.2 Index]
+

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0bd36316/site/docs/2.0.2/api/R/select.html
--
diff --git a/site/docs/2.0.2/api/R/select.html 
b/site/docs/2.0.2/api/R/select.html
new file mode 100644
index

spark git commit: [SPARK-18387][SQL] Add serialization to checkEvaluation.

2016-11-11 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 465e4b40b -> 87820da78


[SPARK-18387][SQL] Add serialization to checkEvaluation.

## What changes were proposed in this pull request?

This removes the serialization test from RegexpExpressionsSuite and
replaces it by serializing all expressions in checkEvaluation.

This also fixes math constant expressions by making LeafMathExpression
Serializable and fixes NumberFormat values that are null or invalid
after serialization.

## How was this patch tested?

This patch is to tests.

Author: Ryan Blue 

Closes #15847 from rdblue/SPARK-18387-fix-serializable-expressions.

(cherry picked from commit 6e95325fc3726d260054bd6e7c0717b3c139917e)
Signed-off-by: Reynold Xin 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/87820da7
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/87820da7
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/87820da7

Branch: refs/heads/branch-2.1
Commit: 87820da782fd2d08078227a2ce5c363c3e1cb0f0
Parents: 465e4b4
Author: Ryan Blue 
Authored: Fri Nov 11 13:52:10 2016 -0800
Committer: Reynold Xin 
Committed: Fri Nov 11 13:52:18 2016 -0800

--
 .../catalyst/expressions/mathExpressions.scala  |  2 +-
 .../expressions/stringExpressions.scala | 44 +++-
 .../expressions/ExpressionEvalHelper.scala  | 15 ---
 .../expressions/RegexpExpressionsSuite.scala| 16 +--
 4 files changed, 36 insertions(+), 41 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/87820da7/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
index a60494a..65273a7 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
@@ -36,7 +36,7 @@ import org.apache.spark.unsafe.types.UTF8String
  * @param name The short name of the function
  */
 abstract class LeafMathExpression(c: Double, name: String)
-  extends LeafExpression with CodegenFallback {
+  extends LeafExpression with CodegenFallback with Serializable {
 
   override def dataType: DataType = DoubleType
   override def foldable: Boolean = true

http://git-wip-us.apache.org/repos/asf/spark/blob/87820da7/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
index 5f533fe..e74ef9a 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
@@ -1431,18 +1431,20 @@ case class FormatNumber(x: Expression, d: Expression)
 
   // Associated with the pattern, for the last d value, and we will update the
   // pattern (DecimalFormat) once the new coming d value differ with the last 
one.
+  // This is an Option to distinguish between 0 (numberFormat is valid) and 
uninitialized after
+  // serialization (numberFormat has not been updated for dValue = 0).
   @transient
-  private var lastDValue: Int = -100
+  private var lastDValue: Option[Int] = None
 
   // A cached DecimalFormat, for performance concern, we will change it
   // only if the d value changed.
   @transient
-  private val pattern: StringBuffer = new StringBuffer()
+  private lazy val pattern: StringBuffer = new StringBuffer()
 
   // SPARK-13515: US Locale configures the DecimalFormat object to use a dot 
('.')
   // as a decimal separator.
   @transient
-  private val numberFormat = new DecimalFormat("", new 
DecimalFormatSymbols(Locale.US))
+  private lazy val numberFormat = new DecimalFormat("", new 
DecimalFormatSymbols(Locale.US))
 
   override protected def nullSafeEval(xObject: Any, dObject: Any): Any = {
 val dValue = dObject.asInstanceOf[Int]
@@ -1450,24 +1452,28 @@ case class FormatNumber(x: Expression, d: Expression)
   return null
 }
 
-if (dValue != lastDValue) {
-  // construct a new DecimalFormat only if a new dValue
-  pattern.delete(0, pattern.length)
-  pattern.append("#,###,###,###,###,###,##0")
-
-  // decimal place
-  if (dValue > 0) {
-

spark git commit: [SPARK-18387][SQL] Add serialization to checkEvaluation.

2016-11-11 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/branch-2.0 6e7310590 -> 99575e88f


[SPARK-18387][SQL] Add serialization to checkEvaluation.

## What changes were proposed in this pull request?

This removes the serialization test from RegexpExpressionsSuite and
replaces it by serializing all expressions in checkEvaluation.

This also fixes math constant expressions by making LeafMathExpression
Serializable and fixes NumberFormat values that are null or invalid
after serialization.

## How was this patch tested?

This patch is to tests.

Author: Ryan Blue 

Closes #15847 from rdblue/SPARK-18387-fix-serializable-expressions.

(cherry picked from commit 6e95325fc3726d260054bd6e7c0717b3c139917e)
Signed-off-by: Reynold Xin 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/99575e88
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/99575e88
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/99575e88

Branch: refs/heads/branch-2.0
Commit: 99575e88fd711c3fc25e8e6f00bbc8d1491feed6
Parents: 6e73105
Author: Ryan Blue 
Authored: Fri Nov 11 13:52:10 2016 -0800
Committer: Reynold Xin 
Committed: Fri Nov 11 13:52:28 2016 -0800

--
 .../catalyst/expressions/mathExpressions.scala  |  2 +-
 .../expressions/stringExpressions.scala | 44 +++-
 .../expressions/ExpressionEvalHelper.scala  | 15 ---
 .../expressions/RegexpExpressionsSuite.scala| 16 +--
 4 files changed, 36 insertions(+), 41 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/99575e88/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
index 5152265..591e1e5 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
@@ -36,7 +36,7 @@ import org.apache.spark.unsafe.types.UTF8String
  * @param name The short name of the function
  */
 abstract class LeafMathExpression(c: Double, name: String)
-  extends LeafExpression with CodegenFallback {
+  extends LeafExpression with CodegenFallback with Serializable {
 
   override def dataType: DataType = DoubleType
   override def foldable: Boolean = true

http://git-wip-us.apache.org/repos/asf/spark/blob/99575e88/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
index 61549c9..004c74d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
@@ -1236,18 +1236,20 @@ case class FormatNumber(x: Expression, d: Expression)
 
   // Associated with the pattern, for the last d value, and we will update the
   // pattern (DecimalFormat) once the new coming d value differ with the last 
one.
+  // This is an Option to distinguish between 0 (numberFormat is valid) and 
uninitialized after
+  // serialization (numberFormat has not been updated for dValue = 0).
   @transient
-  private var lastDValue: Int = -100
+  private var lastDValue: Option[Int] = None
 
   // A cached DecimalFormat, for performance concern, we will change it
   // only if the d value changed.
   @transient
-  private val pattern: StringBuffer = new StringBuffer()
+  private lazy val pattern: StringBuffer = new StringBuffer()
 
   // SPARK-13515: US Locale configures the DecimalFormat object to use a dot 
('.')
   // as a decimal separator.
   @transient
-  private val numberFormat = new DecimalFormat("", new 
DecimalFormatSymbols(Locale.US))
+  private lazy val numberFormat = new DecimalFormat("", new 
DecimalFormatSymbols(Locale.US))
 
   override protected def nullSafeEval(xObject: Any, dObject: Any): Any = {
 val dValue = dObject.asInstanceOf[Int]
@@ -1255,24 +1257,28 @@ case class FormatNumber(x: Expression, d: Expression)
   return null
 }
 
-if (dValue != lastDValue) {
-  // construct a new DecimalFormat only if a new dValue
-  pattern.delete(0, pattern.length)
-  pattern.append("#,###,###,###,###,###,##0")
-
-  // decimal place
-  if (dValue > 0) {
-

spark git commit: [SPARK-18387][SQL] Add serialization to checkEvaluation.

2016-11-11 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/master d42bb7cc4 -> 6e95325fc


[SPARK-18387][SQL] Add serialization to checkEvaluation.

## What changes were proposed in this pull request?

This removes the serialization test from RegexpExpressionsSuite and
replaces it by serializing all expressions in checkEvaluation.

This also fixes math constant expressions by making LeafMathExpression
Serializable and fixes NumberFormat values that are null or invalid
after serialization.

## How was this patch tested?

This patch is to tests.

Author: Ryan Blue 

Closes #15847 from rdblue/SPARK-18387-fix-serializable-expressions.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6e95325f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6e95325f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6e95325f

Branch: refs/heads/master
Commit: 6e95325fc3726d260054bd6e7c0717b3c139917e
Parents: d42bb7c
Author: Ryan Blue 
Authored: Fri Nov 11 13:52:10 2016 -0800
Committer: Reynold Xin 
Committed: Fri Nov 11 13:52:10 2016 -0800

--
 .../catalyst/expressions/mathExpressions.scala  |  2 +-
 .../expressions/stringExpressions.scala | 44 +++-
 .../expressions/ExpressionEvalHelper.scala  | 15 ---
 .../expressions/RegexpExpressionsSuite.scala| 16 +--
 4 files changed, 36 insertions(+), 41 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/6e95325f/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
index a60494a..65273a7 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
@@ -36,7 +36,7 @@ import org.apache.spark.unsafe.types.UTF8String
  * @param name The short name of the function
  */
 abstract class LeafMathExpression(c: Double, name: String)
-  extends LeafExpression with CodegenFallback {
+  extends LeafExpression with CodegenFallback with Serializable {
 
   override def dataType: DataType = DoubleType
   override def foldable: Boolean = true

http://git-wip-us.apache.org/repos/asf/spark/blob/6e95325f/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
index 5f533fe..e74ef9a 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
@@ -1431,18 +1431,20 @@ case class FormatNumber(x: Expression, d: Expression)
 
   // Associated with the pattern, for the last d value, and we will update the
   // pattern (DecimalFormat) once the new coming d value differ with the last 
one.
+  // This is an Option to distinguish between 0 (numberFormat is valid) and 
uninitialized after
+  // serialization (numberFormat has not been updated for dValue = 0).
   @transient
-  private var lastDValue: Int = -100
+  private var lastDValue: Option[Int] = None
 
   // A cached DecimalFormat, for performance concern, we will change it
   // only if the d value changed.
   @transient
-  private val pattern: StringBuffer = new StringBuffer()
+  private lazy val pattern: StringBuffer = new StringBuffer()
 
   // SPARK-13515: US Locale configures the DecimalFormat object to use a dot 
('.')
   // as a decimal separator.
   @transient
-  private val numberFormat = new DecimalFormat("", new 
DecimalFormatSymbols(Locale.US))
+  private lazy val numberFormat = new DecimalFormat("", new 
DecimalFormatSymbols(Locale.US))
 
   override protected def nullSafeEval(xObject: Any, dObject: Any): Any = {
 val dValue = dObject.asInstanceOf[Int]
@@ -1450,24 +1452,28 @@ case class FormatNumber(x: Expression, d: Expression)
   return null
 }
 
-if (dValue != lastDValue) {
-  // construct a new DecimalFormat only if a new dValue
-  pattern.delete(0, pattern.length)
-  pattern.append("#,###,###,###,###,###,##0")
-
-  // decimal place
-  if (dValue > 0) {
-pattern.append(".")
-
-var i = 0
-while (i < dValue) {
-  i += 1
-  pattern.append("0")
+

[spark] Git Push Summary

2016-11-11 Thread rxin

Repository: spark
Updated Tags:  refs/tags/v2.0.2-rc2 [deleted] a6abe1ee2

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] Git Push Summary

2016-11-11 Thread rxin

Repository: spark
Updated Tags:  refs/tags/v2.0.2-rc3 [deleted] 584354eaa

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] Git Push Summary

2016-11-11 Thread rxin

Repository: spark
Updated Tags:  refs/tags/v2.0.2 [created] 584354eaa

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-17982][SQL] SQLBuilder should wrap the generated SQL with parenthesis for LIMIT

2016-11-11 Thread lixiao

Repository: spark
Updated Branches:
  refs/heads/master a531fe1a8 -> d42bb7cc4


[SPARK-17982][SQL] SQLBuilder should wrap the generated SQL with parenthesis 
for LIMIT

## What changes were proposed in this pull request?

Currently, `SQLBuilder` handles `LIMIT` by always adding `LIMIT` at the end of 
the generated subSQL. It makes `RuntimeException`s like the following. This PR 
adds a parenthesis always except `SubqueryAlias` is used together with `LIMIT`.

**Before**

``` scala
scala> sql("CREATE TABLE tbl(id INT)")
scala> sql("CREATE VIEW v1(id2) AS SELECT id FROM tbl LIMIT 2")
java.lang.RuntimeException: Failed to analyze the canonicalized SQL: ...
```

**After**

``` scala
scala> sql("CREATE TABLE tbl(id INT)")
scala> sql("CREATE VIEW v1(id2) AS SELECT id FROM tbl LIMIT 2")
scala> sql("SELECT id2 FROM v1")
res4: org.apache.spark.sql.DataFrame = [id2: int]
```

**Fixed cases in this PR**

The following two cases are the detail query plans having problematic SQL 
generations.

1. `SELECT * FROM (SELECT id FROM tbl LIMIT 2)`

Please note that **FROM SELECT** part of the generated SQL in the below. 
When we don't use '()' for limit, this fails.

```scala
# Original logical plan:
Project [id#1]
+- GlobalLimit 2
   +- LocalLimit 2
  +- Project [id#1]
 +- MetastoreRelation default, tbl

# Canonicalized logical plan:
Project [gen_attr_0#1 AS id#4]
+- SubqueryAlias tbl
   +- Project [gen_attr_0#1]
  +- GlobalLimit 2
 +- LocalLimit 2
+- Project [gen_attr_0#1]
   +- SubqueryAlias gen_subquery_0
  +- Project [id#1 AS gen_attr_0#1]
 +- SQLTable default, tbl, [id#1]

# Generated SQL:
SELECT `gen_attr_0` AS `id` FROM (SELECT `gen_attr_0` FROM SELECT `gen_attr_0` 
FROM (SELECT `id` AS `gen_attr_0` FROM `default`.`tbl`) AS gen_subquery_0 LIMIT 
2) AS tbl
```

2. `SELECT * FROM (SELECT id FROM tbl TABLESAMPLE (2 ROWS))`

Please note that **((~~~) AS gen_subquery_0 LIMIT 2)** in the below. When 
we use '()' for limit on `SubqueryAlias`, this fails.

```scala
# Original logical plan:
Project [id#1]
+- Project [id#1]
   +- GlobalLimit 2
  +- LocalLimit 2
 +- MetastoreRelation default, tbl

# Canonicalized logical plan:
Project [gen_attr_0#1 AS id#4]
+- SubqueryAlias tbl
   +- Project [gen_attr_0#1]
  +- GlobalLimit 2
 +- LocalLimit 2
+- SubqueryAlias gen_subquery_0
   +- Project [id#1 AS gen_attr_0#1]
  +- SQLTable default, tbl, [id#1]

# Generated SQL:
SELECT `gen_attr_0` AS `id` FROM (SELECT `gen_attr_0` FROM ((SELECT `id` AS 
`gen_attr_0` FROM `default`.`tbl`) AS gen_subquery_0 LIMIT 2)) AS tbl
```

## How was this patch tested?

Pass the Jenkins test with a newly added test case.

Author: Dongjoon Hyun 

Closes #15546 from dongjoon-hyun/SPARK-17982.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d42bb7cc
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d42bb7cc
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d42bb7cc

Branch: refs/heads/master
Commit: d42bb7cc4e32c173769bd7da5b9b5eafb510860c
Parents: a531fe1
Author: Dongjoon Hyun 
Authored: Fri Nov 11 13:28:18 2016 -0800
Committer: gatorsmile 
Committed: Fri Nov 11 13:28:18 2016 -0800

--
 .../scala/org/apache/spark/sql/catalyst/SQLBuilder.scala  |  7 ++-
 .../src/test/resources/sqlgen/generate_with_other_1.sql   |  2 +-
 .../src/test/resources/sqlgen/generate_with_other_2.sql   |  2 +-
 sql/hive/src/test/resources/sqlgen/limit.sql  |  4 
 .../apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala | 10 ++
 5 files changed, 22 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/d42bb7cc/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala
index 6f821f8..3804542 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala
@@ -138,9 +138,14 @@ class SQLBuilder private (
 case g: Generate =>
   generateToSQL(g)
 
-case Limit(limitExpr, child) =>
+// This prevents a pattern of `((...) AS gen_subquery_0 LIMIT 1)` which 
does not work.
+// For example, `SELECT * FROM (SELECT id FROM tbl TABLESAMPLE (2 ROWS))` 
makes this plan.
+case Limit(limitExpr, child: SubqueryAlias) =>
   s"${toSQL(child)} LIMIT ${limitExpr.sql}"
 
+case Limit(limitExpr, child) =>
+  s"(${toSQL(child)} LIMIT ${limitExpr.sql})"
+

spark git commit: [SPARK-17982][SQL] SQLBuilder should wrap the generated SQL with parenthesis for LIMIT

2016-11-11 Thread lixiao

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 00c9c7d96 -> 465e4b40b


[SPARK-17982][SQL] SQLBuilder should wrap the generated SQL with parenthesis 
for LIMIT

## What changes were proposed in this pull request?

Currently, `SQLBuilder` handles `LIMIT` by always adding `LIMIT` at the end of 
the generated subSQL. It makes `RuntimeException`s like the following. This PR 
adds a parenthesis always except `SubqueryAlias` is used together with `LIMIT`.

**Before**

``` scala
scala> sql("CREATE TABLE tbl(id INT)")
scala> sql("CREATE VIEW v1(id2) AS SELECT id FROM tbl LIMIT 2")
java.lang.RuntimeException: Failed to analyze the canonicalized SQL: ...
```

**After**

``` scala
scala> sql("CREATE TABLE tbl(id INT)")
scala> sql("CREATE VIEW v1(id2) AS SELECT id FROM tbl LIMIT 2")
scala> sql("SELECT id2 FROM v1")
res4: org.apache.spark.sql.DataFrame = [id2: int]
```

**Fixed cases in this PR**

The following two cases are the detail query plans having problematic SQL 
generations.

1. `SELECT * FROM (SELECT id FROM tbl LIMIT 2)`

Please note that **FROM SELECT** part of the generated SQL in the below. 
When we don't use '()' for limit, this fails.

```scala
# Original logical plan:
Project [id#1]
+- GlobalLimit 2
   +- LocalLimit 2
  +- Project [id#1]
 +- MetastoreRelation default, tbl

# Canonicalized logical plan:
Project [gen_attr_0#1 AS id#4]
+- SubqueryAlias tbl
   +- Project [gen_attr_0#1]
  +- GlobalLimit 2
 +- LocalLimit 2
+- Project [gen_attr_0#1]
   +- SubqueryAlias gen_subquery_0
  +- Project [id#1 AS gen_attr_0#1]
 +- SQLTable default, tbl, [id#1]

# Generated SQL:
SELECT `gen_attr_0` AS `id` FROM (SELECT `gen_attr_0` FROM SELECT `gen_attr_0` 
FROM (SELECT `id` AS `gen_attr_0` FROM `default`.`tbl`) AS gen_subquery_0 LIMIT 
2) AS tbl
```

2. `SELECT * FROM (SELECT id FROM tbl TABLESAMPLE (2 ROWS))`

Please note that **((~~~) AS gen_subquery_0 LIMIT 2)** in the below. When 
we use '()' for limit on `SubqueryAlias`, this fails.

```scala
# Original logical plan:
Project [id#1]
+- Project [id#1]
   +- GlobalLimit 2
  +- LocalLimit 2
 +- MetastoreRelation default, tbl

# Canonicalized logical plan:
Project [gen_attr_0#1 AS id#4]
+- SubqueryAlias tbl
   +- Project [gen_attr_0#1]
  +- GlobalLimit 2
 +- LocalLimit 2
+- SubqueryAlias gen_subquery_0
   +- Project [id#1 AS gen_attr_0#1]
  +- SQLTable default, tbl, [id#1]

# Generated SQL:
SELECT `gen_attr_0` AS `id` FROM (SELECT `gen_attr_0` FROM ((SELECT `id` AS 
`gen_attr_0` FROM `default`.`tbl`) AS gen_subquery_0 LIMIT 2)) AS tbl
```

## How was this patch tested?

Pass the Jenkins test with a newly added test case.

Author: Dongjoon Hyun 

Closes #15546 from dongjoon-hyun/SPARK-17982.

(cherry picked from commit d42bb7cc4e32c173769bd7da5b9b5eafb510860c)
Signed-off-by: gatorsmile 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/465e4b40
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/465e4b40
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/465e4b40

Branch: refs/heads/branch-2.1
Commit: 465e4b40b3b7760bfcd0f03a14b805029ed599f1
Parents: 00c9c7d
Author: Dongjoon Hyun 
Authored: Fri Nov 11 13:28:18 2016 -0800
Committer: gatorsmile 
Committed: Fri Nov 11 13:28:34 2016 -0800

--
 .../scala/org/apache/spark/sql/catalyst/SQLBuilder.scala  |  7 ++-
 .../src/test/resources/sqlgen/generate_with_other_1.sql   |  2 +-
 .../src/test/resources/sqlgen/generate_with_other_2.sql   |  2 +-
 sql/hive/src/test/resources/sqlgen/limit.sql  |  4 
 .../apache/spark/sql/catalyst/LogicalPlanToSQLSuite.scala | 10 ++
 5 files changed, 22 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/465e4b40/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala
index 6f821f8..3804542 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala
@@ -138,9 +138,14 @@ class SQLBuilder private (
 case g: Generate =>
   generateToSQL(g)
 
-case Limit(limitExpr, child) =>
+// This prevents a pattern of `((...) AS gen_subquery_0 LIMIT 1)` which 
does not work.
+// For example, `SELECT * FROM (SELECT id FROM tbl TABLESAMPLE (2 ROWS))` 
makes this plan.
+case Limit(limitExpr, child: SubqueryAlias) =>

spark git commit: [SPARK-17843][WEB UI] Indicate event logs pending for processing on history server UI

2016-11-11 Thread tgraves

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 51dca6143 -> 00c9c7d96


[SPARK-17843][WEB UI] Indicate event logs pending for processing on history 
server UI

## What changes were proposed in this pull request?

History Server UI's application listing to display information on currently 
under process event logs so a user knows that pending this processing an 
application may not list on the UI.

When there are no event logs under process, the application list page has a 
"Last Updated" date-time at the top indicating the date-time of the last 
_completed_ scan of the event logs. The value is displayed to the user in 
his/her local time zone.
## How was this patch tested?

All unit tests pass. Particularly all the suites under 
org.apache.spark.deploy.history.\* were run to test changes.
- Very first startup - Pending logs - no logs processed yet:

https://cloud.githubusercontent.com/assets/12079825/19640981/b8d2a96a-99fc-11e6-9b1f-2d736fe90e48.png;>
- Very first startup - Pending logs - some logs processed:

https://cloud.githubusercontent.com/assets/12079825/19641087/3f8e3bae-99fd-11e6-9ef1-e0e70d71d8ef.png;>
- Last updated - No currently pending logs:

https://cloud.githubusercontent.com/assets/12079825/19443100/4d13946c-94a9-11e6-8ee2-c442729bb206.png;>
- Last updated - With some currently pending logs:

https://cloud.githubusercontent.com/assets/12079825/19640903/7323ba3a-99fc-11e6-8359-6a45753dbb28.png;>
- No applications found and No currently pending logs:

https://cloud.githubusercontent.com/assets/12079825/19641364/03a2cb04-99fe-11e6-87d6-d09587fc6201.png;>

Author: Vinayak 

Closes #15410 from vijoshi/SAAS-608_master.

(cherry picked from commit a531fe1a82ec515314f2db2e2305283fef24067f)
Signed-off-by: Tom Graves 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/00c9c7d9
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/00c9c7d9
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/00c9c7d9

Branch: refs/heads/branch-2.1
Commit: 00c9c7d96489778dfe38a36675d3162bf8844880
Parents: 51dca61
Author: Vinayak 
Authored: Fri Nov 11 12:54:16 2016 -0600
Committer: Tom Graves 
Committed: Fri Nov 11 12:54:42 2016 -0600

--
 .../spark/ui/static/historypage-common.js   | 24 
 .../history/ApplicationHistoryProvider.scala| 24 
 .../deploy/history/FsHistoryProvider.scala  | 59 ++--
 .../spark/deploy/history/HistoryPage.scala  | 19 +++
 .../spark/deploy/history/HistoryServer.scala|  8 +++
 5 files changed, 116 insertions(+), 18 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/00c9c7d9/core/src/main/resources/org/apache/spark/ui/static/historypage-common.js
--
diff --git 
a/core/src/main/resources/org/apache/spark/ui/static/historypage-common.js 
b/core/src/main/resources/org/apache/spark/ui/static/historypage-common.js
new file mode 100644
index 000..55d540d
--- /dev/null
+++ b/core/src/main/resources/org/apache/spark/ui/static/historypage-common.js
@@ -0,0 +1,24 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+$(document).ready(function() {
+if ($('#last-updated').length) {
+  var lastUpdatedMillis = Number($('#last-updated').text());
+  var updatedDate = new Date(lastUpdatedMillis);
+  $('#last-updated').text(updatedDate.toLocaleDateString()+", 
"+updatedDate.toLocaleTimeString())
+}
+});

http://git-wip-us.apache.org/repos/asf/spark/blob/00c9c7d9/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
 
b/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
index 06530ff..d7d8280 100644
---

spark git commit: [SPARK-13331] AES support for over-the-wire encryption

2016-11-11 Thread vanzin

Repository: spark
Updated Branches:
  refs/heads/master 5ddf69470 -> 4f15d94cf


[SPARK-13331] AES support for over-the-wire encryption

## What changes were proposed in this pull request?

DIGEST-MD5 mechanism is used for SASL authentication and secure communication. 
DIGEST-MD5 mechanism supports 3DES, DES, and RC4 ciphers. However, 3DES, DES 
and RC4 are slow relatively.

AES provide better performance and security by design and is a replacement for 
3DES according to NIST. Apache Common Crypto is a cryptographic library 
optimized with AES-NI, this patch employ Apache Common Crypto as enc/dec 
backend for SASL authentication and secure channel to improve spark RPC.
## How was this patch tested?

Unit tests and Integration test.

Author: Junjie Chen 

Closes #15172 from cjjnjust/shuffle_rpc_encrypt.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4f15d94c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4f15d94c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4f15d94c

Branch: refs/heads/master
Commit: 4f15d94cfec86130f8dab28ae2e228ded8124020
Parents: 5ddf694
Author: Junjie Chen 
Authored: Fri Nov 11 10:37:58 2016 -0800
Committer: Marcelo Vanzin 
Committed: Fri Nov 11 10:37:58 2016 -0800

--
 common/network-common/pom.xml   |   4 +
 .../spark/network/sasl/SaslClientBootstrap.java |  23 +-
 .../spark/network/sasl/SaslRpcHandler.java  | 101 +--
 .../spark/network/sasl/aes/AesCipher.java   | 294 +++
 .../network/sasl/aes/AesConfigMessage.java  | 101 +++
 .../network/util/ByteArrayReadableChannel.java  |  62 
 .../spark/network/util/TransportConf.java   |  22 ++
 .../spark/network/sasl/SparkSaslSuite.java  |  93 +-
 docs/configuration.md   |  26 ++
 9 files changed, 689 insertions(+), 37 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/4f15d94c/common/network-common/pom.xml
--
diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml
index fcefe64..ca99fa8 100644
--- a/common/network-common/pom.xml
+++ b/common/network-common/pom.xml
@@ -76,6 +76,10 @@
   guava
   compile
 
+
+  org.apache.commons
+  commons-crypto
+
 
 
 

http://git-wip-us.apache.org/repos/asf/spark/blob/4f15d94c/common/network-common/src/main/java/org/apache/spark/network/sasl/SaslClientBootstrap.java
--
diff --git 
a/common/network-common/src/main/java/org/apache/spark/network/sasl/SaslClientBootstrap.java
 
b/common/network-common/src/main/java/org/apache/spark/network/sasl/SaslClientBootstrap.java
index 9e5c616..a1bb453 100644
--- 
a/common/network-common/src/main/java/org/apache/spark/network/sasl/SaslClientBootstrap.java
+++ 
b/common/network-common/src/main/java/org/apache/spark/network/sasl/SaslClientBootstrap.java
@@ -30,6 +30,8 @@ import org.slf4j.LoggerFactory;
 
 import org.apache.spark.network.client.TransportClient;
 import org.apache.spark.network.client.TransportClientBootstrap;
+import org.apache.spark.network.sasl.aes.AesCipher;
+import org.apache.spark.network.sasl.aes.AesConfigMessage;
 import org.apache.spark.network.util.JavaUtils;
 import org.apache.spark.network.util.TransportConf;
 
@@ -88,9 +90,26 @@ public class SaslClientBootstrap implements 
TransportClientBootstrap {
   throw new RuntimeException(
 new SaslException("Encryption requests by negotiated non-encrypted 
connection."));
 }
-SaslEncryption.addToChannel(channel, saslClient, 
conf.maxSaslEncryptedBlockSize());
+
+if (conf.aesEncryptionEnabled()) {
+  // Generate a request config message to send to server.
+  AesConfigMessage configMessage = AesCipher.createConfigMessage(conf);
+  ByteBuffer buf = configMessage.encodeMessage();
+
+  // Encrypted the config message.
+  byte[] toEncrypt = JavaUtils.bufferToArray(buf);
+  ByteBuffer encrypted = ByteBuffer.wrap(saslClient.wrap(toEncrypt, 0, 
toEncrypt.length));
+
+  client.sendRpcSync(encrypted, conf.saslRTTimeoutMs());
+  AesCipher cipher = new AesCipher(configMessage, conf);
+  logger.info("Enabling AES cipher for client channel {}", client);
+  cipher.addToChannel(channel);
+  saslClient.dispose();
+} else {
+  SaslEncryption.addToChannel(channel, saslClient, 
conf.maxSaslEncryptedBlockSize());
+}
 saslClient = null;
-logger.debug("Channel {} configured for SASL encryption.", client);
+logger.debug("Channel {} configured for encryption.",

63 matches

Mail list logo