from:"srowen"

spark git commit: [SPARK-18396][HISTORYSERVER] Duration" column makes search result confused, maybe we should make it unsearchable

2016-11-14 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 d554c02f4 -> 518dc1e1e


[SPARK-18396][HISTORYSERVER] Duration" column makes search result confused, 
maybe we should make it unsearchable

## What changes were proposed in this pull request?

When we search data in History Server, it will check if any columns contains 
the search string. Duration is represented as long value in table, so if we 
search simple string like "003", "111", the duration containing "003", 
â111â will be showed, which make not much sense to users.
We cannot simply transfer the long value to meaning format like "1 h", "3.2 
min" because they are also used for sorting. Better way to handle it is ban 
"Duration" columns from searching.

## How was this patch tested

manually tests.

Before("local-1478225166651" pass the filter because its duration in long 
value, which is "257244245" contains search string "244"):
![before](https://cloud.githubusercontent.com/assets/5276001/20203166/f851ffc6-a7ff-11e6-8fe6-91a90ca92b23.jpg)

After:
![after](https://cloud.githubusercontent.com/assets/5276001/20178646/2129fbb0-a78d-11e6-9edb-39f885ce3ed0.jpg)

Author: WangTaoTheTonic 

Closes #15838 from WangTaoTheTonic/duration.

(cherry picked from commit 637a0bb88f74712001f32a53ff66fd0b8cb67e4a)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/518dc1e1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/518dc1e1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/518dc1e1

Branch: refs/heads/branch-2.1
Commit: 518dc1e1e63a8955b16a3f2ca7592264fd637ae6
Parents: d554c02
Author: WangTaoTheTonic 
Authored: Mon Nov 14 12:22:36 2016 +0100
Committer: Sean Owen 
Committed: Mon Nov 14 12:22:47 2016 +0100

--
 core/src/main/resources/org/apache/spark/ui/static/historypage.js | 3 +++
 1 file changed, 3 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/518dc1e1/core/src/main/resources/org/apache/spark/ui/static/historypage.js
--
diff --git a/core/src/main/resources/org/apache/spark/ui/static/historypage.js 
b/core/src/main/resources/org/apache/spark/ui/static/historypage.js
index 6c0ec8d..8fd9186 100644
--- a/core/src/main/resources/org/apache/spark/ui/static/historypage.js
+++ b/core/src/main/resources/org/apache/spark/ui/static/historypage.js
@@ -139,6 +139,9 @@ $(document).ready(function() {
 {name: 'eighth'},
 {name: 'ninth'},
 ],
+"columnDefs": [
+{"searchable": false, "targets": [5]}
+],
 "autoWidth": false,
 "order": [[ 4, "desc" ]]
 };


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18427][DOC] Update docs of mllib.KMeans

2016-11-15 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 a0125fd68 -> 0762c0ceb


[SPARK-18427][DOC] Update docs of mllib.KMeans

## What changes were proposed in this pull request?
1,Remove `runs` from docs of mllib.KMeans
2,Add notes for `k` according to comments in sources
## How was this patch tested?
existing tests

Author: Zheng RuiFeng 

Closes #15873 from zhengruifeng/update_doc_mllib_kmeans.

(cherry picked from commit 33be4da5391b884191c405ffbce7d382ea8a2f66)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0762c0ce
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0762c0ce
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0762c0ce

Branch: refs/heads/branch-2.1
Commit: 0762c0cebe66f806b138420baa562787fd0cf375
Parents: a0125fd
Author: Zheng RuiFeng 
Authored: Tue Nov 15 15:44:50 2016 +0100
Committer: Sean Owen 
Committed: Tue Nov 15 15:45:26 2016 +0100

--
 docs/mllib-clustering.md  | 6 ++
 examples/src/main/python/mllib/k_means_example.py | 3 +--
 2 files changed, 3 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/0762c0ce/docs/mllib-clustering.md
--
diff --git a/docs/mllib-clustering.md b/docs/mllib-clustering.md
index d5f6ae3..8990e95 100644
--- a/docs/mllib-clustering.md
+++ b/docs/mllib-clustering.md
@@ -24,13 +24,11 @@ variant of the 
[k-means++](http://en.wikipedia.org/wiki/K-means%2B%2B) method
 called [kmeans||](http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf).
 The implementation in `spark.mllib` has the following parameters:
 
-* *k* is the number of desired clusters.
+* *k* is the number of desired clusters. Note that it is possible for fewer 
than k clusters to be returned, for example, if there are fewer than k distinct 
points to cluster.
 * *maxIterations* is the maximum number of iterations to run.
 * *initializationMode* specifies either random initialization or
 initialization via k-means\|\|.
-* *runs* is the number of times to run the k-means algorithm (k-means is not
-guaranteed to find a globally optimal solution, and when run multiple times on
-a given dataset, the algorithm returns the best clustering result).
+* *runs* This param has no effect since Spark 2.0.0.
 * *initializationSteps* determines the number of steps in the k-means\|\| 
algorithm.
 * *epsilon* determines the distance threshold within which we consider k-means 
to have converged.
 * *initialModel* is an optional set of cluster centers used for 
initialization. If this parameter is supplied, only one run is performed.

http://git-wip-us.apache.org/repos/asf/spark/blob/0762c0ce/examples/src/main/python/mllib/k_means_example.py
--
diff --git a/examples/src/main/python/mllib/k_means_example.py 
b/examples/src/main/python/mllib/k_means_example.py
index 5c397e6..d6058f4 100644
--- a/examples/src/main/python/mllib/k_means_example.py
+++ b/examples/src/main/python/mllib/k_means_example.py
@@ -36,8 +36,7 @@ if __name__ == "__main__":
 parsedData = data.map(lambda line: array([float(x) for x in line.split(' 
')]))
 
 # Build the model (cluster the data)
-clusters = KMeans.train(parsedData, 2, maxIterations=10,
-runs=10, initializationMode="random")
+clusters = KMeans.train(parsedData, 2, maxIterations=10, 
initializationMode="random")
 
 # Evaluate clustering by computing Within Set Sum of Squared Errors
 def error(point):


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18427][DOC] Update docs of mllib.KMeans

2016-11-15 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master d89bfc923 -> 33be4da53


[SPARK-18427][DOC] Update docs of mllib.KMeans

## What changes were proposed in this pull request?
1,Remove `runs` from docs of mllib.KMeans
2,Add notes for `k` according to comments in sources
## How was this patch tested?
existing tests

Author: Zheng RuiFeng 

Closes #15873 from zhengruifeng/update_doc_mllib_kmeans.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/33be4da5
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/33be4da5
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/33be4da5

Branch: refs/heads/master
Commit: 33be4da5391b884191c405ffbce7d382ea8a2f66
Parents: d89bfc9
Author: Zheng RuiFeng 
Authored: Tue Nov 15 15:44:50 2016 +0100
Committer: Sean Owen 
Committed: Tue Nov 15 15:44:50 2016 +0100

--
 docs/mllib-clustering.md  | 6 ++
 examples/src/main/python/mllib/k_means_example.py | 3 +--
 2 files changed, 3 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/33be4da5/docs/mllib-clustering.md
--
diff --git a/docs/mllib-clustering.md b/docs/mllib-clustering.md
index d5f6ae3..8990e95 100644
--- a/docs/mllib-clustering.md
+++ b/docs/mllib-clustering.md
@@ -24,13 +24,11 @@ variant of the 
[k-means++](http://en.wikipedia.org/wiki/K-means%2B%2B) method
 called [kmeans||](http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf).
 The implementation in `spark.mllib` has the following parameters:
 
-* *k* is the number of desired clusters.
+* *k* is the number of desired clusters. Note that it is possible for fewer 
than k clusters to be returned, for example, if there are fewer than k distinct 
points to cluster.
 * *maxIterations* is the maximum number of iterations to run.
 * *initializationMode* specifies either random initialization or
 initialization via k-means\|\|.
-* *runs* is the number of times to run the k-means algorithm (k-means is not
-guaranteed to find a globally optimal solution, and when run multiple times on
-a given dataset, the algorithm returns the best clustering result).
+* *runs* This param has no effect since Spark 2.0.0.
 * *initializationSteps* determines the number of steps in the k-means\|\| 
algorithm.
 * *epsilon* determines the distance threshold within which we consider k-means 
to have converged.
 * *initialModel* is an optional set of cluster centers used for 
initialization. If this parameter is supplied, only one run is performed.

http://git-wip-us.apache.org/repos/asf/spark/blob/33be4da5/examples/src/main/python/mllib/k_means_example.py
--
diff --git a/examples/src/main/python/mllib/k_means_example.py 
b/examples/src/main/python/mllib/k_means_example.py
index 5c397e6..d6058f4 100644
--- a/examples/src/main/python/mllib/k_means_example.py
+++ b/examples/src/main/python/mllib/k_means_example.py
@@ -36,8 +36,7 @@ if __name__ == "__main__":
 parsedData = data.map(lambda line: array([float(x) for x in line.split(' 
')]))
 
 # Build the model (cluster the data)
-clusters = KMeans.train(parsedData, 2, maxIterations=10,
-runs=10, initializationMode="random")
+clusters = KMeans.train(parsedData, 2, maxIterations=10, 
initializationMode="random")
 
 # Evaluate clustering by computing Within Set Sum of Squared Errors
 def error(point):


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[1/3] spark-website git commit: Use site.baseurl, not site.url, to work with Jekyll 3.3. Require Jekyll 3.3. Again commit HTML consistent with Jekyll 3.3 output. Fix date problem with news posts that

2016-11-15 Thread srowen

Repository: spark-website
Updated Branches:
  refs/heads/asf-site 4e10a1ac1 -> d82e37220


http://git-wip-us.apache.org/repos/asf/spark-website/blob/d82e3722/site/releases/spark-release-1-1-0.html
--
diff --git a/site/releases/spark-release-1-1-0.html 
b/site/releases/spark-release-1-1-0.html
index bf0f9e2..0df9f0d 100644
--- a/site/releases/spark-release-1-1-0.html
+++ b/site/releases/spark-release-1-1-0.html
@@ -197,7 +197,7 @@
 Spark SQL adds a number of new features and performance improvements in 
this release. A http://spark.apache.org/docs/1.1.0/sql-programming-guide.html#running-the-thrift-jdbc-server";>JDBC/ODBC
 server allows users to connect to SparkSQL from many different 
applications and provides shared access to cached tables. A new module provides 
http://spark.apache.org/docs/1.1.0/sql-programming-guide.html#json-datasets";>support
 for loading JSON data directly into Sparkâs SchemaRDD format, including 
automatic schema inference. Spark SQL introduces http://spark.apache.org/docs/1.1.0/sql-programming-guide.html#other-configuration-options";>dynamic
 bytecode generation in this release, a technique which significantly 
speeds up execution for queries that perform complex expression evaluation.  
This release also adds support for registering Python, Scala, and Java lambda 
functions as UDFs, which can then be called directly in SQL. Spark 1.1 adds a 
public
 types API to allow users to create SchemaRDDâs from custom data sources. 
Finally, many optimizations have been added to the native Parquet support as 
well as throughout the engine.
 
 MLlib
-MLlib adds several new algorithms and optimizations in this release. 1.1 
introduces a https://issues.apache.org/jira/browse/SPARK-2359";>new 
library of statistical packages which provides exploratory analytic 
functions. These include stratified sampling, correlations, chi-squared tests 
and support for creating random datasets. This release adds utilities for 
feature extraction (https://issues.apache.org/jira/browse/SPARK-2510";>Word2Vec and https://issues.apache.org/jira/browse/SPARK-2511";>TF-IDF) and feature 
transformation (https://issues.apache.org/jira/browse/SPARK-2272";>normalization and 
standard scaling). Also new are support for https://issues.apache.org/jira/browse/SPARK-1553";>nonnegative matrix 
factorization and https://issues.apache.org/jira/browse/SPARK-1782";>SVD via Lanczos. 
The decision tree algorithm has been https://issues.apache.org/jira/browse/SPARK-2478";>added in Python and 
Java<
 /a>. A tree aggregation primitive has been added to help optimize many 
existing algorithms. Performance improves across the board in MLlib 1.1, with 
improvements of around 2-3X for many algorithms and up to 5X for large scale 
decision tree problems. 
+MLlib adds several new algorithms and optimizations in this release. 1.1 
introduces a https://issues.apache.org/jira/browse/SPARK-2359";>new 
library of statistical packages which provides exploratory analytic 
functions. These include stratified sampling, correlations, chi-squared tests 
and support for creating random datasets. This release adds utilities for 
feature extraction (https://issues.apache.org/jira/browse/SPARK-2510";>Word2Vec and https://issues.apache.org/jira/browse/SPARK-2511";>TF-IDF) and feature 
transformation (https://issues.apache.org/jira/browse/SPARK-2272";>normalization and 
standard scaling). Also new are support for https://issues.apache.org/jira/browse/SPARK-1553";>nonnegative matrix 
factorization and https://issues.apache.org/jira/browse/SPARK-1782";>SVD via Lanczos. 
The decision tree algorithm has been https://issues.apache.org/jira/browse/SPARK-2478";>added in Python and 
Java<
 /a>. A tree aggregation primitive has been added to help optimize many 
existing algorithms. Performance improves across the board in MLlib 1.1, with 
improvements of around 2-3X for many algorithms and up to 5X for large scale 
decision tree problems.
 
 GraphX and Spark Streaming
 Spark streaming adds a new data source https://issues.apache.org/jira/browse/SPARK-1981";>Amazon Kinesis. For 
the Apache Flume, a new mode is supported which https://issues.apache.org/jira/browse/SPARK-1729";>pulls data from 
Flume, simplifying deployment and providing high availability. The first of 
a set of https://issues.apache.org/jira/browse/SPARK-2438";>streaming 
machine learning algorithms is introduced with streaming linear regression. 
Finally, https://issues.apache.org/jira/browse/SPARK-1341";>rate 
limiting has been added for streaming inputs. GraphX adds https://issues.apache.org/jira/browse/SPARK-1991";>custom storage levels 
for vertices and edges along with https://issues.apache.org/jira/browse/SPARK-2748";>improved numerical 
precision across the board. Finally, GraphX adds a new label propagation 
algorithm.
@@ -215,7 +215,7 @@
 
 
   The default value of spark.io.compression.codec is now 
snappy for improved m

[3/3] spark-website git commit: Use site.baseurl, not site.url, to work with Jekyll 3.3. Require Jekyll 3.3. Again commit HTML consistent with Jekyll 3.3 output. Fix date problem with news posts that

2016-11-15 Thread srowen

Use site.baseurl, not site.url, to work with Jekyll 3.3. Require Jekyll 3.3. 
Again commit HTML consistent with Jekyll 3.3 output. Fix date problem with news 
posts that set date: by removing date:.


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/d82e3722
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/d82e3722
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/d82e3722

Branch: refs/heads/asf-site
Commit: d82e3722043aa2c2c2d5af6d1e68f16a83101d73
Parents: 4e10a1a
Author: Sean Owen 
Authored: Fri Nov 11 19:56:10 2016 +
Committer: Sean Owen 
Committed: Tue Nov 15 17:56:22 2016 +0100

--
 README.md   | 32 ++
 _layouts/global.html| 62 ++--
 _layouts/post.html  |  2 +-
 community.md|  2 +-
 documentation.md| 58 +-
 downloads.md|  2 +-
 faq.md  |  6 +-
 graphx/index.md | 16 ++---
 index.md| 30 +-
 mllib/index.md  | 22 +++
 .../2012-10-15-spark-version-0-6-0-released.md  |  2 +-
 ...2012-11-22-spark-0-6-1-and-0-5-2-released.md |  2 +-
 news/_posts/2013-02-07-spark-0-6-2-released.md  |  2 +-
 news/_posts/2013-02-27-spark-0-7-0-released.md  |  2 +-
 .../2013-04-16-spark-screencasts-published.md   | 12 ++--
 news/_posts/2013-06-02-spark-0-7-2-released.md  |  2 +-
 news/_posts/2013-07-16-spark-0-7-3-released.md  |  2 +-
 ...3-08-27-fourth-spark-screencast-published.md |  2 +-
 news/_posts/2013-09-25-spark-0-8-0-released.md  |  2 +-
 news/_posts/2013-12-19-spark-0-8-1-released.md  |  2 +-
 news/_posts/2014-02-02-spark-0-9-0-released.md  |  6 +-
 news/_posts/2014-02-27-spark-becomes-tlp.md |  2 +-
 news/_posts/2014-04-09-spark-0-9-1-released.md  |  6 +-
 news/_posts/2014-05-30-spark-1-0-0-released.md  |  4 +-
 news/_posts/2014-07-11-spark-1-0-1-released.md  |  4 +-
 news/_posts/2014-07-23-spark-0-9-2-released.md  |  6 +-
 news/_posts/2014-08-05-spark-1-0-2-released.md  |  4 +-
 news/_posts/2014-09-11-spark-1-1-0-released.md  |  4 +-
 news/_posts/2014-11-26-spark-1-1-1-released.md  |  4 +-
 news/_posts/2014-12-18-spark-1-2-0-released.md  |  4 +-
 news/_posts/2015-02-09-spark-1-2-1-released.md  |  4 +-
 news/_posts/2015-03-13-spark-1-3-0-released.md  |  4 +-
 news/_posts/2015-04-17-spark-1-2-2-released.md  |  4 +-
 ...2015-05-15-one-month-to-spark-summit-2015.md |  1 -
 news/_posts/2015-05-15-spark-summit-europe.md   |  1 -
 news/_posts/2015-06-11-spark-1-4-0-released.md  |  4 +-
 news/_posts/2015-07-15-spark-1-4-1-released.md  |  4 +-
 news/_posts/2015-09-09-spark-1-5-0-released.md  |  4 +-
 news/_posts/2015-10-02-spark-1-5-1-released.md  |  4 +-
 news/_posts/2015-11-09-spark-1-5-2-released.md  |  4 +-
 news/_posts/2016-01-04-spark-1-6-0-released.md  |  6 +-
 news/_posts/2016-03-09-spark-1-6-1-released.md  |  4 +-
 news/_posts/2016-06-25-spark-1-6-2-released.md  |  4 +-
 news/_posts/2016-07-26-spark-2-0-0-released.md  |  2 +-
 news/_posts/2016-10-03-spark-2-0-1-released.md  |  2 +-
 news/_posts/2016-11-07-spark-1-6-3-released.md  |  4 +-
 news/_posts/2016-11-14-spark-2-0-2-released.md  |  4 +-
 .../_posts/2013-09-25-spark-release-0-8-0.md|  2 +-
 .../_posts/2013-12-19-spark-release-0-8-1.md|  4 +-
 .../_posts/2014-02-02-spark-release-0-9-0.md| 18 +++---
 .../_posts/2014-05-30-spark-release-1-0-0.md| 14 ++---
 .../_posts/2014-09-11-spark-release-1-1-0.md|  2 +-
 .../_posts/2014-11-26-spark-release-1-1-1.md|  2 +-
 .../_posts/2014-12-18-spark-release-1-2-0.md|  2 +-
 .../_posts/2015-02-09-spark-release-1-2-1.md|  2 +-
 .../_posts/2015-03-13-spark-release-1-3-0.md|  2 +-
 .../_posts/2015-04-17-spark-release-1-2-2.md|  2 +-
 .../_posts/2015-04-17-spark-release-1-3-1.md|  2 +-
 .../_posts/2015-06-11-spark-release-1-4-0.md|  2 +-
 .../_posts/2015-07-15-spark-release-1-4-1.md|  2 +-
 .../_posts/2015-09-09-spark-release-1-5-0.md|  2 +-
 .../2013-04-10-1-first-steps-with-spark.md  |  4 +-
 ...2013-04-11-2-spark-documentation-overview.md |  4 +-
 .../2013-04-16-3-transformations-and-caching.md |  4 +-
 .../2013-08-26-4-a-standalone-job-in-spark.md   |  2 +-
 site/documentation.html |  5 +-
 site/news/index.html| 33 +--
 site/news/spark-0-9-1-released.html |  2 +-
 site/news/spark-0-9-2-released.html |  2 +-
 site/news/spark-1-1-0-released.html |  2 +-
 site/news/spark-1-2-2-released.html |  2 +-
 site/news/spark-and-shark-in-the-news.html  |  2 +-
 .../spark-summit-east-2015-videos-post

[2/3] spark-website git commit: Use site.baseurl, not site.url, to work with Jekyll 3.3. Require Jekyll 3.3. Again commit HTML consistent with Jekyll 3.3 output. Fix date problem with news posts that

2016-11-15 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark-website/blob/d82e3722/news/_posts/2015-10-02-spark-1-5-1-released.md
--
diff --git a/news/_posts/2015-10-02-spark-1-5-1-released.md 
b/news/_posts/2015-10-02-spark-1-5-1-released.md
index f525cbf..d098de6 100644
--- a/news/_posts/2015-10-02-spark-1-5-1-released.md
+++ b/news/_posts/2015-10-02-spark-1-5-1-released.md
@@ -11,6 +11,6 @@ meta:
   _edit_last: '4'
   _wpas_done_all: '1'
 ---
-We are happy to announce the availability of Spark 1.5.1! This maintenance release includes fixes across several 
areas of Spark, including the DataFrame API, Spark Streaming, PySpark, R, Spark 
SQL, and MLlib.
+We are happy to announce the availability of Spark 1.5.1! This maintenance release includes fixes across several 
areas of Spark, including the DataFrame API, Spark Streaming, PySpark, R, Spark 
SQL, and MLlib.
 
-Visit the release notes to read about the new features, or download the release today.
+Visit the release notes to read about the new features, 
or download the release today.

http://git-wip-us.apache.org/repos/asf/spark-website/blob/d82e3722/news/_posts/2015-11-09-spark-1-5-2-released.md
--
diff --git a/news/_posts/2015-11-09-spark-1-5-2-released.md 
b/news/_posts/2015-11-09-spark-1-5-2-released.md
index 21696c5..fbc6c71 100644
--- a/news/_posts/2015-11-09-spark-1-5-2-released.md
+++ b/news/_posts/2015-11-09-spark-1-5-2-released.md
@@ -11,6 +11,6 @@ meta:
   _edit_last: '4'
   _wpas_done_all: '1'
 ---
-We are happy to announce the availability of Spark 1.5.2! This maintenance release includes fixes across several 
areas of Spark, including the DataFrame API, Spark Streaming, PySpark, R, Spark 
SQL, and MLlib.
+We are happy to announce the availability of Spark 1.5.2! This maintenance release includes fixes across several 
areas of Spark, including the DataFrame API, Spark Streaming, PySpark, R, Spark 
SQL, and MLlib.
 
-Visit the release notes to read about the new features, or download the release today.
+Visit the release notes to read about the new features, 
or download the release today.

http://git-wip-us.apache.org/repos/asf/spark-website/blob/d82e3722/news/_posts/2016-01-04-spark-1-6-0-released.md
--
diff --git a/news/_posts/2016-01-04-spark-1-6-0-released.md 
b/news/_posts/2016-01-04-spark-1-6-0-released.md
index 4e47772..b399ade 100644
--- a/news/_posts/2016-01-04-spark-1-6-0-released.md
+++ b/news/_posts/2016-01-04-spark-1-6-0-released.md
@@ -12,9 +12,9 @@ meta:
   _wpas_done_all: '1'
 ---
 We are happy to announce the availability of 
-Spark 1.6.0! 
+Spark 1.6.0! 
 Spark 1.6.0 is the seventh release on the API-compatible 1.X line. 
 With this release the Spark community continues to grow, with contributions 
from 248 developers!
 
-Visit the release notes 
-to read about the new features, or download the release today.
+Visit the release notes 
+to read about the new features, or download the release today.

http://git-wip-us.apache.org/repos/asf/spark-website/blob/d82e3722/news/_posts/2016-03-09-spark-1-6-1-released.md
--
diff --git a/news/_posts/2016-03-09-spark-1-6-1-released.md 
b/news/_posts/2016-03-09-spark-1-6-1-released.md
index adc2735..6e15537 100644
--- a/news/_posts/2016-03-09-spark-1-6-1-released.md
+++ b/news/_posts/2016-03-09-spark-1-6-1-released.md
@@ -11,6 +11,6 @@ meta:
   _edit_last: '4'
   _wpas_done_all: '1'
 ---
-We are happy to announce the availability of Spark 1.6.1! This maintenance release includes fixes across several 
areas of Spark, including signficant updates to the experimental Dataset API.
+We are happy to announce the availability of Spark 1.6.1! This maintenance release includes fixes across several 
areas of Spark, including signficant updates to the experimental Dataset API.
 
-Visit the release notes to read about the new features, or download the release today.
+Visit the release notes to read about the new features, 
or download the release today.

http://git-wip-us.apache.org/repos/asf/spark-website/blob/d82e3722/news/_posts/2016-06-25-spark-1-6-2-released.md
--
diff --git a/news/_posts/2016-06-25-spark-1-6-2-released.md 
b/news/_posts/2016-06-25-spark-1-6-2-released.md
index d3d2beb..3c9bbf3 100644
--- a/news/_posts/2016-06-25-spark-1-6-2-released.md
+++ b/news/_posts/2016-06-25-spark-1-6-2-released.md
@@ -11,6 +11,6 @@ meta:
   _edit_last: '4'
   _wpas_done_all: '1'
 ---
-We are happy to announce the availability of Spark 1.6.2! This maintenance release includes fixes across several 
areas of Spark.
+We are happy to announce the availability of Spark 1.6.2! This maintenance release includes fixes across several 
areas of Spark.
 
-Visit the release notes to read about the

[2/2] spark-website git commit: Fix broken link to bootstrap JS

2016-11-15 Thread srowen

Fix broken link to bootstrap JS


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/c693f2a7
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/c693f2a7
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/c693f2a7

Branch: refs/heads/asf-site
Commit: c693f2a7d4489019e2391b0955fee786cff5ee81
Parents: d82e372
Author: Sean Owen 
Authored: Fri Nov 11 19:57:47 2016 +
Committer: Sean Owen 
Committed: Tue Nov 15 19:25:10 2016 +0100

--
 _layouts/global.html| 2 +-
 site/community.html | 2 +-
 site/documentation.html | 2 +-
 site/downloads.html | 2 +-
 site/examples.html  | 2 +-
 site/faq.html   | 2 +-
 site/graphx/index.html  | 2 +-
 site/index.html | 2 +-
 site/mailing-lists.html | 2 +-
 site/mllib/index.html   | 2 +-
 site/news/amp-camp-2013-registration-ope.html   | 2 +-
 site/news/announcing-the-first-spark-summit.html| 2 +-
 site/news/fourth-spark-screencast-published.html| 2 +-
 site/news/index.html| 2 +-
 site/news/nsdi-paper.html   | 2 +-
 site/news/one-month-to-spark-summit-2015.html   | 2 +-
 site/news/proposals-open-for-spark-summit-east.html | 2 +-
 site/news/registration-open-for-spark-summit-east.html  | 2 +-
 site/news/run-spark-and-shark-on-amazon-emr.html| 2 +-
 site/news/spark-0-6-1-and-0-5-2-released.html   | 2 +-
 site/news/spark-0-6-2-released.html | 2 +-
 site/news/spark-0-7-0-released.html | 2 +-
 site/news/spark-0-7-2-released.html | 2 +-
 site/news/spark-0-7-3-released.html | 2 +-
 site/news/spark-0-8-0-released.html | 2 +-
 site/news/spark-0-8-1-released.html | 2 +-
 site/news/spark-0-9-0-released.html | 2 +-
 site/news/spark-0-9-1-released.html | 2 +-
 site/news/spark-0-9-2-released.html | 2 +-
 site/news/spark-1-0-0-released.html | 2 +-
 site/news/spark-1-0-1-released.html | 2 +-
 site/news/spark-1-0-2-released.html | 2 +-
 site/news/spark-1-1-0-released.html | 2 +-
 site/news/spark-1-1-1-released.html | 2 +-
 site/news/spark-1-2-0-released.html | 2 +-
 site/news/spark-1-2-1-released.html | 2 +-
 site/news/spark-1-2-2-released.html | 2 +-
 site/news/spark-1-3-0-released.html | 2 +-
 site/news/spark-1-4-0-released.html | 2 +-
 site/news/spark-1-4-1-released.html | 2 +-
 site/news/spark-1-5-0-released.html | 2 +-
 site/news/spark-1-5-1-released.html | 2 +-
 site/news/spark-1-5-2-released.html | 2 +-
 site/news/spark-1-6-0-released.html | 2 +-
 site/news/spark-1-6-1-released.html | 2 +-
 site/news/spark-1-6-2-released.html | 2 +-
 site/news/spark-1-6-3-released.html | 2 +-
 site/news/spark-2-0-0-released.html | 2 +-
 site/news/spark-2-0-1-released.html | 2 +-
 site/news/spark-2-0-2-released.html | 2 +-
 site/news/spark-2.0.0-preview.html  | 2 +-
 site/news/spark-accepted-into-apache-incubator.html | 2 +-
 site/news/spark-and-shark-in-the-news.html  | 2 +-
 site/news/spark-becomes-tlp.html| 2 +-
 site/news/spark-featured-in-wired.html  | 2 +-
 site/news/spark-mailing-lists-moving-to-apache.html | 2 +-
 site/news/spark-meetups.html| 2 +-
 site/news/spark-screencasts-published.html  | 2 +-
 site/news/spark-summit-2013-is-a-wrap.html  | 2 +-
 site/news/spark-summit-2014-videos-posted.html  | 2 +-
 site/news/spark-summit-2015-videos-posted.html  | 2 +-
 site/news/spark-summit-agenda-posted.html   | 2 +-
 site/news/spark-summit-east-2015-videos-posted.html | 2 +-
 site/news/spark-summit-east-2016-cfp-closing.html   | 2 +-
 site/news/spark-summit-east-age

[1/2] spark-website git commit: Fix broken link to bootstrap JS

2016-11-15 Thread srowen

Repository: spark-website
Updated Branches:
  refs/heads/asf-site d82e37220 -> c693f2a7d


http://git-wip-us.apache.org/repos/asf/spark-website/blob/c693f2a7/site/releases/spark-release-1-5-2.html
--
diff --git a/site/releases/spark-release-1-5-2.html 
b/site/releases/spark-release-1-5-2.html
index 114ce5e..f21f9df 100644
--- a/site/releases/spark-release-1-5-2.html
+++ b/site/releases/spark-release-1-5-2.html
@@ -54,7 +54,7 @@
 
 
 https://code.jquery.com/jquery.js";>
-
+https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js";>
 
 
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/c693f2a7/site/releases/spark-release-1-6-0.html
--
diff --git a/site/releases/spark-release-1-6-0.html 
b/site/releases/spark-release-1-6-0.html
index 6dcac58..530c28c 100644
--- a/site/releases/spark-release-1-6-0.html
+++ b/site/releases/spark-release-1-6-0.html
@@ -54,7 +54,7 @@
 
 
 https://code.jquery.com/jquery.js";>
-
+https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js";>
 
 
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/c693f2a7/site/releases/spark-release-1-6-1.html
--
diff --git a/site/releases/spark-release-1-6-1.html 
b/site/releases/spark-release-1-6-1.html
index 339168f..23d4ab4 100644
--- a/site/releases/spark-release-1-6-1.html
+++ b/site/releases/spark-release-1-6-1.html
@@ -54,7 +54,7 @@
 
 
 https://code.jquery.com/jquery.js";>
-
+https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js";>
 
 
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/c693f2a7/site/releases/spark-release-1-6-2.html
--
diff --git a/site/releases/spark-release-1-6-2.html 
b/site/releases/spark-release-1-6-2.html
index ff8e793..dfbbaf9 100644
--- a/site/releases/spark-release-1-6-2.html
+++ b/site/releases/spark-release-1-6-2.html
@@ -54,7 +54,7 @@
 
 
 https://code.jquery.com/jquery.js";>
-
+https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js";>
 
 
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/c693f2a7/site/releases/spark-release-1-6-3.html
--
diff --git a/site/releases/spark-release-1-6-3.html 
b/site/releases/spark-release-1-6-3.html
index fc376d3..1650b15 100644
--- a/site/releases/spark-release-1-6-3.html
+++ b/site/releases/spark-release-1-6-3.html
@@ -54,7 +54,7 @@
 
 
 https://code.jquery.com/jquery.js";>
-
+https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js";>
 
 
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/c693f2a7/site/releases/spark-release-2-0-0.html
--
diff --git a/site/releases/spark-release-2-0-0.html 
b/site/releases/spark-release-2-0-0.html
index d859bb0..e75aae0 100644
--- a/site/releases/spark-release-2-0-0.html
+++ b/site/releases/spark-release-2-0-0.html
@@ -54,7 +54,7 @@
 
 
 https://code.jquery.com/jquery.js";>
-
+https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js";>
 
 
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/c693f2a7/site/releases/spark-release-2-0-1.html
--
diff --git a/site/releases/spark-release-2-0-1.html 
b/site/releases/spark-release-2-0-1.html
index 61967e5..83df432 100644
--- a/site/releases/spark-release-2-0-1.html
+++ b/site/releases/spark-release-2-0-1.html
@@ -54,7 +54,7 @@
 
 
 https://code.jquery.com/jquery.js";>
-
+https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js";>
 
 
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/c693f2a7/site/releases/spark-release-2-0-2.html
--
diff --git a/site/releases/spark-release-2-0-2.html 
b/site/releases/spark-release-2-0-2.html
index c546a66..a97d849 100644
--- a/site/releases/spark-release-2-0-2.html
+++ b/site/releases/spark-release-2-0-2.html
@@ -54,7 +54,7 @@
 
 
 https://code.jquery.com/jquery.js";>
-
+https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js";>
 
 
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/c693f2a7/site/research.html
--
diff --git a/site/research.html b/site/research.html
index e739466..8833d3e 100644
--- a/site/research.html
+++ b/site/research.html
@@ -54,7 +54,7 @@
 
 
 https://code.jquery.com/jquery.js";>
-
+https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js";>
 
 
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/c693f2a7/site/screencasts/1-first-steps-with-spark.html
---

spark git commit: [DOC][MINOR] Kafka doc: breakup into lines

2016-11-16 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 74f5c2176 -> 3e01f1282


[DOC][MINOR] Kafka doc: breakup into lines

## Before

![before](https://cloud.githubusercontent.com/assets/15843379/20340231/99b039fe-ac1b-11e6-9ba9-b44582427459.png)

## After

![after](https://cloud.githubusercontent.com/assets/15843379/20340236/9d5796e2-ac1b-11e6-92bb-6da40ba1a383.png)

Author: Liwei Lin 

Closes #15903 from lw-lin/kafka-doc-lines.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3e01f128
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/3e01f128
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/3e01f128

Branch: refs/heads/master
Commit: 3e01f128284993f39463c0ccd902b774f57cce76
Parents: 74f5c21
Author: Liwei Lin 
Authored: Wed Nov 16 09:51:59 2016 +
Committer: Sean Owen 
Committed: Wed Nov 16 09:51:59 2016 +

--
 docs/structured-streaming-kafka-integration.md | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/3e01f128/docs/structured-streaming-kafka-integration.md
--
diff --git a/docs/structured-streaming-kafka-integration.md 
b/docs/structured-streaming-kafka-integration.md
index c4c9fb3..2458bb5 100644
--- a/docs/structured-streaming-kafka-integration.md
+++ b/docs/structured-streaming-kafka-integration.md
@@ -240,6 +240,7 @@ Kafka's own configurations can be set via 
`DataStreamReader.option` with `kafka.
 [Kafka consumer config 
docs](http://kafka.apache.org/documentation.html#newconsumerconfigs).
 
 Note that the following Kafka params cannot be set and the Kafka source will 
throw an exception:
+
 - **group.id**: Kafka source will create a unique group id for each query 
automatically.
 - **auto.offset.reset**: Set the source option `startingOffsets` to specify
  where to start instead. Structured Streaming manages which offsets are 
consumed internally, rather 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [DOC][MINOR] Kafka doc: breakup into lines

2016-11-16 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 b18c5a9b9 -> 4567db9da


[DOC][MINOR] Kafka doc: breakup into lines

## Before

![before](https://cloud.githubusercontent.com/assets/15843379/20340231/99b039fe-ac1b-11e6-9ba9-b44582427459.png)

## After

![after](https://cloud.githubusercontent.com/assets/15843379/20340236/9d5796e2-ac1b-11e6-92bb-6da40ba1a383.png)

Author: Liwei Lin 

Closes #15903 from lw-lin/kafka-doc-lines.

(cherry picked from commit 3e01f128284993f39463c0ccd902b774f57cce76)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4567db9d
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4567db9d
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4567db9d

Branch: refs/heads/branch-2.1
Commit: 4567db9da47f0830e952614393d6105f4f5587a0
Parents: b18c5a9
Author: Liwei Lin 
Authored: Wed Nov 16 09:51:59 2016 +
Committer: Sean Owen 
Committed: Wed Nov 16 09:52:09 2016 +

--
 docs/structured-streaming-kafka-integration.md | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/4567db9d/docs/structured-streaming-kafka-integration.md
--
diff --git a/docs/structured-streaming-kafka-integration.md 
b/docs/structured-streaming-kafka-integration.md
index c4c9fb3..2458bb5 100644
--- a/docs/structured-streaming-kafka-integration.md
+++ b/docs/structured-streaming-kafka-integration.md
@@ -240,6 +240,7 @@ Kafka's own configurations can be set via 
`DataStreamReader.option` with `kafka.
 [Kafka consumer config 
docs](http://kafka.apache.org/documentation.html#newconsumerconfigs).
 
 Note that the following Kafka params cannot be set and the Kafka source will 
throw an exception:
+
 - **group.id**: Kafka source will create a unique group id for each query 
automatically.
 - **auto.offset.reset**: Set the source option `startingOffsets` to specify
  where to start instead. Structured Streaming manages which offsets are 
consumed internally, rather 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18400][STREAMING] NPE when resharding Kinesis Stream

2016-11-16 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 3e01f1282 -> 43a26899e


[SPARK-18400][STREAMING] NPE when resharding Kinesis Stream

## What changes were proposed in this pull request?

Avoid NPE in KinesisRecordProcessor when shutdown happens without successful 
init

## How was this patch tested?

Existing tests

Author: Sean Owen 

Closes #15882 from srowen/SPARK-18400.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/43a26899
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/43a26899
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/43a26899

Branch: refs/heads/master
Commit: 43a26899e5dd2364297eaf8985bd68367e4735a7
Parents: 3e01f12
Author: Sean Owen 
Authored: Wed Nov 16 10:16:36 2016 +
Committer: Sean Owen 
Committed: Wed Nov 16 10:16:36 2016 +

--
 .../kinesis/KinesisRecordProcessor.scala| 42 +++-
 1 file changed, 23 insertions(+), 19 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/43a26899/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
--
diff --git 
a/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
 
b/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
index 80e0cce..a0ccd08 100644
--- 
a/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
+++ 
b/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
@@ -27,7 +27,6 @@ import 
com.amazonaws.services.kinesis.clientlibrary.types.ShutdownReason
 import com.amazonaws.services.kinesis.model.Record
 
 import org.apache.spark.internal.Logging
-import org.apache.spark.streaming.Duration
 
 /**
  * Kinesis-specific implementation of the Kinesis Client Library (KCL) 
IRecordProcessor.
@@ -102,27 +101,32 @@ private[kinesis] class 
KinesisRecordProcessor[T](receiver: KinesisReceiver[T], w
* @param checkpointer used to perform a Kinesis checkpoint for 
ShutdownReason.TERMINATE
* @param reason for shutdown (ShutdownReason.TERMINATE or 
ShutdownReason.ZOMBIE)
*/
-  override def shutdown(checkpointer: IRecordProcessorCheckpointer, reason: 
ShutdownReason) {
+  override def shutdown(
+  checkpointer: IRecordProcessorCheckpointer,
+  reason: ShutdownReason): Unit = {
 logInfo(s"Shutdown:  Shutting down workerId $workerId with reason $reason")
-reason match {
-  /*
-   * TERMINATE Use Case.  Checkpoint.
-   * Checkpoint to indicate that all records from the shard have been 
drained and processed.
-   * It's now OK to read from the new shards that resulted from a 
resharding event.
-   */
-  case ShutdownReason.TERMINATE =>
-receiver.removeCheckpointer(shardId, checkpointer)
+// null if not initialized before shutdown:
+if (shardId == null) {
+  logWarning(s"No shardId for workerId $workerId?")
+} else {
+  reason match {
+/*
+ * TERMINATE Use Case.  Checkpoint.
+ * Checkpoint to indicate that all records from the shard have been 
drained and processed.
+ * It's now OK to read from the new shards that resulted from a 
resharding event.
+ */
+case ShutdownReason.TERMINATE => receiver.removeCheckpointer(shardId, 
checkpointer)
 
-  /*
-   * ZOMBIE Use Case or Unknown reason.  NoOp.
-   * No checkpoint because other workers may have taken over and already 
started processing
-   *the same records.
-   * This may lead to records being processed more than once.
-   */
-  case _ =>
-receiver.removeCheckpointer(shardId, null) // return null so that we 
don't checkpoint
+/*
+ * ZOMBIE Use Case or Unknown reason.  NoOp.
+ * No checkpoint because other workers may have taken over and already 
started processing
+ *the same records.
+ * This may lead to records being processed more than once.
+ * Return null so that we don't checkpoint
+ */
+case _ => receiver.removeCheckpointer(shardId, null)
+  }
 }
-
   }
 }
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18400][STREAMING] NPE when resharding Kinesis Stream

2016-11-16 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 4567db9da -> a94659cee


[SPARK-18400][STREAMING] NPE when resharding Kinesis Stream

## What changes were proposed in this pull request?

Avoid NPE in KinesisRecordProcessor when shutdown happens without successful 
init

## How was this patch tested?

Existing tests

Author: Sean Owen 

Closes #15882 from srowen/SPARK-18400.

(cherry picked from commit 43a26899e5dd2364297eaf8985bd68367e4735a7)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a94659ce
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a94659ce
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a94659ce

Branch: refs/heads/branch-2.1
Commit: a94659ceeb339a93f72bad3ed059bd2cdfca4df9
Parents: 4567db9
Author: Sean Owen 
Authored: Wed Nov 16 10:16:36 2016 +
Committer: Sean Owen 
Committed: Wed Nov 16 10:16:45 2016 +

--
 .../kinesis/KinesisRecordProcessor.scala| 42 +++-
 1 file changed, 23 insertions(+), 19 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a94659ce/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
--
diff --git 
a/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
 
b/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
index 80e0cce..a0ccd08 100644
--- 
a/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
+++ 
b/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
@@ -27,7 +27,6 @@ import 
com.amazonaws.services.kinesis.clientlibrary.types.ShutdownReason
 import com.amazonaws.services.kinesis.model.Record
 
 import org.apache.spark.internal.Logging
-import org.apache.spark.streaming.Duration
 
 /**
  * Kinesis-specific implementation of the Kinesis Client Library (KCL) 
IRecordProcessor.
@@ -102,27 +101,32 @@ private[kinesis] class 
KinesisRecordProcessor[T](receiver: KinesisReceiver[T], w
* @param checkpointer used to perform a Kinesis checkpoint for 
ShutdownReason.TERMINATE
* @param reason for shutdown (ShutdownReason.TERMINATE or 
ShutdownReason.ZOMBIE)
*/
-  override def shutdown(checkpointer: IRecordProcessorCheckpointer, reason: 
ShutdownReason) {
+  override def shutdown(
+  checkpointer: IRecordProcessorCheckpointer,
+  reason: ShutdownReason): Unit = {
 logInfo(s"Shutdown:  Shutting down workerId $workerId with reason $reason")
-reason match {
-  /*
-   * TERMINATE Use Case.  Checkpoint.
-   * Checkpoint to indicate that all records from the shard have been 
drained and processed.
-   * It's now OK to read from the new shards that resulted from a 
resharding event.
-   */
-  case ShutdownReason.TERMINATE =>
-receiver.removeCheckpointer(shardId, checkpointer)
+// null if not initialized before shutdown:
+if (shardId == null) {
+  logWarning(s"No shardId for workerId $workerId?")
+} else {
+  reason match {
+/*
+ * TERMINATE Use Case.  Checkpoint.
+ * Checkpoint to indicate that all records from the shard have been 
drained and processed.
+ * It's now OK to read from the new shards that resulted from a 
resharding event.
+ */
+case ShutdownReason.TERMINATE => receiver.removeCheckpointer(shardId, 
checkpointer)
 
-  /*
-   * ZOMBIE Use Case or Unknown reason.  NoOp.
-   * No checkpoint because other workers may have taken over and already 
started processing
-   *the same records.
-   * This may lead to records being processed more than once.
-   */
-  case _ =>
-receiver.removeCheckpointer(shardId, null) // return null so that we 
don't checkpoint
+/*
+ * ZOMBIE Use Case or Unknown reason.  NoOp.
+ * No checkpoint because other workers may have taken over and already 
started processing
+ *the same records.
+ * This may lead to records being processed more than once.
+ * Return null so that we don't checkpoint
+ */
+case _ => receiver.removeCheckpointer(shardId, null)
+  }
 }
-
   }
 }
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18400][STREAMING] NPE when resharding Kinesis Stream

2016-11-16 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.0 8d55886aa -> 4f3f09696


[SPARK-18400][STREAMING] NPE when resharding Kinesis Stream

## What changes were proposed in this pull request?

Avoid NPE in KinesisRecordProcessor when shutdown happens without successful 
init

## How was this patch tested?

Existing tests

Author: Sean Owen 

Closes #15882 from srowen/SPARK-18400.

(cherry picked from commit 43a26899e5dd2364297eaf8985bd68367e4735a7)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4f3f0969
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4f3f0969
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4f3f0969

Branch: refs/heads/branch-2.0
Commit: 4f3f09696ea12b631e1db8d00baf363292c5f3e3
Parents: 8d55886
Author: Sean Owen 
Authored: Wed Nov 16 10:16:36 2016 +
Committer: Sean Owen 
Committed: Wed Nov 16 10:16:58 2016 +

--
 .../kinesis/KinesisRecordProcessor.scala| 42 +++-
 1 file changed, 23 insertions(+), 19 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/4f3f0969/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
--
diff --git 
a/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
 
b/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
index 80e0cce..a0ccd08 100644
--- 
a/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
+++ 
b/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
@@ -27,7 +27,6 @@ import 
com.amazonaws.services.kinesis.clientlibrary.types.ShutdownReason
 import com.amazonaws.services.kinesis.model.Record
 
 import org.apache.spark.internal.Logging
-import org.apache.spark.streaming.Duration
 
 /**
  * Kinesis-specific implementation of the Kinesis Client Library (KCL) 
IRecordProcessor.
@@ -102,27 +101,32 @@ private[kinesis] class 
KinesisRecordProcessor[T](receiver: KinesisReceiver[T], w
* @param checkpointer used to perform a Kinesis checkpoint for 
ShutdownReason.TERMINATE
* @param reason for shutdown (ShutdownReason.TERMINATE or 
ShutdownReason.ZOMBIE)
*/
-  override def shutdown(checkpointer: IRecordProcessorCheckpointer, reason: 
ShutdownReason) {
+  override def shutdown(
+  checkpointer: IRecordProcessorCheckpointer,
+  reason: ShutdownReason): Unit = {
 logInfo(s"Shutdown:  Shutting down workerId $workerId with reason $reason")
-reason match {
-  /*
-   * TERMINATE Use Case.  Checkpoint.
-   * Checkpoint to indicate that all records from the shard have been 
drained and processed.
-   * It's now OK to read from the new shards that resulted from a 
resharding event.
-   */
-  case ShutdownReason.TERMINATE =>
-receiver.removeCheckpointer(shardId, checkpointer)
+// null if not initialized before shutdown:
+if (shardId == null) {
+  logWarning(s"No shardId for workerId $workerId?")
+} else {
+  reason match {
+/*
+ * TERMINATE Use Case.  Checkpoint.
+ * Checkpoint to indicate that all records from the shard have been 
drained and processed.
+ * It's now OK to read from the new shards that resulted from a 
resharding event.
+ */
+case ShutdownReason.TERMINATE => receiver.removeCheckpointer(shardId, 
checkpointer)
 
-  /*
-   * ZOMBIE Use Case or Unknown reason.  NoOp.
-   * No checkpoint because other workers may have taken over and already 
started processing
-   *the same records.
-   * This may lead to records being processed more than once.
-   */
-  case _ =>
-receiver.removeCheckpointer(shardId, null) // return null so that we 
don't checkpoint
+/*
+ * ZOMBIE Use Case or Unknown reason.  NoOp.
+ * No checkpoint because other workers may have taken over and already 
started processing
+ *the same records.
+ * This may lead to records being processed more than once.
+ * Return null so that we don't checkpoint
+ */
+case _ => receiver.removeCheckpointer(shardId, null)
+  }
 }
-
   }
 }
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18410][STREAMING] Add structured kafka example

2016-11-16 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 43a26899e -> e6145772e


[SPARK-18410][STREAMING] Add structured kafka example

## What changes were proposed in this pull request?

This PR provides structured kafka wordcount examples

## How was this patch tested?

Author: uncleGen 

Closes #15849 from uncleGen/SPARK-18410.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e6145772
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e6145772
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e6145772

Branch: refs/heads/master
Commit: e6145772eda8d6d3727605e80a7c2f182c801003
Parents: 43a2689
Author: uncleGen 
Authored: Wed Nov 16 10:19:10 2016 +
Committer: Sean Owen 
Committed: Wed Nov 16 10:19:10 2016 +

--
 .../streaming/JavaStructuredKafkaWordCount.java | 96 
 .../sql/streaming/structured_kafka_wordcount.py | 90 ++
 .../streaming/StructuredKafkaWordCount.scala| 85 +
 3 files changed, 271 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/e6145772/examples/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredKafkaWordCount.java
--
diff --git 
a/examples/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredKafkaWordCount.java
 
b/examples/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredKafkaWordCount.java
new file mode 100644
index 000..0f45cfe
--- /dev/null
+++ 
b/examples/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredKafkaWordCount.java
@@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.sql.streaming;
+
+import org.apache.spark.api.java.function.FlatMapFunction;
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Encoders;
+import org.apache.spark.sql.Row;
+import org.apache.spark.sql.SparkSession;
+import org.apache.spark.sql.streaming.StreamingQuery;
+
+import java.util.Arrays;
+import java.util.Iterator;
+
+/**
+ * Consumes messages from one or more topics in Kafka and does wordcount.
+ * Usage: JavaStructuredKafkaWordCount   

+ *The Kafka "bootstrap.servers" configuration. A
+ *   comma-separated list of host:port.
+ *There are three kinds of type, i.e. 'assign', 
'subscribe',
+ *   'subscribePattern'.
+ *   |-  Specific TopicPartitions to consume. Json string
+ *   |  {"topicA":[0,1],"topicB":[2,4]}.
+ *   |-  The topic list to subscribe. A comma-separated list of
+ *   |  topics.
+ *   |-  The pattern used to subscribe to topic(s).
+ *   |  Java regex string.
+ *   |- Only one of "assign, "subscribe" or "subscribePattern" options can be
+ *   |  specified for Kafka source.
+ *Different value format depends on the value of 'subscribe-type'.
+ *
+ * Example:
+ *`$ bin/run-example \
+ *  sql.streaming.JavaStructuredKafkaWordCount host1:port1,host2:port2 \
+ *  subscribe topic1,topic2`
+ */
+public final class JavaStructuredKafkaWordCount {
+
+  public static void main(String[] args) throws Exception {
+if (args.length < 3) {
+  System.err.println("Usage: JavaStructuredKafkaWordCount 
 " +
+" ");
+  System.exit(1);
+}
+
+String bootstrapServers = args[0];
+String subscribeType = args[1];
+String topics = args[2];
+
+SparkSession spark = SparkSession
+  .builder()
+  .appName("JavaStructuredKafkaWordCount")
+  .getOrCreate();
+
+// Create DataSet representing the stream of input lines from kafka
+Dataset lines = spark
+  .readStream()
+  .format("kafka")
+  .option("kafka.bootstrap.servers", bootstrapServers)
+  .option(subscribeType, topics)
+  .load()
+  .selectExpr("CAST(value AS STRING)")
+  .as(Encoders.STRING());
+
+// Generate running word count
+Dataset wordCounts = lines.flatMap(new FlatMapFunction() {
+  @Override
+  public Iterator call(String x) {
+return Arrays.asList(x.split(" ")).iterator();
+

spark git commit: [SPARK-18410][STREAMING] Add structured kafka example

2016-11-16 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 a94659cee -> 6b2301b89


[SPARK-18410][STREAMING] Add structured kafka example

## What changes were proposed in this pull request?

This PR provides structured kafka wordcount examples

## How was this patch tested?

Author: uncleGen 

Closes #15849 from uncleGen/SPARK-18410.

(cherry picked from commit e6145772eda8d6d3727605e80a7c2f182c801003)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6b2301b8
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6b2301b8
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6b2301b8

Branch: refs/heads/branch-2.1
Commit: 6b2301b89bf5a89bd2b8a3d85c9c05a490be2ddb
Parents: a94659c
Author: uncleGen 
Authored: Wed Nov 16 10:19:10 2016 +
Committer: Sean Owen 
Committed: Wed Nov 16 10:19:18 2016 +

--
 .../streaming/JavaStructuredKafkaWordCount.java | 96 
 .../sql/streaming/structured_kafka_wordcount.py | 90 ++
 .../streaming/StructuredKafkaWordCount.scala| 85 +
 3 files changed, 271 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/6b2301b8/examples/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredKafkaWordCount.java
--
diff --git 
a/examples/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredKafkaWordCount.java
 
b/examples/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredKafkaWordCount.java
new file mode 100644
index 000..0f45cfe
--- /dev/null
+++ 
b/examples/src/main/java/org/apache/spark/examples/sql/streaming/JavaStructuredKafkaWordCount.java
@@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.examples.sql.streaming;
+
+import org.apache.spark.api.java.function.FlatMapFunction;
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Encoders;
+import org.apache.spark.sql.Row;
+import org.apache.spark.sql.SparkSession;
+import org.apache.spark.sql.streaming.StreamingQuery;
+
+import java.util.Arrays;
+import java.util.Iterator;
+
+/**
+ * Consumes messages from one or more topics in Kafka and does wordcount.
+ * Usage: JavaStructuredKafkaWordCount   

+ *The Kafka "bootstrap.servers" configuration. A
+ *   comma-separated list of host:port.
+ *There are three kinds of type, i.e. 'assign', 
'subscribe',
+ *   'subscribePattern'.
+ *   |-  Specific TopicPartitions to consume. Json string
+ *   |  {"topicA":[0,1],"topicB":[2,4]}.
+ *   |-  The topic list to subscribe. A comma-separated list of
+ *   |  topics.
+ *   |-  The pattern used to subscribe to topic(s).
+ *   |  Java regex string.
+ *   |- Only one of "assign, "subscribe" or "subscribePattern" options can be
+ *   |  specified for Kafka source.
+ *Different value format depends on the value of 'subscribe-type'.
+ *
+ * Example:
+ *`$ bin/run-example \
+ *  sql.streaming.JavaStructuredKafkaWordCount host1:port1,host2:port2 \
+ *  subscribe topic1,topic2`
+ */
+public final class JavaStructuredKafkaWordCount {
+
+  public static void main(String[] args) throws Exception {
+if (args.length < 3) {
+  System.err.println("Usage: JavaStructuredKafkaWordCount 
 " +
+" ");
+  System.exit(1);
+}
+
+String bootstrapServers = args[0];
+String subscribeType = args[1];
+String topics = args[2];
+
+SparkSession spark = SparkSession
+  .builder()
+  .appName("JavaStructuredKafkaWordCount")
+  .getOrCreate();
+
+// Create DataSet representing the stream of input lines from kafka
+Dataset lines = spark
+  .readStream()
+  .format("kafka")
+  .option("kafka.bootstrap.servers", bootstrapServers)
+  .option(subscribeType, topics)
+  .load()
+  .selectExpr("CAST(value AS STRING)")
+  .as(Encoders.STRING());
+
+// Generate running word count
+Dataset wordCounts = lines.flatMap(new FlatMapFunction() {
+  @Ov

spark git commit: [MINOR][DOC] Fix typos in the 'configuration', 'monitoring' and 'sql-programming-guide' documentation

2016-11-16 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master e6145772e -> 241e04bc0


[MINOR][DOC] Fix typos in the 'configuration', 'monitoring' and 
'sql-programming-guide' documentation

## What changes were proposed in this pull request?

Fix typos in the 'configuration', 'monitoring' and 'sql-programming-guide' 
documentation.

## How was this patch tested?
Manually.

Author: Weiqing Yang 

Closes #15886 from weiqingy/fixTypo.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/241e04bc
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/241e04bc
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/241e04bc

Branch: refs/heads/master
Commit: 241e04bc03efb1379622c0c84299e617512973ac
Parents: e614577
Author: Weiqing Yang 
Authored: Wed Nov 16 10:34:56 2016 +
Committer: Sean Owen 
Committed: Wed Nov 16 10:34:56 2016 +

--
 docs/configuration.md | 2 +-
 docs/monitoring.md| 2 +-
 docs/sql-programming-guide.md | 6 +++---
 3 files changed, 5 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/241e04bc/docs/configuration.md
--
diff --git a/docs/configuration.md b/docs/configuration.md
index ea99592..c021a37 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1951,7 +1951,7 @@ showDF(properties, numRows = 200, truncate = FALSE)
   spark.r.heartBeatInterval
   100
   
-Interval for heartbeats sents from SparkR backend to R process to prevent 
connection timeout.
+Interval for heartbeats sent from SparkR backend to R process to prevent 
connection timeout.
   
 
 

http://git-wip-us.apache.org/repos/asf/spark/blob/241e04bc/docs/monitoring.md
--
diff --git a/docs/monitoring.md b/docs/monitoring.md
index 5bc5e18..2eef456 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -41,7 +41,7 @@ directory must be supplied in the 
`spark.history.fs.logDirectory` configuration
 and should contain sub-directories that each represents an application's event 
logs.
 
 The spark jobs themselves must be configured to log events, and to log them to 
the same shared,
-writeable directory. For example, if the server was configured with a log 
directory of
+writable directory. For example, if the server was configured with a log 
directory of
 `hdfs://namenode/shared/spark-logs`, then the client-side options would be:
 
 ```

http://git-wip-us.apache.org/repos/asf/spark/blob/241e04bc/docs/sql-programming-guide.md
--
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index b9be7a7..ba3e55f 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -222,9 +222,9 @@ The `sql` function enables applications to run SQL queries 
programmatically and
 
 ## Global Temporary View
 
-Temporay views in Spark SQL are session-scoped and will disappear if the 
session that creates it
+Temporary views in Spark SQL are session-scoped and will disappear if the 
session that creates it
 terminates. If you want to have a temporary view that is shared among all 
sessions and keep alive
-until the Spark application terminiates, you can create a global temporary 
view. Global temporary
+until the Spark application terminates, you can create a global temporary 
view. Global temporary
 view is tied to a system preserved database `global_temp`, and we must use the 
qualified name to
 refer it, e.g. `SELECT * FROM global_temp.view1`.
 
@@ -1029,7 +1029,7 @@ following command:
 bin/spark-shell --driver-class-path postgresql-9.4.1207.jar --jars 
postgresql-9.4.1207.jar
 {% endhighlight %}
 
-Tables from the remote database can be loaded as a DataFrame or Spark SQL 
Temporary table using
+Tables from the remote database can be loaded as a DataFrame or Spark SQL 
temporary view using
 the Data Sources API. Users can specify the JDBC connection properties in the 
data source options.
 user and password are normally provided as 
connection properties for
 logging into the data sources. In addition to the connection properties, Spark 
also supports


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [MINOR][DOC] Fix typos in the 'configuration', 'monitoring' and 'sql-programming-guide' documentation

2016-11-16 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 6b2301b89 -> 820847008


[MINOR][DOC] Fix typos in the 'configuration', 'monitoring' and 
'sql-programming-guide' documentation

## What changes were proposed in this pull request?

Fix typos in the 'configuration', 'monitoring' and 'sql-programming-guide' 
documentation.

## How was this patch tested?
Manually.

Author: Weiqing Yang 

Closes #15886 from weiqingy/fixTypo.

(cherry picked from commit 241e04bc03efb1379622c0c84299e617512973ac)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/82084700
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/82084700
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/82084700

Branch: refs/heads/branch-2.1
Commit: 8208470084153f0be6818f66309f63dcdcb16519
Parents: 6b2301b
Author: Weiqing Yang 
Authored: Wed Nov 16 10:34:56 2016 +
Committer: Sean Owen 
Committed: Wed Nov 16 10:35:05 2016 +

--
 docs/configuration.md | 2 +-
 docs/monitoring.md| 2 +-
 docs/sql-programming-guide.md | 6 +++---
 3 files changed, 5 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/82084700/docs/configuration.md
--
diff --git a/docs/configuration.md b/docs/configuration.md
index d0acd94..e0c6613 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1916,7 +1916,7 @@ showDF(properties, numRows = 200, truncate = FALSE)
   spark.r.heartBeatInterval
   100
   
-Interval for heartbeats sents from SparkR backend to R process to prevent 
connection timeout.
+Interval for heartbeats sent from SparkR backend to R process to prevent 
connection timeout.
   
 
 

http://git-wip-us.apache.org/repos/asf/spark/blob/82084700/docs/monitoring.md
--
diff --git a/docs/monitoring.md b/docs/monitoring.md
index 5bc5e18..2eef456 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -41,7 +41,7 @@ directory must be supplied in the 
`spark.history.fs.logDirectory` configuration
 and should contain sub-directories that each represents an application's event 
logs.
 
 The spark jobs themselves must be configured to log events, and to log them to 
the same shared,
-writeable directory. For example, if the server was configured with a log 
directory of
+writable directory. For example, if the server was configured with a log 
directory of
 `hdfs://namenode/shared/spark-logs`, then the client-side options would be:
 
 ```

http://git-wip-us.apache.org/repos/asf/spark/blob/82084700/docs/sql-programming-guide.md
--
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index b9be7a7..ba3e55f 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -222,9 +222,9 @@ The `sql` function enables applications to run SQL queries 
programmatically and
 
 ## Global Temporary View
 
-Temporay views in Spark SQL are session-scoped and will disappear if the 
session that creates it
+Temporary views in Spark SQL are session-scoped and will disappear if the 
session that creates it
 terminates. If you want to have a temporary view that is shared among all 
sessions and keep alive
-until the Spark application terminiates, you can create a global temporary 
view. Global temporary
+until the Spark application terminates, you can create a global temporary 
view. Global temporary
 view is tied to a system preserved database `global_temp`, and we must use the 
qualified name to
 refer it, e.g. `SELECT * FROM global_temp.view1`.
 
@@ -1029,7 +1029,7 @@ following command:
 bin/spark-shell --driver-class-path postgresql-9.4.1207.jar --jars 
postgresql-9.4.1207.jar
 {% endhighlight %}
 
-Tables from the remote database can be loaded as a DataFrame or Spark SQL 
Temporary table using
+Tables from the remote database can be loaded as a DataFrame or Spark SQL 
temporary view using
 the Data Sources API. Users can specify the JDBC connection properties in the 
data source options.
 user and password are normally provided as 
connection properties for
 logging into the data sources. In addition to the connection properties, Spark 
also supports


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18446][ML][DOCS] Add links to API docs for ML algos

2016-11-16 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master c68f1a38a -> a75e3fe92


[SPARK-18446][ML][DOCS] Add links to API docs for ML algos

## What changes were proposed in this pull request?
Add links to API docs for ML algos
## How was this patch tested?
Manual checking for the API links

Author: Zheng RuiFeng 

Closes #15890 from zhengruifeng/algo_link.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a75e3fe9
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a75e3fe9
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a75e3fe9

Branch: refs/heads/master
Commit: a75e3fe923372c56bc1b2f4baeaaf5868ad28341
Parents: c68f1a3
Author: Zheng RuiFeng 
Authored: Wed Nov 16 10:53:23 2016 +
Committer: Sean Owen 
Committed: Wed Nov 16 10:53:23 2016 +

--
 docs/ml-classification-regression.md | 39 +++
 docs/ml-pipeline.md  | 25 
 docs/ml-tuning.md| 17 ++
 3 files changed, 81 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a75e3fe9/docs/ml-classification-regression.md
--
diff --git a/docs/ml-classification-regression.md 
b/docs/ml-classification-regression.md
index b10793d..1aacc3e 100644
--- a/docs/ml-classification-regression.md
+++ b/docs/ml-classification-regression.md
@@ -55,14 +55,23 @@ $\alpha$ and `regParam` corresponds to $\lambda$.
 
 
 
+
+More details on parameters can be found in the [Scala API 
documentation](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression).
+
 {% include_example 
scala/org/apache/spark/examples/ml/LogisticRegressionWithElasticNetExample.scala
 %}
 
 
 
+
+More details on parameters can be found in the [Java API 
documentation](api/java/org/apache/spark/ml/classification/LogisticRegression.html).
+
 {% include_example 
java/org/apache/spark/examples/ml/JavaLogisticRegressionWithElasticNetExample.java
 %}
 
 
 
+
+More details on parameters can be found in the [Python API 
documentation](api/python/pyspark.ml.html#pyspark.ml.classification.LogisticRegression).
+
 {% include_example python/ml/logistic_regression_with_elastic_net.py %}
 
 
@@ -289,14 +298,23 @@ MLPC employs backpropagation for learning the model. We 
use the logistic loss fu
 
 
 
+
+Refer to the [Scala API 
docs](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier)
 for more details.
+
 {% include_example 
scala/org/apache/spark/examples/ml/MultilayerPerceptronClassifierExample.scala 
%}
 
 
 
+
+Refer to the [Java API 
docs](api/java/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.html)
 for more details.
+
 {% include_example 
java/org/apache/spark/examples/ml/JavaMultilayerPerceptronClassifierExample.java
 %}
 
 
 
+
+Refer to the [Python API 
docs](api/python/pyspark.ml.html#pyspark.ml.classification.MultilayerPerceptronClassifier)
 for more details.
+
 {% include_example python/ml/multilayer_perceptron_classification.py %}
 
 
@@ -392,15 +410,24 @@ regression model and extracting model summary statistics.
 
 
 
+
+More details on parameters can be found in the [Scala API 
documentation](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression).
+
 {% include_example 
scala/org/apache/spark/examples/ml/LinearRegressionWithElasticNetExample.scala 
%}
 
 
 
+
+More details on parameters can be found in the [Java API 
documentation](api/java/org/apache/spark/ml/regression/LinearRegression.html).
+
 {% include_example 
java/org/apache/spark/examples/ml/JavaLinearRegressionWithElasticNetExample.java
 %}
 
 
 
 
+
+More details on parameters can be found in the [Python API 
documentation](api/python/pyspark.ml.html#pyspark.ml.regression.LinearRegression).
+
 {% include_example python/ml/linear_regression_with_elastic_net.py %}
 
 
@@ -519,18 +546,21 @@ function and extracting model summary statistics.
 
 
 
+
 Refer to the [Scala API 
docs](api/scala/index.html#org.apache.spark.ml.regression.GeneralizedLinearRegression)
 for more details.
 
 {% include_example 
scala/org/apache/spark/examples/ml/GeneralizedLinearRegressionExample.scala %}
 
 
 
+
 Refer to the [Java API 
docs](api/java/org/apache/spark/ml/regression/GeneralizedLinearRegression.html) 
for more details.
 
 {% include_example 
java/org/apache/spark/examples/ml/JavaGeneralizedLinearRegressionExample.java %}
 
 
 
+
 Refer to the [Python API 
docs](api/python/pyspark.ml.html#pyspark.ml.regression.GeneralizedLinearRegression)
 for more details.
 
 {% include_example python/ml/generalized_linear_regression_example.py %}
@@ -705,14 +735,23 @@ The implementation matches the result from R's survival 
function
 
 
 
+
+Refer to the [Scala API 
docs](api/scala/index.html#org.a

spark git commit: [SPARK-18446][ML][DOCS] Add links to API docs for ML algos

2016-11-16 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 6b6eb4e52 -> 416bc3dd3


[SPARK-18446][ML][DOCS] Add links to API docs for ML algos

## What changes were proposed in this pull request?
Add links to API docs for ML algos
## How was this patch tested?
Manual checking for the API links

Author: Zheng RuiFeng 

Closes #15890 from zhengruifeng/algo_link.

(cherry picked from commit a75e3fe923372c56bc1b2f4baeaaf5868ad28341)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/416bc3dd
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/416bc3dd
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/416bc3dd

Branch: refs/heads/branch-2.1
Commit: 416bc3dd3db7f7ae2cc7b3ffe395decd0c5b73f9
Parents: 6b6eb4e
Author: Zheng RuiFeng 
Authored: Wed Nov 16 10:53:23 2016 +
Committer: Sean Owen 
Committed: Wed Nov 16 10:53:32 2016 +

--
 docs/ml-classification-regression.md | 39 +++
 docs/ml-pipeline.md  | 25 
 docs/ml-tuning.md| 17 ++
 3 files changed, 81 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/416bc3dd/docs/ml-classification-regression.md
--
diff --git a/docs/ml-classification-regression.md 
b/docs/ml-classification-regression.md
index bb2e404..cb2ccbf 100644
--- a/docs/ml-classification-regression.md
+++ b/docs/ml-classification-regression.md
@@ -55,14 +55,23 @@ $\alpha$ and `regParam` corresponds to $\lambda$.
 
 
 
+
+More details on parameters can be found in the [Scala API 
documentation](api/scala/index.html#org.apache.spark.ml.classification.LogisticRegression).
+
 {% include_example 
scala/org/apache/spark/examples/ml/LogisticRegressionWithElasticNetExample.scala
 %}
 
 
 
+
+More details on parameters can be found in the [Java API 
documentation](api/java/org/apache/spark/ml/classification/LogisticRegression.html).
+
 {% include_example 
java/org/apache/spark/examples/ml/JavaLogisticRegressionWithElasticNetExample.java
 %}
 
 
 
+
+More details on parameters can be found in the [Python API 
documentation](api/python/pyspark.ml.html#pyspark.ml.classification.LogisticRegression).
+
 {% include_example python/ml/logistic_regression_with_elastic_net.py %}
 
 
@@ -289,14 +298,23 @@ MLPC employs backpropagation for learning the model. We 
use the logistic loss fu
 
 
 
+
+Refer to the [Scala API 
docs](api/scala/index.html#org.apache.spark.ml.classification.MultilayerPerceptronClassifier)
 for more details.
+
 {% include_example 
scala/org/apache/spark/examples/ml/MultilayerPerceptronClassifierExample.scala 
%}
 
 
 
+
+Refer to the [Java API 
docs](api/java/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.html)
 for more details.
+
 {% include_example 
java/org/apache/spark/examples/ml/JavaMultilayerPerceptronClassifierExample.java
 %}
 
 
 
+
+Refer to the [Python API 
docs](api/python/pyspark.ml.html#pyspark.ml.classification.MultilayerPerceptronClassifier)
 for more details.
+
 {% include_example python/ml/multilayer_perceptron_classification.py %}
 
 
@@ -392,15 +410,24 @@ regression model and extracting model summary statistics.
 
 
 
+
+More details on parameters can be found in the [Scala API 
documentation](api/scala/index.html#org.apache.spark.ml.regression.LinearRegression).
+
 {% include_example 
scala/org/apache/spark/examples/ml/LinearRegressionWithElasticNetExample.scala 
%}
 
 
 
+
+More details on parameters can be found in the [Java API 
documentation](api/java/org/apache/spark/ml/regression/LinearRegression.html).
+
 {% include_example 
java/org/apache/spark/examples/ml/JavaLinearRegressionWithElasticNetExample.java
 %}
 
 
 
 
+
+More details on parameters can be found in the [Python API 
documentation](api/python/pyspark.ml.html#pyspark.ml.regression.LinearRegression).
+
 {% include_example python/ml/linear_regression_with_elastic_net.py %}
 
 
@@ -519,18 +546,21 @@ function and extracting model summary statistics.
 
 
 
+
 Refer to the [Scala API 
docs](api/scala/index.html#org.apache.spark.ml.regression.GeneralizedLinearRegression)
 for more details.
 
 {% include_example 
scala/org/apache/spark/examples/ml/GeneralizedLinearRegressionExample.scala %}
 
 
 
+
 Refer to the [Java API 
docs](api/java/org/apache/spark/ml/regression/GeneralizedLinearRegression.html) 
for more details.
 
 {% include_example 
java/org/apache/spark/examples/ml/JavaGeneralizedLinearRegressionExample.java %}
 
 
 
+
 Refer to the [Python API 
docs](api/python/pyspark.ml.html#pyspark.ml.regression.GeneralizedLinearRegression)
 for more details.
 
 {% include_example python/ml/generalized_linear_regression_example.py %}
@@ -705,14 +735,23 @@ The implementation matches t

spark git commit: [SPARK-18420][BUILD] Fix the errors caused by lint check in Java

2016-11-16 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master a75e3fe92 -> 7569cf6cb


[SPARK-18420][BUILD] Fix the errors caused by lint check in Java

## What changes were proposed in this pull request?

Small fix, fix the errors caused by lint check in Java

- Clear unused objects and `UnusedImports`.
- Add comments around the method `finalize` of `NioBufferedFileInputStream`to 
turn off checkstyle.
- Cut the line which is longer than 100 characters into two lines.

## How was this patch tested?
Travis CI.
```
$ build/mvn -T 4 -q -DskipTests -Pyarn -Phadoop-2.3 -Pkinesis-asl -Phive 
-Phive-thriftserver install
$ dev/lint-java
```
Before:
```
Checkstyle checks failed at following occurrences:
[ERROR] src/main/java/org/apache/spark/network/util/TransportConf.java:[21,8] 
(imports) UnusedImports: Unused import - 
org.apache.commons.crypto.cipher.CryptoCipherFactory.
[ERROR] src/test/java/org/apache/spark/network/sasl/SparkSaslSuite.java:[516,5] 
(modifier) RedundantModifier: Redundant 'public' modifier.
[ERROR] src/main/java/org/apache/spark/io/NioBufferedFileInputStream.java:[133] 
(coding) NoFinalizer: Avoid using finalizer method.
[ERROR] 
src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeMapData.java:[71] 
(sizes) LineLength: Line is longer than 100 characters (found 113).
[ERROR] 
src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java:[112]
 (sizes) LineLength: Line is longer than 100 characters (found 110).
[ERROR] 
src/test/java/org/apache/spark/sql/catalyst/expressions/HiveHasherSuite.java:[31,17]
 (modifier) ModifierOrder: 'static' modifier out of order with the JLS 
suggestions.
[ERROR]src/main/java/org/apache/spark/examples/ml/JavaLogisticRegressionWithElasticNetExample.java:[64]
 (sizes) LineLength: Line is longer than 100 characters (found 103).
[ERROR] 
src/main/java/org/apache/spark/examples/ml/JavaInteractionExample.java:[22,8] 
(imports) UnusedImports: Unused import - org.apache.spark.ml.linalg.Vectors.
[ERROR] 
src/main/java/org/apache/spark/examples/ml/JavaInteractionExample.java:[51] 
(regexp) RegexpSingleline: No trailing whitespace allowed.
```

After:
```
$ build/mvn -T 4 -q -DskipTests -Pyarn -Phadoop-2.3 -Pkinesis-asl -Phive 
-Phive-thriftserver install
$ dev/lint-java
Using `mvn` from path: 
/home/travis/build/ConeyLiu/spark/build/apache-maven-3.3.9/bin/mvn
Checkstyle checks passed.
```

Author: Xianyang Liu 

Closes #15865 from ConeyLiu/master.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7569cf6c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7569cf6c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7569cf6c

Branch: refs/heads/master
Commit: 7569cf6cb85bda7d0e76d3e75e286d4796e77e08
Parents: a75e3fe
Author: Xianyang Liu 
Authored: Wed Nov 16 11:59:00 2016 +
Committer: Sean Owen 
Committed: Wed Nov 16 11:59:00 2016 +

--
 .../org/apache/spark/network/util/TransportConf.java |  1 -
 .../apache/spark/network/sasl/SparkSaslSuite.java|  2 +-
 .../apache/spark/io/NioBufferedFileInputStream.java  |  2 ++
 dev/checkstyle.xml   | 15 +++
 .../spark/examples/ml/JavaInteractionExample.java|  3 +--
 .../JavaLogisticRegressionWithElasticNetExample.java |  4 ++--
 .../sql/catalyst/expressions/UnsafeArrayData.java|  3 ++-
 .../sql/catalyst/expressions/UnsafeMapData.java  |  3 ++-
 .../sql/catalyst/expressions/HiveHasherSuite.java|  1 -
 9 files changed, 25 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/7569cf6c/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java
--
diff --git 
a/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java
 
b/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java
index d0d0728..012bb09 100644
--- 
a/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java
+++ 
b/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java
@@ -18,7 +18,6 @@
 package org.apache.spark.network.util;
 
 import com.google.common.primitives.Ints;
-import org.apache.commons.crypto.cipher.CryptoCipherFactory;
 
 /**
  * A central location that tracks all the settings we expose to users.

http://git-wip-us.apache.org/repos/asf/spark/blob/7569cf6c/common/network-common/src/test/java/org/apache/spark/network/sasl/SparkSaslSuite.java
--
diff --git 
a/common/network-common/src/test/java/org/apache/spark/network/sasl/SparkSaslSuite.java
 
b/common/network-common/src/test/java/org/apache/spark/network/sasl/SparkSaslSuite.java
index 4e6146c..ef2ab34 100644
--- 
a/co

spark git commit: [SPARK-18420][BUILD] Fix the errors caused by lint check in Java

2016-11-16 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 416bc3dd3 -> b0ae87123


[SPARK-18420][BUILD] Fix the errors caused by lint check in Java

Small fix, fix the errors caused by lint check in Java

- Clear unused objects and `UnusedImports`.
- Add comments around the method `finalize` of `NioBufferedFileInputStream`to 
turn off checkstyle.
- Cut the line which is longer than 100 characters into two lines.

Travis CI.
```
$ build/mvn -T 4 -q -DskipTests -Pyarn -Phadoop-2.3 -Pkinesis-asl -Phive 
-Phive-thriftserver install
$ dev/lint-java
```
Before:
```
Checkstyle checks failed at following occurrences:
[ERROR] src/main/java/org/apache/spark/network/util/TransportConf.java:[21,8] 
(imports) UnusedImports: Unused import - 
org.apache.commons.crypto.cipher.CryptoCipherFactory.
[ERROR] src/test/java/org/apache/spark/network/sasl/SparkSaslSuite.java:[516,5] 
(modifier) RedundantModifier: Redundant 'public' modifier.
[ERROR] src/main/java/org/apache/spark/io/NioBufferedFileInputStream.java:[133] 
(coding) NoFinalizer: Avoid using finalizer method.
[ERROR] 
src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeMapData.java:[71] 
(sizes) LineLength: Line is longer than 100 characters (found 113).
[ERROR] 
src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java:[112]
 (sizes) LineLength: Line is longer than 100 characters (found 110).
[ERROR] 
src/test/java/org/apache/spark/sql/catalyst/expressions/HiveHasherSuite.java:[31,17]
 (modifier) ModifierOrder: 'static' modifier out of order with the JLS 
suggestions.
[ERROR]src/main/java/org/apache/spark/examples/ml/JavaLogisticRegressionWithElasticNetExample.java:[64]
 (sizes) LineLength: Line is longer than 100 characters (found 103).
[ERROR] 
src/main/java/org/apache/spark/examples/ml/JavaInteractionExample.java:[22,8] 
(imports) UnusedImports: Unused import - org.apache.spark.ml.linalg.Vectors.
[ERROR] 
src/main/java/org/apache/spark/examples/ml/JavaInteractionExample.java:[51] 
(regexp) RegexpSingleline: No trailing whitespace allowed.
```

After:
```
$ build/mvn -T 4 -q -DskipTests -Pyarn -Phadoop-2.3 -Pkinesis-asl -Phive 
-Phive-thriftserver install
$ dev/lint-java
Using `mvn` from path: 
/home/travis/build/ConeyLiu/spark/build/apache-maven-3.3.9/bin/mvn
Checkstyle checks passed.
```

Author: Xianyang Liu 

Closes #15865 from ConeyLiu/master.

(cherry picked from commit 7569cf6cb85bda7d0e76d3e75e286d4796e77e08)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b0ae8712
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b0ae8712
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b0ae8712

Branch: refs/heads/branch-2.1
Commit: b0ae8712358fc8c07aa5efe4d0bd337e7e452078
Parents: 416bc3d
Author: Xianyang Liu 
Authored: Wed Nov 16 11:59:00 2016 +
Committer: Sean Owen 
Committed: Wed Nov 16 12:45:57 2016 +

--
 .../apache/spark/io/NioBufferedFileInputStream.java  |  2 ++
 dev/checkstyle.xml   | 15 +++
 .../spark/examples/ml/JavaInteractionExample.java|  3 +--
 .../JavaLogisticRegressionWithElasticNetExample.java |  4 ++--
 .../sql/catalyst/expressions/UnsafeArrayData.java|  3 ++-
 .../sql/catalyst/expressions/UnsafeMapData.java  |  3 ++-
 .../sql/catalyst/expressions/HiveHasherSuite.java|  1 -
 7 files changed, 24 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/b0ae8712/core/src/main/java/org/apache/spark/io/NioBufferedFileInputStream.java
--
diff --git 
a/core/src/main/java/org/apache/spark/io/NioBufferedFileInputStream.java 
b/core/src/main/java/org/apache/spark/io/NioBufferedFileInputStream.java
index f6d1288..ea5f1a9 100644
--- a/core/src/main/java/org/apache/spark/io/NioBufferedFileInputStream.java
+++ b/core/src/main/java/org/apache/spark/io/NioBufferedFileInputStream.java
@@ -130,8 +130,10 @@ public final class NioBufferedFileInputStream extends 
InputStream {
 StorageUtils.dispose(byteBuffer);
   }
 
+  //checkstyle.off: NoFinalizer
   @Override
   protected void finalize() throws IOException {
 close();
   }
+  //checkstyle.on: NoFinalizer
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/b0ae8712/dev/checkstyle.xml
--
diff --git a/dev/checkstyle.xml b/dev/checkstyle.xml
index 3de6aa9..92c5251 100644
--- a/dev/checkstyle.xml
+++ b/dev/checkstyle.xml
@@ -52,6 +52,20 @@
   
 
 
+
+
+
+
+
+
+
 
 
 
@@ -168,5 +182,6 @@
 
 
 
+
 
 

http://git-wip-us.apache.org/repos/asf/spark/blob/b0ae8712/examples/src/main/java/org/apache/spark/examples/ml/JavaInteracti

spark git commit: [YARN][DOC] Remove non-Yarn specific configurations from running-on-yarn.md

2016-11-17 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 014fceee0 -> 2ee4fc889


[YARN][DOC] Remove non-Yarn specific configurations from running-on-yarn.md

## What changes were proposed in this pull request?

Remove `spark.driver.memory`, `spark.executor.memory`,  `spark.driver.cores`, 
and `spark.executor.cores` from `running-on-yarn.md` as they are not 
Yarn-specific, and they are also defined in`configuration.md`.

## How was this patch tested?
Build passed & Manually check.

Author: Weiqing Yang 

Closes #15869 from weiqingy/yarnDoc.

(cherry picked from commit a3cac7bd86a6fe8e9b42da1bf580aaeb59378304)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2ee4fc88
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2ee4fc88
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2ee4fc88

Branch: refs/heads/branch-2.1
Commit: 2ee4fc8891be53b2fae43faa5cd09ade32173bba
Parents: 014fcee
Author: Weiqing Yang 
Authored: Thu Nov 17 11:13:22 2016 +
Committer: Sean Owen 
Committed: Thu Nov 17 11:13:30 2016 +

--
 docs/running-on-yarn.md | 36 
 1 file changed, 36 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/2ee4fc88/docs/running-on-yarn.md
--
diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md
index fe0221c..4d1fafc 100644
--- a/docs/running-on-yarn.md
+++ b/docs/running-on-yarn.md
@@ -118,28 +118,6 @@ To use a custom metrics.properties for the application 
master and executors, upd
   
 
 
-  spark.driver.memory
-  1g
-  
-Amount of memory to use for the driver process, i.e. where SparkContext is 
initialized.
-(e.g. 1g, 2g).
-
-Note: In client mode, this config must not be set through 
the SparkConf
-directly in your application, because the driver JVM has already started 
at that point.
-Instead, please set this through the --driver-memory command 
line option
-or in your default properties file.
-  
-
-
-  spark.driver.cores
-  1
-  
-Number of cores used by the driver in YARN cluster mode.
-Since the driver is run in the same JVM as the YARN Application Master in 
cluster mode, this also controls the cores used by the YARN Application Master.
-In client mode, use spark.yarn.am.cores to control the number 
of cores used by the YARN Application Master instead.
-  
-
-
   spark.yarn.am.cores
   1
   
@@ -234,13 +212,6 @@ To use a custom metrics.properties for the application 
master and executors, upd
   
 
 
-  spark.executor.cores
-  1 in YARN mode, all the available cores on the worker in standalone 
mode.
-  
-The number of cores to use on each executor. For YARN and standalone mode 
only.
-  
-
-
  spark.executor.instances
   2
   
@@ -248,13 +219,6 @@ To use a custom metrics.properties for the application 
master and executors, upd
   
 
 
-  spark.executor.memory
-  1g
-  
-Amount of memory to use per executor process (e.g. 2g, 
8g).
-  
-
-
  spark.yarn.executor.memoryOverhead
   executorMemory * 0.10, with minimum of 384 
   


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [YARN][DOC] Remove non-Yarn specific configurations from running-on-yarn.md

2016-11-17 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 07b3f045c -> a3cac7bd8


[YARN][DOC] Remove non-Yarn specific configurations from running-on-yarn.md

## What changes were proposed in this pull request?

Remove `spark.driver.memory`, `spark.executor.memory`,  `spark.driver.cores`, 
and `spark.executor.cores` from `running-on-yarn.md` as they are not 
Yarn-specific, and they are also defined in`configuration.md`.

## How was this patch tested?
Build passed & Manually check.

Author: Weiqing Yang 

Closes #15869 from weiqingy/yarnDoc.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a3cac7bd
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a3cac7bd
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a3cac7bd

Branch: refs/heads/master
Commit: a3cac7bd86a6fe8e9b42da1bf580aaeb59378304
Parents: 07b3f04
Author: Weiqing Yang 
Authored: Thu Nov 17 11:13:22 2016 +
Committer: Sean Owen 
Committed: Thu Nov 17 11:13:22 2016 +

--
 docs/running-on-yarn.md | 36 
 1 file changed, 36 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a3cac7bd/docs/running-on-yarn.md
--
diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md
index fe0221c..4d1fafc 100644
--- a/docs/running-on-yarn.md
+++ b/docs/running-on-yarn.md
@@ -118,28 +118,6 @@ To use a custom metrics.properties for the application 
master and executors, upd
   
 
 
-  spark.driver.memory
-  1g
-  
-Amount of memory to use for the driver process, i.e. where SparkContext is 
initialized.
-(e.g. 1g, 2g).
-
-Note: In client mode, this config must not be set through 
the SparkConf
-directly in your application, because the driver JVM has already started 
at that point.
-Instead, please set this through the --driver-memory command 
line option
-or in your default properties file.
-  
-
-
-  spark.driver.cores
-  1
-  
-Number of cores used by the driver in YARN cluster mode.
-Since the driver is run in the same JVM as the YARN Application Master in 
cluster mode, this also controls the cores used by the YARN Application Master.
-In client mode, use spark.yarn.am.cores to control the number 
of cores used by the YARN Application Master instead.
-  
-
-
   spark.yarn.am.cores
   1
   
@@ -234,13 +212,6 @@ To use a custom metrics.properties for the application 
master and executors, upd
   
 
 
-  spark.executor.cores
-  1 in YARN mode, all the available cores on the worker in standalone 
mode.
-  
-The number of cores to use on each executor. For YARN and standalone mode 
only.
-  
-
-
  spark.executor.instances
   2
   
@@ -248,13 +219,6 @@ To use a custom metrics.properties for the application 
master and executors, upd
   
 
 
-  spark.executor.memory
-  1g
-  
-Amount of memory to use per executor process (e.g. 2g, 
8g).
-  
-
-
  spark.yarn.executor.memoryOverhead
   executorMemory * 0.10, with minimum of 384 
   


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18365][DOCS] Improve Sample Method Documentation

2016-11-17 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master a3cac7bd8 -> 49b6f456a


[SPARK-18365][DOCS] Improve Sample Method Documentation

## What changes were proposed in this pull request?

I found the documentation for the sample method to be confusing, this adds more 
clarification across all languages.

- [x] Scala
- [x] Python
- [x] R
- [x] RDD Scala
- [ ] RDD Python with SEED
- [X] RDD Java
- [x] RDD Java with SEED
- [x] RDD Python

## How was this patch tested?

NA

Please review 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before 
opening a pull request.

Author: anabranch 
Author: Bill Chambers 

Closes #15815 from anabranch/SPARK-18365.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/49b6f456
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/49b6f456
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/49b6f456

Branch: refs/heads/master
Commit: 49b6f456aca350e9e2c170782aa5cc75e7822680
Parents: a3cac7b
Author: anabranch 
Authored: Thu Nov 17 11:34:55 2016 +
Committer: Sean Owen 
Committed: Thu Nov 17 11:34:55 2016 +

--
 R/pkg/R/DataFrame.R   |  4 +++-
 .../main/scala/org/apache/spark/api/java/JavaRDD.scala|  8 ++--
 core/src/main/scala/org/apache/spark/rdd/RDD.scala|  3 +++
 python/pyspark/rdd.py |  5 +
 python/pyspark/sql/dataframe.py   |  5 +
 .../src/main/scala/org/apache/spark/sql/Dataset.scala | 10 --
 6 files changed, 30 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/49b6f456/R/pkg/R/DataFrame.R
--
diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R
index 1cf9b38..4e3d97b 100644
--- a/R/pkg/R/DataFrame.R
+++ b/R/pkg/R/DataFrame.R
@@ -936,7 +936,9 @@ setMethod("unique",
 
 #' Sample
 #'
-#' Return a sampled subset of this SparkDataFrame using a random seed.
+#' Return a sampled subset of this SparkDataFrame using a random seed. 
+#' Note: this is not guaranteed to provide exactly the fraction specified
+#' of the total count of of the given SparkDataFrame.
 #'
 #' @param x A SparkDataFrame
 #' @param withReplacement Sampling with replacement or not

http://git-wip-us.apache.org/repos/asf/spark/blob/49b6f456/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala
--
diff --git a/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala 
b/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala
index 20d6c93..d67cff6 100644
--- a/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala
+++ b/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala
@@ -98,7 +98,9 @@ class JavaRDD[T](val rdd: RDD[T])(implicit val classTag: 
ClassTag[T])
   def repartition(numPartitions: Int): JavaRDD[T] = 
rdd.repartition(numPartitions)
 
   /**
-   * Return a sampled subset of this RDD.
+   * Return a sampled subset of this RDD with a random seed.
+   * Note: this is NOT guaranteed to provide exactly the fraction of the count
+   * of the given [[RDD]].
*
* @param withReplacement can elements be sampled multiple times (replaced 
when sampled out)
* @param fraction expected size of the sample as a fraction of this RDD's 
size
@@ -109,7 +111,9 @@ class JavaRDD[T](val rdd: RDD[T])(implicit val classTag: 
ClassTag[T])
 sample(withReplacement, fraction, Utils.random.nextLong)
 
   /**
-   * Return a sampled subset of this RDD.
+   * Return a sampled subset of this RDD, with a user-supplied seed.
+   * Note: this is NOT guaranteed to provide exactly the fraction of the count
+   * of the given [[RDD]].
*
* @param withReplacement can elements be sampled multiple times (replaced 
when sampled out)
* @param fraction expected size of the sample as a fraction of this RDD's 
size

http://git-wip-us.apache.org/repos/asf/spark/blob/49b6f456/core/src/main/scala/org/apache/spark/rdd/RDD.scala
--
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
index e018af3..cded899 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
@@ -466,6 +466,9 @@ abstract class RDD[T: ClassTag](
   /**
* Return a sampled subset of this RDD.
*
+   * Note: this is NOT guaranteed to provide exactly the fraction of the count
+   * of the given [[RDD]].
+   *
* @param withReplacement can elements be sampled multiple times (replaced 
when sampled out)
* @param fraction expected size of the sample as a fraction of this RDD's 
size
*  without

spark git commit: [SPARK-18365][DOCS] Improve Sample Method Documentation

2016-11-17 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 2ee4fc889 -> 4fcecb4cf


[SPARK-18365][DOCS] Improve Sample Method Documentation

## What changes were proposed in this pull request?

I found the documentation for the sample method to be confusing, this adds more 
clarification across all languages.

- [x] Scala
- [x] Python
- [x] R
- [x] RDD Scala
- [ ] RDD Python with SEED
- [X] RDD Java
- [x] RDD Java with SEED
- [x] RDD Python

## How was this patch tested?

NA

Please review 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before 
opening a pull request.

Author: anabranch 
Author: Bill Chambers 

Closes #15815 from anabranch/SPARK-18365.

(cherry picked from commit 49b6f456aca350e9e2c170782aa5cc75e7822680)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4fcecb4c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4fcecb4c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4fcecb4c

Branch: refs/heads/branch-2.1
Commit: 4fcecb4cf081fba0345f1939420ca1d9f6de720c
Parents: 2ee4fc8
Author: anabranch 
Authored: Thu Nov 17 11:34:55 2016 +
Committer: Sean Owen 
Committed: Thu Nov 17 11:35:04 2016 +

--
 R/pkg/R/DataFrame.R   |  4 +++-
 .../main/scala/org/apache/spark/api/java/JavaRDD.scala|  8 ++--
 core/src/main/scala/org/apache/spark/rdd/RDD.scala|  3 +++
 python/pyspark/rdd.py |  5 +
 python/pyspark/sql/dataframe.py   |  5 +
 .../src/main/scala/org/apache/spark/sql/Dataset.scala | 10 --
 6 files changed, 30 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/4fcecb4c/R/pkg/R/DataFrame.R
--
diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R
index 1cf9b38..4e3d97b 100644
--- a/R/pkg/R/DataFrame.R
+++ b/R/pkg/R/DataFrame.R
@@ -936,7 +936,9 @@ setMethod("unique",
 
 #' Sample
 #'
-#' Return a sampled subset of this SparkDataFrame using a random seed.
+#' Return a sampled subset of this SparkDataFrame using a random seed. 
+#' Note: this is not guaranteed to provide exactly the fraction specified
+#' of the total count of of the given SparkDataFrame.
 #'
 #' @param x A SparkDataFrame
 #' @param withReplacement Sampling with replacement or not

http://git-wip-us.apache.org/repos/asf/spark/blob/4fcecb4c/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala
--
diff --git a/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala 
b/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala
index 20d6c93..d67cff6 100644
--- a/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala
+++ b/core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala
@@ -98,7 +98,9 @@ class JavaRDD[T](val rdd: RDD[T])(implicit val classTag: 
ClassTag[T])
   def repartition(numPartitions: Int): JavaRDD[T] = 
rdd.repartition(numPartitions)
 
   /**
-   * Return a sampled subset of this RDD.
+   * Return a sampled subset of this RDD with a random seed.
+   * Note: this is NOT guaranteed to provide exactly the fraction of the count
+   * of the given [[RDD]].
*
* @param withReplacement can elements be sampled multiple times (replaced 
when sampled out)
* @param fraction expected size of the sample as a fraction of this RDD's 
size
@@ -109,7 +111,9 @@ class JavaRDD[T](val rdd: RDD[T])(implicit val classTag: 
ClassTag[T])
 sample(withReplacement, fraction, Utils.random.nextLong)
 
   /**
-   * Return a sampled subset of this RDD.
+   * Return a sampled subset of this RDD, with a user-supplied seed.
+   * Note: this is NOT guaranteed to provide exactly the fraction of the count
+   * of the given [[RDD]].
*
* @param withReplacement can elements be sampled multiple times (replaced 
when sampled out)
* @param fraction expected size of the sample as a fraction of this RDD's 
size

http://git-wip-us.apache.org/repos/asf/spark/blob/4fcecb4c/core/src/main/scala/org/apache/spark/rdd/RDD.scala
--
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
index e018af3..cded899 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
@@ -466,6 +466,9 @@ abstract class RDD[T: ClassTag](
   /**
* Return a sampled subset of this RDD.
*
+   * Note: this is NOT guaranteed to provide exactly the fraction of the count
+   * of the given [[RDD]].
+   *
* @param withReplacement can elements be sampled multiple times (replaced 
when sampled

spark git commit: [SPARK-17462][MLLIB]use VersionUtils to parse Spark version strings

2016-11-17 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 49b6f456a -> de77c6775


[SPARK-17462][MLLIB]use VersionUtils to parse Spark version strings

## What changes were proposed in this pull request?

Several places in MLlib use custom regexes or other approaches to parse Spark 
versions.
Those should be fixed to use the VersionUtils. This PR replaces custom regexes 
with
VersionUtils to get Spark version numbers.
## How was this patch tested?

Existing tests.

Signed-off-by: VinceShieh vincent.xieintel.com

Author: VinceShieh 

Closes #15055 from VinceShieh/SPARK-17462.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/de77c677
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/de77c677
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/de77c677

Branch: refs/heads/master
Commit: de77c67750dc868d75d6af173c3820b75a9fe4b7
Parents: 49b6f45
Author: VinceShieh 
Authored: Thu Nov 17 13:37:42 2016 +
Committer: Sean Owen 
Committed: Thu Nov 17 13:37:42 2016 +

--
 .../src/main/scala/org/apache/spark/ml/clustering/KMeans.scala | 6 ++
 mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala | 6 ++
 2 files changed, 4 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/de77c677/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala 
b/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
index a0d481b..26505b4 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
@@ -33,6 +33,7 @@ import org.apache.spark.rdd.RDD
 import org.apache.spark.sql.{DataFrame, Dataset, Row}
 import org.apache.spark.sql.functions.{col, udf}
 import org.apache.spark.sql.types.{IntegerType, StructType}
+import org.apache.spark.util.VersionUtils.majorVersion
 
 /**
  * Common params for KMeans and KMeansModel
@@ -232,10 +233,7 @@ object KMeansModel extends MLReadable[KMeansModel] {
   val metadata = DefaultParamsReader.loadMetadata(path, sc, className)
   val dataPath = new Path(path, "data").toString
 
-  val versionRegex = "([0-9]+)\\.(.+)".r
-  val versionRegex(major, _) = metadata.sparkVersion
-
-  val clusterCenters = if (major.toInt >= 2) {
+  val clusterCenters = if (majorVersion(metadata.sparkVersion) >= 2) {
 val data: Dataset[Data] = sparkSession.read.parquet(dataPath).as[Data]
 
data.collect().sortBy(_.clusterIdx).map(_.clusterCenter).map(OldVectors.fromML)
   } else {

http://git-wip-us.apache.org/repos/asf/spark/blob/de77c677/mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala
index 444006f..1e49352 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala
@@ -34,6 +34,7 @@ import org.apache.spark.rdd.RDD
 import org.apache.spark.sql._
 import org.apache.spark.sql.functions._
 import org.apache.spark.sql.types.{StructField, StructType}
+import org.apache.spark.util.VersionUtils.majorVersion
 
 /**
  * Params for [[PCA]] and [[PCAModel]].
@@ -204,11 +205,8 @@ object PCAModel extends MLReadable[PCAModel] {
 override def load(path: String): PCAModel = {
   val metadata = DefaultParamsReader.loadMetadata(path, sc, className)
 
-  val versionRegex = "([0-9]+)\\.(.+)".r
-  val versionRegex(major, _) = metadata.sparkVersion
-
   val dataPath = new Path(path, "data").toString
-  val model = if (major.toInt >= 2) {
+  val model = if (majorVersion(metadata.sparkVersion) >= 2) {
 val Row(pc: DenseMatrix, explainedVariance: DenseVector) =
   sparkSession.read.parquet(dataPath)
 .select("pc", "explainedVariance")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-17462][MLLIB]use VersionUtils to parse Spark version strings

2016-11-17 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 4fcecb4cf -> 42777b1b3


[SPARK-17462][MLLIB]use VersionUtils to parse Spark version strings

## What changes were proposed in this pull request?

Several places in MLlib use custom regexes or other approaches to parse Spark 
versions.
Those should be fixed to use the VersionUtils. This PR replaces custom regexes 
with
VersionUtils to get Spark version numbers.
## How was this patch tested?

Existing tests.

Signed-off-by: VinceShieh vincent.xieintel.com

Author: VinceShieh 

Closes #15055 from VinceShieh/SPARK-17462.

(cherry picked from commit de77c67750dc868d75d6af173c3820b75a9fe4b7)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/42777b1b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/42777b1b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/42777b1b

Branch: refs/heads/branch-2.1
Commit: 42777b1b3c10d3945494e27f1dedd43f2f836361
Parents: 4fcecb4
Author: VinceShieh 
Authored: Thu Nov 17 13:37:42 2016 +
Committer: Sean Owen 
Committed: Thu Nov 17 13:37:53 2016 +

--
 .../src/main/scala/org/apache/spark/ml/clustering/KMeans.scala | 6 ++
 mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala | 6 ++
 2 files changed, 4 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/42777b1b/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala 
b/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
index a0d481b..26505b4 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
@@ -33,6 +33,7 @@ import org.apache.spark.rdd.RDD
 import org.apache.spark.sql.{DataFrame, Dataset, Row}
 import org.apache.spark.sql.functions.{col, udf}
 import org.apache.spark.sql.types.{IntegerType, StructType}
+import org.apache.spark.util.VersionUtils.majorVersion
 
 /**
  * Common params for KMeans and KMeansModel
@@ -232,10 +233,7 @@ object KMeansModel extends MLReadable[KMeansModel] {
   val metadata = DefaultParamsReader.loadMetadata(path, sc, className)
   val dataPath = new Path(path, "data").toString
 
-  val versionRegex = "([0-9]+)\\.(.+)".r
-  val versionRegex(major, _) = metadata.sparkVersion
-
-  val clusterCenters = if (major.toInt >= 2) {
+  val clusterCenters = if (majorVersion(metadata.sparkVersion) >= 2) {
 val data: Dataset[Data] = sparkSession.read.parquet(dataPath).as[Data]
 
data.collect().sortBy(_.clusterIdx).map(_.clusterCenter).map(OldVectors.fromML)
   } else {

http://git-wip-us.apache.org/repos/asf/spark/blob/42777b1b/mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala
index 444006f..1e49352 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala
@@ -34,6 +34,7 @@ import org.apache.spark.rdd.RDD
 import org.apache.spark.sql._
 import org.apache.spark.sql.functions._
 import org.apache.spark.sql.types.{StructField, StructType}
+import org.apache.spark.util.VersionUtils.majorVersion
 
 /**
  * Params for [[PCA]] and [[PCAModel]].
@@ -204,11 +205,8 @@ object PCAModel extends MLReadable[PCAModel] {
 override def load(path: String): PCAModel = {
   val metadata = DefaultParamsReader.loadMetadata(path, sc, className)
 
-  val versionRegex = "([0-9]+)\\.(.+)".r
-  val versionRegex(major, _) = metadata.sparkVersion
-
   val dataPath = new Path(path, "data").toString
-  val model = if (major.toInt >= 2) {
+  val model = if (majorVersion(metadata.sparkVersion) >= 2) {
 val Row(pc: DenseMatrix, explainedVariance: DenseVector) =
   sparkSession.read.parquet(dataPath)
 .select("pc", "explainedVariance")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18480][DOCS] Fix wrong links for ML guide docs

2016-11-17 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master de77c6775 -> cdaf4ce9f


[SPARK-18480][DOCS] Fix wrong links for ML guide docs

## What changes were proposed in this pull request?
1, There are two `[Graph.partitionBy]` in `graphx-programming-guide.md`, the 
first one had no effert.
2, `DataFrame`, `Transformer`, `Pipeline` and `Parameter`  in `ml-pipeline.md` 
were linked to `ml-guide.html` by mistake.
3, `PythonMLLibAPI` in `mllib-linear-methods.md` was not accessable, because 
class `PythonMLLibAPI` is private.
4, Other link updates.
## How was this patch tested?
 manual tests

Author: Zheng RuiFeng 

Closes #15912 from zhengruifeng/md_fix.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cdaf4ce9
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cdaf4ce9
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cdaf4ce9

Branch: refs/heads/master
Commit: cdaf4ce9fe58c4606be8aa2a5c3756d30545c850
Parents: de77c67
Author: Zheng RuiFeng 
Authored: Thu Nov 17 13:40:16 2016 +
Committer: Sean Owen 
Committed: Thu Nov 17 13:40:16 2016 +

--
 docs/graphx-programming-guide.md|  1 -
 docs/ml-classification-regression.md|  4 ++--
 docs/ml-features.md |  2 +-
 docs/ml-pipeline.md | 12 ++--
 docs/mllib-linear-methods.md|  4 +---
 .../main/scala/org/apache/spark/ml/feature/LSH.scala|  2 +-
 .../spark/ml/tree/impl/GradientBoostedTrees.scala   |  8 
 .../org/apache/spark/ml/tree/impl/RandomForest.scala|  8 
 8 files changed, 19 insertions(+), 22 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/cdaf4ce9/docs/graphx-programming-guide.md
--
diff --git a/docs/graphx-programming-guide.md b/docs/graphx-programming-guide.md
index 1097cf1..e271b28 100644
--- a/docs/graphx-programming-guide.md
+++ b/docs/graphx-programming-guide.md
@@ -36,7 +36,6 @@ description: GraphX graph processing library guide for Spark 
SPARK_VERSION_SHORT
 [Graph.fromEdgeTuples]: 
api/scala/index.html#org.apache.spark.graphx.Graph$@fromEdgeTuples[VD](RDD[(VertexId,VertexId)],VD,Option[PartitionStrategy])(ClassTag[VD]):Graph[VD,Int]
 [Graph.fromEdges]: 
api/scala/index.html#org.apache.spark.graphx.Graph$@fromEdges[VD,ED](RDD[Edge[ED]],VD)(ClassTag[VD],ClassTag[ED]):Graph[VD,ED]
 [PartitionStrategy]: 
api/scala/index.html#org.apache.spark.graphx.PartitionStrategy
-[Graph.partitionBy]: 
api/scala/index.html#org.apache.spark.graphx.Graph$@partitionBy(partitionStrategy:org.apache.spark.graphx.PartitionStrategy):org.apache.spark.graphx.Graph[VD,ED]
 [PageRank]: api/scala/index.html#org.apache.spark.graphx.lib.PageRank$
 [ConnectedComponents]: 
api/scala/index.html#org.apache.spark.graphx.lib.ConnectedComponents$
 [TriangleCount]: 
api/scala/index.html#org.apache.spark.graphx.lib.TriangleCount$

http://git-wip-us.apache.org/repos/asf/spark/blob/cdaf4ce9/docs/ml-classification-regression.md
--
diff --git a/docs/ml-classification-regression.md 
b/docs/ml-classification-regression.md
index 1aacc3e..43cc79b 100644
--- a/docs/ml-classification-regression.md
+++ b/docs/ml-classification-regression.md
@@ -984,7 +984,7 @@ Random forests combine many decision trees in order to 
reduce the risk of overfi
 The `spark.ml` implementation supports random forests for binary and 
multiclass classification and for regression,
 using both continuous and categorical features.
 
-For more information on the algorithm itself, please see the [`spark.mllib` 
documentation on random forests](mllib-ensembles.html).
+For more information on the algorithm itself, please see the [`spark.mllib` 
documentation on random forests](mllib-ensembles.html#random-forests).
 
 ### Inputs and Outputs
 
@@ -1065,7 +1065,7 @@ GBTs iteratively train decision trees in order to 
minimize a loss function.
 The `spark.ml` implementation supports GBTs for binary classification and for 
regression,
 using both continuous and categorical features.
 
-For more information on the algorithm itself, please see the [`spark.mllib` 
documentation on GBTs](mllib-ensembles.html).
+For more information on the algorithm itself, please see the [`spark.mllib` 
documentation on GBTs](mllib-ensembles.html#gradient-boosted-trees-gbts).
 
 ### Inputs and Outputs
 

http://git-wip-us.apache.org/repos/asf/spark/blob/cdaf4ce9/docs/ml-features.md
--
diff --git a/docs/ml-features.md b/docs/ml-features.md
index 19ec574..d2f036f 100644
--- a/docs/ml-features.md
+++ b/docs/ml-features.md
@@ -710,7 +710,7 @@ for mor

spark git commit: [SPARK-18480][DOCS] Fix wrong links for ML guide docs

2016-11-17 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 42777b1b3 -> 536a21593


[SPARK-18480][DOCS] Fix wrong links for ML guide docs

## What changes were proposed in this pull request?
1, There are two `[Graph.partitionBy]` in `graphx-programming-guide.md`, the 
first one had no effert.
2, `DataFrame`, `Transformer`, `Pipeline` and `Parameter`  in `ml-pipeline.md` 
were linked to `ml-guide.html` by mistake.
3, `PythonMLLibAPI` in `mllib-linear-methods.md` was not accessable, because 
class `PythonMLLibAPI` is private.
4, Other link updates.
## How was this patch tested?
 manual tests

Author: Zheng RuiFeng 

Closes #15912 from zhengruifeng/md_fix.

(cherry picked from commit cdaf4ce9fe58c4606be8aa2a5c3756d30545c850)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/536a2159
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/536a2159
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/536a2159

Branch: refs/heads/branch-2.1
Commit: 536a2159393c82d414cc46797c8bfd958f453d33
Parents: 42777b1
Author: Zheng RuiFeng 
Authored: Thu Nov 17 13:40:16 2016 +
Committer: Sean Owen 
Committed: Thu Nov 17 13:40:24 2016 +

--
 docs/graphx-programming-guide.md|  1 -
 docs/ml-classification-regression.md|  4 ++--
 docs/ml-features.md |  2 +-
 docs/ml-pipeline.md | 12 ++--
 docs/mllib-linear-methods.md|  4 +---
 .../main/scala/org/apache/spark/ml/feature/LSH.scala|  2 +-
 .../spark/ml/tree/impl/GradientBoostedTrees.scala   |  8 
 .../org/apache/spark/ml/tree/impl/RandomForest.scala|  8 
 8 files changed, 19 insertions(+), 22 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/536a2159/docs/graphx-programming-guide.md
--
diff --git a/docs/graphx-programming-guide.md b/docs/graphx-programming-guide.md
index 1097cf1..e271b28 100644
--- a/docs/graphx-programming-guide.md
+++ b/docs/graphx-programming-guide.md
@@ -36,7 +36,6 @@ description: GraphX graph processing library guide for Spark 
SPARK_VERSION_SHORT
 [Graph.fromEdgeTuples]: 
api/scala/index.html#org.apache.spark.graphx.Graph$@fromEdgeTuples[VD](RDD[(VertexId,VertexId)],VD,Option[PartitionStrategy])(ClassTag[VD]):Graph[VD,Int]
 [Graph.fromEdges]: 
api/scala/index.html#org.apache.spark.graphx.Graph$@fromEdges[VD,ED](RDD[Edge[ED]],VD)(ClassTag[VD],ClassTag[ED]):Graph[VD,ED]
 [PartitionStrategy]: 
api/scala/index.html#org.apache.spark.graphx.PartitionStrategy
-[Graph.partitionBy]: 
api/scala/index.html#org.apache.spark.graphx.Graph$@partitionBy(partitionStrategy:org.apache.spark.graphx.PartitionStrategy):org.apache.spark.graphx.Graph[VD,ED]
 [PageRank]: api/scala/index.html#org.apache.spark.graphx.lib.PageRank$
 [ConnectedComponents]: 
api/scala/index.html#org.apache.spark.graphx.lib.ConnectedComponents$
 [TriangleCount]: 
api/scala/index.html#org.apache.spark.graphx.lib.TriangleCount$

http://git-wip-us.apache.org/repos/asf/spark/blob/536a2159/docs/ml-classification-regression.md
--
diff --git a/docs/ml-classification-regression.md 
b/docs/ml-classification-regression.md
index cb2ccbf..c72c01f 100644
--- a/docs/ml-classification-regression.md
+++ b/docs/ml-classification-regression.md
@@ -984,7 +984,7 @@ Random forests combine many decision trees in order to 
reduce the risk of overfi
 The `spark.ml` implementation supports random forests for binary and 
multiclass classification and for regression,
 using both continuous and categorical features.
 
-For more information on the algorithm itself, please see the [`spark.mllib` 
documentation on random forests](mllib-ensembles.html).
+For more information on the algorithm itself, please see the [`spark.mllib` 
documentation on random forests](mllib-ensembles.html#random-forests).
 
 ### Inputs and Outputs
 
@@ -1065,7 +1065,7 @@ GBTs iteratively train decision trees in order to 
minimize a loss function.
 The `spark.ml` implementation supports GBTs for binary classification and for 
regression,
 using both continuous and categorical features.
 
-For more information on the algorithm itself, please see the [`spark.mllib` 
documentation on GBTs](mllib-ensembles.html).
+For more information on the algorithm itself, please see the [`spark.mllib` 
documentation on GBTs](mllib-ensembles.html#gradient-boosted-trees-gbts).
 
 ### Inputs and Outputs
 

http://git-wip-us.apache.org/repos/asf/spark/blob/536a2159/docs/ml-features.md
--
diff --git a/docs/ml-features.md b/docs/ml-features.md
index

spark git commit: [SPARK-18490][SQL] duplication nodename extrainfo for ShuffleExchange

2016-11-17 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master cdaf4ce9f -> b0aa1aa1a


[SPARK-18490][SQL] duplication nodename extrainfo for ShuffleExchange

## What changes were proposed in this pull request?

   In ShuffleExchange, the nodename's extraInfo are the same when 
exchangeCoordinator.isEstimated
 is true or false.

Merge the two situation in the PR.

Author: root 

Closes #15920 from windpiger/DupNodeNameShuffleExchange.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b0aa1aa1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b0aa1aa1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b0aa1aa1

Branch: refs/heads/master
Commit: b0aa1aa1af6c513a6a881eaea96abdd2b480ef98
Parents: cdaf4ce
Author: root 
Authored: Thu Nov 17 17:04:19 2016 +
Committer: Sean Owen 
Committed: Thu Nov 17 17:04:19 2016 +

--
 .../apache/spark/sql/execution/exchange/ShuffleExchange.scala| 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/b0aa1aa1/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchange.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchange.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchange.scala
index 7a4a251..125a493 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchange.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchange.scala
@@ -45,9 +45,7 @@ case class ShuffleExchange(
 
   override def nodeName: String = {
 val extraInfo = coordinator match {
-  case Some(exchangeCoordinator) if exchangeCoordinator.isEstimated =>
-s"(coordinator id: ${System.identityHashCode(coordinator)})"
-  case Some(exchangeCoordinator) if !exchangeCoordinator.isEstimated =>
+  case Some(exchangeCoordinator) =>
 s"(coordinator id: ${System.identityHashCode(coordinator)})"
   case None => ""
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18490][SQL] duplication nodename extrainfo for ShuffleExchange

2016-11-17 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 536a21593 -> 978798880


[SPARK-18490][SQL] duplication nodename extrainfo for ShuffleExchange

## What changes were proposed in this pull request?

   In ShuffleExchange, the nodename's extraInfo are the same when 
exchangeCoordinator.isEstimated
 is true or false.

Merge the two situation in the PR.

Author: root 

Closes #15920 from windpiger/DupNodeNameShuffleExchange.

(cherry picked from commit b0aa1aa1af6c513a6a881eaea96abdd2b480ef98)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/97879888
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/97879888
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/97879888

Branch: refs/heads/branch-2.1
Commit: 978798880c0b1e6a15e8a342847e1ff4d83a5ac0
Parents: 536a215
Author: root 
Authored: Thu Nov 17 17:04:19 2016 +
Committer: Sean Owen 
Committed: Thu Nov 17 17:04:38 2016 +

--
 .../apache/spark/sql/execution/exchange/ShuffleExchange.scala| 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/97879888/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchange.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchange.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchange.scala
index 7a4a251..125a493 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchange.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchange.scala
@@ -45,9 +45,7 @@ case class ShuffleExchange(
 
   override def nodeName: String = {
 val extraInfo = coordinator match {
-  case Some(exchangeCoordinator) if exchangeCoordinator.isEstimated =>
-s"(coordinator id: ${System.identityHashCode(coordinator)})"
-  case Some(exchangeCoordinator) if !exchangeCoordinator.isEstimated =>
+  case Some(exchangeCoordinator) =>
 s"(coordinator id: ${System.identityHashCode(coordinator)})"
   case None => ""
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark-website git commit: Expand guidance on SO and mailing lists, per discussion

2016-11-18 Thread srowen

Repository: spark-website
Updated Branches:
  refs/heads/asf-site 8781cd3c4 -> 80a543b56


Expand guidance on SO and mailing lists, per discussion


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/80a543b5
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/80a543b5
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/80a543b5

Branch: refs/heads/asf-site
Commit: 80a543b56076af07e75b42bb72347710be46b437
Parents: 8781cd3
Author: Sean Owen 
Authored: Wed Nov 16 13:55:08 2016 +
Committer: Sean Owen 
Committed: Thu Nov 17 11:02:10 2016 +

--
 community.md| 48 ++---
 site/community.html | 61 +---
 2 files changed, 103 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark-website/blob/80a543b5/community.md
--
diff --git a/community.md b/community.md
index 480d05a..c4f83a5 100644
--- a/community.md
+++ b/community.md
@@ -9,10 +9,37 @@ navigation:
 
 Apache Spark Community
 
+
+Have Questions?
+
+StackOverflow
+
+For usage questions and help (e.g. how to use this Spark API), it is 
recommended you use the 
+StackOverflow tag http://stackoverflow.com/questions/tagged/apache-spark";>`apache-spark`
 
+as it is an active forum for Spark users' questions and answers.
+
+Some quick tips when using StackOverflow:
+
+- Prior to asking submitting questions, please:
+  - Search StackOverflow's 
+  http://stackoverflow.com/questions/tagged/apache-spark";>`apache-spark`
 tag to see if 
+  your question has already been answered
+  - Search the nabble archive for
+  http://apache-spark-user-list.1001560.n3.nabble.com/";>us...@spark.apache.org
 
+- Please follow the StackOverflow http://stackoverflow.com/help/how-to-ask";>code of conduct  
+- Always use the `apache-spark` tag when asking questions
+- Please also use a secondary tag to specify components so subject matter 
experts can more easily find them.
+ Examples include: `pyspark`, `spark-dataframe`, `spark-streaming`, `spark-r`, 
`spark-mllib`, 
+  `spark-ml`, `spark-graphx`, `spark-graphframes`, `spark-tensorframes`, etc. 
+- Please do not cross-post between StackOverflow and the mailing lists
+- No jobs, sales, or solicitation is permitted on StackOverflow
+
 
-Mailing Lists
+Mailing Lists
+
+For broad, opinion based, ask for external resources, debug issues, bugs, 
contributing to the 
+project, and scenarios, it is recommended you use the u...@spark.apache.org 
mailing list.
 
-Get help using Spark or contribute to the project on our mailing lists:
 
   
 http://apache-spark-user-list.1001560.n3.nabble.com";>u...@spark.apache.org
 is for usage questions, help, and announcements.
@@ -28,7 +55,22 @@ navigation:
   
 
 
-The StackOverflow tag http://stackoverflow.com/questions/tagged/apache-spark";>apache-spark
 is an unofficial but active forum for Spark users' questions and answers.
+Some quick tips when using email:
+
+- Prior to asking submitting questions, please:
+  - Search StackOverflow at http://stackoverflow.com/questions/tagged/apache-spark";>`apache-spark`
 
+  to see if your question has already been answered
+  - Search the nabble archive for
+  http://apache-spark-user-list.1001560.n3.nabble.com/";>us...@spark.apache.org
 
+- Tagging the subject line of your email will help you get a faster response, 
e.g. 
+`[Spark SQL]: Does Spark SQL support LEFT SEMI JOIN?`
+- Tags may help identify a topic by:
+  - Component: Spark Core, Spark SQL, ML, MLlib, GraphFrames, GraphX, 
TensorFrames, etc
+  - Level: Beginner, Intermediate, Advanced
+  - Scenario: Debug, How-to
+- For error logs or long code examples, please use https://gist.github.com/";>GitHub gist 
+and include only a few lines of the pertinent code / log within the email.
+- No jobs, sales, or solicitation is permitted on the Apache Spark mailing 
lists.
 
 
 Events and Meetups

http://git-wip-us.apache.org/repos/asf/spark-website/blob/80a543b5/site/community.html
--
diff --git a/site/community.html b/site/community.html
index 79ab192..fe4aa35 100644
--- a/site/community.html
+++ b/site/community.html
@@ -185,10 +185,42 @@
   
 Apache Spark Community
 
+
+Have Questions?
+
+StackOverflow
+
+For usage questions and help (e.g. how to use this Spark API), it is 
recommended you use the 
+StackOverflow tag http://stackoverflow.com/questions/tagged/apache-spark";>apache-spark
 
+as it is an active forum for Spark users’ questions and answers.
+
+Some quick tips when using StackOverflow:
+
+
+  Prior to asking submitting questions, please:
+
+  Search StackOverflow’s 
+http://stackoverflow.com/questions/

spark git commit: [SPARK-18422][CORE] Fix wholeTextFiles test to pass on Windows in JavaAPISuite

2016-11-18 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 795e9fc92 -> 40d59ff5e


[SPARK-18422][CORE] Fix wholeTextFiles test to pass on Windows in JavaAPISuite

## What changes were proposed in this pull request?

This PR fixes the test `wholeTextFiles` in `JavaAPISuite.java`. This is failed 
due to the different path format on Windows.

For example, the path in `container` was

```
C:\projects\spark\target\tmp\1478967560189-0/part-0
```

whereas `new URI(res._1()).getPath()` was as below:

```
/C:/projects/spark/target/tmp/1478967560189-0/part-0
```

## How was this patch tested?

Tests in `JavaAPISuite.java`.

Tested via AppVeyor.

**Before**
Build: https://ci.appveyor.com/project/spark-test/spark/build/63-JavaAPISuite-1
Diff: https://github.com/apache/spark/compare/master...spark-test:JavaAPISuite-1

```
[info] Test org.apache.spark.JavaAPISuite.wholeTextFiles started
[error] Test org.apache.spark.JavaAPISuite.wholeTextFiles failed: 
java.lang.AssertionError: expected: but was:, took 0.578 sec
[error] at 
org.apache.spark.JavaAPISuite.wholeTextFiles(JavaAPISuite.java:1089)
...
```

**After**
Build started: [CORE] `org.apache.spark.JavaAPISuite` 
[![PR-15866](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=198DDA52-F201-4D2B-BE2F-244E0C1725B2&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/198DDA52-F201-4D2B-BE2F-244E0C1725B2)
Diff: 
https://github.com/apache/spark/compare/master...spark-test:198DDA52-F201-4D2B-BE2F-244E0C1725B2

```
[info] Test org.apache.spark.JavaAPISuite.wholeTextFiles started
...
```

Author: hyukjinkwon 

Closes #15866 from HyukjinKwon/SPARK-18422.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/40d59ff5
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/40d59ff5
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/40d59ff5

Branch: refs/heads/master
Commit: 40d59ff5eaac6df237fe3d50186695c3806b268c
Parents: 795e9fc
Author: hyukjinkwon 
Authored: Fri Nov 18 21:45:18 2016 +
Committer: Sean Owen 
Committed: Fri Nov 18 21:45:18 2016 +

--
 .../test/java/org/apache/spark/JavaAPISuite.java   | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/40d59ff5/core/src/test/java/org/apache/spark/JavaAPISuite.java
--
diff --git a/core/src/test/java/org/apache/spark/JavaAPISuite.java 
b/core/src/test/java/org/apache/spark/JavaAPISuite.java
index 533025b..7bebe06 100644
--- a/core/src/test/java/org/apache/spark/JavaAPISuite.java
+++ b/core/src/test/java/org/apache/spark/JavaAPISuite.java
@@ -20,7 +20,6 @@ package org.apache.spark;
 import java.io.*;
 import java.nio.channels.FileChannel;
 import java.nio.ByteBuffer;
-import java.net.URI;
 import java.nio.charset.StandardCharsets;
 import java.util.ArrayList;
 import java.util.Arrays;
@@ -46,6 +45,7 @@ import com.google.common.collect.Iterators;
 import com.google.common.collect.Lists;
 import com.google.common.base.Throwables;
 import com.google.common.io.Files;
+import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.io.IntWritable;
 import org.apache.hadoop.io.Text;
 import org.apache.hadoop.io.compress.DefaultCodec;
@@ -1075,18 +1075,23 @@ public class JavaAPISuite implements Serializable {
 byte[] content2 = "spark is also easy to 
use.\n".getBytes(StandardCharsets.UTF_8);
 
 String tempDirName = tempDir.getAbsolutePath();
-Files.write(content1, new File(tempDirName + "/part-0"));
-Files.write(content2, new File(tempDirName + "/part-1"));
+String path1 = new Path(tempDirName, "part-0").toUri().getPath();
+String path2 = new Path(tempDirName, "part-1").toUri().getPath();
+
+Files.write(content1, new File(path1));
+Files.write(content2, new File(path2));
 
 Map container = new HashMap<>();
-container.put(tempDirName+"/part-0", new Text(content1).toString());
-container.put(tempDirName+"/part-1", new Text(content2).toString());
+container.put(path1, new Text(content1).toString());
+container.put(path2, new Text(content2).toString());
 
 JavaPairRDD readRDD = sc.wholeTextFiles(tempDirName, 3);
 List> result = readRDD.collect();
 
 for (Tuple2 res : result) {
-  assertEquals(res._2(), container.get(new URI(res._1()).getPath()));
+  // Note that the paths from `wholeTextFiles` are in URI format on 
Windows,
+  // for example, file:/C:/a/b/c.
+  assertEquals(res._2(), container.get(new 
Path(res._1()).toUri().getPath()));
 }
   }
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18422][CORE] Fix wholeTextFiles test to pass on Windows in JavaAPISuite

2016-11-18 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 ec622eb7e -> 6717981e4


[SPARK-18422][CORE] Fix wholeTextFiles test to pass on Windows in JavaAPISuite

## What changes were proposed in this pull request?

This PR fixes the test `wholeTextFiles` in `JavaAPISuite.java`. This is failed 
due to the different path format on Windows.

For example, the path in `container` was

```
C:\projects\spark\target\tmp\1478967560189-0/part-0
```

whereas `new URI(res._1()).getPath()` was as below:

```
/C:/projects/spark/target/tmp/1478967560189-0/part-0
```

## How was this patch tested?

Tests in `JavaAPISuite.java`.

Tested via AppVeyor.

**Before**
Build: https://ci.appveyor.com/project/spark-test/spark/build/63-JavaAPISuite-1
Diff: https://github.com/apache/spark/compare/master...spark-test:JavaAPISuite-1

```
[info] Test org.apache.spark.JavaAPISuite.wholeTextFiles started
[error] Test org.apache.spark.JavaAPISuite.wholeTextFiles failed: 
java.lang.AssertionError: expected: but was:, took 0.578 sec
[error] at 
org.apache.spark.JavaAPISuite.wholeTextFiles(JavaAPISuite.java:1089)
...
```

**After**
Build started: [CORE] `org.apache.spark.JavaAPISuite` 
[![PR-15866](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=198DDA52-F201-4D2B-BE2F-244E0C1725B2&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/198DDA52-F201-4D2B-BE2F-244E0C1725B2)
Diff: 
https://github.com/apache/spark/compare/master...spark-test:198DDA52-F201-4D2B-BE2F-244E0C1725B2

```
[info] Test org.apache.spark.JavaAPISuite.wholeTextFiles started
...
```

Author: hyukjinkwon 

Closes #15866 from HyukjinKwon/SPARK-18422.

(cherry picked from commit 40d59ff5eaac6df237fe3d50186695c3806b268c)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6717981e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6717981e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6717981e

Branch: refs/heads/branch-2.1
Commit: 6717981e4d76f0794a75c60586de4677c49659ad
Parents: ec622eb
Author: hyukjinkwon 
Authored: Fri Nov 18 21:45:18 2016 +
Committer: Sean Owen 
Committed: Fri Nov 18 21:45:36 2016 +

--
 .../test/java/org/apache/spark/JavaAPISuite.java   | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/6717981e/core/src/test/java/org/apache/spark/JavaAPISuite.java
--
diff --git a/core/src/test/java/org/apache/spark/JavaAPISuite.java 
b/core/src/test/java/org/apache/spark/JavaAPISuite.java
index 533025b..7bebe06 100644
--- a/core/src/test/java/org/apache/spark/JavaAPISuite.java
+++ b/core/src/test/java/org/apache/spark/JavaAPISuite.java
@@ -20,7 +20,6 @@ package org.apache.spark;
 import java.io.*;
 import java.nio.channels.FileChannel;
 import java.nio.ByteBuffer;
-import java.net.URI;
 import java.nio.charset.StandardCharsets;
 import java.util.ArrayList;
 import java.util.Arrays;
@@ -46,6 +45,7 @@ import com.google.common.collect.Iterators;
 import com.google.common.collect.Lists;
 import com.google.common.base.Throwables;
 import com.google.common.io.Files;
+import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.io.IntWritable;
 import org.apache.hadoop.io.Text;
 import org.apache.hadoop.io.compress.DefaultCodec;
@@ -1075,18 +1075,23 @@ public class JavaAPISuite implements Serializable {
 byte[] content2 = "spark is also easy to 
use.\n".getBytes(StandardCharsets.UTF_8);
 
 String tempDirName = tempDir.getAbsolutePath();
-Files.write(content1, new File(tempDirName + "/part-0"));
-Files.write(content2, new File(tempDirName + "/part-1"));
+String path1 = new Path(tempDirName, "part-0").toUri().getPath();
+String path2 = new Path(tempDirName, "part-1").toUri().getPath();
+
+Files.write(content1, new File(path1));
+Files.write(content2, new File(path2));
 
 Map container = new HashMap<>();
-container.put(tempDirName+"/part-0", new Text(content1).toString());
-container.put(tempDirName+"/part-1", new Text(content2).toString());
+container.put(path1, new Text(content1).toString());
+container.put(path2, new Text(content2).toString());
 
 JavaPairRDD readRDD = sc.wholeTextFiles(tempDirName, 3);
 List> result = readRDD.collect();
 
 for (Tuple2 res : result) {
-  assertEquals(res._2(), container.get(new URI(res._1()).getPath()));
+  // Note that the paths from `wholeTextFiles` are in URI format on 
Windows,
+  // for example, file:/C:/a/b/c.
+  assertEquals(res._2(), container.get(new 
Path(res._1()).toUri().getPath()));
 }
   }
 


-
To unsubscribe, e-m

spark git commit: [SPARK-18448][CORE] SparkSession should implement java.lang.AutoCloseable like JavaSparkContext

2016-11-19 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 2a40de408 -> db9fb9baa


[SPARK-18448][CORE] SparkSession should implement java.lang.AutoCloseable like 
JavaSparkContext

## What changes were proposed in this pull request?

Just adds `close()` + `Closeable` as a synonym for `stop()`. This makes it 
usable in Java in try-with-resources, as suggested by ash211  (`Closeable` 
extends `AutoCloseable` BTW)

## How was this patch tested?

Existing tests

Author: Sean Owen 

Closes #15932 from srowen/SPARK-18448.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/db9fb9ba
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/db9fb9ba
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/db9fb9ba

Branch: refs/heads/master
Commit: db9fb9baacbf8640dd37a507b7450db727c7e6ea
Parents: 2a40de4
Author: Sean Owen 
Authored: Sat Nov 19 09:00:11 2016 +
Committer: Sean Owen 
Committed: Sat Nov 19 09:00:11 2016 +

--
 .../main/scala/org/apache/spark/sql/SparkSession.scala| 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/db9fb9ba/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
--
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
index 3045eb6..58b2ab3 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
@@ -18,6 +18,7 @@
 package org.apache.spark.sql
 
 import java.beans.Introspector
+import java.io.Closeable
 import java.util.concurrent.atomic.AtomicReference
 
 import scala.collection.JavaConverters._
@@ -72,7 +73,7 @@ import org.apache.spark.util.Utils
 class SparkSession private(
 @transient val sparkContext: SparkContext,
 @transient private val existingSharedState: Option[SharedState])
-  extends Serializable with Logging { self =>
+  extends Serializable with Closeable with Logging { self =>
 
   private[sql] def this(sc: SparkContext) {
 this(sc, None)
@@ -648,6 +649,13 @@ class SparkSession private(
   }
 
   /**
+   * Synonym for `stop()`.
+   *
+   * @since 2.2.0
+   */
+  override def close(): Unit = stop()
+
+  /**
* Parses the data type in our internal string representation. The data type 
string should
* have the same format as the one generated by `toString` in scala.
* It is only used by PySpark.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18448][CORE] SparkSession should implement java.lang.AutoCloseable like JavaSparkContext

2016-11-19 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 b4bad04c5 -> 693401be2


[SPARK-18448][CORE] SparkSession should implement java.lang.AutoCloseable like 
JavaSparkContext

## What changes were proposed in this pull request?

Just adds `close()` + `Closeable` as a synonym for `stop()`. This makes it 
usable in Java in try-with-resources, as suggested by ash211  (`Closeable` 
extends `AutoCloseable` BTW)

## How was this patch tested?

Existing tests

Author: Sean Owen 

Closes #15932 from srowen/SPARK-18448.

(cherry picked from commit db9fb9baacbf8640dd37a507b7450db727c7e6ea)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/693401be
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/693401be
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/693401be

Branch: refs/heads/branch-2.1
Commit: 693401be24bfefe5305038b87888cdeb641d7642
Parents: b4bad04
Author: Sean Owen 
Authored: Sat Nov 19 09:00:11 2016 +
Committer: Sean Owen 
Committed: Sat Nov 19 09:00:21 2016 +

--
 .../main/scala/org/apache/spark/sql/SparkSession.scala| 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/693401be/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
--
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
index 3045eb6..58b2ab3 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
@@ -18,6 +18,7 @@
 package org.apache.spark.sql
 
 import java.beans.Introspector
+import java.io.Closeable
 import java.util.concurrent.atomic.AtomicReference
 
 import scala.collection.JavaConverters._
@@ -72,7 +73,7 @@ import org.apache.spark.util.Utils
 class SparkSession private(
 @transient val sparkContext: SparkContext,
 @transient private val existingSharedState: Option[SharedState])
-  extends Serializable with Logging { self =>
+  extends Serializable with Closeable with Logging { self =>
 
   private[sql] def this(sc: SparkContext) {
 this(sc, None)
@@ -648,6 +649,13 @@ class SparkSession private(
   }
 
   /**
+   * Synonym for `stop()`.
+   *
+   * @since 2.2.0
+   */
+  override def close(): Unit = stop()
+
+  /**
* Parses the data type in our internal string representation. The data type 
string should
* have the same format as the one generated by `toString` in scala.
* It is only used by PySpark.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[2/3] spark git commit: [SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that`/`'''Note:'''` across Scala/Java API documentation

2016-11-19 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark/blob/d5b1d5fc/external/kafka-0-8/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala
--
diff --git 
a/external/kafka-0-8/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala
 
b/external/kafka-0-8/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala
index b17e198..56f0cb0 100644
--- 
a/external/kafka-0-8/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala
+++ 
b/external/kafka-0-8/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala
@@ -223,7 +223,7 @@ object KafkaUtils {
   }
 
   /**
-   * Create a RDD from Kafka using offset ranges for each topic and partition.
+   * Create an RDD from Kafka using offset ranges for each topic and partition.
*
* @param sc SparkContext object
* @param kafkaParams Kafka http://kafka.apache.org/documentation.html#configuration";>
@@ -255,7 +255,7 @@ object KafkaUtils {
   }
 
   /**
-   * Create a RDD from Kafka using offset ranges for each topic and partition. 
This allows you
+   * Create an RDD from Kafka using offset ranges for each topic and 
partition. This allows you
* specify the Kafka leader to connect to (to optimize fetching) and access 
the message as well
* as the metadata.
*
@@ -303,7 +303,7 @@ object KafkaUtils {
   }
 
   /**
-   * Create a RDD from Kafka using offset ranges for each topic and partition.
+   * Create an RDD from Kafka using offset ranges for each topic and partition.
*
* @param jsc JavaSparkContext object
* @param kafkaParams Kafka http://kafka.apache.org/documentation.html#configuration";>
@@ -340,7 +340,7 @@ object KafkaUtils {
   }
 
   /**
-   * Create a RDD from Kafka using offset ranges for each topic and partition. 
This allows you
+   * Create an RDD from Kafka using offset ranges for each topic and 
partition. This allows you
* specify the Kafka leader to connect to (to optimize fetching) and access 
the message as well
* as the metadata.
*

http://git-wip-us.apache.org/repos/asf/spark/blob/d5b1d5fc/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala
--
diff --git 
a/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala
 
b/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala
index a0007d3..b2daffa 100644
--- 
a/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala
+++ 
b/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala
@@ -33,10 +33,6 @@ object KinesisUtils {
* Create an input stream that pulls messages from a Kinesis stream.
* This uses the Kinesis Client Library (KCL) to pull messages from Kinesis.
*
-   * Note: The AWS credentials will be discovered using the 
DefaultAWSCredentialsProviderChain
-   * on the workers. See AWS documentation to understand how 
DefaultAWSCredentialsProviderChain
-   * gets the AWS credentials.
-   *
* @param ssc StreamingContext object
* @param kinesisAppName  Kinesis application name used by the Kinesis 
Client Library
*(KCL) to update DynamoDB
@@ -57,6 +53,10 @@ object KinesisUtils {
* StorageLevel.MEMORY_AND_DISK_2 is recommended.
* @param messageHandler A custom message handler that can generate a 
generic output from a
*   Kinesis `Record`, which contains both message data, 
and metadata.
+   *
+   * @note The AWS credentials will be discovered using the 
DefaultAWSCredentialsProviderChain
+   * on the workers. See AWS documentation to understand how 
DefaultAWSCredentialsProviderChain
+   * gets the AWS credentials.
*/
   def createStream[T: ClassTag](
   ssc: StreamingContext,
@@ -81,10 +81,6 @@ object KinesisUtils {
* Create an input stream that pulls messages from a Kinesis stream.
* This uses the Kinesis Client Library (KCL) to pull messages from Kinesis.
*
-   * Note:
-   *  The given AWS credentials will get saved in DStream checkpoints if 
checkpointing
-   *  is enabled. Make sure that your checkpoint directory is secure.
-   *
* @param ssc StreamingContext object
* @param kinesisAppName  Kinesis application name used by the Kinesis 
Client Library
*(KCL) to update DynamoDB
@@ -107,6 +103,9 @@ object KinesisUtils {
*   Kinesis `Record`, which contains both message data, 
and metadata.
* @param awsAccessKeyId  AWS AccessKeyId (if null, will use 
DefaultAWSCredentialsProviderChain)
* @param awsSecretKey  AWS SecretKey (if null, will use 
DefaultAWSCredentialsProviderChain)
+   *
+   * @note The given AWS credentials will get saved in DStream checkpoints if 
checkpointing
+   * is enabled. Make sure that your checkpoint directory is se

[3/3] spark git commit: [SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that`/`'''Note:'''` across Scala/Java API documentation

2016-11-19 Thread srowen

[SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note 
that`/`'''Note:'''` across Scala/Java API documentation

## What changes were proposed in this pull request?

It seems in Scala/Java,

- `Note:`
- `NOTE:`
- `Note that`
- `'''Note:'''`
- `note`

This PR proposes to fix those to `note` to be consistent.

**Before**

- Scala
  ![2016-11-17 6 16 
39](https://cloud.githubusercontent.com/assets/6477701/20383180/1a7aed8c-acf2-11e6-9611-5eaf6d52c2e0.png)

- Java
  ![2016-11-17 6 14 
41](https://cloud.githubusercontent.com/assets/6477701/20383096/c8ffc680-acf1-11e6-914a-33460bf1401d.png)

**After**

- Scala
  ![2016-11-17 6 16 
44](https://cloud.githubusercontent.com/assets/6477701/20383167/09940490-acf2-11e6-937a-0d5e1dc2cadf.png)

- Java
  ![2016-11-17 6 13 
39](https://cloud.githubusercontent.com/assets/6477701/20383132/e7c2a57e-acf1-11e6-9c47-b849674d4d88.png)

## How was this patch tested?

The notes were found via

```bash
grep -r "NOTE: " . | \ # Note:|NOTE:|Note that|'''Note:'''
grep -v "// NOTE: " | \  # starting with // does not appear in API 
documentation.
grep -E '.scala|.java' | \ # java/scala files
grep -v Suite | \ # exclude tests
grep -v Test | \ # exclude tests
grep -e 'org.apache.spark.api.java' \ # packages appear in API documenation
-e 'org.apache.spark.api.java.function' \ # note that this is a regular 
expression. So actual matches were mostly `org/apache/spark/api/java/functions 
...`
-e 'org.apache.spark.api.r' \
...
```

```bash
grep -r "Note that " . | \ # Note:|NOTE:|Note that|'''Note:'''
grep -v "// Note that " | \  # starting with // does not appear in API 
documentation.
grep -E '.scala|.java' | \ # java/scala files
grep -v Suite | \ # exclude tests
grep -v Test | \ # exclude tests
grep -e 'org.apache.spark.api.java' \ # packages appear in API documenation
-e 'org.apache.spark.api.java.function' \
-e 'org.apache.spark.api.r' \
...
```

```bash
grep -r "Note: " . | \ # Note:|NOTE:|Note that|'''Note:'''
grep -v "// Note: " | \  # starting with // does not appear in API 
documentation.
grep -E '.scala|.java' | \ # java/scala files
grep -v Suite | \ # exclude tests
grep -v Test | \ # exclude tests
grep -e 'org.apache.spark.api.java' \ # packages appear in API documenation
-e 'org.apache.spark.api.java.function' \
-e 'org.apache.spark.api.r' \
...
```

```bash
grep -r "'''Note:'''" . | \ # Note:|NOTE:|Note that|'''Note:'''
grep -v "// '''Note:''' " | \  # starting with // does not appear in API 
documentation.
grep -E '.scala|.java' | \ # java/scala files
grep -v Suite | \ # exclude tests
grep -v Test | \ # exclude tests
grep -e 'org.apache.spark.api.java' \ # packages appear in API documenation
-e 'org.apache.spark.api.java.function' \
-e 'org.apache.spark.api.r' \
...
```

And then fixed one by one comparing with API documentation/access modifiers.

After that, manually tested via `jekyll build`.

Author: hyukjinkwon 

Closes #15889 from HyukjinKwon/SPARK-18437.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d5b1d5fc
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d5b1d5fc
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d5b1d5fc

Branch: refs/heads/master
Commit: d5b1d5fc80153571c308130833d0c0774de62c92
Parents: db9fb9b
Author: hyukjinkwon 
Authored: Sat Nov 19 11:24:15 2016 +
Committer: Sean Owen 
Committed: Sat Nov 19 11:24:15 2016 +

--
 .../scala/org/apache/spark/ContextCleaner.scala |  2 +-
 .../scala/org/apache/spark/Partitioner.scala|  2 +-
 .../main/scala/org/apache/spark/SparkConf.scala |  6 +-
 .../scala/org/apache/spark/SparkContext.scala   | 47 ---
 .../apache/spark/api/java/JavaDoubleRDD.scala   |  4 +-
 .../org/apache/spark/api/java/JavaPairRDD.scala | 26 +
 .../org/apache/spark/api/java/JavaRDD.scala | 12 ++--
 .../org/apache/spark/api/java/JavaRDDLike.scala |  3 +-
 .../spark/api/java/JavaSparkContext.scala   | 21 +++
 .../spark/api/java/JavaSparkStatusTracker.scala |  2 +-
 .../io/SparkHadoopMapReduceWriter.scala |  2 +-
 .../org/apache/spark/io/CompressionCodec.scala  | 23 
 .../apache/spark/partial/BoundedDouble.scala|  2 +-
 .../org/apache/spark/rdd/CoGroupedRDD.scala |  8 +--
 .../apache/spark/rdd/DoubleRDDFunctions.scala   |  2 +-
 .../scala/org/apache/spark/rdd/HadoopRDD.scala  |  6 +-
 .../org/apache/spark/rdd/NewHadoopRDD.scala |  6 +-
 .../org/apache/spark/rdd/PairRDDFunctions.scala | 23 
 .../apache/spark/rdd/PartitionPruningRDD.scala  |  2 +-
 .../spark/rdd/PartitionwiseSampledRDD.scala |  2 +-
 .../main/scala/org/apache/spark/rdd/RDD.scala   | 46 +++
 .../apache/spark/rdd/RDDCheckpointData.scala|  2 +-
 .../spark/rdd/ReliableCheckpointRDD.scala   |  2 +-
 .../spark/rdd/SequenceFileRDDFunctions.scala|  5 +-
 .../apache/spark/rdd/ZippedWithIndexRDD.scala   |  2 +-
 .../spark/sche

[1/3] spark git commit: [SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that`/`'''Note:'''` across Scala/Java API documentation

2016-11-19 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master db9fb9baa -> d5b1d5fc8


http://git-wip-us.apache.org/repos/asf/spark/blob/d5b1d5fc/python/pyspark/rdd.py
--
diff --git a/python/pyspark/rdd.py b/python/pyspark/rdd.py
index a163cea..641787e 100644
--- a/python/pyspark/rdd.py
+++ b/python/pyspark/rdd.py
@@ -1218,7 +1218,7 @@ class RDD(object):
 
 def top(self, num, key=None):
 """
-Get the top N elements from a RDD.
+Get the top N elements from an RDD.
 
 Note that this method should only be used if the resulting array is 
expected
 to be small, as all the data is loaded into the driver's memory.
@@ -1242,7 +1242,7 @@ class RDD(object):
 
 def takeOrdered(self, num, key=None):
 """
-Get the N elements from a RDD ordered in ascending order or as
+Get the N elements from an RDD ordered in ascending order or as
 specified by the optional key function.
 
 Note that this method should only be used if the resulting array is 
expected

http://git-wip-us.apache.org/repos/asf/spark/blob/d5b1d5fc/python/pyspark/streaming/kafka.py
--
diff --git a/python/pyspark/streaming/kafka.py 
b/python/pyspark/streaming/kafka.py
index bf27d80..134424a 100644
--- a/python/pyspark/streaming/kafka.py
+++ b/python/pyspark/streaming/kafka.py
@@ -144,7 +144,7 @@ class KafkaUtils(object):
 """
 .. note:: Experimental
 
-Create a RDD from Kafka using offset ranges for each topic and 
partition.
+Create an RDD from Kafka using offset ranges for each topic and 
partition.
 
 :param sc:  SparkContext object
 :param kafkaParams: Additional params for Kafka
@@ -155,7 +155,7 @@ class KafkaUtils(object):
 :param valueDecoder:  A function used to decode value (default is 
utf8_decoder)
 :param messageHandler: A function used to convert 
KafkaMessageAndMetadata. You can assess
meta using messageHandler (default is None).
-:return: A RDD object
+:return: An RDD object
 """
 if leaders is None:
 leaders = dict()

http://git-wip-us.apache.org/repos/asf/spark/blob/d5b1d5fc/sql/catalyst/src/main/scala/org/apache/spark/sql/Encoders.scala
--
diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/Encoders.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/Encoders.scala
index dc90659..0b95a88 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/Encoders.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/Encoders.scala
@@ -165,10 +165,10 @@ object Encoders {
* (Scala-specific) Creates an encoder that serializes objects of type T 
using generic Java
* serialization. This encoder maps T into a single byte array (binary) 
field.
*
-   * Note that this is extremely inefficient and should only be used as the 
last resort.
-   *
* T must be publicly accessible.
*
+   * @note This is extremely inefficient and should only be used as the last 
resort.
+   *
* @since 1.6.0
*/
   def javaSerialization[T: ClassTag]: Encoder[T] = genericSerializer(useKryo = 
false)
@@ -177,10 +177,10 @@ object Encoders {
* Creates an encoder that serializes objects of type T using generic Java 
serialization.
* This encoder maps T into a single byte array (binary) field.
*
-   * Note that this is extremely inefficient and should only be used as the 
last resort.
-   *
* T must be publicly accessible.
*
+   * @note This is extremely inefficient and should only be used as the last 
resort.
+   *
* @since 1.6.0
*/
   def javaSerialization[T](clazz: Class[T]): Encoder[T] = 
javaSerialization(ClassTag[T](clazz))

http://git-wip-us.apache.org/repos/asf/spark/blob/d5b1d5fc/sql/catalyst/src/main/scala/org/apache/spark/sql/types/CalendarIntervalType.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/CalendarIntervalType.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/CalendarIntervalType.scala
index e121044..21f3497 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/CalendarIntervalType.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/CalendarIntervalType.scala
@@ -23,10 +23,10 @@ import org.apache.spark.annotation.InterfaceStability
  * The data type representing calendar time intervals. The calendar time 
interval is stored
  * internally in two components: number of months the number of microseconds.
  *
- * Note that calendar intervals are not comparable.
- *
  * Please use the singleton [[DataTypes.CalendarIntervalType]].
  *
+ * @note Calendar intervals are not comparable.
+ *
  * @since 1.5.0
  */
 @InterfaceStability.Stable

[2/3] spark git commit: [SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that`/`'''Note:'''` across Scala/Java API documentation

2016-11-19 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/external/kafka-0-8/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala
--
diff --git 
a/external/kafka-0-8/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala
 
b/external/kafka-0-8/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala
index b17e198..56f0cb0 100644
--- 
a/external/kafka-0-8/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala
+++ 
b/external/kafka-0-8/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala
@@ -223,7 +223,7 @@ object KafkaUtils {
   }
 
   /**
-   * Create a RDD from Kafka using offset ranges for each topic and partition.
+   * Create an RDD from Kafka using offset ranges for each topic and partition.
*
* @param sc SparkContext object
* @param kafkaParams Kafka http://kafka.apache.org/documentation.html#configuration";>
@@ -255,7 +255,7 @@ object KafkaUtils {
   }
 
   /**
-   * Create a RDD from Kafka using offset ranges for each topic and partition. 
This allows you
+   * Create an RDD from Kafka using offset ranges for each topic and 
partition. This allows you
* specify the Kafka leader to connect to (to optimize fetching) and access 
the message as well
* as the metadata.
*
@@ -303,7 +303,7 @@ object KafkaUtils {
   }
 
   /**
-   * Create a RDD from Kafka using offset ranges for each topic and partition.
+   * Create an RDD from Kafka using offset ranges for each topic and partition.
*
* @param jsc JavaSparkContext object
* @param kafkaParams Kafka http://kafka.apache.org/documentation.html#configuration";>
@@ -340,7 +340,7 @@ object KafkaUtils {
   }
 
   /**
-   * Create a RDD from Kafka using offset ranges for each topic and partition. 
This allows you
+   * Create an RDD from Kafka using offset ranges for each topic and 
partition. This allows you
* specify the Kafka leader to connect to (to optimize fetching) and access 
the message as well
* as the metadata.
*

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala
--
diff --git 
a/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala
 
b/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala
index a0007d3..b2daffa 100644
--- 
a/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala
+++ 
b/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisUtils.scala
@@ -33,10 +33,6 @@ object KinesisUtils {
* Create an input stream that pulls messages from a Kinesis stream.
* This uses the Kinesis Client Library (KCL) to pull messages from Kinesis.
*
-   * Note: The AWS credentials will be discovered using the 
DefaultAWSCredentialsProviderChain
-   * on the workers. See AWS documentation to understand how 
DefaultAWSCredentialsProviderChain
-   * gets the AWS credentials.
-   *
* @param ssc StreamingContext object
* @param kinesisAppName  Kinesis application name used by the Kinesis 
Client Library
*(KCL) to update DynamoDB
@@ -57,6 +53,10 @@ object KinesisUtils {
* StorageLevel.MEMORY_AND_DISK_2 is recommended.
* @param messageHandler A custom message handler that can generate a 
generic output from a
*   Kinesis `Record`, which contains both message data, 
and metadata.
+   *
+   * @note The AWS credentials will be discovered using the 
DefaultAWSCredentialsProviderChain
+   * on the workers. See AWS documentation to understand how 
DefaultAWSCredentialsProviderChain
+   * gets the AWS credentials.
*/
   def createStream[T: ClassTag](
   ssc: StreamingContext,
@@ -81,10 +81,6 @@ object KinesisUtils {
* Create an input stream that pulls messages from a Kinesis stream.
* This uses the Kinesis Client Library (KCL) to pull messages from Kinesis.
*
-   * Note:
-   *  The given AWS credentials will get saved in DStream checkpoints if 
checkpointing
-   *  is enabled. Make sure that your checkpoint directory is secure.
-   *
* @param ssc StreamingContext object
* @param kinesisAppName  Kinesis application name used by the Kinesis 
Client Library
*(KCL) to update DynamoDB
@@ -107,6 +103,9 @@ object KinesisUtils {
*   Kinesis `Record`, which contains both message data, 
and metadata.
* @param awsAccessKeyId  AWS AccessKeyId (if null, will use 
DefaultAWSCredentialsProviderChain)
* @param awsSecretKey  AWS SecretKey (if null, will use 
DefaultAWSCredentialsProviderChain)
+   *
+   * @note The given AWS credentials will get saved in DStream checkpoints if 
checkpointing
+   * is enabled. Make sure that your checkpoint directory is se

[1/3] spark git commit: [SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that`/`'''Note:'''` across Scala/Java API documentation

2016-11-19 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 693401be2 -> 4b396a654


http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/python/pyspark/rdd.py
--
diff --git a/python/pyspark/rdd.py b/python/pyspark/rdd.py
index a163cea..641787e 100644
--- a/python/pyspark/rdd.py
+++ b/python/pyspark/rdd.py
@@ -1218,7 +1218,7 @@ class RDD(object):
 
 def top(self, num, key=None):
 """
-Get the top N elements from a RDD.
+Get the top N elements from an RDD.
 
 Note that this method should only be used if the resulting array is 
expected
 to be small, as all the data is loaded into the driver's memory.
@@ -1242,7 +1242,7 @@ class RDD(object):
 
 def takeOrdered(self, num, key=None):
 """
-Get the N elements from a RDD ordered in ascending order or as
+Get the N elements from an RDD ordered in ascending order or as
 specified by the optional key function.
 
 Note that this method should only be used if the resulting array is 
expected

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/python/pyspark/streaming/kafka.py
--
diff --git a/python/pyspark/streaming/kafka.py 
b/python/pyspark/streaming/kafka.py
index bf27d80..134424a 100644
--- a/python/pyspark/streaming/kafka.py
+++ b/python/pyspark/streaming/kafka.py
@@ -144,7 +144,7 @@ class KafkaUtils(object):
 """
 .. note:: Experimental
 
-Create a RDD from Kafka using offset ranges for each topic and 
partition.
+Create an RDD from Kafka using offset ranges for each topic and 
partition.
 
 :param sc:  SparkContext object
 :param kafkaParams: Additional params for Kafka
@@ -155,7 +155,7 @@ class KafkaUtils(object):
 :param valueDecoder:  A function used to decode value (default is 
utf8_decoder)
 :param messageHandler: A function used to convert 
KafkaMessageAndMetadata. You can assess
meta using messageHandler (default is None).
-:return: A RDD object
+:return: An RDD object
 """
 if leaders is None:
 leaders = dict()

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/sql/catalyst/src/main/scala/org/apache/spark/sql/Encoders.scala
--
diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/Encoders.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/Encoders.scala
index dc90659..0b95a88 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/Encoders.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/Encoders.scala
@@ -165,10 +165,10 @@ object Encoders {
* (Scala-specific) Creates an encoder that serializes objects of type T 
using generic Java
* serialization. This encoder maps T into a single byte array (binary) 
field.
*
-   * Note that this is extremely inefficient and should only be used as the 
last resort.
-   *
* T must be publicly accessible.
*
+   * @note This is extremely inefficient and should only be used as the last 
resort.
+   *
* @since 1.6.0
*/
   def javaSerialization[T: ClassTag]: Encoder[T] = genericSerializer(useKryo = 
false)
@@ -177,10 +177,10 @@ object Encoders {
* Creates an encoder that serializes objects of type T using generic Java 
serialization.
* This encoder maps T into a single byte array (binary) field.
*
-   * Note that this is extremely inefficient and should only be used as the 
last resort.
-   *
* T must be publicly accessible.
*
+   * @note This is extremely inefficient and should only be used as the last 
resort.
+   *
* @since 1.6.0
*/
   def javaSerialization[T](clazz: Class[T]): Encoder[T] = 
javaSerialization(ClassTag[T](clazz))

http://git-wip-us.apache.org/repos/asf/spark/blob/4b396a65/sql/catalyst/src/main/scala/org/apache/spark/sql/types/CalendarIntervalType.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/CalendarIntervalType.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/CalendarIntervalType.scala
index e121044..21f3497 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/CalendarIntervalType.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/CalendarIntervalType.scala
@@ -23,10 +23,10 @@ import org.apache.spark.annotation.InterfaceStability
  * The data type representing calendar time intervals. The calendar time 
interval is stored
  * internally in two components: number of months the number of microseconds.
  *
- * Note that calendar intervals are not comparable.
- *
  * Please use the singleton [[DataTypes.CalendarIntervalType]].
  *
+ * @note Calendar intervals are not comparable.
+ *
  * @since 1.5.0
  */
 @InterfaceStability.St

[3/3] spark git commit: [SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that`/`'''Note:'''` across Scala/Java API documentation

2016-11-19 Thread srowen

[SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note 
that`/`'''Note:'''` across Scala/Java API documentation

It seems in Scala/Java,

- `Note:`
- `NOTE:`
- `Note that`
- `'''Note:'''`
- `note`

This PR proposes to fix those to `note` to be consistent.

**Before**

- Scala
  ![2016-11-17 6 16 
39](https://cloud.githubusercontent.com/assets/6477701/20383180/1a7aed8c-acf2-11e6-9611-5eaf6d52c2e0.png)

- Java
  ![2016-11-17 6 14 
41](https://cloud.githubusercontent.com/assets/6477701/20383096/c8ffc680-acf1-11e6-914a-33460bf1401d.png)

**After**

- Scala
  ![2016-11-17 6 16 
44](https://cloud.githubusercontent.com/assets/6477701/20383167/09940490-acf2-11e6-937a-0d5e1dc2cadf.png)

- Java
  ![2016-11-17 6 13 
39](https://cloud.githubusercontent.com/assets/6477701/20383132/e7c2a57e-acf1-11e6-9c47-b849674d4d88.png)

The notes were found via

```bash
grep -r "NOTE: " . | \ # Note:|NOTE:|Note that|'''Note:'''
grep -v "// NOTE: " | \  # starting with // does not appear in API 
documentation.
grep -E '.scala|.java' | \ # java/scala files
grep -v Suite | \ # exclude tests
grep -v Test | \ # exclude tests
grep -e 'org.apache.spark.api.java' \ # packages appear in API documenation
-e 'org.apache.spark.api.java.function' \ # note that this is a regular 
expression. So actual matches were mostly `org/apache/spark/api/java/functions 
...`
-e 'org.apache.spark.api.r' \
...
```

```bash
grep -r "Note that " . | \ # Note:|NOTE:|Note that|'''Note:'''
grep -v "// Note that " | \  # starting with // does not appear in API 
documentation.
grep -E '.scala|.java' | \ # java/scala files
grep -v Suite | \ # exclude tests
grep -v Test | \ # exclude tests
grep -e 'org.apache.spark.api.java' \ # packages appear in API documenation
-e 'org.apache.spark.api.java.function' \
-e 'org.apache.spark.api.r' \
...
```

```bash
grep -r "Note: " . | \ # Note:|NOTE:|Note that|'''Note:'''
grep -v "// Note: " | \  # starting with // does not appear in API 
documentation.
grep -E '.scala|.java' | \ # java/scala files
grep -v Suite | \ # exclude tests
grep -v Test | \ # exclude tests
grep -e 'org.apache.spark.api.java' \ # packages appear in API documenation
-e 'org.apache.spark.api.java.function' \
-e 'org.apache.spark.api.r' \
...
```

```bash
grep -r "'''Note:'''" . | \ # Note:|NOTE:|Note that|'''Note:'''
grep -v "// '''Note:''' " | \  # starting with // does not appear in API 
documentation.
grep -E '.scala|.java' | \ # java/scala files
grep -v Suite | \ # exclude tests
grep -v Test | \ # exclude tests
grep -e 'org.apache.spark.api.java' \ # packages appear in API documenation
-e 'org.apache.spark.api.java.function' \
-e 'org.apache.spark.api.r' \
...
```

And then fixed one by one comparing with API documentation/access modifiers.

After that, manually tested via `jekyll build`.

Author: hyukjinkwon 

Closes #15889 from HyukjinKwon/SPARK-18437.

(cherry picked from commit d5b1d5fc80153571c308130833d0c0774de62c92)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4b396a65
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4b396a65
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4b396a65

Branch: refs/heads/branch-2.1
Commit: 4b396a6545ec0f1e31b0e211228f04bdc5660300
Parents: 693401b
Author: hyukjinkwon 
Authored: Sat Nov 19 11:24:15 2016 +
Committer: Sean Owen 
Committed: Sat Nov 19 11:25:07 2016 +

--
 .../scala/org/apache/spark/ContextCleaner.scala |  2 +-
 .../scala/org/apache/spark/Partitioner.scala|  2 +-
 .../main/scala/org/apache/spark/SparkConf.scala |  6 +-
 .../scala/org/apache/spark/SparkContext.scala   | 47 ---
 .../apache/spark/api/java/JavaDoubleRDD.scala   |  4 +-
 .../org/apache/spark/api/java/JavaPairRDD.scala | 26 +
 .../org/apache/spark/api/java/JavaRDD.scala | 12 ++--
 .../org/apache/spark/api/java/JavaRDDLike.scala |  3 +-
 .../spark/api/java/JavaSparkContext.scala   | 21 +++
 .../spark/api/java/JavaSparkStatusTracker.scala |  2 +-
 .../org/apache/spark/io/CompressionCodec.scala  | 23 
 .../apache/spark/partial/BoundedDouble.scala|  2 +-
 .../org/apache/spark/rdd/CoGroupedRDD.scala |  8 +--
 .../apache/spark/rdd/DoubleRDDFunctions.scala   |  2 +-
 .../scala/org/apache/spark/rdd/HadoopRDD.scala  |  6 +-
 .../org/apache/spark/rdd/NewHadoopRDD.scala |  6 +-
 .../org/apache/spark/rdd/PairRDDFunctions.scala | 23 
 .../apache/spark/rdd/PartitionPruningRDD.scala  |  2 +-
 .../spark/rdd/PartitionwiseSampledRDD.scala |  2 +-
 .../main/scala/org/apache/spark/rdd/RDD.scala   | 46 +++
 .../apache/spark/rdd/RDDCheckpointData.scala|  2 +-
 .../spark/rdd/ReliableCheckpointRDD.scala   |  2 +-
 .../spark/rdd/SequenceFileRDDFunctions.scala|  5 +-
 .../apache/spark/rdd/ZippedWithIndexRDD.scala   |  2 +-
 .../spark/scheduler/AccumulableInfo.scala   | 10 ++

spark git commit: [SPARK-18353][CORE] spark.rpc.askTimeout defalut value is not 120s

2016-11-19 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master d5b1d5fc8 -> 8b1e1088e


[SPARK-18353][CORE] spark.rpc.askTimeout defalut value is not 120s

## What changes were proposed in this pull request?

Avoid hard-coding spark.rpc.askTimeout to non-default in Client; fix doc about 
spark.rpc.askTimeout default

## How was this patch tested?

Existing tests

Author: Sean Owen 

Closes #15833 from srowen/SPARK-18353.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8b1e1088
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8b1e1088
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8b1e1088

Branch: refs/heads/master
Commit: 8b1e1088eb274fb15260cd5d6d9508d42837a4d6
Parents: d5b1d5f
Author: Sean Owen 
Authored: Sat Nov 19 11:28:25 2016 +
Committer: Sean Owen 
Committed: Sat Nov 19 11:28:25 2016 +

--
 core/src/main/scala/org/apache/spark/deploy/Client.scala | 4 +++-
 docs/configuration.md| 4 ++--
 2 files changed, 5 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/8b1e1088/core/src/main/scala/org/apache/spark/deploy/Client.scala
--
diff --git a/core/src/main/scala/org/apache/spark/deploy/Client.scala 
b/core/src/main/scala/org/apache/spark/deploy/Client.scala
index ee276e1..a4de3d7 100644
--- a/core/src/main/scala/org/apache/spark/deploy/Client.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/Client.scala
@@ -221,7 +221,9 @@ object Client {
 val conf = new SparkConf()
 val driverArgs = new ClientArguments(args)
 
-conf.set("spark.rpc.askTimeout", "10")
+if (!conf.contains("spark.rpc.askTimeout")) {
+  conf.set("spark.rpc.askTimeout", "10s")
+}
 Logger.getRootLogger.setLevel(driverArgs.logLevel)
 
 val rpcEnv =

http://git-wip-us.apache.org/repos/asf/spark/blob/8b1e1088/docs/configuration.md
--
diff --git a/docs/configuration.md b/docs/configuration.md
index c021a37..a3b4ff0 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1184,7 +1184,7 @@ Apart from these, the following properties are also 
available, and may be useful
 
 
   spark.rpc.askTimeout
-  120s
+  spark.network.timeout
   
 Duration for an RPC ask operation to wait before timing out.
   
@@ -1566,7 +1566,7 @@ Apart from these, the following properties are also 
available, and may be useful
 
 
   spark.core.connection.ack.wait.timeout
-  60s
+  spark.network.timeout
   
 How long for the connection to wait for ack to occur before timing
 out and giving up. To avoid unwilling timeout caused by long pause like GC,


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18353][CORE] spark.rpc.askTimeout defalut value is not 120s

2016-11-19 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 4b396a654 -> 30a6fbbb0


[SPARK-18353][CORE] spark.rpc.askTimeout defalut value is not 120s

## What changes were proposed in this pull request?

Avoid hard-coding spark.rpc.askTimeout to non-default in Client; fix doc about 
spark.rpc.askTimeout default

## How was this patch tested?

Existing tests

Author: Sean Owen 

Closes #15833 from srowen/SPARK-18353.

(cherry picked from commit 8b1e1088eb274fb15260cd5d6d9508d42837a4d6)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/30a6fbbb
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/30a6fbbb
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/30a6fbbb

Branch: refs/heads/branch-2.1
Commit: 30a6fbbb0fb47f5b74ceba3384f28a61bf4e4740
Parents: 4b396a6
Author: Sean Owen 
Authored: Sat Nov 19 11:28:25 2016 +
Committer: Sean Owen 
Committed: Sat Nov 19 11:28:33 2016 +

--
 core/src/main/scala/org/apache/spark/deploy/Client.scala | 4 +++-
 docs/configuration.md| 4 ++--
 2 files changed, 5 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/30a6fbbb/core/src/main/scala/org/apache/spark/deploy/Client.scala
--
diff --git a/core/src/main/scala/org/apache/spark/deploy/Client.scala 
b/core/src/main/scala/org/apache/spark/deploy/Client.scala
index ee276e1..a4de3d7 100644
--- a/core/src/main/scala/org/apache/spark/deploy/Client.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/Client.scala
@@ -221,7 +221,9 @@ object Client {
 val conf = new SparkConf()
 val driverArgs = new ClientArguments(args)
 
-conf.set("spark.rpc.askTimeout", "10")
+if (!conf.contains("spark.rpc.askTimeout")) {
+  conf.set("spark.rpc.askTimeout", "10s")
+}
 Logger.getRootLogger.setLevel(driverArgs.logLevel)
 
 val rpcEnv =

http://git-wip-us.apache.org/repos/asf/spark/blob/30a6fbbb/docs/configuration.md
--
diff --git a/docs/configuration.md b/docs/configuration.md
index e0c6613..c2329b4 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1175,7 +1175,7 @@ Apart from these, the following properties are also 
available, and may be useful
 
 
   spark.rpc.askTimeout
-  120s
+  spark.network.timeout
   
 Duration for an RPC ask operation to wait before timing out.
   
@@ -1531,7 +1531,7 @@ Apart from these, the following properties are also 
available, and may be useful
 
 
   spark.core.connection.ack.wait.timeout
-  60s
+  spark.network.timeout
   
 How long for the connection to wait for ack to occur before timing
 out and giving up. To avoid unwilling timeout caused by long pause like GC,


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark-website git commit: (Again again) make web site HTML consistent with latest jekyll output

2016-11-19 Thread srowen

Repository: spark-website
Updated Branches:
  refs/heads/asf-site 80a543b56 -> 59fca9b5c


(Again again) make web site HTML consistent with latest jekyll output


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/59fca9b5
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/59fca9b5
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/59fca9b5

Branch: refs/heads/asf-site
Commit: 59fca9b5c34fe66bd85466b2564a2b288ba8d815
Parents: 80a543b
Author: Sean Owen 
Authored: Sat Nov 19 12:25:16 2016 +
Committer: Sean Owen 
Committed: Sat Nov 19 12:25:16 2016 +

--
 site/documentation.html |  5 +--
 site/mailing-lists.html |  2 +-
 site/news/index.html| 10 +++---
 site/news/spark-0-9-1-released.html |  2 +-
 site/news/spark-0-9-2-released.html |  2 +-
 site/news/spark-1-1-0-released.html |  2 +-
 site/news/spark-1-2-2-released.html |  2 +-
 site/news/spark-and-shark-in-the-news.html  |  2 +-
 .../spark-summit-east-2015-videos-posted.html   |  2 +-
 site/releases/spark-release-0-8-0.html  |  4 +--
 site/releases/spark-release-0-9-1.html  | 20 +--
 site/releases/spark-release-1-0-1.html  |  8 ++---
 site/releases/spark-release-1-0-2.html  |  2 +-
 site/releases/spark-release-1-1-0.html  |  6 ++--
 site/releases/spark-release-1-2-0.html  |  2 +-
 site/releases/spark-release-1-3-0.html  |  6 ++--
 site/releases/spark-release-1-3-1.html  |  6 ++--
 site/releases/spark-release-1-4-0.html  |  4 +--
 site/releases/spark-release-1-5-0.html  | 30 
 site/releases/spark-release-1-6-0.html  | 20 +--
 site/releases/spark-release-2-0-0.html  | 36 ++--
 21 files changed, 87 insertions(+), 86 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark-website/blob/59fca9b5/site/documentation.html
--
diff --git a/site/documentation.html b/site/documentation.html
index 657fdbe..2c580c9 100644
--- a/site/documentation.html
+++ b/site/documentation.html
@@ -255,12 +255,13 @@
 
 
 Meetup Talk Videos
-In addition to the videos listed below, you can also view http://www.meetup.com/spark-users/files/";>all slides from Bay Area 
meetups here.
+In addition to the videos listed below, you can also view http://www.meetup.com/spark-users/files/";>all slides from Bay Area 
meetups here.
 
   .video-meta-info {
 font-size: 0.95em;
   }
-
+
+
 
   http://www.youtube.com/watch?v=NUQ-8to2XAk&list=PL-x35fyliRwiP3YteXbnhk0QGOtYLBT3a";>Spark
 1.0 and Beyond (http://files.meetup.com/3138542/Spark%201.0%20Meetup.ppt";>slides) 
by Patrick Wendell, at Cisco in San Jose, 
2014-04-23
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/59fca9b5/site/mailing-lists.html
--
diff --git a/site/mailing-lists.html b/site/mailing-lists.html
index f3e475e..c0ce369 100644
--- a/site/mailing-lists.html
+++ b/site/mailing-lists.html
@@ -12,7 +12,7 @@
 
   
 
-http://localhost:4000community.html"; />
+
   
 
   

http://git-wip-us.apache.org/repos/asf/spark-website/blob/59fca9b5/site/news/index.html
--
diff --git a/site/news/index.html b/site/news/index.html
index 488a16d..9dad603 100644
--- a/site/news/index.html
+++ b/site/news/index.html
@@ -424,7 +424,7 @@ The Summit will contain https://spark-summit.org/2015/schedule/";>presen
   Spark Summit East 2015 
Videos Posted
   April 20, 2015
 
-The videos and slides for Spark Summit East 
2015 are now all http://spark-summit.org/east/2015";>available 
online. Watch them to get the latest news from the Spark community as well 
as use cases and applications built on top. 
+The videos and slides for Spark Summit East 
2015 are now all http://spark-summit.org/east/2015";>available 
online. Watch them to get the latest news from the Spark community as well 
as use cases and applications built on top.
 
 
   
@@ -434,7 +434,7 @@ The Summit will contain https://spark-summit.org/2015/schedule/";>presen
   Spark 
1.2.2 and 1.3.1 released
   April 17, 2015
 
-We are happy to announce the availability of 
Spark 
1.2.2 and Spark 1.3.1! These are both maintenance releases that collectively 
feature the work of more than 90 developers. 
+We are happy to announce the availability of 
Spark 
1.2.2 and Spark 1.3.1! These are both maintenance releases that collectively 
feature the work of more than 90 developers.
 
 
   
@@ -546,7 +546,7 @@ The Summit will contai

spark git commit: [SPARK-18448][CORE] Fix @since 2.1.0 on new SparkSession.close() method

2016-11-19 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 30a6fbbb0 -> 15ad3a319


[SPARK-18448][CORE] Fix @since 2.1.0 on new SparkSession.close() method

## What changes were proposed in this pull request?

Fix since 2.1.0 on new SparkSession.close() method. I goofed in 
https://github.com/apache/spark/pull/15932 because it was back-ported to 2.1 
instead of just master as originally planned.

Author: Sean Owen 

Closes #15938 from srowen/SPARK-18448.2.

(cherry picked from commit ded5fefb6f5c0a97bf3d7fa1c0494dc434b6ee40)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/15ad3a31
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/15ad3a31
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/15ad3a31

Branch: refs/heads/branch-2.1
Commit: 15ad3a319b91a8b495da9a0e6f5386417991d30d
Parents: 30a6fbb
Author: Sean Owen 
Authored: Sat Nov 19 13:48:56 2016 +
Committer: Sean Owen 
Committed: Sat Nov 19 13:49:06 2016 +

--
 sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/15ad3a31/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
--
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
index e09e3ca..71b1880 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
@@ -652,7 +652,7 @@ class SparkSession private(
   /**
* Synonym for `stop()`.
*
-   * @since 2.2.0
+   * @since 2.1.0
*/
   override def close(): Unit = stop()
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18448][CORE] Fix @since 2.1.0 on new SparkSession.close() method

2016-11-19 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 8b1e1088e -> ded5fefb6


[SPARK-18448][CORE] Fix @since 2.1.0 on new SparkSession.close() method

## What changes were proposed in this pull request?

Fix since 2.1.0 on new SparkSession.close() method. I goofed in 
https://github.com/apache/spark/pull/15932 because it was back-ported to 2.1 
instead of just master as originally planned.

Author: Sean Owen 

Closes #15938 from srowen/SPARK-18448.2.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ded5fefb
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ded5fefb
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ded5fefb

Branch: refs/heads/master
Commit: ded5fefb6f5c0a97bf3d7fa1c0494dc434b6ee40
Parents: 8b1e108
Author: Sean Owen 
Authored: Sat Nov 19 13:48:56 2016 +
Committer: Sean Owen 
Committed: Sat Nov 19 13:48:56 2016 +

--
 sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/ded5fefb/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
--
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
index e09e3ca..71b1880 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala
@@ -652,7 +652,7 @@ class SparkSession private(
   /**
* Synonym for `stop()`.
*
-   * @since 2.2.0
+   * @since 2.1.0
*/
   override def close(): Unit = stop()
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark-website git commit: remove broken links from the documentation site

2016-11-19 Thread srowen

Repository: spark-website
Updated Branches:
  refs/heads/asf-site 59fca9b5c -> d85c53c4d


remove broken links from the documentation site


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/d85c53c4
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/d85c53c4
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/d85c53c4

Branch: refs/heads/asf-site
Commit: d85c53c4d3862e2da162eb178ed808649b6320ae
Parents: 59fca9b
Author: Christine Koppelt 
Authored: Sat Nov 19 14:22:30 2016 +0100
Committer: Christine Koppelt 
Committed: Sat Nov 19 16:32:13 2016 +0100

--
 documentation.md| 4 
 site/documentation.html | 4 
 2 files changed, 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark-website/blob/d85c53c4/documentation.md
--
diff --git a/documentation.md b/documentation.md
index c2f4506..d5597f0 100644
--- a/documentation.md
+++ b/documentation.md
@@ -145,13 +145,9 @@ Slides, videos and EC2-based exercises from each of these 
are available online:
   http://engineering.ooyala.com/blog/using-parquet-and-scrooge-spark";>Using 
Parquet and Scrooge with Spark — Scala-friendly Parquet and Avro 
usage tutorial from Ooyala's Evan Chan
   http://codeforhire.com/2014/02/18/using-spark-with-mongodb/";>Using Spark 
with MongoDB — by Sampo Niskanen from Wellmo
   http://spark-summit.org/2013";>Spark Summit 2013 — 
contained 30 talks about Spark use cases, available as slides and videos
-  http://www.pwendell.com/2013/09/28/declarative-streams.html";>Sampling 
Twitter Using Declarative Streams — Spark Streaming tutorial by 
Patrick Wendell
   http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/";>A 
Powerful Big Data Trio: Spark, Parquet and Avro — Using Parquet in 
Spark by Matt Massie
   http://www.slideshare.net/EvanChan2/cassandra2013-spark-talk-final";>Real-time
 Analytics with Cassandra, Spark, and Shark — Presentation by Evan 
Chan from Ooyala at 2013 Cassandra Summit
-  http://syndeticlogic.net/?p=311";>Getting Spark Setup in 
Eclipse — Developer blog post by James Percent
   http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923";>Run 
Spark and Shark on Amazon Elastic MapReduce — Article by Amazon 
Elastic MapReduce team member Parviz Deyhim
-  http://blog.quantifind.com/posts/spark-unit-test/";>Unit testing 
with Spark — Quantifind tech blog post by Imran Rashid
-  http://blog.quantifind.com/posts/logging-post/";>Configuring 
Spark logs — Quantifind tech blog by Imran Rashid
   http://www.ibm.com/developerworks/library/os-spark/";>Spark, an 
alternative for fast data analytics — IBM Developer Works article by 
M. Tim Jones
 
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/d85c53c4/site/documentation.html
--
diff --git a/site/documentation.html b/site/documentation.html
index 2c580c9..f30830f 100644
--- a/site/documentation.html
+++ b/site/documentation.html
@@ -320,13 +320,9 @@ Slides, videos and EC2-based exercises from each of these 
are available online:
   http://engineering.ooyala.com/blog/using-parquet-and-scrooge-spark";>Using 
Parquet and Scrooge with Spark — Scala-friendly Parquet and Avro 
usage tutorial from Ooyala's Evan Chan
   http://codeforhire.com/2014/02/18/using-spark-with-mongodb/";>Using Spark 
with MongoDB — by Sampo Niskanen from Wellmo
   http://spark-summit.org/2013";>Spark Summit 2013 — 
contained 30 talks about Spark use cases, available as slides and videos
-  http://www.pwendell.com/2013/09/28/declarative-streams.html";>Sampling 
Twitter Using Declarative Streams — Spark Streaming tutorial by 
Patrick Wendell
   http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/";>A 
Powerful Big Data Trio: Spark, Parquet and Avro — Using Parquet in 
Spark by Matt Massie
   http://www.slideshare.net/EvanChan2/cassandra2013-spark-talk-final";>Real-time
 Analytics with Cassandra, Spark, and Shark — Presentation by Evan 
Chan from Ooyala at 2013 Cassandra Summit
-  http://syndeticlogic.net/?p=311";>Getting Spark Setup in 
Eclipse — Developer blog post by James Percent
   http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923";>Run 
Spark and Shark on Amazon Elastic MapReduce — Article by Amazon 
Elastic MapReduce team member Parviz Deyhim
-  http://blog.quantifind.com/posts/spark-unit-test/";>Unit testing 
with Spark — Quantifind tech blog post by Imran Rashid
-  http://blog.quantifind.com/posts/logging-post/";>Configuring 
Spark logs — Quantifind tech blog by Imran Rashid
   http://www.ibm.com/developerworks/library/os-spark/";>Spark, an 
alternative for fast data analytics — IBM Developer Works article by 
M. Tim Jones
 
 


-

spark git commit: [SPARK-3359][BUILD][DOCS] Print examples and disable group and tparam tags in javadoc

2016-11-20 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 7ca7a6352 -> c528812ce


[SPARK-3359][BUILD][DOCS] Print examples and disable group and tparam tags in 
javadoc

## What changes were proposed in this pull request?

This PR proposes/fixes two things.

- Remove many errors to generate javadoc with Java8 from unrecognisable tags, 
`tparam` and `group`.

  ```
  [error] 
.../spark/mllib/target/java/org/apache/spark/ml/classification/Classifier.java:18:
 error: unknown tag: group
  [error]   /** group setParam */
  [error]   ^
  [error] 
.../spark/mllib/target/java/org/apache/spark/ml/classification/Classifier.java:8:
 error: unknown tag: tparam
  [error]  * tparam FeaturesType  Type of input features.  E.g., 
Vector
  [error]^
  ...
  ```

  It does not fully resolve the problem but remove many errors. It seems both 
`group` and `tparam` are unrecognisable in javadoc. It seems we can't print 
them pretty in javadoc in a way of `example` here because they appear 
differently (both examples can be found in 
http://spark.apache.org/docs/2.0.2/api/scala/index.html#org.apache.spark.ml.classification.Classifier).

- Print `example` in javadoc.
  Currently, there are few `example` tag in several places.

  ```
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * example This 
operation might be used to evaluate a graph
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * example We 
might use this operation to change the vertex values
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * example This 
function might be used to initialize edge
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * example This 
function might be used to initialize edge
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * example This 
function might be used to initialize edge
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * example We 
can use this function to compute the in-degree of each
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * example This 
function is used to update the vertices with new values based on external data.
  ./graphx/src/main/scala/org/apache/spark/graphx/GraphLoader.scala:   * 
example Loads a file in the following format:
  ./graphx/src/main/scala/org/apache/spark/graphx/GraphOps.scala:   * example 
This function is used to update the vertices with new
  ./graphx/src/main/scala/org/apache/spark/graphx/GraphOps.scala:   * example 
This function can be used to filter the graph based on some property, without
  ./graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala: * example We 
can use the Pregel abstraction to implement PageRank:
  ./graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala: * example 
Construct a `VertexRDD` from a plain RDD:
  
./repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkCommandLine.scala: 
* example new SparkCommandLine(Nil).settings
  ./repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkIMain.scala:   * 
example addImports("org.apache.spark.SparkContext")
  
./sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala:
 * example {{{
  ```

**Before**

  https://cloud.githubusercontent.com/assets/6477701/20457285/26f07e1c-aecb-11e6-9ae9-d9dee66845f4.png";>

**After**
  https://cloud.githubusercontent.com/assets/6477701/20457240/409124e4-aeca-11e6-9a91-0ba514148b52.png";>

## How was this patch tested?

Maunally tested by `jekyll build` with Java 7 and 8

```
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
```

```
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
```

Note: this does not make sbt unidoc suceed with Java 8 yet but it reduces the 
number of errors with Java 8.

Author: hyukjinkwon 

Closes #15939 from HyukjinKwon/SPARK-3359-javadoc.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c528812c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c528812c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c528812c

Branch: refs/heads/master
Commit: c528812ce770fd8a6626e7f9d2f8ca9d1e84642b
Parents: 7ca7a63
Author: hyukjinkwon 
Authored: Sun Nov 20 09:52:03 2016 +
Committer: Sean Owen 
Committed: Sun Nov 20 09:52:03 2016 +

--
 pom.xml  | 13 +
 project/SparkBuild.scala |  5 -
 2 files changed, 17 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/c528812c/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 024b285..7c0b0b5 100644
--- a/pom.xml
+++ b/pom.xml
@@ -2478,10 +24

spark git commit: [SPARK-3359][BUILD][DOCS] Print examples and disable group and tparam tags in javadoc

2016-11-20 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 063da0c8d -> bc3e7b3b8


[SPARK-3359][BUILD][DOCS] Print examples and disable group and tparam tags in 
javadoc

## What changes were proposed in this pull request?

This PR proposes/fixes two things.

- Remove many errors to generate javadoc with Java8 from unrecognisable tags, 
`tparam` and `group`.

  ```
  [error] 
.../spark/mllib/target/java/org/apache/spark/ml/classification/Classifier.java:18:
 error: unknown tag: group
  [error]   /** group setParam */
  [error]   ^
  [error] 
.../spark/mllib/target/java/org/apache/spark/ml/classification/Classifier.java:8:
 error: unknown tag: tparam
  [error]  * tparam FeaturesType  Type of input features.  E.g., 
Vector
  [error]^
  ...
  ```

  It does not fully resolve the problem but remove many errors. It seems both 
`group` and `tparam` are unrecognisable in javadoc. It seems we can't print 
them pretty in javadoc in a way of `example` here because they appear 
differently (both examples can be found in 
http://spark.apache.org/docs/2.0.2/api/scala/index.html#org.apache.spark.ml.classification.Classifier).

- Print `example` in javadoc.
  Currently, there are few `example` tag in several places.

  ```
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * example This 
operation might be used to evaluate a graph
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * example We 
might use this operation to change the vertex values
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * example This 
function might be used to initialize edge
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * example This 
function might be used to initialize edge
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * example This 
function might be used to initialize edge
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * example We 
can use this function to compute the in-degree of each
  ./graphx/src/main/scala/org/apache/spark/graphx/Graph.scala:   * example This 
function is used to update the vertices with new values based on external data.
  ./graphx/src/main/scala/org/apache/spark/graphx/GraphLoader.scala:   * 
example Loads a file in the following format:
  ./graphx/src/main/scala/org/apache/spark/graphx/GraphOps.scala:   * example 
This function is used to update the vertices with new
  ./graphx/src/main/scala/org/apache/spark/graphx/GraphOps.scala:   * example 
This function can be used to filter the graph based on some property, without
  ./graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala: * example We 
can use the Pregel abstraction to implement PageRank:
  ./graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala: * example 
Construct a `VertexRDD` from a plain RDD:
  
./repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkCommandLine.scala: 
* example new SparkCommandLine(Nil).settings
  ./repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkIMain.scala:   * 
example addImports("org.apache.spark.SparkContext")
  
./sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralGenerator.scala:
 * example {{{
  ```

**Before**

  https://cloud.githubusercontent.com/assets/6477701/20457285/26f07e1c-aecb-11e6-9ae9-d9dee66845f4.png";>

**After**
  https://cloud.githubusercontent.com/assets/6477701/20457240/409124e4-aeca-11e6-9a91-0ba514148b52.png";>

## How was this patch tested?

Maunally tested by `jekyll build` with Java 7 and 8

```
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
```

```
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
```

Note: this does not make sbt unidoc suceed with Java 8 yet but it reduces the 
number of errors with Java 8.

Author: hyukjinkwon 

Closes #15939 from HyukjinKwon/SPARK-3359-javadoc.

(cherry picked from commit c528812ce770fd8a6626e7f9d2f8ca9d1e84642b)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bc3e7b3b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bc3e7b3b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bc3e7b3b

Branch: refs/heads/branch-2.1
Commit: bc3e7b3b8a0dfc00d22bf5ee168f308a6ef5d78b
Parents: 063da0c
Author: hyukjinkwon 
Authored: Sun Nov 20 09:52:03 2016 +
Committer: Sean Owen 
Committed: Sun Nov 20 09:52:13 2016 +

--
 pom.xml  | 13 +
 project/SparkBuild.scala |  5 -
 2 files changed, 17 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/bc3e7b3b/pom.xml
--

spark-website git commit: Add Spark 2.0.2 to documentation.html

2016-11-20 Thread srowen

Repository: spark-website
Updated Branches:
  refs/heads/asf-site d85c53c4d -> 3529c3f19


Add Spark 2.0.2 to documentation.html


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/3529c3f1
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/3529c3f1
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/3529c3f1

Branch: refs/heads/asf-site
Commit: 3529c3f195432eb0cef899a173ba7ffc12109a19
Parents: d85c53c
Author: Sean Owen 
Authored: Sat Nov 19 17:39:59 2016 +
Committer: Sean Owen 
Committed: Sun Nov 20 10:38:49 2016 +

--
 documentation.md| 3 ++-
 site/documentation.html | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark-website/blob/3529c3f1/documentation.md
--
diff --git a/documentation.md b/documentation.md
index d5597f0..ad1c529 100644
--- a/documentation.md
+++ b/documentation.md
@@ -12,7 +12,8 @@ navigation:
 Setup instructions, programming guides, and other documentation are 
available for each stable version of Spark below:
 
 
-  Spark 2.0.1 (latest 
release)
+  Spark 2.0.2
+  Spark 2.0.1
   Spark 2.0.0
   Spark 1.6.3
   Spark 1.6.2

http://git-wip-us.apache.org/repos/asf/spark-website/blob/3529c3f1/site/documentation.html
--
diff --git a/site/documentation.html b/site/documentation.html
index f30830f..7f7830f 100644
--- a/site/documentation.html
+++ b/site/documentation.html
@@ -188,7 +188,8 @@
 Setup instructions, programming guides, and other documentation are 
available for each stable version of Spark below:
 
 
-  Spark 2.0.1 (latest release)
+  Spark 2.0.2
+  Spark 2.0.1
   Spark 2.0.0
   Spark 1.6.3
   Spark 1.6.2


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[1/2] spark-website git commit: Update sitemap and make it partly auto-generate; fix site URL

2016-11-20 Thread srowen

Repository: spark-website
Updated Branches:
  refs/heads/asf-site 3529c3f19 -> 8d0a10ae0


http://git-wip-us.apache.org/repos/asf/spark-website/blob/8d0a10ae/sitemap.xml
--
diff --git a/sitemap.xml b/sitemap.xml
index c2b271e..ac3b8a0 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -1,1871 +1,151 @@
+---
+sitemap: false
+---
 
 http://www.sitemaps.org/schemas/sitemap/0.9";
-  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
-  xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
+xmlns="http://www.sitemaps.org/schemas/sitemap/0.9";
+xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd";>
-
+
 
   http://spark.apache.org/
-  2015-01-22T00:27:22+00:00
   daily
-
-
-  http://spark.apache.org/downloads.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/sql/
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/streaming/
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/mllib/
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/graphx/
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/documentation.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/docs/latest/
   1.0
-  2014-12-19T00:12:40+00:00
-  weekly
-
-
-  http://spark.apache.org/examples.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/community.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/faq.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/news/spark-summit-east-agenda-posted.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/news/spark-1-2-0-released.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/news/spark-1-1-1-released.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  
http://spark.apache.org/news/registration-open-for-spark-summit-east.html
-  2015-01-22T00:27:22+00:00
-  weekly
 
+
 
-  http://spark.apache.org/news/index.html
-  2015-01-22T00:27:22+00:00
+  http://spark.apache.org/docs/latest/index.html
   daily
-
-
-  http://spark.apache.org/docs/latest/spark-standalone.html
-  2014-12-19T00:12:40+00:00
-  weekly
-  1.0
-
-
-  http://spark.apache.org/docs/latest/ec2-scripts.html
-  2014-12-19T00:12:40+00:00
-  weekly
   1.0
 
 
   http://spark.apache.org/docs/latest/quick-start.html
-  2014-12-19T00:12:40+00:00
-  weekly
-  1.0
-
-
-  http://spark.apache.org/releases/spark-release-1-2-0.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/docs/latest/building-spark.html
-  2014-12-19T00:12:40+00:00
-  weekly
+  daily
   1.0
 
 
-  http://spark.apache.org/docs/latest/sql-programming-guide.html
-  2014-12-19T00:12:40+00:00
-  weekly
+  http://spark.apache.org/docs/latest/programming-guide.html
+  daily
   1.0
 
 
   
http://spark.apache.org/docs/latest/streaming-programming-guide.html
-  2014-12-19T00:12:40+00:00
-  weekly
-  1.0
-
-
-  http://spark.apache.org/docs/latest/mllib-guide.html
-  2015-01-15T02:38:52+00:00
-  weekly
-  1.0
-
-
-  http://spark.apache.org/docs/latest/graphx-programming-guide.html
-  2014-12-19T00:12:40+00:00
-  weekly
-  1.0
-
-
-  http://spark.apache.org/docs/1.2.0/
-  2014-11-24T23:38:52+00:00
-  weekly
-  0.5
-
-
-  http://spark.apache.org/docs/1.1.1/
-  2014-11-24T23:38:52+00:00
-  weekly
-  0.5
-
-
-  http://spark.apache.org/docs/1.0.2/
-  2014-08-06T00:40:54+00:00
-  weekly
-  0.5
-
-
-  http://spark.apache.org/docs/0.9.2/
-  2014-07-23T23:08:20+00:00
-  weekly
-  0.4
-
-
-  http://spark.apache.org/docs/0.8.1/
-  2013-12-19T23:20:24+00:00
-  weekly
-  0.4
-
-
-  http://spark.apache.org/docs/0.7.3/
-  2013-08-24T03:23:13+00:00
-  weekly
-  0.3
-
-
-  http://spark.apache.org/docs/0.6.2/
-  2013-08-24T03:23:13+00:00
-  weekly
-  0.3
-
-
-  http://spark.apache.org/screencasts/1-first-steps-with-spark.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  
http://spark.apache.org/screencasts/2-spark-documentation-overview.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  
http://spark.apache.org/screencasts/3-transformations-and-caching.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  
http://spark.apache.org/screencasts/4-a-standalone-job-in-spark.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/research.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/docs/latest/index.html
-  2014-12-19T00:12:40+00:00
-  weekly
+  daily
   1.0
 
 
-  http://spark.apache.org/docs/latest/programming-guide.html
-  2014-12-19T00:12:40+00:00
-  weekly
+  http://spark.apache.org/docs/latest/sql-programming-guide.html
+  daily
   1.0
 
 
-  http://spark.apache.org/docs/latest/bagel-programming-guide.html
-  2014-12-19T00:12:40+00:00
-  weekly
+  
http://spark.apache.org/docs/latest

[2/2] spark-website git commit: Update sitemap and make it partly auto-generate; fix site URL

2016-11-20 Thread srowen

Update sitemap and make it partly auto-generate; fix site URL


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/8d0a10ae
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/8d0a10ae
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/8d0a10ae

Branch: refs/heads/asf-site
Commit: 8d0a10ae0be7df4058b38d298df391fe1fc1cdab
Parents: 3529c3f
Author: Sean Owen 
Authored: Sat Nov 19 19:14:33 2016 +
Committer: Sean Owen 
Committed: Sun Nov 20 12:32:55 2016 +

--
 _config.yml |9 +-
 _layouts/global.html|2 +-
 site/mailing-lists.html |2 +-
 site/sitemap.xml| 1602 +
 sitemap.xml | 1824 ++
 5 files changed, 245 insertions(+), 3194 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark-website/blob/8d0a10ae/_config.yml
--
diff --git a/_config.yml b/_config.yml
index cf7c694..18ba30f 100644
--- a/_config.yml
+++ b/_config.yml
@@ -6,11 +6,4 @@ permalink: none
 destination: site
 exclude: ['README.md','content']
 keep_files: ['docs']
-
-# The recommended way of viewing the website on your local machine is via 
jekyll using
-# a webserver, e.g. with a command like: jekyll serve --watch --trace
-# To compile the website such that it is viewable in a sane way via a file 
browser
-# replace '/' here with the url that represents the website root dir in your 
file browser.
-# E.g. on OS X this might be:
-#url: file:///Users/andyk/Development/spark-website/content/
-url: /
+url: http://spark.apache.org
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/spark-website/blob/8d0a10ae/_layouts/global.html
--
diff --git a/_layouts/global.html b/_layouts/global.html
index b35de94..6f02c16 100644
--- a/_layouts/global.html
+++ b/_layouts/global.html
@@ -13,7 +13,7 @@
 
   {% if page.redirect %}
 
-
+
   {% endif %}
 
   {% if page.description %}

http://git-wip-us.apache.org/repos/asf/spark-website/blob/8d0a10ae/site/mailing-lists.html
--
diff --git a/site/mailing-lists.html b/site/mailing-lists.html
index c0ce369..aec2aa0 100644
--- a/site/mailing-lists.html
+++ b/site/mailing-lists.html
@@ -12,7 +12,7 @@
 
   
 
-
+http://spark.apache.org/community.html"; />
   
 
   

http://git-wip-us.apache.org/repos/asf/spark-website/blob/8d0a10ae/site/sitemap.xml
--
diff --git a/site/sitemap.xml b/site/sitemap.xml
index c2b271e..db7822b 100644
--- a/site/sitemap.xml
+++ b/site/sitemap.xml
@@ -1,1871 +1,649 @@
 
 http://www.sitemaps.org/schemas/sitemap/0.9";
-  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
-  xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
+xmlns="http://www.sitemaps.org/schemas/sitemap/0.9";
+xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd";>
-
-
-  http://spark.apache.org/
-  2015-01-22T00:27:22+00:00
-  daily
-
-
-  http://spark.apache.org/downloads.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/sql/
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/streaming/
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/mllib/
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/graphx/
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/documentation.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/docs/latest/
-  1.0
-  2014-12-19T00:12:40+00:00
-  weekly
-
-
-  http://spark.apache.org/examples.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/community.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/faq.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/news/spark-summit-east-agenda-posted.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/news/spark-1-2-0-released.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/news/spark-1-1-1-released.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  
http://spark.apache.org/news/registration-open-for-spark-summit-east.html
-  2015-01-22T00:27:22+00:00
-  weekly
-
-
-  http://spark.apache.org/news/index.html
-  2015-01-22T00:27:22+00:00
-  daily
-
-
-  http://spark.apache.org/docs/latest/spark-standalone.html
-  2014-12-19T00:12:40+00:00
-  weekly
-  1.0
-
-
-  http://spark.apach

spark-website git commit: Fix doc link and touch up sitemap; mostly an excuse to trigger github sync again

2016-11-20 Thread srowen

Repository: spark-website
Updated Branches:
  refs/heads/asf-site 8d0a10ae0 -> e45462a52


Fix doc link and touch up sitemap; mostly an excuse to trigger github sync again


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/e45462a5
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/e45462a5
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/e45462a5

Branch: refs/heads/asf-site
Commit: e45462a529fdefa776cbaf1eb41d021b0a5e4f8a
Parents: 8d0a10a
Author: Sean Owen 
Authored: Sun Nov 20 13:18:53 2016 +
Committer: Sean Owen 
Committed: Sun Nov 20 13:18:53 2016 +

--
 documentation.md|  2 +-
 site/documentation.html |  2 +-
 site/sitemap.xml| 11 +--
 sitemap.xml |  7 +--
 4 files changed, 12 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark-website/blob/e45462a5/documentation.md
--
diff --git a/documentation.md b/documentation.md
index ad1c529..0fa10c2 100644
--- a/documentation.md
+++ b/documentation.md
@@ -12,7 +12,7 @@ navigation:
 Setup instructions, programming guides, and other documentation are 
available for each stable version of Spark below:
 
 
-  Spark 2.0.2
+  Spark 2.0.2
   Spark 2.0.1
   Spark 2.0.0
   Spark 1.6.3

http://git-wip-us.apache.org/repos/asf/spark-website/blob/e45462a5/site/documentation.html
--
diff --git a/site/documentation.html b/site/documentation.html
index 7f7830f..d7f2a80 100644
--- a/site/documentation.html
+++ b/site/documentation.html
@@ -188,7 +188,7 @@
 Setup instructions, programming guides, and other documentation are 
available for each stable version of Spark below:
 
 
-  Spark 2.0.2
+  Spark 2.0.2
   Spark 2.0.1
   Spark 2.0.0
   Spark 1.6.3

http://git-wip-us.apache.org/repos/asf/spark-website/blob/e45462a5/site/sitemap.xml
--
diff --git a/site/sitemap.xml b/site/sitemap.xml
index db7822b..4b7f11f 100644
--- a/site/sitemap.xml
+++ b/site/sitemap.xml
@@ -116,7 +116,6 @@
   daily
   1.0
 
-
 
 
   
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.package
@@ -125,22 +124,20 @@
 
 
   http://spark.apache.org/docs/latest/api/java/index.html
-  daily
+  weekly
   1.0
 
 
   http://spark.apache.org/docs/latest/api/python/index.html
-  daily
+  weekly
   1.0
 
 
   http://spark.apache.org/docs/latest/api/R/index.html
-  daily
+  weekly
   1.0
 
-
 
-
 
   
http://spark.apache.org/news/spark-wins-cloudsort-100tb-benchmark.html
   weekly
@@ -642,8 +639,10 @@
   http://spark.apache.org/research.html
   weekly
 
+
 
   http://spark.apache.org/trademarks.html
   weekly
 
+
 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/spark-website/blob/e45462a5/sitemap.xml
--
diff --git a/sitemap.xml b/sitemap.xml
index ac3b8a0..73b2780 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -144,8 +144,11 @@ sitemap: false
 {% for post in site.posts %}
   {{ site.url }}{{ post.url }}
   weekly
-{% endfor %}{% for page in site.pages %}{% if page.sitemap != false 
%}
+
+{% endfor %}
+{% for page in site.pages %}{% if page.sitemap != false %}
   {{ site.url }}{{ page.url }}
   weekly
-{% endif %}{% endfor %}
+{% endif %}
+{% endfor %}
 
\ No newline at end of file


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark-website git commit: Updated FAQ to point to Community Page / StackOverflow

2016-11-21 Thread srowen

Repository: spark-website
Updated Branches:
  refs/heads/asf-site e45462a52 -> d10414f83


Updated FAQ to point to Community Page / StackOverflow


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/d10414f8
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/d10414f8
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/d10414f8

Branch: refs/heads/asf-site
Commit: d10414f8390c5e2a6a4f11cfcf731d5b4c52b289
Parents: e45462a
Author: Denny Lee 
Authored: Wed Nov 16 09:12:38 2016 -0800
Committer: Sean Owen 
Committed: Mon Nov 21 09:22:52 2016 +

--
 faq.md| 3 ++-
 site/faq.html | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark-website/blob/d10414f8/faq.md
--
diff --git a/faq.md b/faq.md
index f8aa072..8d048aa 100644
--- a/faq.md
+++ b/faq.md
@@ -70,4 +70,5 @@ Please also refer to our
 See the https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark";>Contributing
 to Spark wiki for more information.
 
 Where can I get more help?
-Please post on the http://apache-spark-user-list.1001560.n3.nabble.com";>Spark Users 
mailing list.  We'll be glad to help!
+
+Please post on StackOverflow's http://stackoverflow.com/questions/tagged/apache-spark";>apache-spark
 tag or http://apache-spark-user-list.1001560.n3.nabble.com";>Spark 
Users mailing list.  For more information, please refer to http://spark.apache.org/community.html#have-questions";>Have 
Questions?.  We'll be glad to help!

http://git-wip-us.apache.org/repos/asf/spark-website/blob/d10414f8/site/faq.html
--
diff --git a/site/faq.html b/site/faq.html
index 3c35726..c7650a1 100644
--- a/site/faq.html
+++ b/site/faq.html
@@ -246,7 +246,8 @@ Please also refer to our
 See the https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark";>Contributing
 to Spark wiki for more information.
 
 Where can I get more help?
-Please post on the http://apache-spark-user-list.1001560.n3.nabble.com";>Spark Users 
mailing list.  We'll be glad to help!
+
+Please post on StackOverflow's http://stackoverflow.com/questions/tagged/apache-spark";>apache-spark
 tag or http://apache-spark-user-list.1001560.n3.nabble.com";>Spark 
Users mailing list.  For more information, please refer to http://spark.apache.org/community.html#have-questions";>Have 
Questions?.  We'll be glad to help!
 
 
   


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark-website git commit: add Pull Request Template

2016-11-21 Thread srowen

Repository: spark-website
Updated Branches:
  refs/heads/asf-site d10414f83 -> 46fb91025


add Pull Request Template


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/46fb9102
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/46fb9102
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/46fb9102

Branch: refs/heads/asf-site
Commit: 46fb91025482150f7e8eba20606211d9c016009f
Parents: d10414f
Author: Christine Koppelt 
Authored: Sat Nov 19 17:02:39 2016 +0100
Committer: Sean Owen 
Committed: Mon Nov 21 09:26:18 2016 +

--
 .github/CONTRIBUTING.md  | 1 +
 .github/PULL_REQUEST_TEMPLATE.md | 1 +
 2 files changed, 2 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark-website/blob/46fb9102/.github/CONTRIBUTING.md
--
diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md
new file mode 100644
index 000..7b5dc0b
--- /dev/null
+++ b/.github/CONTRIBUTING.md
@@ -0,0 +1 @@
+Make sure that you generate site HTML with `jekyll build`, and include the 
changes to the HTML in your pull request also. See README.md for more 
information.

http://git-wip-us.apache.org/repos/asf/spark-website/blob/46fb9102/.github/PULL_REQUEST_TEMPLATE.md
--
diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
new file mode 100644
index 000..5121ede
--- /dev/null
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -0,0 +1 @@
+*Make sure that you generate site HTML with `jekyll build`, and include the 
changes to the HTML in your pull request also. See README.md for more 
information. Please remove this message.*


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18413][SQL] Add `maxConnections` JDBCOption

2016-11-21 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 9f262ae16 -> 07beb5d21


[SPARK-18413][SQL] Add `maxConnections` JDBCOption

## What changes were proposed in this pull request?

This PR adds a new JDBCOption `maxConnections` which means the maximum number 
of simultaneous JDBC connections allowed. This option applies only to writing 
with coalesce operation if needed. It defaults to the number of partitions of 
RDD. Previously, SQL users cannot cannot control this while Scala/Java/Python 
users can use `coalesce` (or `repartition`) API.

**Reported Scenario**

For the following cases, the number of connections becomes 200 and database 
cannot handle all of them.

```sql
CREATE OR REPLACE TEMPORARY VIEW resultview
USING org.apache.spark.sql.jdbc
OPTIONS (
  url "jdbc:oracle:thin:10.129.10.111:1521:BKDB",
  dbtable "result",
  user "HIVE",
  password "HIVE"
);
-- set spark.sql.shuffle.partitions=200
INSERT OVERWRITE TABLE resultview SELECT g, count(1) AS COUNT FROM 
tnet.DT_LIVE_INFO GROUP BY g
```

## How was this patch tested?

Manual. Do the followings and see Spark UI.

**Step 1 (MySQL)**
```
CREATE TABLE t1 (a INT);
CREATE TABLE data (a INT);
INSERT INTO data VALUES (1);
INSERT INTO data VALUES (2);
INSERT INTO data VALUES (3);
```

**Step 2 (Spark)**
```scala
SPARK_HOME=$PWD bin/spark-shell --driver-memory 4G --driver-class-path 
mysql-connector-java-5.1.40-bin.jar
scala> sql("SET spark.sql.shuffle.partitions=3")
scala> sql("CREATE OR REPLACE TEMPORARY VIEW data USING 
org.apache.spark.sql.jdbc OPTIONS (url 'jdbc:mysql://localhost:3306/t', dbtable 
'data', user 'root', password '')")
scala> sql("CREATE OR REPLACE TEMPORARY VIEW t1 USING org.apache.spark.sql.jdbc 
OPTIONS (url 'jdbc:mysql://localhost:3306/t', dbtable 't1', user 'root', 
password '', maxConnections '1')")
scala> sql("INSERT OVERWRITE TABLE t1 SELECT a FROM data GROUP BY a")
scala> sql("CREATE OR REPLACE TEMPORARY VIEW t1 USING org.apache.spark.sql.jdbc 
OPTIONS (url 'jdbc:mysql://localhost:3306/t', dbtable 't1', user 'root', 
password '', maxConnections '2')")
scala> sql("INSERT OVERWRITE TABLE t1 SELECT a FROM data GROUP BY a")
scala> sql("CREATE OR REPLACE TEMPORARY VIEW t1 USING org.apache.spark.sql.jdbc 
OPTIONS (url 'jdbc:mysql://localhost:3306/t', dbtable 't1', user 'root', 
password '', maxConnections '3')")
scala> sql("INSERT OVERWRITE TABLE t1 SELECT a FROM data GROUP BY a")
scala> sql("CREATE OR REPLACE TEMPORARY VIEW t1 USING org.apache.spark.sql.jdbc 
OPTIONS (url 'jdbc:mysql://localhost:3306/t', dbtable 't1', user 'root', 
password '', maxConnections '4')")
scala> sql("INSERT OVERWRITE TABLE t1 SELECT a FROM data GROUP BY a")
```

![maxconnections](https://cloud.githubusercontent.com/assets/9700541/20287987/ed8409c2-aa84-11e6-8aab-ae28e63fe54d.png)

Author: Dongjoon Hyun 

Closes #15868 from dongjoon-hyun/SPARK-18413.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/07beb5d2
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/07beb5d2
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/07beb5d2

Branch: refs/heads/master
Commit: 07beb5d21c6803e80733149f1560c71cd3cacc86
Parents: 9f262ae
Author: Dongjoon Hyun 
Authored: Mon Nov 21 13:57:36 2016 +
Committer: Sean Owen 
Committed: Mon Nov 21 13:57:36 2016 +

--
 docs/sql-programming-guide.md   |  7 +++
 .../sql/execution/datasources/jdbc/JDBCOptions.scala|  6 ++
 .../sql/execution/datasources/jdbc/JdbcUtils.scala  |  9 -
 .../org/apache/spark/sql/jdbc/JDBCWriteSuite.scala  | 12 
 4 files changed, 33 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/07beb5d2/docs/sql-programming-guide.md
--
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index ba3e55f..656e7ec 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1087,6 +1087,13 @@ the following case-sensitive options:
   
 
   
+ maxConnections
+ 
+   The maximum number of concurrent JDBC connections that can be used, if 
set. Only applies when writing. It works by limiting the operation's 
parallelism, which depends on the input's partition count. If its partition 
count exceeds this limit, the operation will coalesce the input to fewer 
partitions before writing.
+ 
+  
+
+  
  isolationLevel
  
The transaction isolation level, which applies to current connection. 
It can be one of NONE, READ_COMMITTED, 
READ_UNCOMMITTED, REPEATABLE_READ, or 
SERIALIZABLE, corresponding to standard transaction isolation 
levels defined by JDBC's Connection object, with default of 
READ_UNCOMMITTED. This option applies only to writing. Please refer 
the documentation

[3/6] spark-website git commit: Port wiki page Committers to committers.html, Contributing to Spark and Code Style Guide to contributing.html, Third Party Projects and Additional Language Bindings to

2016-11-21 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/news/spark-summit-agenda-posted.html
--
diff --git a/site/news/spark-summit-agenda-posted.html 
b/site/news/spark-summit-agenda-posted.html
index 2970d73..2ba41f0 100644
--- a/site/news/spark-summit-agenda-posted.html
+++ b/site/news/spark-summit-agenda-posted.html
@@ -98,7 +98,7 @@
   MLlib (machine learning)
   GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party 
Projects
 
   
   
@@ -116,12 +116,13 @@
   Community 
 
 
-  Mailing Lists
+  Mailing Lists
+  Contributing to Spark
+  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
   Events and Meetups
   Project History
-  https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>Powered
 By
-  https://cwiki.apache.org/confluence/display/SPARK/Committers";>Project 
Committers
-  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
+  Powered By
+  Project Committers
 
   
   FAQ
@@ -178,7 +179,7 @@
 MLlib (machine learning)
 GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party Projects
 
   
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/news/spark-summit-east-2015-videos-posted.html
--
diff --git a/site/news/spark-summit-east-2015-videos-posted.html 
b/site/news/spark-summit-east-2015-videos-posted.html
index fc1cbb8..0491329 100644
--- a/site/news/spark-summit-east-2015-videos-posted.html
+++ b/site/news/spark-summit-east-2015-videos-posted.html
@@ -98,7 +98,7 @@
   MLlib (machine learning)
   GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party 
Projects
 
   
   
@@ -116,12 +116,13 @@
   Community 
 
 
-  Mailing Lists
+  Mailing Lists
+  Contributing to Spark
+  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
   Events and Meetups
   Project History
-  https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>Powered
 By
-  https://cwiki.apache.org/confluence/display/SPARK/Committers";>Project 
Committers
-  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
+  Powered By
+  Project Committers
 
   
   FAQ
@@ -178,7 +179,7 @@
 MLlib (machine learning)
 GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party Projects
 
   
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/news/spark-summit-east-2016-cfp-closing.html
--
diff --git a/site/news/spark-summit-east-2016-cfp-closing.html 
b/site/news/spark-summit-east-2016-cfp-closing.html
index bec8c2b..c748312 100644
--- a/site/news/spark-summit-east-2016-cfp-closing.html
+++ b/site/news/spark-summit-east-2016-cfp-closing.html
@@ -98,7 +98,7 @@
   MLlib (machine learning)
   GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party 
Projects
 
   
   
@@ -116,12 +116,13 @@
   Community 
 
 
-  Mailing Lists
+  Mailing Lists
+  Contributing to Spark
+  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
   Events and Meetups
   Project History
-  https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>Powered
 By
-  https://cwiki.apache.org/confluence/display/SPARK/Committers";>Project 
Committers
-  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
+  Powered By
+  Project Committers
 
   
   FAQ
@@ -178,7 +179,7 @@
 MLlib (machine learning)
 GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party Projects
 
   
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/news/spark-summit-east-agenda-posted.html
--
diff --git a/site/news/spark-summit-east-agenda-posted.html 
b/site/news/spark-summit-east-agenda-posted.html
index 1328dc9..5852bfb 100644
--- a/site/news/spark-summit-east-agenda-posted.html
+++ b/site

[1/6] spark-website git commit: Port wiki page Committers to committers.html, Contributing to Spark and Code Style Guide to contributing.html, Third Party Projects and Additional Language Bindings to

2016-11-21 Thread srowen

Repository: spark-website
Updated Branches:
  refs/heads/asf-site 46fb91025 -> 0744e8fdd


http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/third-party-projects.html
--
diff --git a/site/third-party-projects.html b/site/third-party-projects.html
new file mode 100644
index 000..58d4893
--- /dev/null
+++ b/site/third-party-projects.html
@@ -0,0 +1,287 @@
+
+
+
+  
+  
+  
+
+  
+ Third-Party Projects | Apache Spark
+
+  
+
+  
+
+  
+
+  
+  
+  
+
+  
+  
+
+  
+  
+  var _gaq = _gaq || [];
+  _gaq.push(['_setAccount', 'UA-32518208-2']);
+  _gaq.push(['_trackPageview']);
+  (function() {
+var ga = document.createElement('script'); ga.type = 'text/javascript'; 
ga.async = true;
+ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 
'http://www') + '.google-analytics.com/ga.js';
+var s = document.getElementsByTagName('script')[0]; 
s.parentNode.insertBefore(ga, s);
+  })();
+
+  
+  function trackOutboundLink(link, category, action) {
+try {
+  _gaq.push(['_trackEvent', category , action]);
+} catch(err){}
+
+setTimeout(function() {
+  document.location.href = link.href;
+}, 100);
+  }
+  
+
+  
+  
+
+
+
+
+https://code.jquery.com/jquery.js";>
+https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js";>
+
+
+
+
+
+
+  
+
+  
+  
+  Lightning-fast cluster computing
+  
+
+  
+
+
+
+  
+  
+
+  Toggle navigation
+  
+  
+  
+
+  
+
+  
+  
+
+  Download
+  
+
+  Libraries 
+
+
+  SQL and DataFrames
+  Spark Streaming
+  MLlib (machine learning)
+  GraphX (graph)
+  
+  Third-Party 
Projects
+
+  
+  
+
+  Documentation 
+
+
+  Latest Release (Spark 2.0.2)
+  Older Versions and Other 
Resources
+
+  
+  Examples
+  
+
+  Community 
+
+
+  Mailing Lists
+  Contributing to Spark
+  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
+  Events and Meetups
+  Project History
+  Powered By
+  Project Committers
+
+  
+  FAQ
+
+
+  
+http://www.apache.org/"; class="dropdown-toggle" 
data-toggle="dropdown">
+  Apache Software Foundation 
+
+  http://www.apache.org/";>Apache Homepage
+  http://www.apache.org/licenses/";>License
+  http://www.apache.org/foundation/sponsorship.html";>Sponsorship
+  http://www.apache.org/foundation/thanks.html";>Thanks
+  http://www.apache.org/security/";>Security
+
+  
+
+  
+  
+
+
+
+
+  
+
+  Latest News
+  
+
+  Spark 
wins CloudSort Benchmark as the most efficient engine
+  (Nov 15, 2016)
+
+  Spark 2.0.2 
released
+  (Nov 14, 2016)
+
+  Spark 1.6.3 
released
+  (Nov 07, 2016)
+
+  Spark 2.0.1 
released
+  (Oct 03, 2016)
+
+  
+  Archive
+
+
+  
+Download Spark
+  
+  
+Built-in Libraries:
+  
+  
+SQL and DataFrames
+Spark Streaming
+MLlib (machine learning)
+GraphX (graph)
+  
+  Third-Party Projects
+
+  
+
+  
+This page tracks external software projects that supplement Apache 
Spark and add to its ecosystem.
+
+spark-packages.org
+
+https://spark-packages.org/";>spark-packages.org is an 
external, 
+community-managed list of third-party libraries, add-ons, and applications 
that work with 
+Apache Spark. You can add a package as long as you have a GitHub 
repository.
+
+Infrastructure Projects
+
+
+  https://github.com/spark-jobserver/spark-jobserver";>Spark Job 
Server - 
+REST interface for managing and submitting Spark jobs on the same cluster 
+(see http://engineering.ooyala.com/blog/open-sourcing-our-spark-job-server";>blog
 post 
+for details)
+  https://github.com/amplab-extras/SparkR-pkg";>SparkR - R 
frontend for Spark
+  http://mlbase.org/";>MLbase - Machine Learning research 
project on top of Spark
+  http://mesos.apache.org/";>Apache Mesos - Cluster management 
system that supports 
+running Spark
+  http://alluxio.org/";>Alluxio (nÃ©e Tachyon) - Memory speed 
virtual distributed 
+storage system that supports running Spark
+  https://github.com/datastax/spark-cassandra-connector";>Spark 
Cassandra Connector - 
+Easily load your Cassandra data into Spark and Spark SQL; from Datastax
+  http://github.com/tuplejump/FiloDB";>FiloDB - a Spark 
integrated analytical/columnar 
+database

[5/6] spark-website git commit: Port wiki page Committers to committers.html, Contributing to Spark and Code Style Guide to contributing.html, Third Party Projects and Additional Language Bindings to

2016-11-21 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/community.html
--
diff --git a/site/community.html b/site/community.html
index fe4aa35..1a5e80f 100644
--- a/site/community.html
+++ b/site/community.html
@@ -98,7 +98,7 @@
   MLlib (machine learning)
   GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party 
Projects
 
   
   
@@ -116,12 +116,13 @@
   Community 
 
 
-  Mailing Lists
+  Mailing Lists
+  Contributing to Spark
+  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
   Events and Meetups
   Project History
-  https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>Powered
 By
-  https://cwiki.apache.org/confluence/display/SPARK/Committers";>Project 
Committers
-  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
+  Powered By
+  Project Committers
 
   
   FAQ
@@ -178,7 +179,7 @@
 MLlib (machine learning)
 GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party Projects
 
   
 
@@ -380,7 +381,7 @@ and include only a few lines of the pertinent code / log 
within the email.
 
 Powered By
 
-Our wiki has a list of https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>projects
 and organizations powered by Spark.
+Our wiki has a list of projects and 
organizations powered by Spark.
 
 
 Project History

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/contributing.html
--
diff --git a/site/contributing.html b/site/contributing.html
new file mode 100644
index 000..72b5292
--- /dev/null
+++ b/site/contributing.html
@@ -0,0 +1,771 @@
+
+
+
+  
+  
+  
+
+  
+ Contributing to Spark | Apache Spark
+
+  
+
+  
+
+  
+
+  
+  
+  
+
+  
+  
+
+  
+  
+  var _gaq = _gaq || [];
+  _gaq.push(['_setAccount', 'UA-32518208-2']);
+  _gaq.push(['_trackPageview']);
+  (function() {
+var ga = document.createElement('script'); ga.type = 'text/javascript'; 
ga.async = true;
+ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 
'http://www') + '.google-analytics.com/ga.js';
+var s = document.getElementsByTagName('script')[0]; 
s.parentNode.insertBefore(ga, s);
+  })();
+
+  
+  function trackOutboundLink(link, category, action) {
+try {
+  _gaq.push(['_trackEvent', category , action]);
+} catch(err){}
+
+setTimeout(function() {
+  document.location.href = link.href;
+}, 100);
+  }
+  
+
+  
+  
+
+
+
+
+https://code.jquery.com/jquery.js";>
+https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js";>
+
+
+
+
+
+
+  
+
+  
+  
+  Lightning-fast cluster computing
+  
+
+  
+
+
+
+  
+  
+
+  Toggle navigation
+  
+  
+  
+
+  
+
+  
+  
+
+  Download
+  
+
+  Libraries 
+
+
+  SQL and DataFrames
+  Spark Streaming
+  MLlib (machine learning)
+  GraphX (graph)
+  
+  Third-Party 
Projects
+
+  
+  
+
+  Documentation 
+
+
+  Latest Release (Spark 2.0.2)
+  Older Versions and Other 
Resources
+
+  
+  Examples
+  
+
+  Community 
+
+
+  Mailing Lists
+  Contributing to Spark
+  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
+  Events and Meetups
+  Project History
+  Powered By
+  Project Committers
+
+  
+  FAQ
+
+
+  
+http://www.apache.org/"; class="dropdown-toggle" 
data-toggle="dropdown">
+  Apache Software Foundation 
+
+  http://www.apache.org/";>Apache Homepage
+  http://www.apache.org/licenses/";>License
+  http://www.apache.org/foundation/sponsorship.html";>Sponsorship
+  http://www.apache.org/foundation/thanks.html";>Thanks
+  http://www.apache.org/security/";>Security
+
+  
+
+  
+  
+
+
+
+
+  
+
+  Latest News
+  
+
+  Spark 
wins CloudSort Benchmark as the most efficient engine
+  (Nov 15, 2016)
+
+  Spark 2.0.2 
released
+  (Nov 14, 2016)
+
+  Spark 1.6.3 
released
+  (Nov 07, 2016)
+
+  Spark 2.0.1 
released
+  (Oct 03, 2016

[2/6] spark-website git commit: Port wiki page Committers to committers.html, Contributing to Spark and Code Style Guide to contributing.html, Third Party Projects and Additional Language Bindings to

2016-11-21 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/releases/spark-release-0-7-2.html
--
diff --git a/site/releases/spark-release-0-7-2.html 
b/site/releases/spark-release-0-7-2.html
index 2506300..0ccc6a9 100644
--- a/site/releases/spark-release-0-7-2.html
+++ b/site/releases/spark-release-0-7-2.html
@@ -98,7 +98,7 @@
   MLlib (machine learning)
   GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party 
Projects
 
   
   
@@ -116,12 +116,13 @@
   Community 
 
 
-  Mailing Lists
+  Mailing Lists
+  Contributing to Spark
+  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
   Events and Meetups
   Project History
-  https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>Powered
 By
-  https://cwiki.apache.org/confluence/display/SPARK/Committers";>Project 
Committers
-  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
+  Powered By
+  Project Committers
 
   
   FAQ
@@ -178,7 +179,7 @@
 MLlib (machine learning)
 GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party Projects
 
   
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/releases/spark-release-0-7-3.html
--
diff --git a/site/releases/spark-release-0-7-3.html 
b/site/releases/spark-release-0-7-3.html
index 0d04344..671c0c9 100644
--- a/site/releases/spark-release-0-7-3.html
+++ b/site/releases/spark-release-0-7-3.html
@@ -98,7 +98,7 @@
   MLlib (machine learning)
   GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party 
Projects
 
   
   
@@ -116,12 +116,13 @@
   Community 
 
 
-  Mailing Lists
+  Mailing Lists
+  Contributing to Spark
+  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
   Events and Meetups
   Project History
-  https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>Powered
 By
-  https://cwiki.apache.org/confluence/display/SPARK/Committers";>Project 
Committers
-  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
+  Powered By
+  Project Committers
 
   
   FAQ
@@ -178,7 +179,7 @@
 MLlib (machine learning)
 GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party Projects
 
   
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/releases/spark-release-0-8-0.html
--
diff --git a/site/releases/spark-release-0-8-0.html 
b/site/releases/spark-release-0-8-0.html
index 8592aa4..bcdb04a 100644
--- a/site/releases/spark-release-0-8-0.html
+++ b/site/releases/spark-release-0-8-0.html
@@ -98,7 +98,7 @@
   MLlib (machine learning)
   GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party 
Projects
 
   
   
@@ -116,12 +116,13 @@
   Community 
 
 
-  Mailing Lists
+  Mailing Lists
+  Contributing to Spark
+  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
   Events and Meetups
   Project History
-  https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>Powered
 By
-  https://cwiki.apache.org/confluence/display/SPARK/Committers";>Project 
Committers
-  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
+  Powered By
+  Project Committers
 
   
   FAQ
@@ -178,7 +179,7 @@
 MLlib (machine learning)
 GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party Projects
 
   
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/releases/spark-release-0-8-1.html
--
diff --git a/site/releases/spark-release-0-8-1.html 
b/site/releases/spark-release-0-8-1.html
index 9279f5d..02d03fd 100644
--- a/site/releases/spark-release-0-8-1.html
+++ b/site/releases/spark-release-0-8-1.html
@@ -98,7 +98,7 @@
   MLlib (machine learning)
   GraphX (graph)
   
-  https://cwiki.apache.org/conf

[4/6] spark-website git commit: Port wiki page Committers to committers.html, Contributing to Spark and Code Style Guide to contributing.html, Third Party Projects and Additional Language Bindings to

2016-11-21 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/news/spark-0-7-2-released.html
--
diff --git a/site/news/spark-0-7-2-released.html 
b/site/news/spark-0-7-2-released.html
index 06f1766..12c8d35 100644
--- a/site/news/spark-0-7-2-released.html
+++ b/site/news/spark-0-7-2-released.html
@@ -98,7 +98,7 @@
   MLlib (machine learning)
   GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party 
Projects
 
   
   
@@ -116,12 +116,13 @@
   Community 
 
 
-  Mailing Lists
+  Mailing Lists
+  Contributing to Spark
+  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
   Events and Meetups
   Project History
-  https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>Powered
 By
-  https://cwiki.apache.org/confluence/display/SPARK/Committers";>Project 
Committers
-  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
+  Powered By
+  Project Committers
 
   
   FAQ
@@ -178,7 +179,7 @@
 MLlib (machine learning)
 GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party Projects
 
   
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/news/spark-0-7-3-released.html
--
diff --git a/site/news/spark-0-7-3-released.html 
b/site/news/spark-0-7-3-released.html
index 6f090ba..0c7ef4b 100644
--- a/site/news/spark-0-7-3-released.html
+++ b/site/news/spark-0-7-3-released.html
@@ -98,7 +98,7 @@
   MLlib (machine learning)
   GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party 
Projects
 
   
   
@@ -116,12 +116,13 @@
   Community 
 
 
-  Mailing Lists
+  Mailing Lists
+  Contributing to Spark
+  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
   Events and Meetups
   Project History
-  https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>Powered
 By
-  https://cwiki.apache.org/confluence/display/SPARK/Committers";>Project 
Committers
-  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
+  Powered By
+  Project Committers
 
   
   FAQ
@@ -178,7 +179,7 @@
 MLlib (machine learning)
 GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party Projects
 
   
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/news/spark-0-8-0-released.html
--
diff --git a/site/news/spark-0-8-0-released.html 
b/site/news/spark-0-8-0-released.html
index 7ca7379..f2448cf 100644
--- a/site/news/spark-0-8-0-released.html
+++ b/site/news/spark-0-8-0-released.html
@@ -98,7 +98,7 @@
   MLlib (machine learning)
   GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party 
Projects
 
   
   
@@ -116,12 +116,13 @@
   Community 
 
 
-  Mailing Lists
+  Mailing Lists
+  Contributing to Spark
+  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
   Events and Meetups
   Project History
-  https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>Powered
 By
-  https://cwiki.apache.org/confluence/display/SPARK/Committers";>Project 
Committers
-  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
+  Powered By
+  Project Committers
 
   
   FAQ
@@ -178,7 +179,7 @@
 MLlib (machine learning)
 GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages
+  Third-Party Projects
 
   
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/news/spark-0-8-1-released.html
--
diff --git a/site/news/spark-0-8-1-released.html 
b/site/news/spark-0-8-1-released.html
index c35c559..c0c3140 100644
--- a/site/news/spark-0-8-1-released.html
+++ b/site/news/spark-0-8-1-released.html
@@ -98,7 +98,7 @@
   MLlib (machine learning)
   GraphX (graph)
   
-  https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Pac

[6/6] spark-website git commit: Port wiki page Committers to committers.html, Contributing to Spark and Code Style Guide to contributing.html, Third Party Projects and Additional Language Bindings to

2016-11-21 Thread srowen

Port wiki page Committers to committers.html, Contributing to Spark and Code 
Style Guide to contributing.html, Third Party Projects and Additional Language 
Bindings to third-party-projects.html, Powered By to powered-by.html


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/0744e8fd
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/0744e8fd
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/0744e8fd

Branch: refs/heads/asf-site
Commit: 0744e8fdd9f954a6552c968be50604241097dbbc
Parents: 46fb910
Author: Sean Owen 
Authored: Sat Nov 19 12:35:02 2016 +
Committer: Sean Owen 
Committed: Mon Nov 21 20:57:42 2016 +

--
 _layouts/global.html|  13 +-
 committers.md   | 167 
 community.md|   2 +-
 contributing.md | 523 +
 documentation.md|   2 +-
 faq.md  |   4 +-
 graphx/index.md |   2 +-
 index.md|   9 +-
 mllib/index.md  |   2 +-
 ...-05-spark-user-survey-and-powered-by-page.md |   2 +-
 powered-by.md   | 239 ++
 site/committers.html| 518 +
 site/community.html |  15 +-
 site/contributing.html  | 771 +++
 site/documentation.html |  15 +-
 site/downloads.html |  13 +-
 site/examples.html  |  13 +-
 site/faq.html   |  17 +-
 site/graphx/index.html  |  15 +-
 site/index.html |  22 +-
 site/mailing-lists.html |  13 +-
 site/mllib/index.html   |  15 +-
 site/news/amp-camp-2013-registration-ope.html   |  13 +-
 .../news/announcing-the-first-spark-summit.html |  13 +-
 .../news/fourth-spark-screencast-published.html |  13 +-
 site/news/index.html|  13 +-
 site/news/nsdi-paper.html   |  13 +-
 site/news/one-month-to-spark-summit-2015.html   |  13 +-
 .../proposals-open-for-spark-summit-east.html   |  13 +-
 ...registration-open-for-spark-summit-east.html |  13 +-
 .../news/run-spark-and-shark-on-amazon-emr.html |  13 +-
 site/news/spark-0-6-1-and-0-5-2-released.html   |  13 +-
 site/news/spark-0-6-2-released.html |  13 +-
 site/news/spark-0-7-0-released.html |  13 +-
 site/news/spark-0-7-2-released.html |  13 +-
 site/news/spark-0-7-3-released.html |  13 +-
 site/news/spark-0-8-0-released.html |  13 +-
 site/news/spark-0-8-1-released.html |  13 +-
 site/news/spark-0-9-0-released.html |  13 +-
 site/news/spark-0-9-1-released.html |  13 +-
 site/news/spark-0-9-2-released.html |  13 +-
 site/news/spark-1-0-0-released.html |  13 +-
 site/news/spark-1-0-1-released.html |  13 +-
 site/news/spark-1-0-2-released.html |  13 +-
 site/news/spark-1-1-0-released.html |  13 +-
 site/news/spark-1-1-1-released.html |  13 +-
 site/news/spark-1-2-0-released.html |  13 +-
 site/news/spark-1-2-1-released.html |  13 +-
 site/news/spark-1-2-2-released.html |  13 +-
 site/news/spark-1-3-0-released.html |  13 +-
 site/news/spark-1-4-0-released.html |  13 +-
 site/news/spark-1-4-1-released.html |  13 +-
 site/news/spark-1-5-0-released.html |  13 +-
 site/news/spark-1-5-1-released.html |  13 +-
 site/news/spark-1-5-2-released.html |  13 +-
 site/news/spark-1-6-0-released.html |  13 +-
 site/news/spark-1-6-1-released.html |  13 +-
 site/news/spark-1-6-2-released.html |  13 +-
 site/news/spark-1-6-3-released.html |  13 +-
 site/news/spark-2-0-0-released.html |  13 +-
 site/news/spark-2-0-1-released.html |  13 +-
 site/news/spark-2-0-2-released.html |  13 +-
 site/news/spark-2.0.0-preview.html  |  13 +-
 .../spark-accepted-into-apache-incubator.html   |  13 +-
 site/news/spark-and-shark-in-the-news.html  |  13 +-
 site/news/spark-becomes-tlp.html|  13 +-
 site/news/spark-featured-in-wired.html  |  13 +-
 .../spark-mailing-lists-moving-to-apache.html   |  13 +-
 site/news/spark-meetups.html|  13 +-
 site/news/spark-screencasts-published.html  |  13 +-
 site/news/spark-summit-2013-is-a-wrap.html  |  13 +-
 site/news/spark-summit-20

spark git commit: [SPARK-18514][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that` across R API documentation

2016-11-22 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master acb971577 -> 4922f9cdc


[SPARK-18514][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that` across R 
API documentation

## What changes were proposed in this pull request?

It seems in R, there are

- `Note:`
- `NOTE:`
- `Note that`

This PR proposes to fix those to `Note:` to be consistent.

**Before**

![2016-11-21 11 30 
07](https://cloud.githubusercontent.com/assets/6477701/20468848/2f27b0fa-afde-11e6-89e3-993701269dbe.png)

**After**

![2016-11-21 11 29 
44](https://cloud.githubusercontent.com/assets/6477701/20468851/39469664-afde-11e6-9929-ad80be7fc405.png)

## How was this patch tested?

The notes were found via

```bash
grep -r "NOTE: " .
grep -r "Note that " .
```

And then fixed one by one comparing with API documentation.

After that, manually tested via `sh create-docs.sh` under `./R`.

Author: hyukjinkwon 

Closes #15952 from HyukjinKwon/SPARK-18514.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4922f9cd
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4922f9cd
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4922f9cd

Branch: refs/heads/master
Commit: 4922f9cdcac8b7c10320ac1fb701997fffa45d46
Parents: acb9715
Author: hyukjinkwon 
Authored: Tue Nov 22 11:26:10 2016 +
Committer: Sean Owen 
Committed: Tue Nov 22 11:26:10 2016 +

--
 R/pkg/R/DataFrame.R | 6 --
 R/pkg/R/functions.R | 7 ---
 2 files changed, 8 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/4922f9cd/R/pkg/R/DataFrame.R
--
diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R
index 4e3d97b..9a51d53 100644
--- a/R/pkg/R/DataFrame.R
+++ b/R/pkg/R/DataFrame.R
@@ -2541,7 +2541,8 @@ generateAliasesForIntersectedCols <- function (x, 
intersectedColNames, suffix) {
 #'
 #' Return a new SparkDataFrame containing the union of rows in this 
SparkDataFrame
 #' and another SparkDataFrame. This is equivalent to \code{UNION ALL} in SQL.
-#' Note that this does not remove duplicate rows across the two 
SparkDataFrames.
+#'
+#' Note: This does not remove duplicate rows across the two SparkDataFrames.
 #'
 #' @param x A SparkDataFrame
 #' @param y A SparkDataFrame
@@ -2584,7 +2585,8 @@ setMethod("unionAll",
 #' Union two or more SparkDataFrames
 #'
 #' Union two or more SparkDataFrames. This is equivalent to \code{UNION ALL} 
in SQL.
-#' Note that this does not remove duplicate rows across the two 
SparkDataFrames.
+#'
+#' Note: This does not remove duplicate rows across the two SparkDataFrames.
 #'
 #' @param x a SparkDataFrame.
 #' @param ... additional SparkDataFrame(s).

http://git-wip-us.apache.org/repos/asf/spark/blob/4922f9cd/R/pkg/R/functions.R
--
diff --git a/R/pkg/R/functions.R b/R/pkg/R/functions.R
index f8a9d3c..bf5c963 100644
--- a/R/pkg/R/functions.R
+++ b/R/pkg/R/functions.R
@@ -2296,7 +2296,7 @@ setMethod("n", signature(x = "Column"),
 #' A pattern could be for instance \preformatted{dd.MM.} and could return 
a string like '18.03.1993'. All
 #' pattern letters of \code{java.text.SimpleDateFormat} can be used.
 #'
-#' NOTE: Use when ever possible specialized functions like \code{year}. These 
benefit from a
+#' Note: Use when ever possible specialized functions like \code{year}. These 
benefit from a
 #' specialized implementation.
 #'
 #' @param y Column to compute on.
@@ -2341,7 +2341,7 @@ setMethod("from_utc_timestamp", signature(y = "Column", x 
= "character"),
 #' Locate the position of the first occurrence of substr column in the given 
string.
 #' Returns null if either of the arguments are null.
 #'
-#' NOTE: The position is not zero based, but 1 based index. Returns 0 if substr
+#' Note: The position is not zero based, but 1 based index. Returns 0 if substr
 #' could not be found in str.
 #'
 #' @param y column to check
@@ -2779,7 +2779,8 @@ setMethod("window", signature(x = "Column"),
 #' locate
 #'
 #' Locate the position of the first occurrence of substr.
-#' NOTE: The position is not zero based, but 1 based index. Returns 0 if substr
+#'
+#' Note: The position is not zero based, but 1 based index. Returns 0 if substr
 #' could not be found in str.
 #'
 #' @param substr a character string to be matched.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18514][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that` across R API documentation

2016-11-22 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 c70214075 -> 63aa01ffe


[SPARK-18514][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that` across R 
API documentation

## What changes were proposed in this pull request?

It seems in R, there are

- `Note:`
- `NOTE:`
- `Note that`

This PR proposes to fix those to `Note:` to be consistent.

**Before**

![2016-11-21 11 30 
07](https://cloud.githubusercontent.com/assets/6477701/20468848/2f27b0fa-afde-11e6-89e3-993701269dbe.png)

**After**

![2016-11-21 11 29 
44](https://cloud.githubusercontent.com/assets/6477701/20468851/39469664-afde-11e6-9929-ad80be7fc405.png)

## How was this patch tested?

The notes were found via

```bash
grep -r "NOTE: " .
grep -r "Note that " .
```

And then fixed one by one comparing with API documentation.

After that, manually tested via `sh create-docs.sh` under `./R`.

Author: hyukjinkwon 

Closes #15952 from HyukjinKwon/SPARK-18514.

(cherry picked from commit 4922f9cdcac8b7c10320ac1fb701997fffa45d46)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/63aa01ff
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/63aa01ff
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/63aa01ff

Branch: refs/heads/branch-2.1
Commit: 63aa01ffe06e49af032b57ba2eb28dfb8f14f779
Parents: c702140
Author: hyukjinkwon 
Authored: Tue Nov 22 11:26:10 2016 +
Committer: Sean Owen 
Committed: Tue Nov 22 11:26:20 2016 +

--
 R/pkg/R/DataFrame.R | 6 --
 R/pkg/R/functions.R | 7 ---
 2 files changed, 8 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/63aa01ff/R/pkg/R/DataFrame.R
--
diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R
index 4e3d97b..9a51d53 100644
--- a/R/pkg/R/DataFrame.R
+++ b/R/pkg/R/DataFrame.R
@@ -2541,7 +2541,8 @@ generateAliasesForIntersectedCols <- function (x, 
intersectedColNames, suffix) {
 #'
 #' Return a new SparkDataFrame containing the union of rows in this 
SparkDataFrame
 #' and another SparkDataFrame. This is equivalent to \code{UNION ALL} in SQL.
-#' Note that this does not remove duplicate rows across the two 
SparkDataFrames.
+#'
+#' Note: This does not remove duplicate rows across the two SparkDataFrames.
 #'
 #' @param x A SparkDataFrame
 #' @param y A SparkDataFrame
@@ -2584,7 +2585,8 @@ setMethod("unionAll",
 #' Union two or more SparkDataFrames
 #'
 #' Union two or more SparkDataFrames. This is equivalent to \code{UNION ALL} 
in SQL.
-#' Note that this does not remove duplicate rows across the two 
SparkDataFrames.
+#'
+#' Note: This does not remove duplicate rows across the two SparkDataFrames.
 #'
 #' @param x a SparkDataFrame.
 #' @param ... additional SparkDataFrame(s).

http://git-wip-us.apache.org/repos/asf/spark/blob/63aa01ff/R/pkg/R/functions.R
--
diff --git a/R/pkg/R/functions.R b/R/pkg/R/functions.R
index f8a9d3c..bf5c963 100644
--- a/R/pkg/R/functions.R
+++ b/R/pkg/R/functions.R
@@ -2296,7 +2296,7 @@ setMethod("n", signature(x = "Column"),
 #' A pattern could be for instance \preformatted{dd.MM.} and could return 
a string like '18.03.1993'. All
 #' pattern letters of \code{java.text.SimpleDateFormat} can be used.
 #'
-#' NOTE: Use when ever possible specialized functions like \code{year}. These 
benefit from a
+#' Note: Use when ever possible specialized functions like \code{year}. These 
benefit from a
 #' specialized implementation.
 #'
 #' @param y Column to compute on.
@@ -2341,7 +2341,7 @@ setMethod("from_utc_timestamp", signature(y = "Column", x 
= "character"),
 #' Locate the position of the first occurrence of substr column in the given 
string.
 #' Returns null if either of the arguments are null.
 #'
-#' NOTE: The position is not zero based, but 1 based index. Returns 0 if substr
+#' Note: The position is not zero based, but 1 based index. Returns 0 if substr
 #' could not be found in str.
 #'
 #' @param y column to check
@@ -2779,7 +2779,8 @@ setMethod("window", signature(x = "Column"),
 #' locate
 #'
 #' Locate the position of the first occurrence of substr.
-#' NOTE: The position is not zero based, but 1 based index. Returns 0 if substr
+#'
+#' Note: The position is not zero based, but 1 based index. Returns 0 if substr
 #' could not be found in str.
 #'
 #' @param substr a character string to be matched.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18447][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that` across Python API documentation

2016-11-22 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 4922f9cdc -> 933a6548d


[SPARK-18447][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that` across 
Python API documentation

## What changes were proposed in this pull request?

It seems in Python, there are

- `Note:`
- `NOTE:`
- `Note that`
- `.. note::`

This PR proposes to fix those to `.. note::` to be consistent.

**Before**

https://cloud.githubusercontent.com/assets/6477701/20464305/85144c86-af88-11e6-8ee9-90f584dd856c.png";>

https://cloud.githubusercontent.com/assets/6477701/20464263/27be5022-af88-11e6-8577-4bbca7cdf36c.png";>

**After**

https://cloud.githubusercontent.com/assets/6477701/20464306/8fe48932-af88-11e6-83e1-fc3cbf74407d.png";>

https://cloud.githubusercontent.com/assets/6477701/20464264/2d3e156e-af88-11e6-93f3-cab8d8d02983.png";>

## How was this patch tested?

The notes were found via

```bash
grep -r "Note: " .
grep -r "NOTE: " .
grep -r "Note that " .
```

And then fixed one by one comparing with API documentation.

After that, manually tested via `make html` under `./python/docs`.

Author: hyukjinkwon 

Closes #15947 from HyukjinKwon/SPARK-18447.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/933a6548
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/933a6548
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/933a6548

Branch: refs/heads/master
Commit: 933a6548d423cf17448207a99299cf36fc1a95f6
Parents: 4922f9c
Author: hyukjinkwon 
Authored: Tue Nov 22 11:40:18 2016 +
Committer: Sean Owen 
Committed: Tue Nov 22 11:40:18 2016 +

--
 python/pyspark/conf.py |  4 +-
 python/pyspark/context.py  |  8 ++--
 python/pyspark/ml/classification.py| 45 +++--
 python/pyspark/ml/clustering.py|  8 ++--
 python/pyspark/ml/feature.py   | 13 +++---
 python/pyspark/ml/linalg/__init__.py   | 11 ++---
 python/pyspark/ml/regression.py| 32 +++
 python/pyspark/mllib/clustering.py |  6 +--
 python/pyspark/mllib/feature.py| 24 +--
 python/pyspark/mllib/linalg/__init__.py| 11 ++---
 python/pyspark/mllib/linalg/distributed.py | 15 ---
 python/pyspark/mllib/regression.py |  2 +-
 python/pyspark/mllib/stat/_statistics.py   |  3 +-
 python/pyspark/mllib/tree.py   | 12 +++---
 python/pyspark/rdd.py  | 54 +
 python/pyspark/sql/dataframe.py| 28 ++---
 python/pyspark/sql/functions.py| 11 ++---
 python/pyspark/sql/streaming.py| 10 +++--
 python/pyspark/streaming/context.py|  2 +-
 python/pyspark/streaming/kinesis.py|  4 +-
 20 files changed, 157 insertions(+), 146 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/933a6548/python/pyspark/conf.py
--
diff --git a/python/pyspark/conf.py b/python/pyspark/conf.py
index 64b6f23..491b3a8 100644
--- a/python/pyspark/conf.py
+++ b/python/pyspark/conf.py
@@ -90,8 +90,8 @@ class SparkConf(object):
 All setter methods in this class support chaining. For example,
 you can write C{conf.setMaster("local").setAppName("My app")}.
 
-Note that once a SparkConf object is passed to Spark, it is cloned
-and can no longer be modified by the user.
+.. note:: Once a SparkConf object is passed to Spark, it is cloned
+and can no longer be modified by the user.
 """
 
 def __init__(self, loadDefaults=True, _jvm=None, _jconf=None):

http://git-wip-us.apache.org/repos/asf/spark/blob/933a6548/python/pyspark/context.py
--
diff --git a/python/pyspark/context.py b/python/pyspark/context.py
index 2c2cf6a..2fd3aee 100644
--- a/python/pyspark/context.py
+++ b/python/pyspark/context.py
@@ -520,8 +520,8 @@ class SparkContext(object):
   ...
   (a-hdfs-path/part-n, its content)
 
-NOTE: Small files are preferred, as each file will be loaded
-fully in memory.
+.. note:: Small files are preferred, as each file will be loaded
+fully in memory.
 
 >>> dirPath = os.path.join(tempdir, "files")
 >>> os.mkdir(dirPath)
@@ -547,8 +547,8 @@ class SparkContext(object):
 in a key-value pair, where the key is the path of each file, the
 value is the content of each file.
 
-Note: Small files are preferred, large file is also allowable, but
-may cause bad performance.
+.. note:: Small files are preferred, large file is also allowable, but
+may cause bad performance.
 """
 minPartitions = minPartitions or self.defaultMinPartitions
 return RD

spark git commit: [SPARK-18447][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that` across Python API documentation

2016-11-22 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 63aa01ffe -> 36cd10d19


[SPARK-18447][DOCS] Fix the markdown for `Note:`/`NOTE:`/`Note that` across 
Python API documentation

## What changes were proposed in this pull request?

It seems in Python, there are

- `Note:`
- `NOTE:`
- `Note that`
- `.. note::`

This PR proposes to fix those to `.. note::` to be consistent.

**Before**

https://cloud.githubusercontent.com/assets/6477701/20464305/85144c86-af88-11e6-8ee9-90f584dd856c.png";>

https://cloud.githubusercontent.com/assets/6477701/20464263/27be5022-af88-11e6-8577-4bbca7cdf36c.png";>

**After**

https://cloud.githubusercontent.com/assets/6477701/20464306/8fe48932-af88-11e6-83e1-fc3cbf74407d.png";>

https://cloud.githubusercontent.com/assets/6477701/20464264/2d3e156e-af88-11e6-93f3-cab8d8d02983.png";>

## How was this patch tested?

The notes were found via

```bash
grep -r "Note: " .
grep -r "NOTE: " .
grep -r "Note that " .
```

And then fixed one by one comparing with API documentation.

After that, manually tested via `make html` under `./python/docs`.

Author: hyukjinkwon 

Closes #15947 from HyukjinKwon/SPARK-18447.

(cherry picked from commit 933a6548d423cf17448207a99299cf36fc1a95f6)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/36cd10d1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/36cd10d1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/36cd10d1

Branch: refs/heads/branch-2.1
Commit: 36cd10d19d95418cec4b789545afc798088be315
Parents: 63aa01f
Author: hyukjinkwon 
Authored: Tue Nov 22 11:40:18 2016 +
Committer: Sean Owen 
Committed: Tue Nov 22 11:40:29 2016 +

--
 python/pyspark/conf.py |  4 +-
 python/pyspark/context.py  |  8 ++--
 python/pyspark/ml/classification.py| 45 +++--
 python/pyspark/ml/clustering.py|  8 ++--
 python/pyspark/ml/feature.py   | 13 +++---
 python/pyspark/ml/linalg/__init__.py   | 11 ++---
 python/pyspark/ml/regression.py| 32 +++
 python/pyspark/mllib/clustering.py |  6 +--
 python/pyspark/mllib/feature.py| 24 +--
 python/pyspark/mllib/linalg/__init__.py| 11 ++---
 python/pyspark/mllib/linalg/distributed.py | 15 ---
 python/pyspark/mllib/regression.py |  2 +-
 python/pyspark/mllib/stat/_statistics.py   |  3 +-
 python/pyspark/mllib/tree.py   | 12 +++---
 python/pyspark/rdd.py  | 54 +
 python/pyspark/sql/dataframe.py| 28 ++---
 python/pyspark/sql/functions.py| 11 ++---
 python/pyspark/sql/streaming.py| 10 +++--
 python/pyspark/streaming/context.py|  2 +-
 python/pyspark/streaming/kinesis.py|  4 +-
 20 files changed, 157 insertions(+), 146 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/36cd10d1/python/pyspark/conf.py
--
diff --git a/python/pyspark/conf.py b/python/pyspark/conf.py
index 64b6f23..491b3a8 100644
--- a/python/pyspark/conf.py
+++ b/python/pyspark/conf.py
@@ -90,8 +90,8 @@ class SparkConf(object):
 All setter methods in this class support chaining. For example,
 you can write C{conf.setMaster("local").setAppName("My app")}.
 
-Note that once a SparkConf object is passed to Spark, it is cloned
-and can no longer be modified by the user.
+.. note:: Once a SparkConf object is passed to Spark, it is cloned
+and can no longer be modified by the user.
 """
 
 def __init__(self, loadDefaults=True, _jvm=None, _jconf=None):

http://git-wip-us.apache.org/repos/asf/spark/blob/36cd10d1/python/pyspark/context.py
--
diff --git a/python/pyspark/context.py b/python/pyspark/context.py
index 2c2cf6a..2fd3aee 100644
--- a/python/pyspark/context.py
+++ b/python/pyspark/context.py
@@ -520,8 +520,8 @@ class SparkContext(object):
   ...
   (a-hdfs-path/part-n, its content)
 
-NOTE: Small files are preferred, as each file will be loaded
-fully in memory.
+.. note:: Small files are preferred, as each file will be loaded
+fully in memory.
 
 >>> dirPath = os.path.join(tempdir, "files")
 >>> os.mkdir(dirPath)
@@ -547,8 +547,8 @@ class SparkContext(object):
 in a key-value pair, where the key is the path of each file, the
 value is the content of each file.
 
-Note: Small files are preferred, large file is also allowable, but
-may cause bad performance.
+.. note:: Small files are preferred, large file is also allowable, but
+may cause bad performa

[4/5] spark-website git commit: Port wiki Useful Developer Tools and Profiling Spark Apps to /developer-tools.html; port Spark Versioning Policy and main wiki to /versioning-policy.html; port Preparin

2016-11-23 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/news/announcing-the-first-spark-summit.html
--
diff --git a/site/news/announcing-the-first-spark-summit.html 
b/site/news/announcing-the-first-spark-summit.html
index c1e1b8e..2215895 100644
--- a/site/news/announcing-the-first-spark-summit.html
+++ b/site/news/announcing-the-first-spark-summit.html
@@ -108,24 +108,32 @@
 
   Latest Release (Spark 2.0.2)
   Older Versions and Other 
Resources
+  Frequently Asked Questions
 
   
   Examples
   
-
+
   Community 
 
 
-  Mailing Lists
+  Mailing Lists & Resources
   Contributing to Spark
   https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
-  Events and Meetups
-  Project History
   Powered By
   Project Committers
 
   
-  FAQ
+  
+
+   Developers 
+
+
+  Useful Developer Tools
+  Versioning Policy
+  Release Process
+
+  
 
 
   

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/news/fourth-spark-screencast-published.html
--
diff --git a/site/news/fourth-spark-screencast-published.html 
b/site/news/fourth-spark-screencast-published.html
index a11aba7..fe28ecf 100644
--- a/site/news/fourth-spark-screencast-published.html
+++ b/site/news/fourth-spark-screencast-published.html
@@ -108,24 +108,32 @@
 
   Latest Release (Spark 2.0.2)
   Older Versions and Other 
Resources
+  Frequently Asked Questions
 
   
   Examples
   
-
+
   Community 
 
 
-  Mailing Lists
+  Mailing Lists & Resources
   Contributing to Spark
   https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
-  Events and Meetups
-  Project History
   Powered By
   Project Committers
 
   
-  FAQ
+  
+
+   Developers 
+
+
+  Useful Developer Tools
+  Versioning Policy
+  Release Process
+
+  
 
 
   

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/news/index.html
--
diff --git a/site/news/index.html b/site/news/index.html
index f6e17ca..6bebe93 100644
--- a/site/news/index.html
+++ b/site/news/index.html
@@ -108,24 +108,32 @@
 
   Latest Release (Spark 2.0.2)
   Older Versions and Other 
Resources
+  Frequently Asked Questions
 
   
   Examples
   
-
+
   Community 
 
 
-  Mailing Lists
+  Mailing Lists & Resources
   Contributing to Spark
   https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
-  Events and Meetups
-  Project History
   Powered By
   Project Committers
 
   
-  FAQ
+  
+
+   Developers 
+
+
+  Useful Developer Tools
+  Versioning Policy
+  Release Process
+
+  
 
 
   

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/news/nsdi-paper.html
--
diff --git a/site/news/nsdi-paper.html b/site/news/nsdi-paper.html
index bdd78a0..1ea92f1 100644
--- a/site/news/nsdi-paper.html
+++ b/site/news/nsdi-paper.html
@@ -108,24 +108,32 @@
 
   Latest Release (Spark 2.0.2)
   Older Versions and Other 
Resources
+  Frequently Asked Questions
 
   
   Examples
   
-
+
   Community 
 
 
-  Mailing Lists
+  Mailing Lists & Resources
   Contributing to Spark
   https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
-  Events and Meetups
-  Project History
   Powered By
   Project Committers
 
   
-  FAQ
+  
+
+   Developers 
+
+
+  Useful Developer Tools
+  Versioning Policy
+  Release Process
+
+  
 
 
   

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/news/one-month-to-spark-summit-2015.html
--
diff --git a/site/news/one-month-to-spark-summit-2015.html 
b/site/news/one-month-to-spark-summit-2015.html
index 919c446..4566e64 100644
--- a/site/news/one-month-to-spark-summit-2015.html
+++ b/site/news/one-month-to-spark-summit-2015.html
@@ -108,24 +108,32 @@

[2/5] spark-website git commit: Port wiki Useful Developer Tools and Profiling Spark Apps to /developer-tools.html; port Spark Versioning Policy and main wiki to /versioning-policy.html; port Preparin

2016-11-23 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/release-process.html
--
diff --git a/site/release-process.html b/site/release-process.html
new file mode 100644
index 000..c19c40a
--- /dev/null
+++ b/site/release-process.html
@@ -0,0 +1,475 @@
+
+
+
+  
+  
+  
+
+  
+ Release Process | Apache Spark
+
+  
+
+  
+
+  
+
+  
+  
+  
+
+  
+  
+
+  
+  
+  var _gaq = _gaq || [];
+  _gaq.push(['_setAccount', 'UA-32518208-2']);
+  _gaq.push(['_trackPageview']);
+  (function() {
+var ga = document.createElement('script'); ga.type = 'text/javascript'; 
ga.async = true;
+ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 
'http://www') + '.google-analytics.com/ga.js';
+var s = document.getElementsByTagName('script')[0]; 
s.parentNode.insertBefore(ga, s);
+  })();
+
+  
+  function trackOutboundLink(link, category, action) {
+try {
+  _gaq.push(['_trackEvent', category , action]);
+} catch(err){}
+
+setTimeout(function() {
+  document.location.href = link.href;
+}, 100);
+  }
+  
+
+  
+  
+
+
+
+
+https://code.jquery.com/jquery.js";>
+https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js";>
+
+
+
+
+
+
+  
+
+  
+  
+  Lightning-fast cluster computing
+  
+
+  
+
+
+
+  
+  
+
+  Toggle navigation
+  
+  
+  
+
+  
+
+  
+  
+
+  Download
+  
+
+  Libraries 
+
+
+  SQL and DataFrames
+  Spark Streaming
+  MLlib (machine learning)
+  GraphX (graph)
+  
+  Third-Party 
Projects
+
+  
+  
+
+  Documentation 
+
+
+  Latest Release (Spark 2.0.2)
+  Older Versions and Other 
Resources
+  Frequently Asked Questions
+
+  
+  Examples
+  
+
+  Community 
+
+
+  Mailing Lists & Resources
+  Contributing to Spark
+  https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
+  Powered By
+  Project Committers
+
+  
+  
+
+   Developers 
+
+
+  Useful Developer Tools
+  Versioning Policy
+  Release Process
+
+  
+
+
+  
+http://www.apache.org/"; class="dropdown-toggle" 
data-toggle="dropdown">
+  Apache Software Foundation 
+
+  http://www.apache.org/";>Apache Homepage
+  http://www.apache.org/licenses/";>License
+  http://www.apache.org/foundation/sponsorship.html";>Sponsorship
+  http://www.apache.org/foundation/thanks.html";>Thanks
+  http://www.apache.org/security/";>Security
+
+  
+
+  
+  
+
+
+
+
+  
+
+  Latest News
+  
+
+  Spark 
wins CloudSort Benchmark as the most efficient engine
+  (Nov 15, 2016)
+
+  Spark 2.0.2 
released
+  (Nov 14, 2016)
+
+  Spark 1.6.3 
released
+  (Nov 07, 2016)
+
+  Spark 2.0.1 
released
+  (Oct 03, 2016)
+
+  
+  Archive
+
+
+  
+Download Spark
+  
+  
+Built-in Libraries:
+  
+  
+SQL and DataFrames
+Spark Streaming
+MLlib (machine learning)
+GraphX (graph)
+  
+  Third-Party Projects
+
+  
+
+  
+Preparing Spark Releases
+
+Background
+
+The release manager role in Spark means you are responsible for a few 
different things:
+
+
+  Preparing for release candidates:
+
+  cutting a release branch
+  informing the community of timing
+  working with component leads to clean up JIRA
+  making code changes in that branch with necessary version 
updates
+
+  
+  Running the voting process for a release:
+
+  creating release candidates using automated tooling
+  calling votes and triaging issues
+
+  
+  Finalizing and posting a release:
+
+  updating the Spark website
+  writing release notes
+  announcing the release
+
+  
+
+
+Preparing Spark for Release
+
+The main step towards preparing a release is to create a release branch. 
This is done via 
+standard Git branching mechanism and should be announced to the community once 
the branch is 
+created. It is also good to set up Jenkins jobs for the release branch once it 
is cut to 
+ensure tests are passing (consult Josh Rosen and Shane Knapp for help with 
this).
+
+Next, ensure that all Spark versions are correct in the code base on the 
release branch (see 
+https://github.com/apache/spark/commit/01d233e4aede65ffa39b9d2322196d4b6418652

[3/5] spark-website git commit: Port wiki Useful Developer Tools and Profiling Spark Apps to /developer-tools.html; port Spark Versioning Policy and main wiki to /versioning-policy.html; port Preparin

2016-11-23 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/news/spark-accepted-into-apache-incubator.html
--
diff --git a/site/news/spark-accepted-into-apache-incubator.html 
b/site/news/spark-accepted-into-apache-incubator.html
index 571d35b..57e4881 100644
--- a/site/news/spark-accepted-into-apache-incubator.html
+++ b/site/news/spark-accepted-into-apache-incubator.html
@@ -108,24 +108,32 @@
 
   Latest Release (Spark 2.0.2)
   Older Versions and Other 
Resources
+  Frequently Asked Questions
 
   
   Examples
   
-
+
   Community 
 
 
-  Mailing Lists
+  Mailing Lists & Resources
   Contributing to Spark
   https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
-  Events and Meetups
-  Project History
   Powered By
   Project Committers
 
   
-  FAQ
+  
+
+   Developers 
+
+
+  Useful Developer Tools
+  Versioning Policy
+  Release Process
+
+  
 
 
   

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/news/spark-and-shark-in-the-news.html
--
diff --git a/site/news/spark-and-shark-in-the-news.html 
b/site/news/spark-and-shark-in-the-news.html
index 92d56f3..3994fe0 100644
--- a/site/news/spark-and-shark-in-the-news.html
+++ b/site/news/spark-and-shark-in-the-news.html
@@ -108,24 +108,32 @@
 
   Latest Release (Spark 2.0.2)
   Older Versions and Other 
Resources
+  Frequently Asked Questions
 
   
   Examples
   
-
+
   Community 
 
 
-  Mailing Lists
+  Mailing Lists & Resources
   Contributing to Spark
   https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
-  Events and Meetups
-  Project History
   Powered By
   Project Committers
 
   
-  FAQ
+  
+
+   Developers 
+
+
+  Useful Developer Tools
+  Versioning Policy
+  Release Process
+
+  
 
 
   

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/news/spark-becomes-tlp.html
--
diff --git a/site/news/spark-becomes-tlp.html b/site/news/spark-becomes-tlp.html
index f6e7f4a..803c919 100644
--- a/site/news/spark-becomes-tlp.html
+++ b/site/news/spark-becomes-tlp.html
@@ -108,24 +108,32 @@
 
   Latest Release (Spark 2.0.2)
   Older Versions and Other 
Resources
+  Frequently Asked Questions
 
   
   Examples
   
-
+
   Community 
 
 
-  Mailing Lists
+  Mailing Lists & Resources
   Contributing to Spark
   https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
-  Events and Meetups
-  Project History
   Powered By
   Project Committers
 
   
-  FAQ
+  
+
+   Developers 
+
+
+  Useful Developer Tools
+  Versioning Policy
+  Release Process
+
+  
 
 
   

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/news/spark-featured-in-wired.html
--
diff --git a/site/news/spark-featured-in-wired.html 
b/site/news/spark-featured-in-wired.html
index 507273b..6f2398f 100644
--- a/site/news/spark-featured-in-wired.html
+++ b/site/news/spark-featured-in-wired.html
@@ -108,24 +108,32 @@
 
   Latest Release (Spark 2.0.2)
   Older Versions and Other 
Resources
+  Frequently Asked Questions
 
   
   Examples
   
-
+
   Community 
 
 
-  Mailing Lists
+  Mailing Lists & Resources
   Contributing to Spark
   https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
-  Events and Meetups
-  Project History
   Powered By
   Project Committers
 
   
-  FAQ
+  
+
+   Developers 
+
+
+  Useful Developer Tools
+  Versioning Policy
+  Release Process
+
+  
 
 
   

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/news/spark-mailing-lists-moving-to-apache.html
--
diff --git a/site/news/spark-mailing-lists-moving-to-apache.html 
b/site/news/spark-mailing-lists-moving-to-apache.html
index 7effed1..3f089bd 100644
--- a/site/ne

[5/5] spark-website git commit: Port wiki Useful Developer Tools and Profiling Spark Apps to /developer-tools.html; port Spark Versioning Policy and main wiki to /versioning-policy.html; port Preparin

2016-11-23 Thread srowen

Port wiki Useful Developer Tools and Profiling Spark Apps to 
/developer-tools.html; port Spark Versioning Policy and main wiki to 
/versioning-policy.html; port Preparing Spark Releases to 
/release-process.html; rearrange menu with new Developer menu


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/cf21826b
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/cf21826b
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/cf21826b

Branch: refs/heads/asf-site
Commit: cf21826beab2b83cca0028c555e1008ae2f2ed93
Parents: 0744e8f
Author: Sean Owen 
Authored: Tue Nov 22 14:21:33 2016 +
Committer: Sean Owen 
Committed: Tue Nov 22 14:38:50 2016 +

--
 _layouts/global.html|  18 +-
 developer-tools.md  | 287 +++
 documentation.md|   7 -
 downloads.md|   2 +-
 release-process.md  | 263 ++
 site/committers.html|  18 +-
 site/community.html |  18 +-
 site/contributing.html  |  18 +-
 site/developer-tools.html   | 494 +++
 site/documentation.html |  25 +-
 site/downloads.html |  20 +-
 site/examples.html  |  18 +-
 site/faq.html   |  18 +-
 site/graphx/index.html  |  18 +-
 site/index.html |  18 +-
 site/mailing-lists.html |  18 +-
 site/mllib/index.html   |  18 +-
 site/news/amp-camp-2013-registration-ope.html   |  18 +-
 .../news/announcing-the-first-spark-summit.html |  18 +-
 .../news/fourth-spark-screencast-published.html |  18 +-
 site/news/index.html|  18 +-
 site/news/nsdi-paper.html   |  18 +-
 site/news/one-month-to-spark-summit-2015.html   |  18 +-
 .../proposals-open-for-spark-summit-east.html   |  18 +-
 ...registration-open-for-spark-summit-east.html |  18 +-
 .../news/run-spark-and-shark-on-amazon-emr.html |  18 +-
 site/news/spark-0-6-1-and-0-5-2-released.html   |  18 +-
 site/news/spark-0-6-2-released.html |  18 +-
 site/news/spark-0-7-0-released.html |  18 +-
 site/news/spark-0-7-2-released.html |  18 +-
 site/news/spark-0-7-3-released.html |  18 +-
 site/news/spark-0-8-0-released.html |  18 +-
 site/news/spark-0-8-1-released.html |  18 +-
 site/news/spark-0-9-0-released.html |  18 +-
 site/news/spark-0-9-1-released.html |  18 +-
 site/news/spark-0-9-2-released.html |  18 +-
 site/news/spark-1-0-0-released.html |  18 +-
 site/news/spark-1-0-1-released.html |  18 +-
 site/news/spark-1-0-2-released.html |  18 +-
 site/news/spark-1-1-0-released.html |  18 +-
 site/news/spark-1-1-1-released.html |  18 +-
 site/news/spark-1-2-0-released.html |  18 +-
 site/news/spark-1-2-1-released.html |  18 +-
 site/news/spark-1-2-2-released.html |  18 +-
 site/news/spark-1-3-0-released.html |  18 +-
 site/news/spark-1-4-0-released.html |  18 +-
 site/news/spark-1-4-1-released.html |  18 +-
 site/news/spark-1-5-0-released.html |  18 +-
 site/news/spark-1-5-1-released.html |  18 +-
 site/news/spark-1-5-2-released.html |  18 +-
 site/news/spark-1-6-0-released.html |  18 +-
 site/news/spark-1-6-1-released.html |  18 +-
 site/news/spark-1-6-2-released.html |  18 +-
 site/news/spark-1-6-3-released.html |  18 +-
 site/news/spark-2-0-0-released.html |  18 +-
 site/news/spark-2-0-1-released.html |  18 +-
 site/news/spark-2-0-2-released.html |  18 +-
 site/news/spark-2.0.0-preview.html  |  18 +-
 .../spark-accepted-into-apache-incubator.html   |  18 +-
 site/news/spark-and-shark-in-the-news.html  |  18 +-
 site/news/spark-becomes-tlp.html|  18 +-
 site/news/spark-featured-in-wired.html  |  18 +-
 .../spark-mailing-lists-moving-to-apache.html   |  18 +-
 site/news/spark-meetups.html|  18 +-
 site/news/spark-screencasts-published.html  |  18 +-
 site/news/spark-summit-2013-is-a-wrap.html  |  18 +-
 site/news/spark-summit-2014-videos-posted.html  |  18 +-
 site/news/spark-summit-2015-videos-posted.html  |  18 +-
 site/news/spark-summit-agenda-posted.html   |  18 +-
 .../spark-summit-east-2015-videos-posted.html   |  18 +-
 .../spark-summit-east-2016-cfp-closing.html |  18 +-
 site/news/s

[1/5] spark-website git commit: Port wiki Useful Developer Tools and Profiling Spark Apps to /developer-tools.html; port Spark Versioning Policy and main wiki to /versioning-policy.html; port Preparin

2016-11-23 Thread srowen

Repository: spark-website
Updated Branches:
  refs/heads/asf-site 0744e8fdd -> cf21826be


http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/releases/spark-release-1-6-0.html
--
diff --git a/site/releases/spark-release-1-6-0.html 
b/site/releases/spark-release-1-6-0.html
index 95f2589..3aeb3fb 100644
--- a/site/releases/spark-release-1-6-0.html
+++ b/site/releases/spark-release-1-6-0.html
@@ -108,24 +108,32 @@
 
   Latest Release (Spark 2.0.2)
   Older Versions and Other 
Resources
+  Frequently Asked Questions
 
   
   Examples
   
-
+
   Community 
 
 
-  Mailing Lists
+  Mailing Lists & Resources
   Contributing to Spark
   https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
-  Events and Meetups
-  Project History
   Powered By
   Project Committers
 
   
-  FAQ
+  
+
+   Developers 
+
+
+  Useful Developer Tools
+  Versioning Policy
+  Release Process
+
+  
 
 
   

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/releases/spark-release-1-6-1.html
--
diff --git a/site/releases/spark-release-1-6-1.html 
b/site/releases/spark-release-1-6-1.html
index 8bd6172..33d6ebb 100644
--- a/site/releases/spark-release-1-6-1.html
+++ b/site/releases/spark-release-1-6-1.html
@@ -108,24 +108,32 @@
 
   Latest Release (Spark 2.0.2)
   Older Versions and Other 
Resources
+  Frequently Asked Questions
 
   
   Examples
   
-
+
   Community 
 
 
-  Mailing Lists
+  Mailing Lists & Resources
   Contributing to Spark
   https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
-  Events and Meetups
-  Project History
   Powered By
   Project Committers
 
   
-  FAQ
+  
+
+   Developers 
+
+
+  Useful Developer Tools
+  Versioning Policy
+  Release Process
+
+  
 
 
   

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/releases/spark-release-1-6-2.html
--
diff --git a/site/releases/spark-release-1-6-2.html 
b/site/releases/spark-release-1-6-2.html
index da97150..c721e56 100644
--- a/site/releases/spark-release-1-6-2.html
+++ b/site/releases/spark-release-1-6-2.html
@@ -108,24 +108,32 @@
 
   Latest Release (Spark 2.0.2)
   Older Versions and Other 
Resources
+  Frequently Asked Questions
 
   
   Examples
   
-
+
   Community 
 
 
-  Mailing Lists
+  Mailing Lists & Resources
   Contributing to Spark
   https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
-  Events and Meetups
-  Project History
   Powered By
   Project Committers
 
   
-  FAQ
+  
+
+   Developers 
+
+
+  Useful Developer Tools
+  Versioning Policy
+  Release Process
+
+  
 
 
   

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/releases/spark-release-1-6-3.html
--
diff --git a/site/releases/spark-release-1-6-3.html 
b/site/releases/spark-release-1-6-3.html
index d293bcc..b193762 100644
--- a/site/releases/spark-release-1-6-3.html
+++ b/site/releases/spark-release-1-6-3.html
@@ -108,24 +108,32 @@
 
   Latest Release (Spark 2.0.2)
   Older Versions and Other 
Resources
+  Frequently Asked Questions
 
   
   Examples
   
-
+
   Community 
 
 
-  Mailing Lists
+  Mailing Lists & Resources
   Contributing to Spark
   https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker
-  Events and Meetups
-  Project History
   Powered By
   Project Committers
 
   
-  FAQ
+  
+
+   Developers 
+
+
+  Useful Developer Tools
+  Versioning Policy
+  Release Process
+
+  
 
 
   

http://git-wip-us.apache.org/repos/asf/spark-website/blob/cf21826b/site/releases/spark-release-2-0-0.html
--
diff --git a/site/releases/spark-release-2-0-0.html 
b/site/releases/spark-release-2-0-0.html
index e18bd4b..a1959e1 100644
--- a/site/rele

spark git commit: [SPARK-18073][DOCS][WIP] Migrate wiki to spark.apache.org web site

2016-11-23 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 fabb5aeaf -> 5f198d200


[SPARK-18073][DOCS][WIP] Migrate wiki to spark.apache.org web site

## What changes were proposed in this pull request?

Updates links to the wiki to links to the new location of content on 
spark.apache.org.

## How was this patch tested?

Doc builds

Author: Sean Owen 

Closes #15967 from srowen/SPARK-18073.1.

(cherry picked from commit 7e0cd1d9b168286386f15e9b55988733476ae2bb)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5f198d20
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5f198d20
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5f198d20

Branch: refs/heads/branch-2.1
Commit: 5f198d200d47703f6ab770e592c0a1d9f8d7b0dc
Parents: fabb5ae
Author: Sean Owen 
Authored: Wed Nov 23 11:25:47 2016 +
Committer: Sean Owen 
Committed: Wed Nov 23 11:25:55 2016 +

--
 .github/PULL_REQUEST_TEMPLATE|  2 +-
 CONTRIBUTING.md  |  4 ++--
 R/README.md  |  2 +-
 R/pkg/DESCRIPTION|  2 +-
 README.md| 11 ++-
 dev/checkstyle.xml   |  2 +-
 docs/_layouts/global.html|  4 ++--
 docs/building-spark.md   |  4 ++--
 docs/contributing-to-spark.md|  2 +-
 docs/index.md|  4 ++--
 docs/sparkr.md   |  2 +-
 docs/streaming-programming-guide.md  |  2 +-
 .../spark/sql/execution/datasources/DataSource.scala |  5 ++---
 13 files changed, 23 insertions(+), 23 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/5f198d20/.github/PULL_REQUEST_TEMPLATE
--
diff --git a/.github/PULL_REQUEST_TEMPLATE b/.github/PULL_REQUEST_TEMPLATE
index 0e41cf1..5af45d6 100644
--- a/.github/PULL_REQUEST_TEMPLATE
+++ b/.github/PULL_REQUEST_TEMPLATE
@@ -7,4 +7,4 @@
 (Please explain how this patch was tested. E.g. unit tests, integration tests, 
manual tests)
 (If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)
 
-Please review 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before 
opening a pull request.
+Please review http://spark.apache.org/contributing.html before opening a pull 
request.

http://git-wip-us.apache.org/repos/asf/spark/blob/5f198d20/CONTRIBUTING.md
--
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 1a8206a..8fdd5aa 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,12 +1,12 @@
 ## Contributing to Spark
 
 *Before opening a pull request*, review the 
-[Contributing to Spark 
wiki](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark). 
+[Contributing to Spark guide](http://spark.apache.org/contributing.html). 
 It lists steps that are required before creating a PR. In particular, consider:
 
 - Is the change important and ready enough to ask the community to spend time 
reviewing?
 - Have you searched for existing, related JIRAs and pull requests?
-- Is this a new feature that can stand alone as a [third party 
project](https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects)
 ?
+- Is this a new feature that can stand alone as a [third party 
project](http://spark.apache.org/third-party-projects.html) ?
 - Is the change being proposed clearly explained and motivated?
 
 When you contribute code, you affirm that the contribution is your original 
work and that you 

http://git-wip-us.apache.org/repos/asf/spark/blob/5f198d20/R/README.md
--
diff --git a/R/README.md b/R/README.md
index 47f9a86..4c40c59 100644
--- a/R/README.md
+++ b/R/README.md
@@ -51,7 +51,7 @@ sparkR.session()
 
  Making changes to SparkR
 
-The 
[instructions](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark)
 for making contributions to Spark also apply to SparkR.
+The [instructions](http://spark.apache.org/contributing.html) for making 
contributions to Spark also apply to SparkR.
 If you only make R file changes (i.e. no Scala changes) then you can just 
re-install the R package using `R/install-dev.sh` and test your changes.
 Once you have made your changes, please include unit tests for them and run 
existing unit tests using the `R/run-tests.sh` script as described below.
 

http://git-wip-us.apache.org/repos/asf/spark/blob/5f198d20/R/pkg/DESCRIPT

spark git commit: [SPARK-18073][DOCS][WIP] Migrate wiki to spark.apache.org web site

2016-11-23 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 2559fb4b4 -> 7e0cd1d9b


[SPARK-18073][DOCS][WIP] Migrate wiki to spark.apache.org web site

## What changes were proposed in this pull request?

Updates links to the wiki to links to the new location of content on 
spark.apache.org.

## How was this patch tested?

Doc builds

Author: Sean Owen 

Closes #15967 from srowen/SPARK-18073.1.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7e0cd1d9
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7e0cd1d9
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7e0cd1d9

Branch: refs/heads/master
Commit: 7e0cd1d9b168286386f15e9b55988733476ae2bb
Parents: 2559fb4
Author: Sean Owen 
Authored: Wed Nov 23 11:25:47 2016 +
Committer: Sean Owen 
Committed: Wed Nov 23 11:25:47 2016 +

--
 .github/PULL_REQUEST_TEMPLATE|  2 +-
 CONTRIBUTING.md  |  4 ++--
 R/README.md  |  2 +-
 R/pkg/DESCRIPTION|  2 +-
 README.md| 11 ++-
 dev/checkstyle.xml   |  2 +-
 docs/_layouts/global.html|  4 ++--
 docs/building-spark.md   |  4 ++--
 docs/contributing-to-spark.md|  2 +-
 docs/index.md|  4 ++--
 docs/sparkr.md   |  2 +-
 docs/streaming-programming-guide.md  |  2 +-
 .../spark/sql/execution/datasources/DataSource.scala |  5 ++---
 13 files changed, 23 insertions(+), 23 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/7e0cd1d9/.github/PULL_REQUEST_TEMPLATE
--
diff --git a/.github/PULL_REQUEST_TEMPLATE b/.github/PULL_REQUEST_TEMPLATE
index 0e41cf1..5af45d6 100644
--- a/.github/PULL_REQUEST_TEMPLATE
+++ b/.github/PULL_REQUEST_TEMPLATE
@@ -7,4 +7,4 @@
 (Please explain how this patch was tested. E.g. unit tests, integration tests, 
manual tests)
 (If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)
 
-Please review 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before 
opening a pull request.
+Please review http://spark.apache.org/contributing.html before opening a pull 
request.

http://git-wip-us.apache.org/repos/asf/spark/blob/7e0cd1d9/CONTRIBUTING.md
--
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 1a8206a..8fdd5aa 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,12 +1,12 @@
 ## Contributing to Spark
 
 *Before opening a pull request*, review the 
-[Contributing to Spark 
wiki](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark). 
+[Contributing to Spark guide](http://spark.apache.org/contributing.html). 
 It lists steps that are required before creating a PR. In particular, consider:
 
 - Is the change important and ready enough to ask the community to spend time 
reviewing?
 - Have you searched for existing, related JIRAs and pull requests?
-- Is this a new feature that can stand alone as a [third party 
project](https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects)
 ?
+- Is this a new feature that can stand alone as a [third party 
project](http://spark.apache.org/third-party-projects.html) ?
 - Is the change being proposed clearly explained and motivated?
 
 When you contribute code, you affirm that the contribution is your original 
work and that you 

http://git-wip-us.apache.org/repos/asf/spark/blob/7e0cd1d9/R/README.md
--
diff --git a/R/README.md b/R/README.md
index 47f9a86..4c40c59 100644
--- a/R/README.md
+++ b/R/README.md
@@ -51,7 +51,7 @@ sparkR.session()
 
  Making changes to SparkR
 
-The 
[instructions](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark)
 for making contributions to Spark also apply to SparkR.
+The [instructions](http://spark.apache.org/contributing.html) for making 
contributions to Spark also apply to SparkR.
 If you only make R file changes (i.e. no Scala changes) then you can just 
re-install the R package using `R/install-dev.sh` and test your changes.
 Once you have made your changes, please include unit tests for them and run 
existing unit tests using the `R/run-tests.sh` script as described below.
 

http://git-wip-us.apache.org/repos/asf/spark/blob/7e0cd1d9/R/pkg/DESCRIPTION
--
diff --git a/R/pkg/DESCRIPTION

spark-website git commit: Add chat rooms to community links

2016-11-23 Thread srowen

Repository: spark-website
Updated Branches:
  refs/heads/asf-site cf21826be -> 9418d7be6


Add chat rooms to community links


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/9418d7be
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/9418d7be
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/9418d7be

Branch: refs/heads/asf-site
Commit: 9418d7be6bd6dd5edf7b34ee78f9a33cdb6c2324
Parents: cf21826
Author: Jakob Odersky 
Authored: Mon Nov 21 18:30:12 2016 -0800
Committer: Sean Owen 
Committed: Wed Nov 23 11:34:38 2016 +

--
 community.md| 10 ++
 site/community.html | 10 ++
 2 files changed, 20 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark-website/blob/9418d7be/community.md
--
diff --git a/community.md b/community.md
index 3bff6ad..d887d31 100644
--- a/community.md
+++ b/community.md
@@ -72,6 +72,16 @@ Some quick tips when using email:
 and include only a few lines of the pertinent code / log within the email.
 - No jobs, sales, or solicitation is permitted on the Apache Spark mailing 
lists.
 
+
+Chat Rooms
+
+Chat rooms are great for quick questions or discussions on specialized topics. 
The following chat rooms are not officially part of Apache Spark; they are 
provided for reference only.
+
+  
+https://gitter.im/spark-scala/Lobby";>Spark with Scala is for 
questions and discussions related to using Spark with the Scala programming 
language.
+  
+
+
 
 Events and Meetups
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/9418d7be/site/community.html
--
diff --git a/site/community.html b/site/community.html
index a83ae1b..ea360f5 100644
--- a/site/community.html
+++ b/site/community.html
@@ -270,6 +270,16 @@ and include only a few lines of the pertinent code / log 
within the email.
   No jobs, sales, or solicitation is permitted on the Apache Spark mailing 
lists.
 
 
+
+Chat Rooms
+
+Chat rooms are great for quick questions or discussions on specialized 
topics. The following chat rooms are not officially part of Apache Spark; they 
are provided for reference only.
+
+  
+https://gitter.im/spark-scala/Lobby";>Spark with Scala is for 
questions and discussions related to using Spark with the Scala programming 
language.
+  
+
+
 
 Events and Meetups
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark-website git commit: Manually patch ec2-scripts.html file, missing from 2.0.1, 2.0.2 docs

2016-11-24 Thread srowen

Repository: spark-website
Updated Branches:
  refs/heads/asf-site 9418d7be6 -> 8d5d77c65


Manually patch ec2-scripts.html file, missing from 2.0.1, 2.0.2 docs


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/8d5d77c6
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/8d5d77c6
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/8d5d77c6

Branch: refs/heads/asf-site
Commit: 8d5d77c656006edb783eb38b799b2dd514b0ed95
Parents: 9418d7b
Author: Sean Owen 
Authored: Thu Nov 24 10:42:20 2016 +
Committer: Sean Owen 
Committed: Thu Nov 24 10:42:20 2016 +

--
 site/docs/2.0.1/ec2-scripts.html | 160 ++
 site/docs/2.0.1/ec2-scripts.md   |   7 --
 site/docs/2.0.2/ec2-scripts.html | 160 ++
 site/docs/2.0.2/ec2-scripts.md   |   7 --
 4 files changed, 320 insertions(+), 14 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark-website/blob/8d5d77c6/site/docs/2.0.1/ec2-scripts.html
--
diff --git a/site/docs/2.0.1/ec2-scripts.html b/site/docs/2.0.1/ec2-scripts.html
new file mode 100644
index 000..9c19ca7
--- /dev/null
+++ b/site/docs/2.0.1/ec2-scripts.html
@@ -0,0 +1,160 @@
+
+
+
+
+
+  
+
+
+
+Running Spark on EC2 - Spark 2.0.1 Documentation
+
+
+
+  https://github.com/amplab/spark-ec2#readme";>
+  https://github.com/amplab/spark-ec2#readme"; />
+
+
+
+
+body {
+padding-top: 60px;
+padding-bottom: 40px;
+}
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  2.0.0
+
+
+
+Overview
+
+
+Programming Guides
+
+Quick 
Start
+Spark 
Programming Guide
+
+Spark Streaming
+DataFrames, Datasets and SQL
+MLlib (Machine 
Learning)
+GraphX (Graph Processing)
+SparkR (R on 
Spark)
+
+
+
+
+API Docs
+
+Scala
+Java
+Python
+R
+
+
+
+
+Deploying
+
+Overview
+Submitting Applications
+
+Spark 
Standalone
+Mesos
+YARN
+
+
+
+
+More
+
+Configuration
+Monitoring
+Tuning Guide
+Job 
Scheduling
+Security
+Hardware Provisioning
+
+Building 
Spark
+https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark";>Contributing
 to Spark
+https://cwiki.apache.org/confluence/display/SPARK/Supplemental+Spark+Projects";>Supplemental
 Projects
+
+
+
+
+
+
+
+
+
+
+
+
+
+Running Spark on EC2
+
+
+This document has been superseded and replaced by 
documentation at https://github.com/amplab/spark-ec2#readme
+
+
+
+
+ 
+
+
+
+
+
+
+
+
+
+MathJax.Hub.Config({
+TeX: { equationNumbers: { autoNumber: "AMS" } }
+});
+
+
+// Note that we load MathJax this way to work with local file 
(file://), HTTP and HTTPS.
+// We could use "//cdn.m

spark git commit: [SPARK-18575][WEB] Keep same style: adjust the position of driver log links

2016-11-25 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master a367d5ff0 -> f58a8aa20


[SPARK-18575][WEB] Keep same style: adjust the position of driver log links

## What changes were proposed in this pull request?

NOT BUG, just adjust the position of driver log link to keep the same style 
with other executors log link.

![image](https://cloud.githubusercontent.com/assets/7402327/20590092/f8bddbb8-b25b-11e6-9aaf-3b5b3073df10.png)

## How was this patch tested?
 no

Author: uncleGen 

Closes #16001 from uncleGen/SPARK-18575.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f58a8aa2
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f58a8aa2
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f58a8aa2

Branch: refs/heads/master
Commit: f58a8aa20106ea36386db79a8a66f529a8da75c9
Parents: a367d5f
Author: uncleGen 
Authored: Fri Nov 25 09:10:17 2016 +
Committer: Sean Owen 
Committed: Fri Nov 25 09:10:17 2016 +

--
 .../spark/scheduler/cluster/YarnClusterSchedulerBackend.scala| 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/f58a8aa2/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
--
diff --git 
a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
 
b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
index ced597b..4f3d5eb 100644
--- 
a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
+++ 
b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
@@ -55,8 +55,8 @@ private[spark] class YarnClusterSchedulerBackend(
   val baseUrl = 
s"$httpScheme$httpAddress/node/containerlogs/$containerId/$user"
   logDebug(s"Base URL for logs: $baseUrl")
   driverLogs = Some(Map(
-"stderr" -> s"$baseUrl/stderr?start=-4096",
-"stdout" -> s"$baseUrl/stdout?start=-4096"))
+"stdout" -> s"$baseUrl/stdout?start=-4096",
+"stderr" -> s"$baseUrl/stderr?start=-4096"))
 } catch {
   case e: Exception =>
 logInfo("Error while building AM log links, so AM" +


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18575][WEB] Keep same style: adjust the position of driver log links

2016-11-25 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 a7f414561 -> 57dbc682d


[SPARK-18575][WEB] Keep same style: adjust the position of driver log links

## What changes were proposed in this pull request?

NOT BUG, just adjust the position of driver log link to keep the same style 
with other executors log link.

![image](https://cloud.githubusercontent.com/assets/7402327/20590092/f8bddbb8-b25b-11e6-9aaf-3b5b3073df10.png)

## How was this patch tested?
 no

Author: uncleGen 

Closes #16001 from uncleGen/SPARK-18575.

(cherry picked from commit f58a8aa20106ea36386db79a8a66f529a8da75c9)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/57dbc682
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/57dbc682
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/57dbc682

Branch: refs/heads/branch-2.1
Commit: 57dbc682dfafc87076dcaafd29c637cb16ace91a
Parents: a7f4145
Author: uncleGen 
Authored: Fri Nov 25 09:10:17 2016 +
Committer: Sean Owen 
Committed: Fri Nov 25 09:10:27 2016 +

--
 .../spark/scheduler/cluster/YarnClusterSchedulerBackend.scala| 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/57dbc682/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
--
diff --git 
a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
 
b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
index ced597b..4f3d5eb 100644
--- 
a/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
+++ 
b/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala
@@ -55,8 +55,8 @@ private[spark] class YarnClusterSchedulerBackend(
   val baseUrl = 
s"$httpScheme$httpAddress/node/containerlogs/$containerId/$user"
   logDebug(s"Base URL for logs: $baseUrl")
   driverLogs = Some(Map(
-"stderr" -> s"$baseUrl/stderr?start=-4096",
-"stdout" -> s"$baseUrl/stdout?start=-4096"))
+"stdout" -> s"$baseUrl/stdout?start=-4096",
+"stderr" -> s"$baseUrl/stderr?start=-4096"))
 } catch {
   case e: Exception =>
 logInfo("Error while building AM log links, so AM" +


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18119][SPARK-CORE] Namenode safemode check is only performed on one namenode which can stuck the startup of SparkHistory server

2016-11-25 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master f58a8aa20 -> f42db0c0c


[SPARK-18119][SPARK-CORE] Namenode safemode check is only performed on one 
namenode which can stuck the startup of SparkHistory server

## What changes were proposed in this pull request?

Instead of using the setSafeMode method that check the first namenode used the 
one which permitts to check only for active NNs
## How was this patch tested?

manual tests

Please review 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before 
opening a pull request.

This commit is contributed by Criteo SA under the Apache v2 licence.

Author: n.fraison 

Closes #15648 from ashangit/SPARK-18119.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f42db0c0
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f42db0c0
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f42db0c0

Branch: refs/heads/master
Commit: f42db0c0c1434bfcccaa70d0db55e16c4396af04
Parents: f58a8aa
Author: n.fraison 
Authored: Fri Nov 25 09:45:51 2016 +
Committer: Sean Owen 
Committed: Fri Nov 25 09:45:51 2016 +

--
 .../org/apache/spark/deploy/history/FsHistoryProvider.scala  | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/f42db0c0/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index ca38a47..8ef69b1 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -663,9 +663,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
   false
   }
 
-  // For testing.
   private[history] def isFsInSafeMode(dfs: DistributedFileSystem): Boolean = {
-dfs.setSafeMode(HdfsConstants.SafeModeAction.SAFEMODE_GET)
+/* true to check only for Active NNs status */
+dfs.setSafeMode(HdfsConstants.SafeModeAction.SAFEMODE_GET, true)
   }
 
   /**


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-18119][SPARK-CORE] Namenode safemode check is only performed on one namenode which can stuck the startup of SparkHistory server

2016-11-25 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 57dbc682d -> a49dfa93e


[SPARK-18119][SPARK-CORE] Namenode safemode check is only performed on one 
namenode which can stuck the startup of SparkHistory server

## What changes were proposed in this pull request?

Instead of using the setSafeMode method that check the first namenode used the 
one which permitts to check only for active NNs
## How was this patch tested?

manual tests

Please review 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before 
opening a pull request.

This commit is contributed by Criteo SA under the Apache v2 licence.

Author: n.fraison 

Closes #15648 from ashangit/SPARK-18119.

(cherry picked from commit f42db0c0c1434bfcccaa70d0db55e16c4396af04)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a49dfa93
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a49dfa93
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a49dfa93

Branch: refs/heads/branch-2.1
Commit: a49dfa93e160d63e806f35cb6b6953367916f44b
Parents: 57dbc68
Author: n.fraison 
Authored: Fri Nov 25 09:45:51 2016 +
Committer: Sean Owen 
Committed: Fri Nov 25 09:46:05 2016 +

--
 .../org/apache/spark/deploy/history/FsHistoryProvider.scala  | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a49dfa93/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index ca38a47..8ef69b1 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -663,9 +663,9 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
   false
   }
 
-  // For testing.
   private[history] def isFsInSafeMode(dfs: DistributedFileSystem): Boolean = {
-dfs.setSafeMode(HdfsConstants.SafeModeAction.SAFEMODE_GET)
+/* true to check only for Active NNs status */
+dfs.setSafeMode(HdfsConstants.SafeModeAction.SAFEMODE_GET, true)
   }
 
   /**


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[1/3] spark git commit: [SPARK-3359][BUILD][DOCS] More changes to resolve javadoc 8 errors that will help unidoc/genjavadoc compatibility

2016-11-25 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master f42db0c0c -> 51b1c1551


http://git-wip-us.apache.org/repos/asf/spark/blob/51b1c155/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala
index e96c2bc..6bb3271 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala
@@ -213,7 +213,7 @@ object MLUtils extends Logging {
   }
 
   /**
-   * Version of [[kFold()]] taking a Long seed.
+   * Version of `kFold()` taking a Long seed.
*/
   @Since("2.0.0")
   def kFold[T: ClassTag](rdd: RDD[T], numFolds: Int, seed: Long): 
Array[(RDD[T], RDD[T])] = {
@@ -262,7 +262,7 @@ object MLUtils extends Logging {
* @param dataset input dataset
* @param cols a list of vector columns to be converted. New vector columns 
will be ignored. If
* unspecified, all old vector columns will be converted except 
nested ones.
-   * @return the input [[DataFrame]] with old vector columns converted to the 
new vector type
+   * @return the input `DataFrame` with old vector columns converted to the 
new vector type
*/
   @Since("2.0.0")
   @varargs
@@ -314,7 +314,7 @@ object MLUtils extends Logging {
* @param dataset input dataset
* @param cols a list of vector columns to be converted. Old vector columns 
will be ignored. If
* unspecified, all new vector columns will be converted except 
nested ones.
-   * @return the input [[DataFrame]] with new vector columns converted to the 
old vector type
+   * @return the input `DataFrame` with new vector columns converted to the 
old vector type
*/
   @Since("2.0.0")
   @varargs
@@ -366,7 +366,7 @@ object MLUtils extends Logging {
* @param dataset input dataset
* @param cols a list of matrix columns to be converted. New matrix columns 
will be ignored. If
* unspecified, all old matrix columns will be converted except 
nested ones.
-   * @return the input [[DataFrame]] with old matrix columns converted to the 
new matrix type
+   * @return the input `DataFrame` with old matrix columns converted to the 
new matrix type
*/
   @Since("2.0.0")
   @varargs
@@ -416,7 +416,7 @@ object MLUtils extends Logging {
* @param dataset input dataset
* @param cols a list of matrix columns to be converted. Old matrix columns 
will be ignored. If
* unspecified, all new matrix columns will be converted except 
nested ones.
-   * @return the input [[DataFrame]] with new matrix columns converted to the 
old matrix type
+   * @return the input `DataFrame` with new matrix columns converted to the 
old matrix type
*/
   @Since("2.0.0")
   @varargs

http://git-wip-us.apache.org/repos/asf/spark/blob/51b1c155/mllib/src/main/scala/org/apache/spark/mllib/util/modelSaveLoad.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/util/modelSaveLoad.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/util/modelSaveLoad.scala
index c881c8e..da0eb04 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/util/modelSaveLoad.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/util/modelSaveLoad.scala
@@ -72,7 +72,7 @@ trait Loader[M <: Saveable] {
   /**
* Load a model from the given path.
*
-   * The model should have been saved by [[Saveable.save]].
+   * The model should have been saved by `Saveable.save`.
*
* @param sc  Spark context used for loading model files.
* @param path  Path specifying the directory to which the model was saved.

http://git-wip-us.apache.org/repos/asf/spark/blob/51b1c155/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 7c0b0b5..5c417d2 100644
--- a/pom.xml
+++ b/pom.xml
@@ -2495,6 +2495,18 @@
   tparam
   X
 
+
+  constructor
+  X
+
+
+  todo
+  X
+
+
+  groupname
+  X
+
   
 
   

http://git-wip-us.apache.org/repos/asf/spark/blob/51b1c155/project/SparkBuild.scala
--
diff --git a/project/SparkBuild.scala b/project/SparkBuild.scala
index 429a163..e3fbe03 100644
--- a/project/SparkBuild.scala
+++ b/project/SparkBuild.scala
@@ -745,7 +745,10 @@ object Unidoc {
   "-tag", """example:a:Example\:""",
   "-tag", """note:a:Note\:""",
   "-tag", "group:X",
-  "-tag", "tparam:X"
+  "-tag", "tparam:X",
+  "-tag", "constructor:X",
+  "-tag", "todo:X",
+  "-tag",

[2/3] spark git commit: [SPARK-3359][BUILD][DOCS] More changes to resolve javadoc 8 errors that will help unidoc/genjavadoc compatibility

2016-11-25 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark/blob/51b1c155/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala
 
b/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala
index 8a6b862..143bf53 100644
--- 
a/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala
@@ -50,9 +50,10 @@ private[ml] class IterativelyReweightedLeastSquaresModel(
  * @param maxIter maximum number of iterations.
  * @param tol the convergence tolerance.
  *
- * @see [[http://www.jstor.org/stable/2345503 P. J. Green, Iteratively 
Reweighted Least Squares
- * for Maximum Likelihood Estimation, and some Robust and Resistant 
Alternatives,
- * Journal of the Royal Statistical Society. Series B, 1984.]]
+ * @see http://www.jstor.org/stable/2345503";>P. J. Green, Iteratively
+ * Reweighted Least Squares for Maximum Likelihood Estimation, and some Robust
+ * and Resistant Alternatives, Journal of the Royal Statistical Society.
+ * Series B, 1984.
  */
 private[ml] class IterativelyReweightedLeastSquares(
 val initialModel: WeightedLeastSquaresModel,

http://git-wip-us.apache.org/repos/asf/spark/blob/51b1c155/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala 
b/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala
index fa45309..e3e03df 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala
@@ -29,7 +29,7 @@ import org.apache.spark.ml.param._
 private[ml] trait HasRegParam extends Params {
 
   /**
-   * Param for regularization parameter (>= 0).
+   * Param for regularization parameter (>= 0).
* @group param
*/
   final val regParam: DoubleParam = new DoubleParam(this, "regParam", 
"regularization parameter (>= 0)", ParamValidators.gtEq(0))
@@ -44,7 +44,7 @@ private[ml] trait HasRegParam extends Params {
 private[ml] trait HasMaxIter extends Params {
 
   /**
-   * Param for maximum number of iterations (>= 0).
+   * Param for maximum number of iterations (>= 0).
* @group param
*/
   final val maxIter: IntParam = new IntParam(this, "maxIter", "maximum number 
of iterations (>= 0)", ParamValidators.gtEq(0))
@@ -238,7 +238,7 @@ private[ml] trait HasOutputCol extends Params {
 private[ml] trait HasCheckpointInterval extends Params {
 
   /**
-   * Param for set checkpoint interval (>= 1) or disable checkpoint (-1). E.g. 
10 means that the cache will get checkpointed every 10 iterations.
+   * Param for set checkpoint interval (>= 1) or disable checkpoint (-1). 
E.g. 10 means that the cache will get checkpointed every 10 iterations.
* @group param
*/
   final val checkpointInterval: IntParam = new IntParam(this, 
"checkpointInterval", "set checkpoint interval (>= 1) or disable checkpoint 
(-1). E.g. 10 means that the cache will get checkpointed every 10 iterations", 
(interval: Int) => interval == -1 || interval >= 1)
@@ -334,7 +334,7 @@ private[ml] trait HasElasticNetParam extends Params {
 private[ml] trait HasTol extends Params {
 
   /**
-   * Param for the convergence tolerance for iterative algorithms (>= 0).
+   * Param for the convergence tolerance for iterative algorithms (>= 0).
* @group param
*/
   final val tol: DoubleParam = new DoubleParam(this, "tol", "the convergence 
tolerance for iterative algorithms (>= 0)", ParamValidators.gtEq(0))
@@ -349,7 +349,7 @@ private[ml] trait HasTol extends Params {
 private[ml] trait HasStepSize extends Params {
 
   /**
-   * Param for Step size to be used for each iteration of optimization (> 0).
+   * Param for Step size to be used for each iteration of optimization (> 
0).
* @group param
*/
   final val stepSize: DoubleParam = new DoubleParam(this, "stepSize", "Step 
size to be used for each iteration of optimization (> 0)", 
ParamValidators.gt(0))
@@ -396,7 +396,7 @@ private[ml] trait HasSolver extends Params {
 private[ml] trait HasAggregationDepth extends Params {
 
   /**
-   * Param for suggested depth for treeAggregate (>= 2).
+   * Param for suggested depth for treeAggregate (>= 2).
* @group expertParam
*/
   final val aggregationDepth: IntParam = new IntParam(this, 
"aggregationDepth", "suggested depth for treeAggregate (>= 2)", 
ParamValidators.gtEq(2))

http://git-wip-us.apache.org/repos/asf/spark/blob/51b1c155/mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala
--
dif

[2/3] spark git commit: [SPARK-3359][BUILD][DOCS] More changes to resolve javadoc 8 errors that will help unidoc/genjavadoc compatibility

2016-11-25 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark/blob/69856f28/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala
 
b/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala
index 8a6b862..143bf53 100644
--- 
a/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/ml/optim/IterativelyReweightedLeastSquares.scala
@@ -50,9 +50,10 @@ private[ml] class IterativelyReweightedLeastSquaresModel(
  * @param maxIter maximum number of iterations.
  * @param tol the convergence tolerance.
  *
- * @see [[http://www.jstor.org/stable/2345503 P. J. Green, Iteratively 
Reweighted Least Squares
- * for Maximum Likelihood Estimation, and some Robust and Resistant 
Alternatives,
- * Journal of the Royal Statistical Society. Series B, 1984.]]
+ * @see http://www.jstor.org/stable/2345503";>P. J. Green, Iteratively
+ * Reweighted Least Squares for Maximum Likelihood Estimation, and some Robust
+ * and Resistant Alternatives, Journal of the Royal Statistical Society.
+ * Series B, 1984.
  */
 private[ml] class IterativelyReweightedLeastSquares(
 val initialModel: WeightedLeastSquaresModel,

http://git-wip-us.apache.org/repos/asf/spark/blob/69856f28/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala 
b/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala
index fa45309..e3e03df 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala
@@ -29,7 +29,7 @@ import org.apache.spark.ml.param._
 private[ml] trait HasRegParam extends Params {
 
   /**
-   * Param for regularization parameter (>= 0).
+   * Param for regularization parameter (>= 0).
* @group param
*/
   final val regParam: DoubleParam = new DoubleParam(this, "regParam", 
"regularization parameter (>= 0)", ParamValidators.gtEq(0))
@@ -44,7 +44,7 @@ private[ml] trait HasRegParam extends Params {
 private[ml] trait HasMaxIter extends Params {
 
   /**
-   * Param for maximum number of iterations (>= 0).
+   * Param for maximum number of iterations (>= 0).
* @group param
*/
   final val maxIter: IntParam = new IntParam(this, "maxIter", "maximum number 
of iterations (>= 0)", ParamValidators.gtEq(0))
@@ -238,7 +238,7 @@ private[ml] trait HasOutputCol extends Params {
 private[ml] trait HasCheckpointInterval extends Params {
 
   /**
-   * Param for set checkpoint interval (>= 1) or disable checkpoint (-1). E.g. 
10 means that the cache will get checkpointed every 10 iterations.
+   * Param for set checkpoint interval (>= 1) or disable checkpoint (-1). 
E.g. 10 means that the cache will get checkpointed every 10 iterations.
* @group param
*/
   final val checkpointInterval: IntParam = new IntParam(this, 
"checkpointInterval", "set checkpoint interval (>= 1) or disable checkpoint 
(-1). E.g. 10 means that the cache will get checkpointed every 10 iterations", 
(interval: Int) => interval == -1 || interval >= 1)
@@ -334,7 +334,7 @@ private[ml] trait HasElasticNetParam extends Params {
 private[ml] trait HasTol extends Params {
 
   /**
-   * Param for the convergence tolerance for iterative algorithms (>= 0).
+   * Param for the convergence tolerance for iterative algorithms (>= 0).
* @group param
*/
   final val tol: DoubleParam = new DoubleParam(this, "tol", "the convergence 
tolerance for iterative algorithms (>= 0)", ParamValidators.gtEq(0))
@@ -349,7 +349,7 @@ private[ml] trait HasTol extends Params {
 private[ml] trait HasStepSize extends Params {
 
   /**
-   * Param for Step size to be used for each iteration of optimization (> 0).
+   * Param for Step size to be used for each iteration of optimization (> 
0).
* @group param
*/
   final val stepSize: DoubleParam = new DoubleParam(this, "stepSize", "Step 
size to be used for each iteration of optimization (> 0)", 
ParamValidators.gt(0))
@@ -396,7 +396,7 @@ private[ml] trait HasSolver extends Params {
 private[ml] trait HasAggregationDepth extends Params {
 
   /**
-   * Param for suggested depth for treeAggregate (>= 2).
+   * Param for suggested depth for treeAggregate (>= 2).
* @group expertParam
*/
   final val aggregationDepth: IntParam = new IntParam(this, 
"aggregationDepth", "suggested depth for treeAggregate (>= 2)", 
ParamValidators.gtEq(2))

http://git-wip-us.apache.org/repos/asf/spark/blob/69856f28/mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala
--
dif

[3/3] spark git commit: [SPARK-3359][BUILD][DOCS] More changes to resolve javadoc 8 errors that will help unidoc/genjavadoc compatibility

2016-11-25 Thread srowen

[SPARK-3359][BUILD][DOCS] More changes to resolve javadoc 8 errors that will 
help unidoc/genjavadoc compatibility

## What changes were proposed in this pull request?

This PR only tries to fix things that looks pretty straightforward and were 
fixed in other previous PRs before.

This PR roughly fixes several things as below:

- Fix unrecognisable class and method links in javadoc by changing it from 
`[[..]]` to `` `...` ``

  ```
  [error] 
.../spark/sql/core/target/java/org/apache/spark/sql/streaming/DataStreamReader.java:226:
 error: reference not found
  [error]* Loads text files and returns a {link DataFrame} whose schema 
starts with a string column named
  ```

- Fix an exception annotation and remove code backticks in `throws` annotation

  Currently, sbt unidoc with Java 8 complains as below:

  ```
  [error] .../java/org/apache/spark/sql/streaming/StreamingQuery.java:72: 
error: unexpected text
  [error]* throws StreamingQueryException, if this query has 
terminated with an exception.
  ```

  `throws` should specify the correct class name from 
`StreamingQueryException,` to `StreamingQueryException` without backticks. (see 
[JDK-8007644](https://bugs.openjdk.java.net/browse/JDK-8007644)).

- Fix `[[http..]]` to ``.

  ```diff
  -   * 
[[https://blogs.oracle.com/java-platform-group/entry/diagnosing_tls_ssl_and_https
 Oracle
  -   * blog page]].
  +   * https://blogs.oracle.com/java-platform-group/entry/diagnosing_tls_ssl_and_https";>
  +   * Oracle blog page.
  ```

   `[[http...]]` link markdown in scaladoc is unrecognisable in javadoc.

- It seems class can't have `return` annotation. So, two cases of this were 
removed.

  ```
  [error] 
.../java/org/apache/spark/mllib/regression/IsotonicRegression.java:27: error: 
invalid use of return
  [error]* return New instance of IsotonicRegression.
  ```

- Fix < to `<` and > to `>` according to HTML rules.

- Fix `` complaint

- Exclude unrecognisable in javadoc, `constructor`, `todo` and `groupname`.

## How was this patch tested?

Manually tested by `jekyll build` with Java 7 and 8

```
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
```

```
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
```

Note: this does not yet make sbt unidoc suceed with Java 8 yet but it reduces 
the number of errors with Java 8.

Author: hyukjinkwon 

Closes #15999 from HyukjinKwon/SPARK-3359-errors.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/51b1c155
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/51b1c155
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/51b1c155

Branch: refs/heads/master
Commit: 51b1c1551d3a7147403b9e821fcc7c8f57b4824c
Parents: f42db0c
Author: hyukjinkwon 
Authored: Fri Nov 25 11:27:07 2016 +
Committer: Sean Owen 
Committed: Fri Nov 25 11:27:07 2016 +

--
 .../scala/org/apache/spark/SSLOptions.scala |  4 +-
 .../org/apache/spark/api/java/JavaPairRDD.scala |  6 +-
 .../org/apache/spark/api/java/JavaRDD.scala | 10 +--
 .../spark/api/java/JavaSparkContext.scala   | 14 ++--
 .../org/apache/spark/io/CompressionCodec.scala  |  2 +-
 .../main/scala/org/apache/spark/rdd/RDD.scala   | 18 ++---
 .../spark/security/CryptoStreamUtils.scala  |  4 +-
 .../spark/serializer/KryoSerializer.scala   |  3 +-
 .../spark/storage/BlockReplicationPolicy.scala  |  7 +-
 .../scala/org/apache/spark/ui/UIUtils.scala |  4 +-
 .../org/apache/spark/util/AccumulatorV2.scala   |  2 +-
 .../scala/org/apache/spark/util/RpcUtils.scala  |  2 +-
 .../org/apache/spark/util/StatCounter.scala |  4 +-
 .../org/apache/spark/util/ThreadUtils.scala |  6 +-
 .../scala/org/apache/spark/util/Utils.scala | 10 +--
 .../spark/util/io/ChunkedByteBuffer.scala   |  2 +-
 .../scala/org/apache/spark/graphx/Graph.scala   |  4 +-
 .../org/apache/spark/graphx/GraphLoader.scala   |  2 +-
 .../apache/spark/graphx/impl/EdgeRDDImpl.scala  |  2 +-
 .../org/apache/spark/graphx/lib/PageRank.scala  |  4 +-
 .../apache/spark/graphx/lib/SVDPlusPlus.scala   |  3 +-
 .../apache/spark/graphx/lib/TriangleCount.scala |  2 +-
 .../distribution/MultivariateGaussian.scala |  3 +-
 .../scala/org/apache/spark/ml/Predictor.scala   |  2 +-
 .../spark/ml/attribute/AttributeGroup.scala |  2 +-
 .../apache/spark/ml/attribute/attributes.scala  |  4 +-
 .../ml/classification/LogisticRegression.scala  | 74 ++--
 .../MultilayerPerceptronClassifier.scala|  1 -
 .../spark/ml/classification/NaiveBayes.scala|  8 ++-
 .../classification/RandomForestClassifier.scala |  6 +-
 .../spark/ml/clustering/BisectingKMeans.scala   | 14 ++--
 .../spark/ml/clustering/ClusteringSummary.scala |  2 +-
 .../spark/ml/cluste

[3/3] spark git commit: [SPARK-3359][BUILD][DOCS] More changes to resolve javadoc 8 errors that will help unidoc/genjavadoc compatibility

2016-11-25 Thread srowen

[SPARK-3359][BUILD][DOCS] More changes to resolve javadoc 8 errors that will 
help unidoc/genjavadoc compatibility

## What changes were proposed in this pull request?

This PR only tries to fix things that looks pretty straightforward and were 
fixed in other previous PRs before.

This PR roughly fixes several things as below:

- Fix unrecognisable class and method links in javadoc by changing it from 
`[[..]]` to `` `...` ``

  ```
  [error] 
.../spark/sql/core/target/java/org/apache/spark/sql/streaming/DataStreamReader.java:226:
 error: reference not found
  [error]* Loads text files and returns a {link DataFrame} whose schema 
starts with a string column named
  ```

- Fix an exception annotation and remove code backticks in `throws` annotation

  Currently, sbt unidoc with Java 8 complains as below:

  ```
  [error] .../java/org/apache/spark/sql/streaming/StreamingQuery.java:72: 
error: unexpected text
  [error]* throws StreamingQueryException, if this query has 
terminated with an exception.
  ```

  `throws` should specify the correct class name from 
`StreamingQueryException,` to `StreamingQueryException` without backticks. (see 
[JDK-8007644](https://bugs.openjdk.java.net/browse/JDK-8007644)).

- Fix `[[http..]]` to ``.

  ```diff
  -   * 
[[https://blogs.oracle.com/java-platform-group/entry/diagnosing_tls_ssl_and_https
 Oracle
  -   * blog page]].
  +   * https://blogs.oracle.com/java-platform-group/entry/diagnosing_tls_ssl_and_https";>
  +   * Oracle blog page.
  ```

   `[[http...]]` link markdown in scaladoc is unrecognisable in javadoc.

- It seems class can't have `return` annotation. So, two cases of this were 
removed.

  ```
  [error] 
.../java/org/apache/spark/mllib/regression/IsotonicRegression.java:27: error: 
invalid use of return
  [error]* return New instance of IsotonicRegression.
  ```

- Fix < to `<` and > to `>` according to HTML rules.

- Fix `` complaint

- Exclude unrecognisable in javadoc, `constructor`, `todo` and `groupname`.

## How was this patch tested?

Manually tested by `jekyll build` with Java 7 and 8

```
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
```

```
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
```

Note: this does not yet make sbt unidoc suceed with Java 8 yet but it reduces 
the number of errors with Java 8.

Author: hyukjinkwon 

Closes #15999 from HyukjinKwon/SPARK-3359-errors.

(cherry picked from commit 51b1c1551d3a7147403b9e821fcc7c8f57b4824c)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/69856f28
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/69856f28
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/69856f28

Branch: refs/heads/branch-2.1
Commit: 69856f28361022812d2af83128d8591694bcef4b
Parents: a49dfa9
Author: hyukjinkwon 
Authored: Fri Nov 25 11:27:07 2016 +
Committer: Sean Owen 
Committed: Fri Nov 25 11:27:18 2016 +

--
 .../scala/org/apache/spark/SSLOptions.scala |  4 +-
 .../org/apache/spark/api/java/JavaPairRDD.scala |  6 +-
 .../org/apache/spark/api/java/JavaRDD.scala | 10 +--
 .../spark/api/java/JavaSparkContext.scala   | 14 ++--
 .../org/apache/spark/io/CompressionCodec.scala  |  2 +-
 .../main/scala/org/apache/spark/rdd/RDD.scala   | 18 ++---
 .../spark/security/CryptoStreamUtils.scala  |  4 +-
 .../spark/serializer/KryoSerializer.scala   |  3 +-
 .../spark/storage/BlockReplicationPolicy.scala  |  7 +-
 .../scala/org/apache/spark/ui/UIUtils.scala |  4 +-
 .../org/apache/spark/util/AccumulatorV2.scala   |  2 +-
 .../scala/org/apache/spark/util/RpcUtils.scala  |  2 +-
 .../org/apache/spark/util/StatCounter.scala |  4 +-
 .../org/apache/spark/util/ThreadUtils.scala |  6 +-
 .../scala/org/apache/spark/util/Utils.scala | 10 +--
 .../spark/util/io/ChunkedByteBuffer.scala   |  2 +-
 .../scala/org/apache/spark/graphx/Graph.scala   |  4 +-
 .../org/apache/spark/graphx/GraphLoader.scala   |  2 +-
 .../apache/spark/graphx/impl/EdgeRDDImpl.scala  |  2 +-
 .../org/apache/spark/graphx/lib/PageRank.scala  |  4 +-
 .../apache/spark/graphx/lib/SVDPlusPlus.scala   |  3 +-
 .../apache/spark/graphx/lib/TriangleCount.scala |  2 +-
 .../distribution/MultivariateGaussian.scala |  3 +-
 .../scala/org/apache/spark/ml/Predictor.scala   |  2 +-
 .../spark/ml/attribute/AttributeGroup.scala |  2 +-
 .../apache/spark/ml/attribute/attributes.scala  |  4 +-
 .../ml/classification/LogisticRegression.scala  | 74 ++--
 .../MultilayerPerceptronClassifier.scala|  1 -
 .../spark/ml/classification/NaiveBayes.scala|  8 ++-
 .../classification/RandomForestClassifier.scala |  6 +-
 .../spark/ml/clustering/BisectingKM

[1/3] spark git commit: [SPARK-3359][BUILD][DOCS] More changes to resolve javadoc 8 errors that will help unidoc/genjavadoc compatibility

2016-11-25 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 a49dfa93e -> 69856f283


http://git-wip-us.apache.org/repos/asf/spark/blob/69856f28/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala
index e96c2bc..6bb3271 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala
@@ -213,7 +213,7 @@ object MLUtils extends Logging {
   }
 
   /**
-   * Version of [[kFold()]] taking a Long seed.
+   * Version of `kFold()` taking a Long seed.
*/
   @Since("2.0.0")
   def kFold[T: ClassTag](rdd: RDD[T], numFolds: Int, seed: Long): 
Array[(RDD[T], RDD[T])] = {
@@ -262,7 +262,7 @@ object MLUtils extends Logging {
* @param dataset input dataset
* @param cols a list of vector columns to be converted. New vector columns 
will be ignored. If
* unspecified, all old vector columns will be converted except 
nested ones.
-   * @return the input [[DataFrame]] with old vector columns converted to the 
new vector type
+   * @return the input `DataFrame` with old vector columns converted to the 
new vector type
*/
   @Since("2.0.0")
   @varargs
@@ -314,7 +314,7 @@ object MLUtils extends Logging {
* @param dataset input dataset
* @param cols a list of vector columns to be converted. Old vector columns 
will be ignored. If
* unspecified, all new vector columns will be converted except 
nested ones.
-   * @return the input [[DataFrame]] with new vector columns converted to the 
old vector type
+   * @return the input `DataFrame` with new vector columns converted to the 
old vector type
*/
   @Since("2.0.0")
   @varargs
@@ -366,7 +366,7 @@ object MLUtils extends Logging {
* @param dataset input dataset
* @param cols a list of matrix columns to be converted. New matrix columns 
will be ignored. If
* unspecified, all old matrix columns will be converted except 
nested ones.
-   * @return the input [[DataFrame]] with old matrix columns converted to the 
new matrix type
+   * @return the input `DataFrame` with old matrix columns converted to the 
new matrix type
*/
   @Since("2.0.0")
   @varargs
@@ -416,7 +416,7 @@ object MLUtils extends Logging {
* @param dataset input dataset
* @param cols a list of matrix columns to be converted. Old matrix columns 
will be ignored. If
* unspecified, all new matrix columns will be converted except 
nested ones.
-   * @return the input [[DataFrame]] with new matrix columns converted to the 
old matrix type
+   * @return the input `DataFrame` with new matrix columns converted to the 
old matrix type
*/
   @Since("2.0.0")
   @varargs

http://git-wip-us.apache.org/repos/asf/spark/blob/69856f28/mllib/src/main/scala/org/apache/spark/mllib/util/modelSaveLoad.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/util/modelSaveLoad.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/util/modelSaveLoad.scala
index c881c8e..da0eb04 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/util/modelSaveLoad.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/util/modelSaveLoad.scala
@@ -72,7 +72,7 @@ trait Loader[M <: Saveable] {
   /**
* Load a model from the given path.
*
-   * The model should have been saved by [[Saveable.save]].
+   * The model should have been saved by `Saveable.save`.
*
* @param sc  Spark context used for loading model files.
* @param path  Path specifying the directory to which the model was saved.

http://git-wip-us.apache.org/repos/asf/spark/blob/69856f28/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 7c0b0b5..5c417d2 100644
--- a/pom.xml
+++ b/pom.xml
@@ -2495,6 +2495,18 @@
   tparam
   X
 
+
+  constructor
+  X
+
+
+  todo
+  X
+
+
+  groupname
+  X
+
   
 
   

http://git-wip-us.apache.org/repos/asf/spark/blob/69856f28/project/SparkBuild.scala
--
diff --git a/project/SparkBuild.scala b/project/SparkBuild.scala
index 429a163..e3fbe03 100644
--- a/project/SparkBuild.scala
+++ b/project/SparkBuild.scala
@@ -745,7 +745,10 @@ object Unidoc {
   "-tag", """example:a:Example\:""",
   "-tag", """note:a:Note\:""",
   "-tag", "group:X",
-  "-tag", "tparam:X"
+  "-tag", "tparam:X",
+  "-tag", "constructor:X",
+  "-tag", "todo:X",
+  "-ta

spark git commit: [SPARK-18356][ML] Improve MLKmeans Performance

2016-11-25 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 5ecdc7c5c -> 445d4d9e1


[SPARK-18356][ML] Improve MLKmeans Performance

## What changes were proposed in this pull request?

Spark Kmeans fit() doesn't cache the RDD which generates a lot of warnings :
 WARN KMeans: The input data is not directly cached, which may hurt performance 
if its parent RDDs are also uncached.
So, Kmeans should cache the internal rdd before calling the Mllib.Kmeans algo, 
this helped to improve spark kmeans performance by 14%

https://github.com/ZakariaHili/spark/commit/a9cf905cf7dbd50eeb9a8b4f891f2f41ea672472

hhbyyh
## How was this patch tested?
Pass Kmeans tests and existing tests

Author: Zakaria_Hili 
Author: HILI Zakaria 

Closes #15965 from ZakariaHili/zakbranch.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/445d4d9e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/445d4d9e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/445d4d9e

Branch: refs/heads/master
Commit: 445d4d9e13ebaee9eceea6135fe7ee47812d97de
Parents: 5ecdc7c
Author: Zakaria_Hili 
Authored: Fri Nov 25 13:19:26 2016 +
Committer: Sean Owen 
Committed: Fri Nov 25 13:19:26 2016 +

--
 .../org/apache/spark/ml/clustering/KMeans.scala | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/445d4d9e/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala 
b/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
index 6e124eb..ad4f79a 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
@@ -33,6 +33,7 @@ import org.apache.spark.rdd.RDD
 import org.apache.spark.sql.{DataFrame, Dataset, Row}
 import org.apache.spark.sql.functions.{col, udf}
 import org.apache.spark.sql.types.{IntegerType, StructType}
+import org.apache.spark.storage.StorageLevel
 import org.apache.spark.util.VersionUtils.majorVersion
 
 /**
@@ -305,12 +306,20 @@ class KMeans @Since("1.5.0") (
 
   @Since("2.0.0")
   override def fit(dataset: Dataset[_]): KMeansModel = {
+val handlePersistence = dataset.rdd.getStorageLevel == StorageLevel.NONE
+fit(dataset, handlePersistence)
+  }
+
+  @Since("2.2.0")
+  protected def fit(dataset: Dataset[_], handlePersistence: Boolean): 
KMeansModel = {
 transformSchema(dataset.schema, logging = true)
-val rdd: RDD[OldVector] = dataset.select(col($(featuresCol))).rdd.map {
+val instances: RDD[OldVector] = 
dataset.select(col($(featuresCol))).rdd.map {
   case Row(point: Vector) => OldVectors.fromML(point)
 }
-
-val instr = Instrumentation.create(this, rdd)
+if (handlePersistence) {
+  instances.persist(StorageLevel.MEMORY_AND_DISK)
+}
+val instr = Instrumentation.create(this, instances)
 instr.logParams(featuresCol, predictionCol, k, initMode, initSteps, 
maxIter, seed, tol)
 
 val algo = new MLlibKMeans()
@@ -320,12 +329,15 @@ class KMeans @Since("1.5.0") (
   .setMaxIterations($(maxIter))
   .setSeed($(seed))
   .setEpsilon($(tol))
-val parentModel = algo.run(rdd, Option(instr))
+val parentModel = algo.run(instances, Option(instr))
 val model = copyValues(new KMeansModel(uid, parentModel).setParent(this))
 val summary = new KMeansSummary(
   model.transform(dataset), $(predictionCol), $(featuresCol), $(k))
 model.setSummary(Some(summary))
 instr.logSuccess(model)
+if (handlePersistence) {
+  instances.unpersist()
+}
 model
   }
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [WIP][SQL][DOC] Fix incorrect `code` tag

2016-11-26 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 830ee1345 -> ff699332c


[WIP][SQL][DOC] Fix incorrect `code` tag

## What changes were proposed in this pull request?
This PR is to fix incorrect `code` tag in `sql-programming-guide.md`

## How was this patch tested?
Manually.

Author: Weiqing Yang 

Closes #15941 from weiqingy/fixtag.

(cherry picked from commit f4a98e421e14434fddc3f9f1018a17124d660ef0)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ff699332
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ff699332
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ff699332

Branch: refs/heads/branch-2.1
Commit: ff699332c113e21b942f5a62f475ae79ac6c0ee5
Parents: 830ee13
Author: Weiqing Yang 
Authored: Sat Nov 26 15:41:37 2016 +
Committer: Sean Owen 
Committed: Sat Nov 26 15:41:49 2016 +

--
 docs/sql-programming-guide.md  | 2 +-
 .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/ff699332/docs/sql-programming-guide.md
--
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index ba3e55f..3093d48 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1089,7 +1089,7 @@ the following case-sensitive options:
   
  isolationLevel
  
-   The transaction isolation level, which applies to current connection. 
It can be one of NONE, READ_COMMITTED, 
READ_UNCOMMITTED, REPEATABLE_READ, or 
SERIALIZABLE, corresponding to standard transaction isolation 
levels defined by JDBC's Connection object, with default of 
READ_UNCOMMITTED. This option applies only to writing. Please refer 
the documentation in java.sql.Connection.
+   The transaction isolation level, which applies to current connection. 
It can be one of NONE, READ_COMMITTED, 
READ_UNCOMMITTED, REPEATABLE_READ, or 
SERIALIZABLE, corresponding to standard transaction isolation 
levels defined by JDBC's Connection object, with default of 
READ_UNCOMMITTED. This option applies only to writing. Please 
refer the documentation in java.sql.Connection.
  

 

http://git-wip-us.apache.org/repos/asf/spark/blob/ff699332/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index 7cca9db..5589805 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -108,7 +108,7 @@ object SQLConf {
 .doc("Configures the maximum size in bytes for a table that will be 
broadcast to all worker " +
   "nodes when performing a join.  By setting this value to -1 broadcasting 
can be disabled. " +
   "Note that currently statistics are only supported for Hive Metastore 
tables where the " +
-  "commandANALYZE TABLE  COMPUTE STATISTICS 
noscan has been " +
+  "command ANALYZE TABLE  COMPUTE STATISTICS 
noscan has been " +
   "run, and file-based data source tables where the statistics are 
computed directly on " +
   "the files of data.")
 .longConf


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [WIP][SQL][DOC] Fix incorrect `code` tag

2016-11-26 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master c4a7eef0c -> f4a98e421


[WIP][SQL][DOC] Fix incorrect `code` tag

## What changes were proposed in this pull request?
This PR is to fix incorrect `code` tag in `sql-programming-guide.md`

## How was this patch tested?
Manually.

Author: Weiqing Yang 

Closes #15941 from weiqingy/fixtag.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f4a98e42
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f4a98e42
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f4a98e42

Branch: refs/heads/master
Commit: f4a98e421e14434fddc3f9f1018a17124d660ef0
Parents: c4a7eef
Author: Weiqing Yang 
Authored: Sat Nov 26 15:41:37 2016 +
Committer: Sean Owen 
Committed: Sat Nov 26 15:41:37 2016 +

--
 docs/sql-programming-guide.md  | 2 +-
 .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/f4a98e42/docs/sql-programming-guide.md
--
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index be53a8d..3adbe23 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1100,7 +1100,7 @@ the following case-sensitive options:
   
  isolationLevel
  
-   The transaction isolation level, which applies to current connection. 
It can be one of NONE, READ_COMMITTED, 
READ_UNCOMMITTED, REPEATABLE_READ, or 
SERIALIZABLE, corresponding to standard transaction isolation 
levels defined by JDBC's Connection object, with default of 
READ_UNCOMMITTED. This option applies only to writing. Please refer 
the documentation in java.sql.Connection.
+   The transaction isolation level, which applies to current connection. 
It can be one of NONE, READ_COMMITTED, 
READ_UNCOMMITTED, REPEATABLE_READ, or 
SERIALIZABLE, corresponding to standard transaction isolation 
levels defined by JDBC's Connection object, with default of 
READ_UNCOMMITTED. This option applies only to writing. Please 
refer the documentation in java.sql.Connection.
  

 

http://git-wip-us.apache.org/repos/asf/spark/blob/f4a98e42/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index b2a50c6..206c08b 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -108,7 +108,7 @@ object SQLConf {
 .doc("Configures the maximum size in bytes for a table that will be 
broadcast to all worker " +
   "nodes when performing a join.  By setting this value to -1 broadcasting 
can be disabled. " +
   "Note that currently statistics are only supported for Hive Metastore 
tables where the " +
-  "commandANALYZE TABLE  COMPUTE STATISTICS 
noscan has been " +
+  "command ANALYZE TABLE  COMPUTE STATISTICS 
noscan has been " +
   "run, and file-based data source tables where the statistics are 
computed directly on " +
   "the files of data.")
 .longConf


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[1/4] spark git commit: [SPARK-3359][DOCS] Make javadoc8 working for unidoc/genjavadoc compatibility in Java API documentation

2016-11-29 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master 7d5cb3af7 -> f830bb917


http://git-wip-us.apache.org/repos/asf/spark/blob/f830bb91/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
index 83857c3..e328b86 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
@@ -40,8 +40,8 @@ case class JdbcType(databaseTypeDefinition : String, 
jdbcNullType : Int)
  * SQL dialect of a certain database or jdbc driver.
  * Lots of databases define types that aren't explicitly supported
  * by the JDBC spec.  Some JDBC drivers also report inaccurate
- * information---for instance, BIT(n>1) being reported as a BIT type is 
quite
- * common, even though BIT in JDBC is meant for single-bit values.  Also, there
+ * information---for instance, BIT(n{@literal >}1) being reported as a BIT 
type is quite
+ * common, even though BIT in JDBC is meant for single-bit values. Also, there
  * does not appear to be a standard name for an unbounded string or binary
  * type; we use BLOB and CLOB by default but override with database-specific
  * alternatives when these are absent or do not behave correctly.

http://git-wip-us.apache.org/repos/asf/spark/blob/f830bb91/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
index ff6dd8c..f288ad6 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
@@ -112,7 +112,7 @@ trait SchemaRelationProvider {
 
 /**
  * ::Experimental::
- * Implemented by objects that can produce a streaming [[Source]] for a 
specific format or system.
+ * Implemented by objects that can produce a streaming `Source` for a specific 
format or system.
  *
  * @since 2.0.0
  */
@@ -143,7 +143,7 @@ trait StreamSourceProvider {
 
 /**
  * ::Experimental::
- * Implemented by objects that can produce a streaming [[Sink]] for a specific 
format or system.
+ * Implemented by objects that can produce a streaming `Sink` for a specific 
format or system.
  *
  * @since 2.0.0
  */
@@ -185,7 +185,7 @@ trait CreatableRelationProvider {
 
 /**
  * Represents a collection of tuples with a known schema. Classes that extend 
BaseRelation must
- * be able to produce the schema of their data in the form of a 
[[StructType]]. Concrete
+ * be able to produce the schema of their data in the form of a `StructType`. 
Concrete
  * implementation should inherit from one of the descendant `Scan` classes, 
which define various
  * abstract methods for execution.
  *
@@ -216,10 +216,10 @@ abstract class BaseRelation {
 
   /**
* Whether does it need to convert the objects in Row to internal 
representation, for example:
-   *  java.lang.String -> UTF8String
-   *  java.lang.Decimal -> Decimal
+   *  java.lang.String to UTF8String
+   *  java.lang.Decimal to Decimal
*
-   * If `needConversion` is `false`, buildScan() should return an [[RDD]] of 
[[InternalRow]]
+   * If `needConversion` is `false`, buildScan() should return an `RDD` of 
`InternalRow`
*
* @note The internal representation is not stable across releases and thus 
data sources outside
* of Spark SQL should leave this as true.

http://git-wip-us.apache.org/repos/asf/spark/blob/f830bb91/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
--
diff --git 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
index a2d64da..5f5c8e2 100644
--- 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
+++ 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
@@ -57,9 +57,17 @@ import org.apache.spark.util.SerializableJobConf
  * @param partition a map from the partition key to the partition value 
(optional). If the partition
  *  value is optional, dynamic partition insert will be 
performed.
  *  As an example, `INSERT INTO tbl PARTITION (a=1, b=2) AS 
...` would have
- *  Map('a' -> Some('1'), 'b' -> Some('2')),
+ *
+ *  {{{
+ *  Map('a' -> Some('1'), 'b' -> Some('2'))
+ *  }}}
+ *
  *  and `INSERT INTO tbl PARTITION (a=1, b) AS ...`
- *  would have Map('a' -> Some('1'), '

[2/4] spark git commit: [SPARK-3359][DOCS] Make javadoc8 working for unidoc/genjavadoc compatibility in Java API documentation

2016-11-29 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark/blob/f830bb91/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/BoostingStrategy.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/BoostingStrategy.scala
 
b/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/BoostingStrategy.scala
index d8405d1..4334b31 100644
--- 
a/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/BoostingStrategy.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/BoostingStrategy.scala
@@ -36,14 +36,14 @@ import org.apache.spark.mllib.tree.loss.{LogLoss, Loss, 
SquaredError}
  * @param validationTol validationTol is a condition which decides iteration 
termination when
  *  runWithValidation is used.
  *  The end of iteration is decided based on below logic:
- *  If the current loss on the validation set is > 0.01, 
the diff
+ *  If the current loss on the validation set is greater 
than 0.01, the diff
  *  of validation error is compared to relative tolerance 
which is
  *  validationTol * (current loss on the validation set).
- *  If the current loss on the validation set is <= 0.01, 
the diff
- *  of validation error is compared to absolute tolerance 
which is
+ *  If the current loss on the validation set is less than 
or equal to 0.01,
+ *  the diff of validation error is compared to absolute 
tolerance which is
  *  validationTol * 0.01.
  *  Ignored when
- *  
[[org.apache.spark.mllib.tree.GradientBoostedTrees.run()]] is used.
+ *  
`org.apache.spark.mllib.tree.GradientBoostedTrees.run()` is used.
  */
 @Since("1.2.0")
 case class BoostingStrategy @Since("1.4.0") (
@@ -92,8 +92,8 @@ object BoostingStrategy {
   /**
* Returns default configuration for the boosting algorithm
* @param algo Learning goal.  Supported:
-   * 
[[org.apache.spark.mllib.tree.configuration.Algo.Classification]],
-   * [[org.apache.spark.mllib.tree.configuration.Algo.Regression]]
+   * 
`org.apache.spark.mllib.tree.configuration.Algo.Classification`,
+   * `org.apache.spark.mllib.tree.configuration.Algo.Regression`
* @return Configuration for boosting algorithm
*/
   @Since("1.3.0")

http://git-wip-us.apache.org/repos/asf/spark/blob/f830bb91/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/Strategy.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/Strategy.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/Strategy.scala
index b34e1b1..58e8f5b 100644
--- 
a/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/Strategy.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/Strategy.scala
@@ -28,8 +28,8 @@ import org.apache.spark.mllib.tree.impurity.{Entropy, Gini, 
Impurity, Variance}
 /**
  * Stores all the configuration options for tree construction
  * @param algo  Learning goal.  Supported:
- *  
[[org.apache.spark.mllib.tree.configuration.Algo.Classification]],
- *  [[org.apache.spark.mllib.tree.configuration.Algo.Regression]]
+ *  
`org.apache.spark.mllib.tree.configuration.Algo.Classification`,
+ *  `org.apache.spark.mllib.tree.configuration.Algo.Regression`
  * @param impurity Criterion used for information gain calculation.
  * Supported for Classification: 
[[org.apache.spark.mllib.tree.impurity.Gini]],
  *  [[org.apache.spark.mllib.tree.impurity.Entropy]].
@@ -43,9 +43,9 @@ import org.apache.spark.mllib.tree.impurity.{Entropy, Gini, 
Impurity, Variance}
  *for choosing how to split on features at each node.
  *More bins give higher granularity.
  * @param quantileCalculationStrategy Algorithm for calculating quantiles.  
Supported:
- * 
[[org.apache.spark.mllib.tree.configuration.QuantileStrategy.Sort]]
+ * 
`org.apache.spark.mllib.tree.configuration.QuantileStrategy.Sort`
  * @param categoricalFeaturesInfo A map storing information about the 
categorical variables and the
- *number of discrete values they take. An 
entry (n -> k)
+ *number of discrete values they take. An 
entry (n to k)
  *indicates that feature n is categorical with 
k categories
  *indexed from 0: {0, 1, ..., k-1}.
  * @param minInstancesPerNode Minimum number of instances each child must have 
after split.

http://git-wip-

[3/4] spark git commit: [SPARK-3359][DOCS] Make javadoc8 working for unidoc/genjavadoc compatibility in Java API documentation

2016-11-29 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark/blob/f830bb91/mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala
index b9e01dd..d8f33cd 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala
@@ -35,7 +35,7 @@ private[feature] trait QuantileDiscretizerBase extends Params
 
   /**
* Number of buckets (quantiles, or categories) into which data points are 
grouped. Must
-   * be >= 2.
+   * be greater than or equal to 2.
*
* See also [[handleInvalid]], which can optionally create an additional 
bucket for NaN values.
*
@@ -52,7 +52,7 @@ private[feature] trait QuantileDiscretizerBase extends Params
 
   /**
* Relative error (see documentation for
-   * [[org.apache.spark.sql.DataFrameStatFunctions.approxQuantile 
approxQuantile]] for description)
+   * `org.apache.spark.sql.DataFrameStatFunctions.approxQuantile` for 
description)
* Must be in the range [0, 1].
* default: 0.001
* @group param
@@ -99,7 +99,7 @@ private[feature] trait QuantileDiscretizerBase extends Params
  * but NaNs will be counted in a special bucket[4].
  *
  * Algorithm: The bin ranges are chosen using an approximate algorithm (see 
the documentation for
- * [[org.apache.spark.sql.DataFrameStatFunctions.approxQuantile 
approxQuantile]]
+ * `org.apache.spark.sql.DataFrameStatFunctions.approxQuantile`
  * for a detailed description). The precision of the approximation can be 
controlled with the
  * `relativeError` parameter. The lower and upper bin bounds will be 
`-Infinity` and `+Infinity`,
  * covering all real values.

http://git-wip-us.apache.org/repos/asf/spark/blob/f830bb91/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
index b25fff9..65db06c 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
@@ -32,9 +32,11 @@ import org.apache.spark.sql.types.StructType
  * the output, it can be any select clause that Spark SQL supports. Users can 
also
  * use Spark SQL built-in function and UDFs to operate on these selected 
columns.
  * For example, [[SQLTransformer]] supports statements like:
- *  - SELECT a, a + b AS a_b FROM __THIS__
- *  - SELECT a, SQRT(b) AS b_sqrt FROM __THIS__ where a > 5
- *  - SELECT a, b, SUM(c) AS c_sum FROM __THIS__ GROUP BY a, b
+ * {{{
+ *  SELECT a, a + b AS a_b FROM __THIS__
+ *  SELECT a, SQRT(b) AS b_sqrt FROM __THIS__ where a > 5
+ *  SELECT a, b, SUM(c) AS c_sum FROM __THIS__ GROUP BY a, b
+ * }}}
  */
 @Since("1.6.0")
 class SQLTransformer @Since("1.6.0") (@Since("1.6.0") override val uid: 
String) extends Transformer

http://git-wip-us.apache.org/repos/asf/spark/blob/f830bb91/mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala
index a558162..3fcd84c 100755
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala
@@ -52,7 +52,7 @@ class StopWordsRemover @Since("1.5.0") (@Since("1.5.0") 
override val uid: String
   /**
* The words to be filtered out.
* Default: English stop words
-   * @see [[StopWordsRemover.loadDefaultStopWords()]]
+   * @see `StopWordsRemover.loadDefaultStopWords()`
* @group param
*/
   @Since("1.5.0")

http://git-wip-us.apache.org/repos/asf/spark/blob/f830bb91/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
index 8b155f0..0a4d31d 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
@@ -60,7 +60,7 @@ private[feature] trait StringIndexerBase extends Params with 
HasInputCol with Ha
  * The indices are in [0, numLabels), ordered by label frequencies.
  * So the most frequent label gets index 0.
  *
- * @see [[IndexToString]] for the inverse transformation
+ * @see `IndexToString` for the inverse transformation
  */
 @Since("1.4.0")
 class

[4/4] spark git commit: [SPARK-3359][DOCS] Make javadoc8 working for unidoc/genjavadoc compatibility in Java API documentation

2016-11-29 Thread srowen

[SPARK-3359][DOCS] Make javadoc8 working for unidoc/genjavadoc compatibility in 
Java API documentation

## What changes were proposed in this pull request?

This PR make `sbt unidoc` complete with Java 8.

This PR roughly includes several fixes as below:

- Fix unrecognisable class and method links in javadoc by changing it from 
`[[..]]` to `` `...` ``

  ```diff
  - * A column that will be computed based on the data in a [[DataFrame]].
  + * A column that will be computed based on the data in a `DataFrame`.
  ```

- Fix throws annotations so that they are recognisable in javadoc

- Fix URL links to ``.

  ```diff
  - * [[http://en.wikipedia.org/wiki/Decision_tree_learning Decision tree]] 
model for regression.
  + * http://en.wikipedia.org/wiki/Decision_tree_learning";>
  + * Decision tree (Wikipedia) model for regression.
  ```

  ```diff
  -   * see http://en.wikipedia.org/wiki/Receiver_operating_characteristic
  +   * see http://en.wikipedia.org/wiki/Receiver_operating_characteristic";>
  +   * Receiver operating characteristic (Wikipedia)
  ```

- Fix < to > to

  - `greater than`/`greater than or equal to` or `less than`/`less than or 
equal to` where applicable.

  - Wrap it with `{{{...}}}` to print them in javadoc or use `{code ...}` or 
`{literal ..}`. Please refer 
https://github.com/apache/spark/pull/16013#discussion_r89665558

- Fix `` complaint

## How was this patch tested?

Manually tested by `jekyll build` with Java 7 and 8

```
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
```

```
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
```

Author: hyukjinkwon 

Closes #16013 from HyukjinKwon/SPARK-3359-errors-more.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f830bb91
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f830bb91
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f830bb91

Branch: refs/heads/master
Commit: f830bb9170f6b853565d9dd30ca7418b93a54fe3
Parents: 7d5cb3a
Author: hyukjinkwon 
Authored: Tue Nov 29 09:41:32 2016 +
Committer: Sean Owen 
Committed: Tue Nov 29 09:41:32 2016 +

--
 .../scala/org/apache/spark/Accumulator.scala|  2 +-
 .../main/scala/org/apache/spark/SparkConf.scala | 12 ++--
 .../scala/org/apache/spark/SparkContext.scala   | 14 ++---
 .../scala/org/apache/spark/TaskContext.scala|  4 +-
 .../scala/org/apache/spark/TaskEndReason.scala  |  2 +-
 .../main/scala/org/apache/spark/TestUtils.scala |  2 +-
 .../org/apache/spark/api/java/JavaRDD.scala |  8 ++-
 .../apache/spark/rdd/DoubleRDDFunctions.scala   |  4 +-
 .../scala/org/apache/spark/rdd/HadoopRDD.scala  |  2 +-
 .../scala/org/apache/spark/rdd/JdbcRDD.scala| 15 -
 .../org/apache/spark/rdd/NewHadoopRDD.scala |  2 +-
 .../org/apache/spark/rdd/PairRDDFunctions.scala | 20 +++
 .../main/scala/org/apache/spark/rdd/RDD.scala   | 24 +---
 .../apache/spark/rdd/RDDCheckpointData.scala|  3 +-
 .../org/apache/spark/rdd/coalesce-public.scala  |  4 +-
 .../spark/rpc/netty/RpcEndpointVerifier.scala   |  4 +-
 .../spark/scheduler/InputFormatInfo.scala   |  2 +-
 .../org/apache/spark/scheduler/ResultTask.scala |  2 +-
 .../apache/spark/scheduler/ShuffleMapTask.scala |  2 +-
 .../scala/org/apache/spark/scheduler/Task.scala |  2 +-
 .../spark/scheduler/TaskDescription.scala   |  2 +-
 .../spark/storage/BlockManagerMessages.scala|  2 +-
 .../storage/ShuffleBlockFetcherIterator.scala   |  4 +-
 .../scala/org/apache/spark/ui/UIUtils.scala |  2 +-
 .../scala/org/apache/spark/util/Utils.scala |  7 ++-
 .../spark/util/random/SamplingUtils.scala   | 18 +++---
 .../util/random/StratifiedSamplingUtils.scala   | 33 +++
 .../flume/FlumePollingInputDStream.scala|  2 +-
 .../spark/streaming/kafka/KafkaCluster.scala| 20 +--
 .../streaming/kafka/KafkaInputDStream.scala |  2 +-
 .../spark/streaming/kafka/KafkaUtils.scala  | 18 +++---
 .../spark/streaming/kafka/OffsetRange.scala |  2 +-
 .../org/apache/spark/graphx/GraphLoader.scala   |  2 +-
 .../spark/graphx/impl/VertexPartitionBase.scala |  2 +-
 .../graphx/impl/VertexPartitionBaseOps.scala|  2 +-
 .../apache/spark/graphx/lib/TriangleCount.scala |  2 +-
 .../ml/classification/LogisticRegression.scala  | 15 ++---
 .../spark/ml/clustering/BisectingKMeans.scala   |  4 +-
 .../spark/ml/clustering/GaussianMixture.scala   |  2 +-
 .../org/apache/spark/ml/clustering/LDA.scala| 10 ++--
 .../apache/spark/ml/feature/Bucketizer.scala|  2 +-
 .../spark/ml/feature/CountVectorizer.scala  |  9 +--
 .../org/apache/spark/ml/feature/HashingTF.scala |  2 +-
 .../org/apache/spark/ml/feature/NGram.scala |  2 +-
 .../apache/spark/ml/feature/Normalizer.scala

[4/4] spark git commit: [SPARK-3359][DOCS] Make javadoc8 working for unidoc/genjavadoc compatibility in Java API documentation

2016-11-29 Thread srowen

[SPARK-3359][DOCS] Make javadoc8 working for unidoc/genjavadoc compatibility in 
Java API documentation

## What changes were proposed in this pull request?

This PR make `sbt unidoc` complete with Java 8.

This PR roughly includes several fixes as below:

- Fix unrecognisable class and method links in javadoc by changing it from 
`[[..]]` to `` `...` ``

  ```diff
  - * A column that will be computed based on the data in a [[DataFrame]].
  + * A column that will be computed based on the data in a `DataFrame`.
  ```

- Fix throws annotations so that they are recognisable in javadoc

- Fix URL links to ``.

  ```diff
  - * [[http://en.wikipedia.org/wiki/Decision_tree_learning Decision tree]] 
model for regression.
  + * http://en.wikipedia.org/wiki/Decision_tree_learning";>
  + * Decision tree (Wikipedia) model for regression.
  ```

  ```diff
  -   * see http://en.wikipedia.org/wiki/Receiver_operating_characteristic
  +   * see http://en.wikipedia.org/wiki/Receiver_operating_characteristic";>
  +   * Receiver operating characteristic (Wikipedia)
  ```

- Fix < to > to

  - `greater than`/`greater than or equal to` or `less than`/`less than or 
equal to` where applicable.

  - Wrap it with `{{{...}}}` to print them in javadoc or use `{code ...}` or 
`{literal ..}`. Please refer 
https://github.com/apache/spark/pull/16013#discussion_r89665558

- Fix `` complaint

## How was this patch tested?

Manually tested by `jekyll build` with Java 7 and 8

```
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
```

```
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
```

Author: hyukjinkwon 

Closes #16013 from HyukjinKwon/SPARK-3359-errors-more.

(cherry picked from commit f830bb9170f6b853565d9dd30ca7418b93a54fe3)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/84b2af22
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/84b2af22
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/84b2af22

Branch: refs/heads/branch-2.1
Commit: 84b2af229ca312023cd6343ecd2b1278542d9b9a
Parents: 06a56df
Author: hyukjinkwon 
Authored: Tue Nov 29 09:41:32 2016 +
Committer: Sean Owen 
Committed: Tue Nov 29 09:41:52 2016 +

--
 .../scala/org/apache/spark/Accumulator.scala|  2 +-
 .../main/scala/org/apache/spark/SparkConf.scala | 12 ++--
 .../scala/org/apache/spark/SparkContext.scala   | 14 ++---
 .../scala/org/apache/spark/TaskContext.scala|  4 +-
 .../scala/org/apache/spark/TaskEndReason.scala  |  2 +-
 .../main/scala/org/apache/spark/TestUtils.scala |  2 +-
 .../org/apache/spark/api/java/JavaRDD.scala |  8 ++-
 .../apache/spark/rdd/DoubleRDDFunctions.scala   |  4 +-
 .../scala/org/apache/spark/rdd/HadoopRDD.scala  |  2 +-
 .../scala/org/apache/spark/rdd/JdbcRDD.scala| 15 -
 .../org/apache/spark/rdd/NewHadoopRDD.scala |  2 +-
 .../org/apache/spark/rdd/PairRDDFunctions.scala | 20 +++
 .../main/scala/org/apache/spark/rdd/RDD.scala   | 24 +---
 .../apache/spark/rdd/RDDCheckpointData.scala|  3 +-
 .../org/apache/spark/rdd/coalesce-public.scala  |  4 +-
 .../spark/rpc/netty/RpcEndpointVerifier.scala   |  4 +-
 .../spark/scheduler/InputFormatInfo.scala   |  2 +-
 .../org/apache/spark/scheduler/ResultTask.scala |  2 +-
 .../apache/spark/scheduler/ShuffleMapTask.scala |  2 +-
 .../scala/org/apache/spark/scheduler/Task.scala |  2 +-
 .../spark/scheduler/TaskDescription.scala   |  2 +-
 .../spark/storage/BlockManagerMessages.scala|  2 +-
 .../storage/ShuffleBlockFetcherIterator.scala   |  4 +-
 .../scala/org/apache/spark/ui/UIUtils.scala |  2 +-
 .../scala/org/apache/spark/util/Utils.scala |  7 ++-
 .../spark/util/random/SamplingUtils.scala   | 18 +++---
 .../util/random/StratifiedSamplingUtils.scala   | 33 +++
 .../flume/FlumePollingInputDStream.scala|  2 +-
 .../spark/streaming/kafka/KafkaCluster.scala| 20 +--
 .../streaming/kafka/KafkaInputDStream.scala |  2 +-
 .../spark/streaming/kafka/KafkaUtils.scala  | 18 +++---
 .../spark/streaming/kafka/OffsetRange.scala |  2 +-
 .../org/apache/spark/graphx/GraphLoader.scala   |  2 +-
 .../spark/graphx/impl/VertexPartitionBase.scala |  2 +-
 .../graphx/impl/VertexPartitionBaseOps.scala|  2 +-
 .../apache/spark/graphx/lib/TriangleCount.scala |  2 +-
 .../ml/classification/LogisticRegression.scala  | 15 ++---
 .../spark/ml/clustering/BisectingKMeans.scala   |  4 +-
 .../spark/ml/clustering/GaussianMixture.scala   |  2 +-
 .../org/apache/spark/ml/clustering/LDA.scala| 10 ++--
 .../apache/spark/ml/feature/Bucketizer.scala|  2 +-
 .../spark/ml/feature/CountVectorizer.scala  |  9 +--
 .../org/apache/spark/ml/feature/HashingTF.scala |  2 +-
 ..

[2/4] spark git commit: [SPARK-3359][DOCS] Make javadoc8 working for unidoc/genjavadoc compatibility in Java API documentation

2016-11-29 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark/blob/84b2af22/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/BoostingStrategy.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/BoostingStrategy.scala
 
b/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/BoostingStrategy.scala
index d8405d1..4334b31 100644
--- 
a/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/BoostingStrategy.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/BoostingStrategy.scala
@@ -36,14 +36,14 @@ import org.apache.spark.mllib.tree.loss.{LogLoss, Loss, 
SquaredError}
  * @param validationTol validationTol is a condition which decides iteration 
termination when
  *  runWithValidation is used.
  *  The end of iteration is decided based on below logic:
- *  If the current loss on the validation set is > 0.01, 
the diff
+ *  If the current loss on the validation set is greater 
than 0.01, the diff
  *  of validation error is compared to relative tolerance 
which is
  *  validationTol * (current loss on the validation set).
- *  If the current loss on the validation set is <= 0.01, 
the diff
- *  of validation error is compared to absolute tolerance 
which is
+ *  If the current loss on the validation set is less than 
or equal to 0.01,
+ *  the diff of validation error is compared to absolute 
tolerance which is
  *  validationTol * 0.01.
  *  Ignored when
- *  
[[org.apache.spark.mllib.tree.GradientBoostedTrees.run()]] is used.
+ *  
`org.apache.spark.mllib.tree.GradientBoostedTrees.run()` is used.
  */
 @Since("1.2.0")
 case class BoostingStrategy @Since("1.4.0") (
@@ -92,8 +92,8 @@ object BoostingStrategy {
   /**
* Returns default configuration for the boosting algorithm
* @param algo Learning goal.  Supported:
-   * 
[[org.apache.spark.mllib.tree.configuration.Algo.Classification]],
-   * [[org.apache.spark.mllib.tree.configuration.Algo.Regression]]
+   * 
`org.apache.spark.mllib.tree.configuration.Algo.Classification`,
+   * `org.apache.spark.mllib.tree.configuration.Algo.Regression`
* @return Configuration for boosting algorithm
*/
   @Since("1.3.0")

http://git-wip-us.apache.org/repos/asf/spark/blob/84b2af22/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/Strategy.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/Strategy.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/Strategy.scala
index b34e1b1..58e8f5b 100644
--- 
a/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/Strategy.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/mllib/tree/configuration/Strategy.scala
@@ -28,8 +28,8 @@ import org.apache.spark.mllib.tree.impurity.{Entropy, Gini, 
Impurity, Variance}
 /**
  * Stores all the configuration options for tree construction
  * @param algo  Learning goal.  Supported:
- *  
[[org.apache.spark.mllib.tree.configuration.Algo.Classification]],
- *  [[org.apache.spark.mllib.tree.configuration.Algo.Regression]]
+ *  
`org.apache.spark.mllib.tree.configuration.Algo.Classification`,
+ *  `org.apache.spark.mllib.tree.configuration.Algo.Regression`
  * @param impurity Criterion used for information gain calculation.
  * Supported for Classification: 
[[org.apache.spark.mllib.tree.impurity.Gini]],
  *  [[org.apache.spark.mllib.tree.impurity.Entropy]].
@@ -43,9 +43,9 @@ import org.apache.spark.mllib.tree.impurity.{Entropy, Gini, 
Impurity, Variance}
  *for choosing how to split on features at each node.
  *More bins give higher granularity.
  * @param quantileCalculationStrategy Algorithm for calculating quantiles.  
Supported:
- * 
[[org.apache.spark.mllib.tree.configuration.QuantileStrategy.Sort]]
+ * 
`org.apache.spark.mllib.tree.configuration.QuantileStrategy.Sort`
  * @param categoricalFeaturesInfo A map storing information about the 
categorical variables and the
- *number of discrete values they take. An 
entry (n -> k)
+ *number of discrete values they take. An 
entry (n to k)
  *indicates that feature n is categorical with 
k categories
  *indexed from 0: {0, 1, ..., k-1}.
  * @param minInstancesPerNode Minimum number of instances each child must have 
after split.

http://git-wip-

[1/4] spark git commit: [SPARK-3359][DOCS] Make javadoc8 working for unidoc/genjavadoc compatibility in Java API documentation

2016-11-29 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 06a56df22 -> 84b2af229


http://git-wip-us.apache.org/repos/asf/spark/blob/84b2af22/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
index 83857c3..e328b86 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
@@ -40,8 +40,8 @@ case class JdbcType(databaseTypeDefinition : String, 
jdbcNullType : Int)
  * SQL dialect of a certain database or jdbc driver.
  * Lots of databases define types that aren't explicitly supported
  * by the JDBC spec.  Some JDBC drivers also report inaccurate
- * information---for instance, BIT(n>1) being reported as a BIT type is 
quite
- * common, even though BIT in JDBC is meant for single-bit values.  Also, there
+ * information---for instance, BIT(n{@literal >}1) being reported as a BIT 
type is quite
+ * common, even though BIT in JDBC is meant for single-bit values. Also, there
  * does not appear to be a standard name for an unbounded string or binary
  * type; we use BLOB and CLOB by default but override with database-specific
  * alternatives when these are absent or do not behave correctly.

http://git-wip-us.apache.org/repos/asf/spark/blob/84b2af22/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
index ff6dd8c..f288ad6 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala
@@ -112,7 +112,7 @@ trait SchemaRelationProvider {
 
 /**
  * ::Experimental::
- * Implemented by objects that can produce a streaming [[Source]] for a 
specific format or system.
+ * Implemented by objects that can produce a streaming `Source` for a specific 
format or system.
  *
  * @since 2.0.0
  */
@@ -143,7 +143,7 @@ trait StreamSourceProvider {
 
 /**
  * ::Experimental::
- * Implemented by objects that can produce a streaming [[Sink]] for a specific 
format or system.
+ * Implemented by objects that can produce a streaming `Sink` for a specific 
format or system.
  *
  * @since 2.0.0
  */
@@ -185,7 +185,7 @@ trait CreatableRelationProvider {
 
 /**
  * Represents a collection of tuples with a known schema. Classes that extend 
BaseRelation must
- * be able to produce the schema of their data in the form of a 
[[StructType]]. Concrete
+ * be able to produce the schema of their data in the form of a `StructType`. 
Concrete
  * implementation should inherit from one of the descendant `Scan` classes, 
which define various
  * abstract methods for execution.
  *
@@ -216,10 +216,10 @@ abstract class BaseRelation {
 
   /**
* Whether does it need to convert the objects in Row to internal 
representation, for example:
-   *  java.lang.String -> UTF8String
-   *  java.lang.Decimal -> Decimal
+   *  java.lang.String to UTF8String
+   *  java.lang.Decimal to Decimal
*
-   * If `needConversion` is `false`, buildScan() should return an [[RDD]] of 
[[InternalRow]]
+   * If `needConversion` is `false`, buildScan() should return an `RDD` of 
`InternalRow`
*
* @note The internal representation is not stable across releases and thus 
data sources outside
* of Spark SQL should leave this as true.

http://git-wip-us.apache.org/repos/asf/spark/blob/84b2af22/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
--
diff --git 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
index a2d64da..5f5c8e2 100644
--- 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
+++ 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
@@ -57,9 +57,17 @@ import org.apache.spark.util.SerializableJobConf
  * @param partition a map from the partition key to the partition value 
(optional). If the partition
  *  value is optional, dynamic partition insert will be 
performed.
  *  As an example, `INSERT INTO tbl PARTITION (a=1, b=2) AS 
...` would have
- *  Map('a' -> Some('1'), 'b' -> Some('2')),
+ *
+ *  {{{
+ *  Map('a' -> Some('1'), 'b' -> Some('2'))
+ *  }}}
+ *
  *  and `INSERT INTO tbl PARTITION (a=1, b) AS ...`
- *  would have Map('a' -> Some('1'

[3/4] spark git commit: [SPARK-3359][DOCS] Make javadoc8 working for unidoc/genjavadoc compatibility in Java API documentation

2016-11-29 Thread srowen

http://git-wip-us.apache.org/repos/asf/spark/blob/84b2af22/mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala
index b9e01dd..d8f33cd 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala
@@ -35,7 +35,7 @@ private[feature] trait QuantileDiscretizerBase extends Params
 
   /**
* Number of buckets (quantiles, or categories) into which data points are 
grouped. Must
-   * be >= 2.
+   * be greater than or equal to 2.
*
* See also [[handleInvalid]], which can optionally create an additional 
bucket for NaN values.
*
@@ -52,7 +52,7 @@ private[feature] trait QuantileDiscretizerBase extends Params
 
   /**
* Relative error (see documentation for
-   * [[org.apache.spark.sql.DataFrameStatFunctions.approxQuantile 
approxQuantile]] for description)
+   * `org.apache.spark.sql.DataFrameStatFunctions.approxQuantile` for 
description)
* Must be in the range [0, 1].
* default: 0.001
* @group param
@@ -99,7 +99,7 @@ private[feature] trait QuantileDiscretizerBase extends Params
  * but NaNs will be counted in a special bucket[4].
  *
  * Algorithm: The bin ranges are chosen using an approximate algorithm (see 
the documentation for
- * [[org.apache.spark.sql.DataFrameStatFunctions.approxQuantile 
approxQuantile]]
+ * `org.apache.spark.sql.DataFrameStatFunctions.approxQuantile`
  * for a detailed description). The precision of the approximation can be 
controlled with the
  * `relativeError` parameter. The lower and upper bin bounds will be 
`-Infinity` and `+Infinity`,
  * covering all real values.

http://git-wip-us.apache.org/repos/asf/spark/blob/84b2af22/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
index b25fff9..65db06c 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
@@ -32,9 +32,11 @@ import org.apache.spark.sql.types.StructType
  * the output, it can be any select clause that Spark SQL supports. Users can 
also
  * use Spark SQL built-in function and UDFs to operate on these selected 
columns.
  * For example, [[SQLTransformer]] supports statements like:
- *  - SELECT a, a + b AS a_b FROM __THIS__
- *  - SELECT a, SQRT(b) AS b_sqrt FROM __THIS__ where a > 5
- *  - SELECT a, b, SUM(c) AS c_sum FROM __THIS__ GROUP BY a, b
+ * {{{
+ *  SELECT a, a + b AS a_b FROM __THIS__
+ *  SELECT a, SQRT(b) AS b_sqrt FROM __THIS__ where a > 5
+ *  SELECT a, b, SUM(c) AS c_sum FROM __THIS__ GROUP BY a, b
+ * }}}
  */
 @Since("1.6.0")
 class SQLTransformer @Since("1.6.0") (@Since("1.6.0") override val uid: 
String) extends Transformer

http://git-wip-us.apache.org/repos/asf/spark/blob/84b2af22/mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala
index a558162..3fcd84c 100755
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/StopWordsRemover.scala
@@ -52,7 +52,7 @@ class StopWordsRemover @Since("1.5.0") (@Since("1.5.0") 
override val uid: String
   /**
* The words to be filtered out.
* Default: English stop words
-   * @see [[StopWordsRemover.loadDefaultStopWords()]]
+   * @see `StopWordsRemover.loadDefaultStopWords()`
* @group param
*/
   @Since("1.5.0")

http://git-wip-us.apache.org/repos/asf/spark/blob/84b2af22/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
index 8b155f0..0a4d31d 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
@@ -60,7 +60,7 @@ private[feature] trait StringIndexerBase extends Params with 
HasInputCol with Ha
  * The indices are in [0, numLabels), ordered by label frequencies.
  * So the most frequent label gets index 0.
  *
- * @see [[IndexToString]] for the inverse transformation
+ * @see `IndexToString` for the inverse transformation
  */
 @Since("1.4.0")
 class

spark git commit: [MINOR][DOCS] Updates to the Accumulator example in the programming guide. Fixed typos, AccumulatorV2 in Java

2016-11-29 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master f830bb917 -> f045d9dad


[MINOR][DOCS] Updates to the Accumulator example in the programming guide. 
Fixed typos, AccumulatorV2 in Java

## What changes were proposed in this pull request?

This pull request contains updates to Scala and Java Accumulator code snippets 
in the programming guide.

- For Scala, the pull request fixes the signature of the 'add()' method in the 
custom Accumulator, which contained two params (as the old AccumulatorParam) 
instead of one (as in AccumulatorV2).

- The Java example was updated to use the AccumulatorV2 class since 
AccumulatorParam is marked as deprecated.

- Scala and Java examples are more consistent now.

## How was this patch tested?

This patch was tested manually by building the docs locally.

![image](https://cloud.githubusercontent.com/assets/6235869/20652099/77d98d18-b4f3-11e6-8565-a995fe8cf8e5.png)

Author: aokolnychyi 

Closes #16024 from aokolnychyi/fixed_accumulator_example.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f045d9da
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f045d9da
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f045d9da

Branch: refs/heads/master
Commit: f045d9dade66d44f5ca4768bfe6a484e9288ec8d
Parents: f830bb9
Author: aokolnychyi 
Authored: Tue Nov 29 13:49:39 2016 +
Committer: Sean Owen 
Committed: Tue Nov 29 13:49:39 2016 +

--
 docs/programming-guide.md | 54 ++
 1 file changed, 33 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/f045d9da/docs/programming-guide.md
--
diff --git a/docs/programming-guide.md b/docs/programming-guide.md
index 58bf17b..4267b8c 100644
--- a/docs/programming-guide.md
+++ b/docs/programming-guide.md
@@ -1378,18 +1378,23 @@ res2: Long = 10
 
 While this code used the built-in support for accumulators of type Long, 
programmers can also
 create their own types by subclassing 
[AccumulatorV2](api/scala/index.html#org.apache.spark.util.AccumulatorV2).
-The AccumulatorV2 abstract class has several methods which need to override: 
-`reset` for resetting the accumulator to zero, and `add` for add anothor value 
into the accumulator, `merge` for merging another same-type accumulator into 
this one. Other methods need to override can refer to scala API document. For 
example, supposing we had a `MyVector` class
+The AccumulatorV2 abstract class has several methods which one has to 
override: `reset` for resetting
+the accumulator to zero, `add` for adding another value into the accumulator,
+`merge` for merging another same-type accumulator into this one. Other methods 
that must be overridden
+are contained in the [API 
documentation](api/scala/index.html#org.apache.spark.util.AccumulatorV2). For 
example, supposing we had a `MyVector` class
 representing mathematical vectors, we could write:
 
 {% highlight scala %}
-object VectorAccumulatorV2 extends AccumulatorV2[MyVector, MyVector] {
-  val vec_ : MyVector = MyVector.createZeroVector
-  def reset(): MyVector = {
-vec_.reset()
+class VectorAccumulatorV2 extends AccumulatorV2[MyVector, MyVector] {
+
+  private val myVector: MyVector = MyVector.createZeroVector
+
+  def reset(): Unit = {
+myVector.reset()
   }
-  def add(v1: MyVector, v2: MyVector): MyVector = {
-vec_.add(v2)
+
+  def add(v: MyVector): Unit = {
+myVector.add(v)
   }
   ...
 }
@@ -1424,29 +1429,36 @@ accum.value();
 // returns 10
 {% endhighlight %}
 
-Programmers can also create their own types by subclassing
-[AccumulatorParam](api/java/index.html?org/apache/spark/AccumulatorParam.html).
-The AccumulatorParam interface has two methods: `zero` for providing a "zero 
value" for your data
-type, and `addInPlace` for adding two values together. For example, supposing 
we had a `Vector` class
+While this code used the built-in support for accumulators of type Long, 
programmers can also
+create their own types by subclassing 
[AccumulatorV2](api/scala/index.html#org.apache.spark.util.AccumulatorV2).
+The AccumulatorV2 abstract class has several methods which one has to 
override: `reset` for resetting
+the accumulator to zero, `add` for adding another value into the accumulator,
+`merge` for merging another same-type accumulator into this one. Other methods 
that must be overridden
+are contained in the [API 
documentation](api/scala/index.html#org.apache.spark.util.AccumulatorV2). For 
example, supposing we had a `MyVector` class
 representing mathematical vectors, we could write:
 
 {% highlight java %}
-class VectorAccumulatorParam implements AccumulatorParam {
-  public Vector zero(Vector initialValue) {
-return Vector.zeros(initialValue.size());

spark git commit: [MINOR][DOCS] Updates to the Accumulator example in the programming guide. Fixed typos, AccumulatorV2 in Java

2016-11-29 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/branch-2.1 84b2af229 -> 124944ab6


[MINOR][DOCS] Updates to the Accumulator example in the programming guide. 
Fixed typos, AccumulatorV2 in Java

## What changes were proposed in this pull request?

This pull request contains updates to Scala and Java Accumulator code snippets 
in the programming guide.

- For Scala, the pull request fixes the signature of the 'add()' method in the 
custom Accumulator, which contained two params (as the old AccumulatorParam) 
instead of one (as in AccumulatorV2).

- The Java example was updated to use the AccumulatorV2 class since 
AccumulatorParam is marked as deprecated.

- Scala and Java examples are more consistent now.

## How was this patch tested?

This patch was tested manually by building the docs locally.

![image](https://cloud.githubusercontent.com/assets/6235869/20652099/77d98d18-b4f3-11e6-8565-a995fe8cf8e5.png)

Author: aokolnychyi 

Closes #16024 from aokolnychyi/fixed_accumulator_example.

(cherry picked from commit f045d9dade66d44f5ca4768bfe6a484e9288ec8d)
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/124944ab
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/124944ab
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/124944ab

Branch: refs/heads/branch-2.1
Commit: 124944ab639b879c43c07415ceb6de6b4dc2517a
Parents: 84b2af2
Author: aokolnychyi 
Authored: Tue Nov 29 13:49:39 2016 +
Committer: Sean Owen 
Committed: Tue Nov 29 13:49:49 2016 +

--
 docs/programming-guide.md | 54 ++
 1 file changed, 33 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/124944ab/docs/programming-guide.md
--
diff --git a/docs/programming-guide.md b/docs/programming-guide.md
index 58bf17b..4267b8c 100644
--- a/docs/programming-guide.md
+++ b/docs/programming-guide.md
@@ -1378,18 +1378,23 @@ res2: Long = 10
 
 While this code used the built-in support for accumulators of type Long, 
programmers can also
 create their own types by subclassing 
[AccumulatorV2](api/scala/index.html#org.apache.spark.util.AccumulatorV2).
-The AccumulatorV2 abstract class has several methods which need to override: 
-`reset` for resetting the accumulator to zero, and `add` for add anothor value 
into the accumulator, `merge` for merging another same-type accumulator into 
this one. Other methods need to override can refer to scala API document. For 
example, supposing we had a `MyVector` class
+The AccumulatorV2 abstract class has several methods which one has to 
override: `reset` for resetting
+the accumulator to zero, `add` for adding another value into the accumulator,
+`merge` for merging another same-type accumulator into this one. Other methods 
that must be overridden
+are contained in the [API 
documentation](api/scala/index.html#org.apache.spark.util.AccumulatorV2). For 
example, supposing we had a `MyVector` class
 representing mathematical vectors, we could write:
 
 {% highlight scala %}
-object VectorAccumulatorV2 extends AccumulatorV2[MyVector, MyVector] {
-  val vec_ : MyVector = MyVector.createZeroVector
-  def reset(): MyVector = {
-vec_.reset()
+class VectorAccumulatorV2 extends AccumulatorV2[MyVector, MyVector] {
+
+  private val myVector: MyVector = MyVector.createZeroVector
+
+  def reset(): Unit = {
+myVector.reset()
   }
-  def add(v1: MyVector, v2: MyVector): MyVector = {
-vec_.add(v2)
+
+  def add(v: MyVector): Unit = {
+myVector.add(v)
   }
   ...
 }
@@ -1424,29 +1429,36 @@ accum.value();
 // returns 10
 {% endhighlight %}
 
-Programmers can also create their own types by subclassing
-[AccumulatorParam](api/java/index.html?org/apache/spark/AccumulatorParam.html).
-The AccumulatorParam interface has two methods: `zero` for providing a "zero 
value" for your data
-type, and `addInPlace` for adding two values together. For example, supposing 
we had a `Vector` class
+While this code used the built-in support for accumulators of type Long, 
programmers can also
+create their own types by subclassing 
[AccumulatorV2](api/scala/index.html#org.apache.spark.util.AccumulatorV2).
+The AccumulatorV2 abstract class has several methods which one has to 
override: `reset` for resetting
+the accumulator to zero, `add` for adding another value into the accumulator,
+`merge` for merging another same-type accumulator into this one. Other methods 
that must be overridden
+are contained in the [API 
documentation](api/scala/index.html#org.apache.spark.util.AccumulatorV2). For 
example, supposing we had a `MyVector` class
 representing mathematical vectors, we could write:
 
 {% highlight java %}
-class VectorAccumulatorParam implements Accumu

spark git commit: [SPARK-18615][DOCS] Switch to multi-line doc to avoid a genjavadoc bug for backticks

2016-11-29 Thread srowen

Repository: spark
Updated Branches:
  refs/heads/master f045d9dad -> 1a870090e


[SPARK-18615][DOCS] Switch to multi-line doc to avoid a genjavadoc bug for 
backticks

## What changes were proposed in this pull request?

Currently, single line comment does not mark down backticks to 
`..` but prints as they are (`` `..` ``). For example, the line 
below:

```scala
/** Return an RDD with the pairs from `this` whose keys are not in `other`. */
```

So, we could work around this as below:

```scala
/**
 * Return an RDD with the pairs from `this` whose keys are not in `other`.
 */
```

- javadoc

  - **Before**
![2016-11-29 10 39 
14](https://cloud.githubusercontent.com/assets/6477701/20693606/e64c8f90-b622-11e6-8dfc-4a029216e23d.png)

  - **After**
![2016-11-29 10 39 
08](https://cloud.githubusercontent.com/assets/6477701/20693607/e7280d36-b622-11e6-8502-d2e21cd5556b.png)

- scaladoc (this one looks fine either way)

  - **Before**
![2016-11-29 10 38 
22](https://cloud.githubusercontent.com/assets/6477701/20693640/12c18aa8-b623-11e6-901a-693e2f6f8066.png)

  - **After**
![2016-11-29 10 40 
05](https://cloud.githubusercontent.com/assets/6477701/20693642/14eb043a-b623-11e6-82ac-7cd106d1.png)

I suspect this is related with SPARK-16153 and genjavadoc issue in ` 
typesafehub/genjavadoc#85`.

## How was this patch tested?

I found them via

```
grep -r "\/\*\*.*\`" . | grep .scala


and then checked if each is in the public API documentation with manually built 
docs (`jekyll build`) with Java 7.

Author: hyukjinkwon 

Closes #16050 from HyukjinKwon/javadoc-markdown.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1a870090
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1a870090
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1a870090

Branch: refs/heads/master
Commit: 1a870090e4266df570c3f56c1e2ea12d090d03d1
Parents: f045d9d
Author: hyukjinkwon 
Authored: Tue Nov 29 13:50:24 2016 +
Committer: Sean Owen 
Committed: Tue Nov 29 13:50:24 2016 +

--
 .../main/scala/org/apache/spark/SparkConf.scala |  4 +++-
 .../apache/spark/api/java/JavaDoubleRDD.scala   |  4 +++-
 .../org/apache/spark/api/java/JavaPairRDD.scala | 12 +---
 .../org/apache/spark/api/java/JavaRDD.scala |  4 +++-
 .../org/apache/spark/rdd/PairRDDFunctions.scala |  8 ++--
 .../main/scala/org/apache/spark/rdd/RDD.scala   |  8 ++--
 .../apache/spark/graphx/impl/EdgeRDDImpl.scala  |  4 +++-
 .../apache/spark/graphx/impl/GraphImpl.scala| 12 +---
 .../spark/graphx/impl/VertexRDDImpl.scala   |  4 +++-
 .../org/apache/spark/ml/linalg/Matrices.scala   | 16 
 .../scala/org/apache/spark/ml/Pipeline.scala|  4 +++-
 .../spark/ml/attribute/AttributeGroup.scala |  4 +++-
 .../apache/spark/ml/attribute/attributes.scala  | 20 +++-
 .../ml/classification/LogisticRegression.scala  |  4 +++-
 .../GeneralizedLinearRegression.scala   |  4 +++-
 .../spark/mllib/feature/ChiSqSelector.scala |  8 ++--
 .../apache/spark/mllib/linalg/Matrices.scala| 16 
 .../mllib/linalg/distributed/BlockMatrix.scala  |  4 +++-
 .../linalg/distributed/CoordinateMatrix.scala   |  4 +++-
 .../linalg/distributed/IndexedRowMatrix.scala   |  4 +++-
 .../apache/spark/mllib/stat/Statistics.scala|  8 ++--
 .../scala/org/apache/spark/sql/Encoder.scala|  4 +++-
 .../org/apache/spark/sql/types/ArrayType.scala  |  4 +++-
 .../org/apache/spark/streaming/StateSpec.scala  |  8 ++--
 24 files changed, 129 insertions(+), 43 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/1a870090/core/src/main/scala/org/apache/spark/SparkConf.scala
--
diff --git a/core/src/main/scala/org/apache/spark/SparkConf.scala 
b/core/src/main/scala/org/apache/spark/SparkConf.scala
index 0c1c68d..d78b9f1 100644
--- a/core/src/main/scala/org/apache/spark/SparkConf.scala
+++ b/core/src/main/scala/org/apache/spark/SparkConf.scala
@@ -378,7 +378,9 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable 
with Logging with Seria
 settings.entrySet().asScala.map(x => (x.getKey, x.getValue)).toArray
   }
 
-  /** Get all parameters that start with `prefix` */
+  /**
+   * Get all parameters that start with `prefix`
+   */
   def getAllWithPrefix(prefix: String): Array[(String, String)] = {
 getAll.filter { case (k, v) => k.startsWith(prefix) }
   .map { case (k, v) => (k.substring(prefix.length), v) }

http://git-wip-us.apache.org/repos/asf/spark/blob/1a870090/core/src/main/scala/org/apache/spark/api/java/JavaDoubleRDD.scala
--
diff --git a/core/src/main/scala/org/apache/spark/api/java/JavaDoubleRDD.scal

< 13 14 15 16 17 18 19 20 21 22 >

1701 - 1800 of 2133 matches

Mail list logo