Port wiki page Committers to committers.html, Contributing to Spark and Code 
Style Guide to contributing.html, Third Party Projects and Additional Language 
Bindings to third-party-projects.html, Powered By to powered-by.html


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/0744e8fd
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/0744e8fd
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/0744e8fd

Branch: refs/heads/asf-site
Commit: 0744e8fdd9f954a6552c968be50604241097dbbc
Parents: 46fb910
Author: Sean Owen <so...@cloudera.com>
Authored: Sat Nov 19 12:35:02 2016 +0000
Committer: Sean Owen <so...@cloudera.com>
Committed: Mon Nov 21 20:57:42 2016 +0000

----------------------------------------------------------------------
 _layouts/global.html                            |  13 +-
 committers.md                                   | 167 ++++
 community.md                                    |   2 +-
 contributing.md                                 | 523 +++++++++++++
 documentation.md                                |   2 +-
 faq.md                                          |   4 +-
 graphx/index.md                                 |   2 +-
 index.md                                        |   9 +-
 mllib/index.md                                  |   2 +-
 ...-05-spark-user-survey-and-powered-by-page.md |   2 +-
 powered-by.md                                   | 239 ++++++
 site/committers.html                            | 518 +++++++++++++
 site/community.html                             |  15 +-
 site/contributing.html                          | 771 +++++++++++++++++++
 site/documentation.html                         |  15 +-
 site/downloads.html                             |  13 +-
 site/examples.html                              |  13 +-
 site/faq.html                                   |  17 +-
 site/graphx/index.html                          |  15 +-
 site/index.html                                 |  22 +-
 site/mailing-lists.html                         |  13 +-
 site/mllib/index.html                           |  15 +-
 site/news/amp-camp-2013-registration-ope.html   |  13 +-
 .../news/announcing-the-first-spark-summit.html |  13 +-
 .../news/fourth-spark-screencast-published.html |  13 +-
 site/news/index.html                            |  13 +-
 site/news/nsdi-paper.html                       |  13 +-
 site/news/one-month-to-spark-summit-2015.html   |  13 +-
 .../proposals-open-for-spark-summit-east.html   |  13 +-
 ...registration-open-for-spark-summit-east.html |  13 +-
 .../news/run-spark-and-shark-on-amazon-emr.html |  13 +-
 site/news/spark-0-6-1-and-0-5-2-released.html   |  13 +-
 site/news/spark-0-6-2-released.html             |  13 +-
 site/news/spark-0-7-0-released.html             |  13 +-
 site/news/spark-0-7-2-released.html             |  13 +-
 site/news/spark-0-7-3-released.html             |  13 +-
 site/news/spark-0-8-0-released.html             |  13 +-
 site/news/spark-0-8-1-released.html             |  13 +-
 site/news/spark-0-9-0-released.html             |  13 +-
 site/news/spark-0-9-1-released.html             |  13 +-
 site/news/spark-0-9-2-released.html             |  13 +-
 site/news/spark-1-0-0-released.html             |  13 +-
 site/news/spark-1-0-1-released.html             |  13 +-
 site/news/spark-1-0-2-released.html             |  13 +-
 site/news/spark-1-1-0-released.html             |  13 +-
 site/news/spark-1-1-1-released.html             |  13 +-
 site/news/spark-1-2-0-released.html             |  13 +-
 site/news/spark-1-2-1-released.html             |  13 +-
 site/news/spark-1-2-2-released.html             |  13 +-
 site/news/spark-1-3-0-released.html             |  13 +-
 site/news/spark-1-4-0-released.html             |  13 +-
 site/news/spark-1-4-1-released.html             |  13 +-
 site/news/spark-1-5-0-released.html             |  13 +-
 site/news/spark-1-5-1-released.html             |  13 +-
 site/news/spark-1-5-2-released.html             |  13 +-
 site/news/spark-1-6-0-released.html             |  13 +-
 site/news/spark-1-6-1-released.html             |  13 +-
 site/news/spark-1-6-2-released.html             |  13 +-
 site/news/spark-1-6-3-released.html             |  13 +-
 site/news/spark-2-0-0-released.html             |  13 +-
 site/news/spark-2-0-1-released.html             |  13 +-
 site/news/spark-2-0-2-released.html             |  13 +-
 site/news/spark-2.0.0-preview.html              |  13 +-
 .../spark-accepted-into-apache-incubator.html   |  13 +-
 site/news/spark-and-shark-in-the-news.html      |  13 +-
 site/news/spark-becomes-tlp.html                |  13 +-
 site/news/spark-featured-in-wired.html          |  13 +-
 .../spark-mailing-lists-moving-to-apache.html   |  13 +-
 site/news/spark-meetups.html                    |  13 +-
 site/news/spark-screencasts-published.html      |  13 +-
 site/news/spark-summit-2013-is-a-wrap.html      |  13 +-
 site/news/spark-summit-2014-videos-posted.html  |  13 +-
 site/news/spark-summit-2015-videos-posted.html  |  13 +-
 site/news/spark-summit-agenda-posted.html       |  13 +-
 .../spark-summit-east-2015-videos-posted.html   |  13 +-
 .../spark-summit-east-2016-cfp-closing.html     |  13 +-
 site/news/spark-summit-east-agenda-posted.html  |  13 +-
 .../news/spark-summit-europe-agenda-posted.html |  13 +-
 site/news/spark-summit-europe.html              |  13 +-
 .../spark-summit-june-2016-agenda-posted.html   |  13 +-
 site/news/spark-tips-from-quantifind.html       |  13 +-
 .../spark-user-survey-and-powered-by-page.html  |  15 +-
 site/news/spark-version-0-6-0-released.html     |  13 +-
 .../spark-wins-cloudsort-100tb-benchmark.html   |  13 +-
 ...-wins-daytona-gray-sort-100tb-benchmark.html |  13 +-
 .../strata-exercises-now-available-online.html  |  13 +-
 .../news/submit-talks-to-spark-summit-2014.html |  13 +-
 .../news/submit-talks-to-spark-summit-2016.html |  13 +-
 .../submit-talks-to-spark-summit-east-2016.html |  13 +-
 .../submit-talks-to-spark-summit-eu-2016.html   |  13 +-
 site/news/two-weeks-to-spark-summit-2014.html   |  13 +-
 ...deo-from-first-spark-development-meetup.html |  13 +-
 site/powered-by.html                            | 563 ++++++++++++++
 site/releases/spark-release-0-3.html            |  13 +-
 site/releases/spark-release-0-5-0.html          |  13 +-
 site/releases/spark-release-0-5-1.html          |  13 +-
 site/releases/spark-release-0-5-2.html          |  13 +-
 site/releases/spark-release-0-6-0.html          |  13 +-
 site/releases/spark-release-0-6-1.html          |  13 +-
 site/releases/spark-release-0-6-2.html          |  13 +-
 site/releases/spark-release-0-7-0.html          |  13 +-
 site/releases/spark-release-0-7-2.html          |  13 +-
 site/releases/spark-release-0-7-3.html          |  13 +-
 site/releases/spark-release-0-8-0.html          |  13 +-
 site/releases/spark-release-0-8-1.html          |  13 +-
 site/releases/spark-release-0-9-0.html          |  13 +-
 site/releases/spark-release-0-9-1.html          |  13 +-
 site/releases/spark-release-0-9-2.html          |  13 +-
 site/releases/spark-release-1-0-0.html          |  13 +-
 site/releases/spark-release-1-0-1.html          |  13 +-
 site/releases/spark-release-1-0-2.html          |  13 +-
 site/releases/spark-release-1-1-0.html          |  13 +-
 site/releases/spark-release-1-1-1.html          |  13 +-
 site/releases/spark-release-1-2-0.html          |  13 +-
 site/releases/spark-release-1-2-1.html          |  13 +-
 site/releases/spark-release-1-2-2.html          |  13 +-
 site/releases/spark-release-1-3-0.html          |  13 +-
 site/releases/spark-release-1-3-1.html          |  13 +-
 site/releases/spark-release-1-4-0.html          |  13 +-
 site/releases/spark-release-1-4-1.html          |  13 +-
 site/releases/spark-release-1-5-0.html          |  13 +-
 site/releases/spark-release-1-5-1.html          |  13 +-
 site/releases/spark-release-1-5-2.html          |  13 +-
 site/releases/spark-release-1-6-0.html          |  13 +-
 site/releases/spark-release-1-6-1.html          |  13 +-
 site/releases/spark-release-1-6-2.html          |  13 +-
 site/releases/spark-release-1-6-3.html          |  13 +-
 site/releases/spark-release-2-0-0.html          |  13 +-
 site/releases/spark-release-2-0-1.html          |  13 +-
 site/releases/spark-release-2-0-2.html          |  13 +-
 site/research.html                              |  13 +-
 site/screencasts/1-first-steps-with-spark.html  |  13 +-
 .../2-spark-documentation-overview.html         |  13 +-
 .../3-transformations-and-caching.html          |  13 +-
 .../4-a-standalone-job-in-spark.html            |  13 +-
 site/screencasts/index.html                     |  13 +-
 site/sitemap.xml                                |  30 +-
 site/sql/index.html                             |  15 +-
 site/streaming/index.html                       |  15 +-
 site/third-party-projects.html                  | 287 +++++++
 site/trademarks.html                            |  13 +-
 sql/index.md                                    |   2 +-
 streaming/index.md                              |   2 +-
 third-party-projects.md                         |  84 ++
 144 files changed, 4081 insertions(+), 793 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/_layouts/global.html
----------------------------------------------------------------------
diff --git a/_layouts/global.html b/_layouts/global.html
index 6f02c16..662fb86 100644
--- a/_layouts/global.html
+++ b/_layouts/global.html
@@ -113,7 +113,7 @@
           <li><a href="{{site.baseurl}}/mllib/">MLlib (machine 
learning)</a></li>
           <li><a href="{{site.baseurl}}/graphx/">GraphX (graph)</a></li>
           <li class="divider"></li>
-          <li><a 
href="https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages</a></li>
+          <li><a href="{{site.baseurl}}/third-party-projects.html">Third-Party 
Projects</a></li>
         </ul>
       </li>
       <li class="dropdown">
@@ -131,12 +131,13 @@
           Community <b class="caret"></b>
         </a>
         <ul class="dropdown-menu">
-          <li><a href="{{site.baseurl}}/community.html">Mailing Lists</a></li>
+          <li><a href="{{site.baseurl}}/community.html#mailing-lists">Mailing 
Lists</a></li>
+          <li><a href="{{site.baseurl}}/contributing.html">Contributing to 
Spark</a></li>
+          <li><a href="https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker</a></li>
           <li><a href="{{site.baseurl}}/community.html#events">Events and 
Meetups</a></li>
           <li><a href="{{site.baseurl}}/community.html#history">Project 
History</a></li>
-          <li><a 
href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>Powered
 By</a></li>
-          <li><a 
href="https://cwiki.apache.org/confluence/display/SPARK/Committers";>Project 
Committers</a></li>
-          <li><a href="https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker</a></li>
+          <li><a href="{{site.baseurl}}/powered-by.html">Powered By</a></li>
+          <li><a href="{{site.baseurl}}/committers.html">Project 
Committers</a></li>
         </ul>
       </li>
       <li><a href="{{site.baseurl}}/faq.html">FAQ</a></li>
@@ -184,7 +185,7 @@
         <li><a href="{{site.baseurl}}/mllib/">MLlib (machine learning)</a></li>
         <li><a href="{{site.baseurl}}/graphx/">GraphX (graph)</a></li>
       </ul>
-      <a 
href="https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects";>Third-Party
 Packages</a>
+      <a href="{{site.baseurl}}/third-party-projects.html">Third-Party 
Projects</a>
     </div>
   </div>
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/committers.md
----------------------------------------------------------------------
diff --git a/committers.md b/committers.md
new file mode 100644
index 0000000..03defa6
--- /dev/null
+++ b/committers.md
@@ -0,0 +1,167 @@
+---
+layout: global
+title: Committers
+type: "page singular"
+navigation:
+  weight: 5
+  show: true
+---
+<h2>Current Committers</h2>
+
+|Name|Organization|
+|----|------------|
+|Michael Armbrust|Databricks|
+|Joseph Bradley|Databricks|
+|Felix Cheung|Automattic|
+|Mosharaf Chowdhury|University of Michigan, Ann Arbor|
+|Jason Dai|Intel|
+|Tathagata Das|Databricks|
+|Ankur Dave|UC Berkeley|
+|Aaron Davidson|Databricks|
+|Thomas Dudziak|Facebook|
+|Robert Evans|Yahoo!|
+|Wenchen Fan|Databricks|
+|Joseph Gonzalez|UC Berkeley|
+|Thomas Graves|Yahoo!|
+|Stephen Haberman|Bizo|
+|Mark Hamstra|ClearStory Data|
+|Herman van Hovell|QuestTec B.V.|
+|Yin Huai|Databricks|
+|Shane Huang|Intel|
+|Andy Konwinski|Databricks|
+|Ryan LeCompte|Quantifind|
+|Haoyuan Li|Alluxio, UC Berkeley|
+|Xiao Li|IBM|
+|Davies Liu|Databricks|
+|Cheng Lian|Databricks|
+|Yanbo Liang|Hortonworks|
+|Sean McNamara|Webtrends|
+|Xiangrui Meng|Databricks|
+|Mridul Muralidharam|Hortonworks|
+|Andrew Or|Princeton University|
+|Kay Ousterhout|UC Berkeley|
+|Sean Owen|Cloudera|
+|Nick Pentreath|IBM|
+|Imran Rashid|Cloudera|
+|Charles Reiss|UC Berkeley|
+|Josh Rosen|Databricks|
+|Sandy Ryza|Clover Health|
+|Kousuke Saruta|NTT Data|
+|Prashant Sharma|IBM|
+|Ram Sriharsha|Databricks|
+|DB Tsai|Netflix|
+|Marcelo Vanzin|Cloudera|
+|Shivaram Venkataraman|UC Berkeley|
+|Patrick Wendell|Databricks|
+|Andrew Xia|Alibaba|
+|Reynold Xin|Databricks|
+|Matei Zaharia|Databricks, Stanford|
+|Shixiong Zhu|Databricks|
+
+<h3>Becoming a Committer</h3>
+
+To get started contributing to Spark, learn 
+<a href="{{site.baseurl}}/contributing.html">how to contribute</a> – 
+anyone can submit patches, documentation and examples to the project.
+
+The PMC regularly adds new committers from the active contributors, based on 
their contributions 
+to Spark. The qualifications for new committers include:
+
+1. Sustained contributions to Spark: Committers should have a history of major 
contributions to 
+Spark. An ideal committer will have contributed broadly throughout the 
project, and have 
+contributed at least one major component where they have taken an "ownership" 
role. An ownership 
+role means that existing contributors feel that they should run patches for 
this component by 
+this person.
+2. Quality of contributions: Committers more than any other community member 
should submit simple, 
+well-tested, and well-designed patches. In addition, they should show 
sufficient expertise to be 
+able to review patches, including making sure they fit within Spark's 
engineering practices 
+(testability, documentation, API stability, code style, etc). The 
committership is collectively 
+responsible for the software quality and maintainability of Spark.
+3. Community involvement: Committers should have a constructive and friendly 
attitude in all 
+community interactions. They should also be active on the dev and user list 
and help mentor 
+newer contributors and users. In design discussions, committers should 
maintain a professional 
+and diplomatic approach, even in the face of disagreement.
+
+The type and level of contributions considered may vary by project area -- for 
example, we 
+greatly encourage contributors who want to work on mainly the documentation, 
or mainly on 
+platform support for specific OSes, storage systems, etc.
+
+<h3>Review Process</h3>
+
+All contributions should be reviewed before merging as described in 
+<a href="{{site.baseurl}}/contributing.html">Contributing to Spark</a>. 
+In particular, if you are working on an area of the codebase you are 
unfamiliar with, look at the 
+Git history for that code to see who reviewed patches before. You can do this 
using 
+`git log --format=full <filename>`, by examining the "Commit" field to see who 
committed each patch.
+
+<h3>How to Merge a Pull Request</h3>
+
+Changes pushed to the master branch on Apache cannot be removed; that is, we 
can't force-push to 
+it. So please don't add any test commits or anything like that, only real 
patches.
+
+All merges should be done using the 
+[dev/merge_spark_pr.py](https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py)
 
+script, which squashes the pull request's changes into one commit. To use this 
script, you 
+will need to add a git remote called "apache" at 
https://git-wip-us.apache.org/repos/asf/spark.git, 
+as well as one called "apache-github" at `git://github.com/apache/spark`. For 
the `apache` repo, 
+you can authenticate using your ASF username and password. Ask Patrick if you 
have trouble with 
+this or want help doing your first merge.
+
+The script is fairly self explanatory and walks you through steps and options 
interactively.
+
+If you want to amend a commit before merging – which should be used for 
trivial touch-ups – 
+then simply let the script wait at the point where it asks you if you want to 
push to Apache. 
+Then, in a separate window, modify the code and push a commit. Run `git rebase 
-i HEAD~2` and 
+"squash" your new commit. Edit the commit message just after to remove your 
commit message. 
+You can verify the result is one change with `git log`. Then resume the script 
in the other window.
+
+Also, please remember to set Assignee on JIRAs where applicable when they are 
resolved. The script 
+can't do this automatically.
+
+<!--
+<h3>Minimize use of MINOR, BUILD, and HOTFIX with no JIRA</h3>
+
+From pwendell at 
https://www.mail-archive.com/dev@spark.apache.org/msg09565.html:
+It would be great if people could create JIRA's for any and all merged pull 
requests. The reason is 
+that when patches get reverted due to build breaks or other issues, it is very 
difficult to keep 
+track of what is going on if there is no JIRA. 
+Here is a list of 5 patches we had to revert recently that didn't include a 
JIRA:
+    Revert "[MINOR] [BUILD] Use custom temp directory during build."
+    Revert "[SQL] [TEST] [MINOR] Uses a temporary log4j.properties in 
HiveThriftServer2Test to ensure expected logging behavior"
+    Revert "[BUILD] Always run SQL tests in master build."
+    Revert "[MINOR] [CORE] Warn users who try to cache RDDs with dynamic 
allocation on."
+    Revert "[HOT FIX] [YARN] Check whether `/lib` exists before listing its 
files"
+
+The cost overhead of creating a JIRA relative to other aspects of development 
is very small. 
+If it's really a documentation change or something small, that's okay.
+
+But anything affecting the build, packaging, etc. These all need to have a 
JIRA to ensure that 
+follow-up can be well communicated to all Spark developers.
+-->
+
+<h3>Policy on Backporting Bug Fixes</h3>
+
+From <a 
href="https://www.mail-archive.com/dev@spark.apache.org/msg10284.html";>`pwendell`</a>:
+
+The trade off when backporting is you get to deliver the fix to people running 
older versions 
+(great!), but you risk introducing new or even worse bugs in maintenance 
releases (bad!). 
+The decision point is when you have a bug fix and it's not clear whether it is 
worth backporting.
+
+I think the following facets are important to consider:
+- Backports are an extremely valuable service to the community and should be 
considered for 
+any bug fix.
+- Introducing a new bug in a maintenance release must be avoided at all costs. 
It over time would 
+erode confidence in our release process.
+- Distributions or advanced users can always backport risky patches on their 
own, if they see fit.
+
+For me, the consequence of these is that we should backport in the following 
situations:
+- Both the bug and the fix are well understood and isolated. Code being 
modified is well tested.
+- The bug being addressed is high priority to the community.
+- The backported fix does not vary widely from the master branch fix.
+
+We tend to avoid backports in the converse situations:
+- The bug or fix are not well understood. For instance, it relates to 
interactions between complex 
+components or third party libraries (e.g. Hadoop libraries). The code is not 
well tested outside 
+of the immediate bug being fixed.
+- The bug is not clearly a high priority for the community.
+- The backported fix is widely different from the master branch fix.

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/community.md
----------------------------------------------------------------------
diff --git a/community.md b/community.md
index c4f83a5..3bff6ad 100644
--- a/community.md
+++ b/community.md
@@ -191,7 +191,7 @@ Spark Meetups are grass-roots events organized and hosted 
by leaders and champio
 
 <h3>Powered By</h3>
 
-<p>Our wiki has a list of <a 
href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>projects
 and organizations powered by Spark</a>.</p>
+<p>Our wiki has a list of <a href="{{site.baseurl}}/powered-by.html">projects 
and organizations powered by Spark</a>.</p>
 
 <a name="history"></a>
 <h3>Project History</h3>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/contributing.md
----------------------------------------------------------------------
diff --git a/contributing.md b/contributing.md
new file mode 100644
index 0000000..5ee066f
--- /dev/null
+++ b/contributing.md
@@ -0,0 +1,523 @@
+---
+layout: global
+title: Contributing to Spark
+type: "page singular"
+navigation:
+  weight: 5
+  show: true
+---
+
+This guide documents the best way to make various types of contribution to 
Apache Spark, 
+including what is required before submitting a code change.
+
+Contributing to Spark doesn't just mean writing code. Helping new users on the 
mailing list, 
+testing releases, and improving documentation are also welcome. In fact, 
proposing significant 
+code changes usually requires first gaining experience and credibility within 
the community by h
+elping in other ways. This is also a guide to becoming an effective 
contributor.
+
+So, this guide organizes contributions in order that they should probably be 
considered by new 
+contributors who intend to get involved long-term. Build some track record of 
helping others, 
+rather than just open pull requests.
+
+<h2>Contributing by Helping Other Users</h2>
+
+A great way to contribute to Spark is to help answer user questions on the 
`u...@spark.apache.org` 
+mailing list or on StackOverflow. There are always many new Spark users; 
taking a few minutes to 
+help answer a question is a very valuable community service.
+
+Contributors should subscribe to this list and follow it in order to keep up 
to date on what's 
+happening in Spark. Answering questions is an excellent and visible way to 
help the community, 
+which also demonstrates your expertise.
+
+See the <a href="{{site.baseurl}}/mailing-lists.html">Mailing Lists guide</a> 
for guidelines 
+about how to effectively participate in discussions on the mailing list, as 
well as forums 
+like StackOverflow.
+
+<h2>Contributing by Testing Releases</h2>
+
+Spark's release process is community-oriented, and members of the community 
can vote on new 
+releases on the `d...@spark.apache.org` mailing list. Spark users are invited 
to subscribe to 
+this list to receive announcements, and test their workloads on newer release 
and provide 
+feedback on any performance or correctness issues found in the newer release.
+
+<h2>Contributing by Reviewing Changes</h2>
+
+Changes to Spark source code are proposed, reviewed and committed via 
+<a href="http://github.com/apache/spark/pulls";>Github pull requests</a> 
(described later). 
+Anyone can view and comment on active changes here. 
+Reviewing others' changes is a good way to learn how the change process works 
and gain exposure 
+to activity in various parts of the code. You can help by reviewing the 
changes and asking 
+questions or pointing out issues -- as simple as typos or small issues of 
style.
+See also https://spark-prs.appspot.com/ for a convenient way to view and 
filter open PRs.
+
+<h2>Contributing Documentation Changes</h2>
+
+To propose a change to _release_ documentation (that is, docs that appear 
under 
+<a href="https://spark.apache.org/docs/";>https://spark.apache.org/docs/</a>), 
+edit the Markdown source files in Spark's 
+<a href="https://github.com/apache/spark/tree/master/docs";>`docs/`</a> 
directory, 
+whose `README` file shows how to build the documentation locally to test your 
changes.
+The process to propose a doc change is otherwise the same as the process for 
proposing code 
+changes below. 
+
+To propose a change to the rest of the documentation (that is, docs that do 
_not_ appear under 
+<a href="https://spark.apache.org/docs/";>https://spark.apache.org/docs/</a>), 
similarly, edit the Markdown in the 
+<a href="https://github.com/apache/spark-website";>spark-website repository</a> 
and open a pull request.
+
+<h2>Contributing User Libraries to Spark</h2>
+
+Just as Java and Scala applications can access a huge selection of libraries 
and utilities, 
+none of which are part of Java or Scala themselves, Spark aims to support a 
rich ecosystem of 
+libraries. Many new useful utilities or features belong outside of Spark 
rather than in the core. 
+For example: language support probably has to be a part of core Spark, but, 
useful machine 
+learning algorithms can happily exist outside of MLlib.
+
+To that end, large and independent new functionality is often rejected for 
inclusion in Spark 
+itself, but, can and should be hosted as a separate project and repository, 
and included in 
+the <a href="http://spark-packages.org/";>spark-packages.org</a> collection.
+
+<h2>Contributing Bug Reports</h2>
+
+Ideally, bug reports are accompanied by a proposed code change to fix the bug. 
This isn't 
+always possible, as those who discover a bug may not have the experience to 
fix it. A bug 
+may be reported by creating a JIRA but without creating a pull request (see 
below).
+
+Bug reports are only useful however if they include enough information to 
understand, isolate 
+and ideally reproduce the bug. Simply encountering an error does not mean a 
bug should be 
+reported; as below, search JIRA and search and inquire on the Spark user / dev 
mailing lists 
+first. Unreproducible bugs, or simple error reports, may be closed.
+
+It is possible to propose new features as well. These are generally not 
helpful unless 
+accompanied by detail, such as a design document and/or code change. Large new 
contributions 
+should consider <a href="http://spark-packages.org/";>spark-packages.org</a> 
first (see above), 
+or be discussed on the mailing 
+list first. Feature requests may be rejected, or closed after a long period of 
inactivity.
+
+<h2>Contributing to JIRA Maintenance</h2>
+
+Given the sheer volume of issues raised in the Apache Spark JIRA, inevitably 
some issues are 
+duplicates, or become obsolete and eventually fixed otherwise, or can't be 
reproduced, or could 
+benefit from more detail, and so on. It's useful to help identify these issues 
and resolve them, 
+either by advancing the discussion or even resolving the JIRA. Most 
contributors are able to 
+directly resolve JIRAs. Use judgment in determining whether you are quite 
confident the issue 
+should be resolved, although changes can be easily undone. If in doubt, just 
leave a comment 
+on the JIRA.
+
+When resolving JIRAs, observe a few useful conventions:
+
+- Resolve as **Fixed** if there's a change you can point to that resolved the 
issue
+  - Set Fix Version(s), if and only if the resolution is Fixed
+  - Set Assignee to the person who most contributed to the resolution, which 
is usually the person 
+  who opened the PR that resolved the issue.
+  - In case several people contributed, prefer to assign to the more 'junior', 
non-committer contributor
+- For issues that can't be reproduced against master as reported, resolve as 
**Cannot Reproduce**
+  - Fixed is reasonable too, if it's clear what other previous pull request 
resolved it. Link to it.
+- If the issue is the same as or a subset of another issue, resolved as 
**Duplicate**
+  - Make sure to link to the JIRA it duplicates
+  - Prefer to resolve the issue that has less activity or discussion as the 
duplicate
+- If the issue seems clearly obsolete and applies to issues or components that 
have changed 
+radically since it was opened, resolve as **Not a Problem**
+- If the issue doesn't make sense – not actionable, for example, a non-Spark 
issue, resolve 
+as **Invalid**
+- If it's a coherent issue, but there is a clear indication that there is not 
support or interest 
+in acting on it, then resolve as **Won't Fix**
+- Umbrellas are frequently marked **Done** if they are just container issues 
that don't correspond 
+to an actionable change of their own
+
+<h2>Preparing to Contribute Code Changes</h2>
+
+<h3>Choosing What to Contribute</h3>
+
+Spark is an exceptionally busy project, with a new JIRA or pull request every 
few hours on average. 
+Review can take hours or days of committer time. Everyone benefits if 
contributors focus on 
+changes that are useful, clear, easy to evaluate, and already pass basic 
checks.
+
+Sometimes, a contributor will already have a particular new change or bug in 
mind. If seeking 
+ideas, consult the list of starter tasks in JIRA, or ask the 
`u...@spark.apache.org` mailing list.
+
+Before proceeding, contributors should evaluate if the proposed change is 
likely to be relevant, 
+new and actionable:
+
+- Is it clear that code must change? Proposing a JIRA and pull request is 
appropriate only when a 
+clear problem or change has been identified. If simply having trouble using 
Spark, use the mailing 
+lists first, rather than consider filing a JIRA or proposing a change. When in 
doubt, email 
+`u...@spark.apache.org` first about the possible change
+- Search the `u...@spark.apache.org` and `d...@spark.apache.org` mailing list 
+<a href="{{site.baseurl}}/community.html#mailing-lists">archives</a> for 
+related discussions. Use <a 
href="http://search-hadoop.com/?q=&fc_project=Spark";>search-hadoop.com</a> 
+or similar search tools. 
+Often, the problem has been discussed before, with a resolution that doesn't 
require a code 
+change, or recording what kinds of changes will not be accepted as a 
resolution.
+- Search JIRA for existing issues: 
+<a 
href="https://issues.apache.org/jira/browse/SPARK";>https://issues.apache.org/jira/browse/SPARK</a>
 
+- Type `spark [search terms]` at the top right search box. If a logically 
similar issue already 
+exists, then contribute to the discussion on the existing JIRA and pull 
request first, instead of 
+creating a new one.
+- Is the scope of the change matched to the contributor's level of experience? 
Anyone is qualified 
+to suggest a typo fix, but refactoring core scheduling logic requires much 
more understanding of 
+Spark. Some changes require building up experience first (see above).
+
+<h3>MLlib-specific Contribution Guidelines</h3>
+
+While a rich set of algorithms is an important goal for MLLib, scaling the 
project requires 
+that maintainability, consistency, and code quality come first. New algorithms 
should:
+
+- Be widely known
+- Be used and accepted (academic citations and concrete use cases can help 
justify this)
+- Be highly scalable
+- Be well documented
+- Have APIs consistent with other algorithms in MLLib that accomplish the same 
thing
+- Come with a reasonable expectation of developer support.
+- Have `@Since` annotation on public classes, methods, and variables.
+
+<h3>Code Review Criteria</h3>
+
+Before considering how to contribute code, it's useful to understand how code 
is reviewed, 
+and why changes may be rejected. Simply put, changes that have many or large 
positives, and 
+few negative effects or risks, are much more likely to be merged, and merged 
quickly. 
+Risky and less valuable changes are very unlikely to be merged, and may be 
rejected outright 
+rather than receive iterations of review.
+
+<h4>Positives</h4>
+
+- Fixes the root cause of a bug in existing functionality
+- Adds functionality or fixes a problem needed by a large number of users
+- Simple, targeted
+- Maintains or improves consistency across Python, Java, Scala
+- Easily tested; has tests
+- Reduces complexity and lines of code
+- Change has already been discussed and is known to committers
+
+<h4>Negatives, Risks</h4>
+
+- Band-aids a symptom of a bug only
+- Introduces complex new functionality, especially an API that needs to be 
supported
+- Adds complexity that only helps a niche use case
+- Adds user-space functionality that does not need to be maintained in Spark, 
but could be hosted 
+externally and indexed by <a 
href="http://spark-packages.org/";>spark-packages.org</a> 
+- Changes a public API or semantics (rarely allowed)
+- Adds large dependencies
+- Changes versions of existing dependencies
+- Adds a large amount of code
+- Makes lots of modifications in one "big bang" change
+
+<h2>Contributing Code Changes</h2>
+
+Please review the preceding section before proposing a code change. This 
section documents how to do so.
+
+**When you contribute code, you affirm that the contribution is your original 
work and that you 
+license the work to the project under the project's open source license. 
Whether or not you state 
+this explicitly, by submitting any copyrighted material via pull request, 
email, or other means 
+you agree to license the material under the project's open source license and 
warrant that you 
+have the legal authority to do so.**
+
+<h3>JIRA</h3>
+
+Generally, Spark uses JIRA to track logical issues, including bugs and 
improvements, and uses 
+Github pull requests to manage the review and merge of specific code changes. 
That is, JIRAs are 
+used to describe _what_ should be fixed or changed, and high-level approaches, 
and pull requests 
+describe _how_ to implement that change in the project's source code. For 
example, major design 
+decisions are discussed in JIRA.
+
+1. Find the existing Spark JIRA that the change pertains to.
+    1. Do not create a new JIRA if creating a change to address an existing 
issue in JIRA; add to 
+    the existing discussion and work instead
+    1. Look for existing pull requests that are linked from the JIRA, to 
understand if someone is 
+    already working on the JIRA
+1. If the change is new, then it usually needs a new JIRA. However, trivial 
changes, where the
+what should change is virtually the same as the how it should change do not 
require a JIRA. 
+Example: `Fix typos in Foo scaladoc`
+1. If required, create a new JIRA:
+    1. Provide a descriptive Title. "Update web UI" or "Problem in scheduler" 
is not sufficient.
+    "Kafka Streaming support fails to handle empty queue in YARN cluster mode" 
is good.
+    1. Write a detailed Description. For bug reports, this should ideally 
include a short 
+    reproduction of the problem. For new features, it may include a design 
document.
+    1. Set required fields:
+        1. **Issue Type**. Generally, Bug, Improvement and New Feature are the 
only types used in Spark.
+        1. **Priority**. Set to Major or below; higher priorities are 
generally reserved for 
+        committers to set. JIRA tends to unfortunately conflate "size" and 
"importance" in its 
+        Priority field values. Their meaning is roughly:
+             1. Blocker: pointless to release without this change as the 
release would be unusable 
+             to a large minority of users
+             1. Critical: a large minority of users are missing important 
functionality without 
+             this, and/or a workaround is difficult
+             1. Major: a small minority of users are missing important 
functionality without this, 
+             and there is a workaround
+             1. Minor: a niche use case is missing some support, but it does 
not affect usage or 
+             is easily worked around
+             1. Trivial: a nice-to-have change but unlikely to be any problem 
in practice otherwise 
+        1. **Component**
+        1. **Affects Version**. For Bugs, assign at least one version that is 
known to exhibit the 
+        problem or need the change
+    1. Do not set the following fields:
+        1. **Fix Version**. This is assigned by committers only when resolved.
+        1. **Target Version**. This is assigned by committers to indicate a PR 
has been accepted for 
+        possible fix by the target version.
+    1. Do not include a patch file; pull requests are used to propose the 
actual change.
+1. If the change is a large change, consider inviting discussion on the issue 
at 
+`d...@spark.apache.org` first before proceeding to implement the change.
+
+<h3>Pull Request</h3>
+
+1. <a href="https://help.github.com/articles/fork-a-repo/";>Fork</a> the Github 
repository at 
+<a href="http://github.com/apache/spark";>http://github.com/apache/spark</a> if 
you haven't already
+1. Clone your fork, create a new branch, push commits to the branch.
+1. Consider whether documentation or tests need to be added or updated as part 
of the change, 
+and add them as needed.
+1. Run all tests with `./dev/run-tests` to verify that the code still 
compiles, passes tests, and 
+passes style checks. If style checks fail, review the Code Style Guide below.
+1. <a href="https://help.github.com/articles/using-pull-requests/";>Open a pull 
request</a> against 
+the `master` branch of `apache/spark`. (Only in special cases would the PR be 
opened against other branches.)
+     1. The PR title should be of the form `[SPARK-xxxx][COMPONENT] Title`, 
where `SPARK-xxxx` is 
+     the relevant JIRA number, `COMPONENT `is one of the PR categories shown 
at 
+     <a href="https://spark-prs.appspot.com/";>spark-prs.appspot.com</a> and 
+     Title may be the JIRA's title or a more specific title describing the PR 
itself.
+     1. If the pull request is still a work in progress, and so is not ready 
to be merged, 
+     but needs to be pushed to Github to facilitate review, then add `[WIP]` 
after the component.
+     1. Consider identifying committers or other contributors who have worked 
on the code being 
+     changed. Find the file(s) in Github and click "Blame" to see a 
line-by-line annotation of 
+     who changed the code last. You can add `@username` in the PR description 
to ping them 
+     immediately.
+     1. Please state that the contribution is your original work and that you 
license the work 
+     to the project under the project's open source license.
+1. The related JIRA, if any, will be marked as "In Progress" and your pull 
request will 
+automatically be linked to it. There is no need to be the Assignee of the JIRA 
to work on it, 
+though you are welcome to comment that you have begun work.
+1. The Jenkins automatic pull request builder will test your changes
+     1. If it is your first contribution, Jenkins will wait for confirmation 
before building 
+     your code and post "Can one of the admins verify this patch?"
+     1. A committer can authorize testing with a comment like "ok to test"
+     1. A committer can automatically allow future pull requests from a 
contributor to be 
+     tested with a comment like "Jenkins, add to whitelist"
+1. After about 2 hours, Jenkins will post the results of the test to the pull 
request, along 
+with a link to the full results on Jenkins.
+1. Watch for the results, and investigate and fix failures promptly
+     1. Fixes can simply be pushed to the same branch from which you opened 
your pull request
+     1. Jenkins will automatically re-test when new commits are pushed
+     1. If the tests failed for reasons unrelated to the change (e.g. Jenkins 
outage), then a 
+     committer can request a re-test with "Jenkins, retest this please". 
+     Ask if you need a test restarted.
+
+<h3>The Review Process</h3>
+
+- Other reviewers, including committers, may comment on the changes and 
suggest modifications. 
+Changes can be added by simply pushing more commits to the same branch.
+- Lively, polite, rapid technical debate is encouraged from everyone in the 
community. The outcome 
+may be a rejection of the entire change.
+- Reviewers can indicate that a change looks suitable for merging with a 
comment such as: "I think 
+this patch looks good". Spark uses the LGTM convention for indicating the 
strongest level of 
+technical sign-off on a patch: simply comment with the word "LGTM". It 
specifically means: "I've 
+looked at this thoroughly and take as much ownership as if I wrote the patch 
myself". If you 
+comment LGTM you will be expected to help with bugs or follow-up issues on the 
patch. Consistent, 
+judicious use of LGTMs is a great way to gain credibility as a reviewer with 
the broader community.
+- Sometimes, other changes will be merged which conflict with your pull 
request's changes. The 
+PR can't be merged until the conflict is resolved. This can be resolved with 
`git fetch origin` 
+followed by `git merge origin/master` and resolving the conflicts by hand, 
then pushing the result 
+to your branch.
+- Try to be responsive to the discussion rather than let days pass between 
replies
+
+<h3>Closing Your Pull Request / JIRA</h3>
+
+- If a change is accepted, it will be merged and the pull request will 
automatically be closed, 
+along with the associated JIRA if any
+  - Note that in the rare case you are asked to open a pull request against a 
branch besides 
+  `master`, that you will actually have to close the pull request manually
+  - The JIRA will be Assigned to the primary contributor to the change as a 
way of giving credit. 
+  If the JIRA isn't closed and/or Assigned promptly, comment on the JIRA.
+- If your pull request is ultimately rejected, please close it promptly
+  - ... because committers can't close PRs directly
+  - Pull requests will be automatically closed by an automated process at 
Apache after about a 
+  week if a committer has made a comment like "mind closing this PR?" This 
means that the 
+  committer is specifically requesting that it be closed.
+- If a pull request has gotten little or no attention, consider improving the 
description or 
+the change itself and ping likely reviewers again after a few days. Consider 
proposing a 
+change that's easier to include, like a smaller and/or less invasive change.
+- If it has been reviewed but not taken up after weeks, after soliciting 
review from the 
+most relevant reviewers, or, has met with neutral reactions, the outcome may 
be considered a 
+"soft no". It is helpful to withdraw and close the PR in this case.
+- If a pull request is closed because it is deemed not the right approach to 
resolve a JIRA, 
+then leave the JIRA open. However if the review makes it clear that the issue 
identified in 
+the JIRA is not going to be resolved by any pull request (not a problem, won't 
fix) then also 
+resolve the JIRA.
+
+<a name="code-style-guide"></a>
+<h2>Code Style Guide</h2>
+
+Please follow the style of the existing codebase.
+
+- For Python code, Apache Spark follows 
+<a href="http://legacy.python.org/dev/peps/pep-0008/";>PEP 8</a> with one 
exception: 
+lines can be up to 100 characters in length, not 79.
+- For Java code, Apache Spark follows 
+<a 
href="http://www.oracle.com/technetwork/java/codeconvtoc-136057.html";>Oracle's 
Java code conventions</a>. 
+Many Scala guidelines below also apply to Java.
+- For Scala code, Apache Spark follows the official 
+<a href="http://docs.scala-lang.org/style/";>Scala style guide</a>, but with 
the following changes, below.
+
+<h3>Line Length</h3>
+
+Limit lines to 100 characters. The only exceptions are import statements 
(although even for 
+those, try to keep them under 100 chars).
+
+<h3>Indentation</h3>
+
+Use 2-space indentation in general. For function declarations, use 4 space 
indentation for its 
+parameters when they don't fit in a single line. For example:
+
+```scala
+// Correct:
+if (true) {
+  println("Wow!")
+}
+ 
+// Wrong:
+if (true) {
+    println("Wow!")
+}
+ 
+// Correct:
+def newAPIHadoopFile[K, V, F <: NewInputFormat[K, V]](
+    path: String,
+    fClass: Class[F],
+    kClass: Class[K],
+    vClass: Class[V],
+    conf: Configuration = hadoopConfiguration): RDD[(K, V)] = {
+  // function body
+}
+ 
+// Wrong
+def newAPIHadoopFile[K, V, F <: NewInputFormat[K, V]](
+  path: String,
+  fClass: Class[F],
+  kClass: Class[K],
+  vClass: Class[V],
+  conf: Configuration = hadoopConfiguration): RDD[(K, V)] = {
+  // function body
+}
+```
+
+<h3>Code documentation style</h3>
+
+For Scala doc / Java doc comment before classes, objects and methods, use Java 
docs style 
+instead of Scala docs style.
+
+```scala
+/** This is a correct one-liner, short description. */
+ 
+/**
+ * This is correct multi-line JavaDoc comment. And
+ * this is my second line, and if I keep typing, this would be
+ * my third line.
+ */
+ 
+/** In Spark, we don't use the ScalaDoc style so this
+  * is not correct.
+  */
+```
+ 
+For inline comment with the code, use `//` and not `/*  .. */`.
+
+```scala
+// This is a short, single line comment
+ 
+// This is a multi line comment.
+// Bla bla bla
+ 
+/*
+ * Do not use this style for multi line comments. This
+ * style of comment interferes with commenting out
+ * blocks of code, and also makes code comments harder
+ * to distinguish from Scala doc / Java doc comments.
+ */
+ 
+/**
+ * Do not use scala doc style for inline comments.
+ */
+```
+
+<h3>Imports</h3>
+
+Always import packages using absolute paths (e.g. `scala.util.Random`) instead 
of relative ones 
+(e.g. `util.Random`). In addition, sort imports in the following order 
+(use alphabetical order within each group):
+- `java.*` and `javax.*`
+- `scala.*`
+- Third-party libraries (`org.*`, `com.*`, etc)
+- Project classes (`org.apache.spark.*`)
+
+The <a href="https://plugins.jetbrains.com/plugin/7350";>IntelliJ import 
organizer plugin</a> 
+can organize imports for you. Use this configuration for the plugin 
(configured under 
+Preferences / Editor / Code Style / Scala Imports Organizer):
+
+```scala
+import java.*
+import javax.*
+ 
+import scala.*
+ 
+import *
+ 
+import org.apache.spark.*
+```
+
+<h3>Infix Methods</h3>
+
+Don't use infix notation for methods that aren't operators. For example, 
instead of 
+`list map func`, use `list.map(func)`, or instead of `string contains "foo"`, 
use 
+`string.contains("foo")`. This is to improve familiarity to developers coming 
from other languages.
+
+<h3>Curly Braces</h3>
+
+Put curly braces even around one-line `if`, `else` or loop statements. The 
only exception is if 
+you are using `if/else` as an one-line ternary operator.
+
+```scala
+// Correct:
+if (true) {
+  println("Wow!")
+}
+ 
+// Correct:
+if (true) statement1 else statement2
+ 
+// Wrong:
+if (true)
+  println("Wow!")
+```
+
+<h3>Return Types</h3>
+
+Always specify the return types of methods where possible. If a method has no 
return type, specify 
+`Unit` instead in accordance with the Scala style guide. Return types for 
variables are not 
+required unless the definition involves huge code blocks with potentially 
ambiguous return values.
+
+```scala
+// Correct:
+def getSize(partitionId: String): Long = { ... }
+def compute(partitionId: String): Unit = { ... }
+ 
+// Wrong:
+def getSize(partitionId: String) = { ... }
+def compute(partitionId: String) = { ... }
+def compute(partitionId: String) { ... }
+ 
+// Correct:
+val name = "black-sheep"
+val path: Option[String] =
+  try {
+    Option(names)
+      .map { ns => ns.split(",") }
+      .flatMap { ns => ns.filter(_.nonEmpty).headOption }
+      .map { n => "prefix" + n + "suffix" }
+      .flatMap { n => if (n.hashCode % 3 == 0) Some(n + n) else None }
+  } catch {
+    case e: SomeSpecialException =>
+      computePath(names)
+  }
+```
+
+<h3>If in Doubt</h3>
+
+If you're not sure about the right style for something, try to follow the 
style of the existing 
+codebase. Look at whether there are other examples in the code that use your 
feature. Feel free 
+to ask on the `d...@spark.apache.org` list as well.

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/documentation.md
----------------------------------------------------------------------
diff --git a/documentation.md b/documentation.md
index 0fa10c2..465f432 100644
--- a/documentation.md
+++ b/documentation.md
@@ -178,7 +178,7 @@ Slides, videos and EC2-based exercises from each of these 
are available online:
 
 <ul><li>
 The <a 
href="https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage";>Spark 
wiki</a> contains
-information for developers, such as architecture documents and how to <a 
href="https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark";>contribute</a>
 to Spark.
+information for developers, such as architecture documents and how to <a 
href="{{site.baseurl}}/contributing.html">">contribute</a> to Spark.
 </li></ul>
 
 <h3>Research Papers</h3>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/faq.md
----------------------------------------------------------------------
diff --git a/faq.md b/faq.md
index 8d048aa..7b2fa15 100644
--- a/faq.md
+++ b/faq.md
@@ -15,7 +15,7 @@ Spark is a fast and general processing engine compatible with 
Hadoop data. It ca
 
 <p class="question">Who is using Spark in production?</p>
 
-<p class="answer">As of 2016, surveys show that more than 1000 organizations 
are using Spark in production. Some of them are listed on the <a 
href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>Powered
 By page</a> and at the <a href="http://spark-summit.org";>Spark Summit</a>.</p>
+<p class="answer">As of 2016, surveys show that more than 1000 organizations 
are using Spark in production. Some of them are listed on the <a 
href="{{site.baseurl}}/powered-by.html">Powered By page</a> and at the <a 
href="http://spark-summit.org";>Spark Summit</a>.</p>
 
 
 <p class="question">How large a cluster can Spark scale to?</p>
@@ -67,7 +67,7 @@ Please also refer to our
 
 <p class="question">How can I contribute to Spark?</p>
 
-<p class="answer">See the <a 
href="https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark";>Contributing
 to Spark wiki</a> for more information.</p>
+<p class="answer">See the <a 
href="{{site.baseurl}}/contributing.html">Contributing to Spark wiki</a> for 
more information.</p>
 
 <p class="question">Where can I get more help?</p>
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/graphx/index.md
----------------------------------------------------------------------
diff --git a/graphx/index.md b/graphx/index.md
index a3aa8d2..dd283ef 100644
--- a/graphx/index.md
+++ b/graphx/index.md
@@ -87,7 +87,7 @@ subproject: GraphX
     </p>
     <p>
       GraphX is in the alpha stage and welcomes contributions. If you'd like 
to submit a change to GraphX,
-      read <a 
href="https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark";>how
 to
+      read <a href="{{site.baseurl}}/contributing.html">how to
       contribute to Spark</a> and send us a patch!
     </p>
   </div>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/index.md
----------------------------------------------------------------------
diff --git a/index.md b/index.md
index 14185d2..b20a4b0 100644
--- a/index.md
+++ b/index.md
@@ -130,9 +130,7 @@ navigation:
     <p>
       Spark is used at a wide range of organizations to process large datasets.
       You can find example use cases at the <a 
href="http://spark-summit.org/summit-2013/";>Spark Summit</a>
-      conference, or on the
-      <a 
href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>Powered
 By</a>
-      page.
+      conference, or on the <a href="{{site.baseurl}}/powered-by.html">Powered 
By</a> page.
     </p>
 
     <p>
@@ -156,14 +154,13 @@ navigation:
 
     <p>
       The project's
-      <a 
href="https://cwiki.apache.org/confluence/display/SPARK/Committers";>committers</a>
+      <a href="{{site.baseurl}}/committers.html">committers</a>
       come from 19 organizations.
     </p>
 
     <p>
       If you'd like to participate in Spark, or contribute to the libraries on 
top of it, learn
-      <a 
href="https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark";>how
 to
-        contribute</a>.
+      <a href="{{site.baseurl}}/contributing.html">how to contribute</a>.
     </p>
   </div>
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/mllib/index.md
----------------------------------------------------------------------
diff --git a/mllib/index.md b/mllib/index.md
index 61e65a8..9c43750 100644
--- a/mllib/index.md
+++ b/mllib/index.md
@@ -114,7 +114,7 @@ subproject: MLlib
     </p>
     <p>
       MLlib is still a rapidly growing project and welcomes contributions. If 
you'd like to submit an algorithm to MLlib,
-      read <a 
href="https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark";>how
 to
+      read <a href="{{site.baseurl}}/contributing.html">how to
       contribute to Spark</a> and send us a patch!
     </p>
   </div>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/news/_posts/2013-09-05-spark-user-survey-and-powered-by-page.md
----------------------------------------------------------------------
diff --git a/news/_posts/2013-09-05-spark-user-survey-and-powered-by-page.md 
b/news/_posts/2013-09-05-spark-user-survey-and-powered-by-page.md
index 8a06597..542610a 100644
--- a/news/_posts/2013-09-05-spark-user-survey-and-powered-by-page.md
+++ b/news/_posts/2013-09-05-spark-user-survey-and-powered-by-page.md
@@ -13,6 +13,6 @@ meta:
 ---
 As we continue developing Spark, we would love to get feedback from users and 
hear what you'd like us to work on next. We've decided that a good way to do 
that is a survey -- we hope to run this at regular intervals. If you have a few 
minutes to participate, <a 
href="https://docs.google.com/forms/d/1eMXp4GjcIXglxJe5vYYBzXKVm-6AiYt1KThJwhCjJiY/viewform";>fill
 in the survey here</a>. Your time is greatly appreciated.
 
-In parallel, we are starting a <a 
href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark";>"powered
 by" page</a> on the Apache Spark wiki for organizations that are using, or 
contributing to, Spark. Sign up if you'd like to support the project! This is a 
great way to let the world know you're using Spark, and can also be helpful to 
generate leads for recruiting. You can also add yourself when you fill the 
survey.
+In parallel, we are starting a <a 
href="{{site.baseurl}}/powered-by.html">"powered by" page</a> on the Apache 
Spark wiki for organizations that are using, or contributing to, Spark. Sign up 
if you'd like to support the project! This is a great way to let the world know 
you're using Spark, and can also be helpful to generate leads for recruiting. 
You can also add yourself when you fill the survey.
 
 Thanks for taking the time to give feedback.

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/powered-by.md
----------------------------------------------------------------------
diff --git a/powered-by.md b/powered-by.md
new file mode 100644
index 0000000..5ecfafb
--- /dev/null
+++ b/powered-by.md
@@ -0,0 +1,239 @@
+---
+layout: global
+title: Powered By Spark
+type: "page singular"
+navigation:
+  weight: 5
+  show: true
+---
+
+<h2>Project and Product names using "Spark"</h2>
+
+Organizations creating products and projects for use with Apache Spark, along 
with associated 
+marketing materials, should take care to respect the trademark in "Apache 
Spark" and its logo. 
+Please refer to <a href="http://www.apache.org/foundation/marks/";>ASF 
Trademarks Guidance</a> and 
+associated <a href="http://www.apache.org/foundation/marks/faq/";>FAQ</a> 
+for comprehensive and authoritative guidance on proper usage of ASF trademarks.
+
+Names that do not include "Spark" at all have no potential trademark issue 
with the Spark project. 
+This is recommended.
+
+Names like "Spark BigCoProduct" are not OK, as are names including "Spark" in 
general. 
+The above links, however, describe some exceptions, like for names such as 
"BigCoProduct, 
+powered by Apache Spark" or "BigCoProduct for Apache Spark".
+
+It is common practice to create software identifiers (Maven coordinates, 
module names, etc.) 
+like "spark-foo". These are permitted. Nominative use of trademarks in 
descriptions is also 
+always allowed, as in "BigCoProduct is a widget for Apache Spark".
+
+<h2>Companies and Organizations</h2>
+
+To add yourself to the list, please email `d...@spark.apache.org` with your 
organization name, URL, 
+a list of which Spark components you are using, and a short description of 
your use case.
+
+- <a href="http://amplab.cs.berkeley.edu";>UC Berkeley AMPLab</a> - Big data 
research lab that 
+initially launched Spark
+  - We're building a variety of open source projects on Spark
+  - We have both graduate students and a team of professional software 
engineers working on the stack
+- <a href="http://4quant.com";>4Quant</a>
+- <a href="http://www.actnowib.com";>Act Now</a>
+  - Spark powers NOW APPS, a big data, real-time, predictive analytics 
platform. We use Spark SQL, 
+  MLlib and GraphX components for both batch ETL and analytics applied to 
telecommunication data, 
+  providing faster and more meaningful insights and actionable data to the 
operators.
+- <a href="http://adatao.com";>Adatao, Inc.</a> - Data Intelligence for All
+  - Visual, Real-Time, Predictive Analytics on Spark+Hadoop, with built-in 
support for R, Python, 
+  SQL, and Natural Language.
+  - Team of ex-Googlers and Yahoos with large-scale infrastructure experience 
+  (including both flavors of MapReduce at Google and Yahoo) and PhD's in 
ML/Data Mining
+  - Determined that Spark, among the many alternatives, answered the right 
problem statements with 
+  the right design
+- <a href="http://www.agilelab.it";>Agile Lab</a>
+  - enhancing big data. 360 customer view, log analysis, BI
+- <a href="http://www.taobao.com/";>Alibaba Taobao</a>
+  - We built one of the world's first Spark on YARN production clusters.
+  - See our blog posts (in Chinese) about Spark at Taobao: 
+  <a 
href="http://rdc.taobao.org/?tag=spark";>http://rdc.taobao.org/?tag=spark</a>
+- <a href="http://alpinenow.com/";>Alpine Data Labs</a>
+- <a href="http://amazon.com";>Amazon</a>
+- <a href="http://www.amrita.edu/cyber/";>Amrita Center for Cyber Security 
Systems and Networks</a>
+- <a href="http://www.art.com/";>Art.com</a>
+  - Trending analytics and personalization
+- <a href="http://www.asiainfo.com";>AsiaInfo</a>
+  - We are using Spark Core, Streaming, MLlib and Graphx. We leverage Spark 
and Hadoop ecosystem 
+  to build cost effective data center solution for our customer in telco 
industry as well as 
+  other industrial sectors.
+- <a href="http://www.atigeo.com";>Atigeo</a> – integrated Spark in 
xPatterns, our big data 
+analytics platform, as a replacement for Hadoop MR
+- <a href="https://atp.io";>atp</a>
+  - Predictive models and learning algorithms to improve the relevance of 
programmatic marketing.
+  - Components used: Spark SQL, MLLib.
+- <a href="http://www.autodesk.com";>Autodesk</a>
+- <a href="http://www.baidu.com";>Baidu</a>
+- <a href="http://www.bakdata.com/";>Bakdata</a> – using Spark (and Shark) to 
perform interactive 
+exploration of large datasets
+- <a href="http://http//www.bigindustries.be/";>Big Industries</a> - using 
Spark Streaming: The 
+Big Content Platform is a business-to-business content asset management 
service providing a 
+searchable, aggregated source of live news feeds, public domain media and 
archives of content.
+- <a href="http://www.bizo.com";>Bizo</a>
+  - Check out our talk on <a 
href="http://www.meetup.com/spark-users/events/139804022/";>Spark at Bizo</a> 
+  at Spark user meetup
+- <a href="http://www.celtra.com";>Celtra</a>
+- <a href="http://www.clearstorydata.com";>ClearStory Data</a> – ClearStory's 
platform and 
+integrated Data Intelligence application leverages Spark to speed analysis 
across internal 
+and external data sources, driving holistic and actionable insights.
+- <a href="https://www.concur.com";>Concur</a>
+  - Spark SQL, MLlib
+  - Using Spark for travel and expenses analytics and personalization<
+- <a href="http://www.contentsquare.com";>Content Square</a>
+  - We use Spark to regularly read raw data, convert them into Parquet, and 
process them to 
+  create advanced analytics dashboards: aggregation, sampling, statistics 
computations, 
+  anomaly detection, machine learning.
+- <a href="http://www.conviva.com";>Conviva</a> – Experience Live
+  - See our talk at <a href="http://ampcamp.berkeley.edu/3/";>AmpCamp</a> on 
how we are 
+  <a 
href="http://www.youtube.com/watch?feature=player_detailpage&v=YaayAatdRNs";>using
 Spark to 
+  provide real time video optimization</a>
+- <a href="https://www.creditkarma.com/";>Credit Karma</a>
+  - We create personalized experiences using Spark.
+- <a href="http://databricks.com";>Databricks</a>
+  - Formed by the creators of Apache Spark and Shark, Databricks is working to 
greatly expand these 
+  open source projects and transform big data analysis in the process. We're 
deeply committed to 
+  keeping all work on these systems open source.
+  - We provided a hosted service to run Spark, 
+  <a href="http://www.databricks.com/cloud";>Databricks Cloud</a>, and partner 
to 
+  <a href="http://databricks.com/support/";>support Apache Spark</a> with other 
Hadoop and big 
+  data companies.
+- <a href="http://dianping.com";>Dianping.com</a>
+- <a href="http://www.digby.com";>Digby</a>
+- <a href="http://www.drawbrid.ge/";>Drawbridge</a>
+- <a href="http://www.ebay.com/";>eBay Inc.</a>
+  - Using Spark core for log transaction aggregation and analytics
+- <a href="http://labs.elsevier.com";>Elsevier Labs</a>
+  - Use Case: Building Machine Reading Pipeline, Knowledge Graphs, Content as 
a Service, Content 
+  and Event Analytics, Content/Event based Predictive Models and Big Data 
Processing.
+  - We use Scala and Python over Databricks Notebooks for most of our work.
+- <a href="http://www.eurecom.fr/en";>EURECOM</a>
+- <a href="http://www.exabeam.com";>Exabeam</a>
+- <a href="http://www.faimdata.com/";>Faimdata</a>
+  - Build eCommerce and data intelligence solutions to the retail industry on 
top of 
+  Spark/Shark/Spark Streaming
+- <a href="http://falkonry.com";>Falkonry</a>
+- <a href="http://www.flytxt.com";>Flytxt</a>
+  - Big Data analytics for subscriber profiling and personalization in 
telecommunications domain. 
+  We are using Spark Core and MLlib.
+- <a href="http://www.jeremyfreeman.net";>Freeman Lab at HHMI</a>
+  - We are using Spark for analyzing and visualizing patterns in large-scale 
recordings of brain 
+  activity in real time
+- <a href="http://www.fundacionctic.org";>Fundacion CTIC</a>
+- <a href="http://graphflow.com";>GraphFlow, Inc.</a>
+- <a 
href="http://www.groupon.com/app/subscriptions/new_zip?division_p=san-francisco";>Groupon</a>
+- <a href="http://www.guavus.com/";>Guavus</a>
+  - Stream processing of network machine data
+- <a href="http://www.hitachi-solutions.com/";>Hitachi Solutions</a>
+- <a href="http://hivedata.com/";>The Hive</a>
+- <a href="http://www.research.ibm.com/labs/almaden/index.shtml";>IBM 
Almaden</a>
+- <a href="http://www.infoobjects.com";>InfoObjects</a>
+  - Award winning Big Data consulting company with focus on Spark and Hadoop
+- <a href="http://en.inspur.com";>Inspur</a>
+- <a href="http://www.sehir.edu.tr/en/";>Istanbul Sehir University</a>
+- <a href="http://www.kenshoo.com/";>Kenshoo</a>
+  - Digital marketing solutions and predictive media optimization
+- <a href="http://www.kelkoo.co.uk";>Kelkoo</a>
+  - Using Spark Core, SQL, and Streaming. Product recommendations, BI and 
analytics, 
+  real-time malicious activity filtering, and data mining.
+- <a href="http://www.knoldus.com";>Knoldus Software LLC</a>
+- <a href="http://eng.localytics.com";>Localytics</a>
+  - Batch, real-time, and predictive analytics driving our mobile app 
analytics and marketing 
+  automation product.
+  - Components used: Spark, Spark Streaming, MLLib.
+- <a href="http://magine.com";>Magine TV</a>
+- <a href="http://mediacrossing.com";>MediaCrossing</a> – Digital Media 
Trading Experts in the 
+New York and Boston areas
+  - We are using Spark as a drop-in replacement for Hadoop Map/Reduce to get 
the right answer 
+  to our queries in a much shorter amount of time.
+- <a href="http://www.myfitnesspal.com/";>MyFitnessPal</a>
+  - Using Spark to clean-up user entered food data using both explicit and 
implicit user signals 
+  with the final goal of identifying high-quality food items.
+  - Using Spark to build different recommendation systems for recipes and 
foods.
+- <a href="http://deepspace.jpl.nasa.gov/";>NASA JPL - Deep Space Network</a>
+- <a href="http://www.163.com/";>Netease</a>
+- <a href="http://www.nflabs.com";>NFLabs</a>
+- <a href="http://nsn.com";>Nokia Solutions and Networks</a>
+- <a href="http://www.nttdata.com/global/en/";>NTT DATA</a>
+- <a href="http://www.nubetech.co";>Nube Technologies</a>
+  - Nube provides solutions for data curation at scale helping customer 
targeting, accurate 
+  inventory and efficient analysis.
+- <a href="http://ooyala.com";>Ooyala, Inc.</a> – Powering personalized video 
experiences 
+across all screens
+  - See our blog post on how we use 
+  <a 
href="http://engineering.ooyala.com/blog/fast-spark-queries-memory-datasets";>Spark
 for 
+  Fast Queries</a>
+  - See our presentation on 
+  <a 
href="http://www.slideshare.net/EvanChan2/cassandra2013-spark-talk-final";>Cassandra,
 Spark, 
+  and Shark</a>
+- <a href="http://www.opentable.com/";>Opentable</a>
+  - Using Apache Spark for log processing and ETL. The data obtained feeds the 
recommender 
+  system powered by Spark MLLIB Matrix Factorization. We are evaluating the 
use of Spark 
+  Streaming for real-time analytics.
+- <a href="http://pantera.io";>PanTera</a>
+  - PanTera is a tool for exploring large datasets. It uses Spark to create XY 
and geographic 
+  scatterplots from millions to billions of datapoints.
+  - Components we are using: Spark Core (Scala API), Spark SQL, and GraphX
+- <a href="http://www.peerialism.com";>Peerialism</a>
+- <a href="http://www.planbmedia.com";>PlanBMedia</a>
+- <a href="http://prediction.io/";>PredicitionIo</a> - PredictionIO currently 
offers two engine 
+templates for Apache Spark MLlib for recommendation (MLlib ALS) and 
classification (MLlib Naive 
+Bayes). With these templates, you can create a custom predictive engine for 
production deployment 
+efficiently.
+- <a href="http://premise.com";>Premise</a>
+- <a href="http://www.quantifind.com";>Quantifind</a>
+- <a href="http://radius.com";>Radius Intelligence</a>
+  - Using Scala, Spark and MLLib for Radius Marketing and Sales intelligence 
platform including 
+  data aggregation, data processing, data clustering, data analysis and 
predictive modeling of all 
+  US businesses.
+- <a href="http://www.realimpactanalytics.com/";>Real Impact Analytics</a>
+  - Building large scale analytics platforms for telecoms operators
+- <a href="http://rocketfuel.com/";>RocketFuel</a>
+- <a href="http://www.rondhuit.com/";>RONDHUIT</a>
+  - Machine Learning with Apache Mahout and Spark 
+  <a 
href="http://www.rondhuit.com/services/training/mahout-ML.html";>http://www.rondhuit.com/services/training/mahout-ML.html</a>
+- <a href="http://www.sailthru.com/";>Sailthru</a>
+  - Uses Spark to build predictive models and recommendation systems for 
marketing automation 
+  and personalization.
+- <a href="http://www.sisa.samsung.com/";>Samsung Research America</a>
+- <a href="http://www.shopify.com/";>Shopify</a>
+- <a href="http://www.simba.com/";>Simba Technologies</a>
+  - BI/reporting/ETL for Spark and beyond
+- <a href="http://www.sinnia.com";>Sinnia</a>
+- <a href="http://www.sktelecom.com/en/main/index.do";>SK Telecom</a>
+  - SK Telecom analyses mobile usage patterns of customer with Spark and Shark.
+- <a href="http://socialmetrix.com/";>Socialmetrix</a>
+- <a href="http://www.sohu.com";>Sohu</a>
+- <a href="http://www.stratio.com/";>Stratio</a>
+  - Offers an open-source Big Data platform centered around Apache Spark.
+- <a href="https://www.taboola.com/";>Taboola</a> – Powering 'Content You May 
Like' around the web
+- <a href="http://www.techbase.com.tr";>Techbase</a>
+- <a href="http://tencent.com/";>Tencent</a>
+- <a href="http://www.tetraconcepts.com/";>Tetra Concepts</a>
+- <a href="http://www.trendmicro.com/us/index.html";>TrendMicro</a>
+- <a 
href="http://engineering.tripadvisor.com/using-apache-spark-for-massively-parallel-nlp/";>TripAdvisor</a>
+- <a href="http://truedash.io";>truedash</a>
+  - Automatic pulling of all your data in to Spark for enterprise 
visualisation, predictive 
+  analytics and data exploration at a low cost.
+- <a href="http://www.trueffect.com";>TruEffect Inc</a>
+- <a href="http://www.tuplejump.com";>Tuplejump</a>
+  - Software development partners for Apache Spark and Cassandra projects
+- <a href="http://www.ucsc.edu";>UC Santa Cruz</a>
+- <a href="http://missouri.edu/";>University of Missouri Data Analytics and 
Discover Lab</a>
+- <a href="http://videoamp.com/";>VideoAmp</a>
+  - Intelligent video ads for online and television viewing audiences.
+- <a href="http://www.vistarmedia.com";>Vistar Media</a>
+  - Location technology company enabling brands to reach on-the-go consumers
+- <a href="http://www.yahoo.com";>Yahoo!</a>
+- <a href="http://www.yandex.com";>Yandex</a>
+  - Using Spark in 
+  <a 
href="http://www.searchenginejournal.com/yandex-islands-markup-issues-implementation/71891/";>Yandex
 Islands</a>, 
+  to process islands identified from a search robor
+- <a href="http://www.zaloni.com/products/";>Zaloni</a>
+  - Zaloni's data lake management platform (Bedrock) and self-service data 
preparation solution 
+  (Mica) leverage Spark for fast execution of transformations and data 
exploration.
+  
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/committers.html
----------------------------------------------------------------------
diff --git a/site/committers.html b/site/committers.html
new file mode 100644
index 0000000..bad4414
--- /dev/null
+++ b/site/committers.html
@@ -0,0 +1,518 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+  <title>
+     Committers | Apache Spark
+    
+  </title>
+
+  
+
+  
+
+  <!-- Bootstrap core CSS -->
+  <link href="/css/cerulean.min.css" rel="stylesheet">
+  <link href="/css/custom.css" rel="stylesheet">
+
+  <!-- Code highlighter CSS -->
+  <link href="/css/pygments-default.css" rel="stylesheet">
+
+  <script type="text/javascript">
+  <!-- Google Analytics initialization -->
+  var _gaq = _gaq || [];
+  _gaq.push(['_setAccount', 'UA-32518208-2']);
+  _gaq.push(['_trackPageview']);
+  (function() {
+    var ga = document.createElement('script'); ga.type = 'text/javascript'; 
ga.async = true;
+    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 
'http://www') + '.google-analytics.com/ga.js';
+    var s = document.getElementsByTagName('script')[0]; 
s.parentNode.insertBefore(ga, s);
+  })();
+
+  <!-- Adds slight delay to links to allow async reporting -->
+  function trackOutboundLink(link, category, action) {
+    try {
+      _gaq.push(['_trackEvent', category , action]);
+    } catch(err){}
+
+    setTimeout(function() {
+      document.location.href = link.href;
+    }, 100);
+  }
+  </script>
+
+  <!-- HTML5 shim and Respond.js IE8 support of HTML5 elements and media 
queries -->
+  <!--[if lt IE 9]>
+  <script 
src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js";></script>
+  <script 
src="https://oss.maxcdn.com/libs/respond.js/1.3.0/respond.min.js";></script>
+  <![endif]-->
+</head>
+
+<body>
+
+<script src="https://code.jquery.com/jquery.js";></script>
+<script 
src="https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js";></script>
+<script src="/js/lang-tabs.js"></script>
+<script src="/js/downloads.js"></script>
+
+<div class="container" style="max-width: 1200px;">
+
+<div class="masthead">
+  
+    <p class="lead">
+      <a href="/">
+      <img src="/images/spark-logo-trademark.png"
+        style="height:100px; width:auto; vertical-align: bottom; margin-top: 
20px;"></a><span class="tagline">
+          Lightning-fast cluster computing
+      </span>
+    </p>
+  
+</div>
+
+<nav class="navbar navbar-default" role="navigation">
+  <!-- Brand and toggle get grouped for better mobile display -->
+  <div class="navbar-header">
+    <button type="button" class="navbar-toggle" data-toggle="collapse"
+            data-target="#navbar-collapse-1">
+      <span class="sr-only">Toggle navigation</span>
+      <span class="icon-bar"></span>
+      <span class="icon-bar"></span>
+      <span class="icon-bar"></span>
+    </button>
+  </div>
+
+  <!-- Collect the nav links, forms, and other content for toggling -->
+  <div class="collapse navbar-collapse" id="navbar-collapse-1">
+    <ul class="nav navbar-nav">
+      <li><a href="/downloads.html">Download</a></li>
+      <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">
+          Libraries <b class="caret"></b>
+        </a>
+        <ul class="dropdown-menu">
+          <li><a href="/sql/">SQL and DataFrames</a></li>
+          <li><a href="/streaming/">Spark Streaming</a></li>
+          <li><a href="/mllib/">MLlib (machine learning)</a></li>
+          <li><a href="/graphx/">GraphX (graph)</a></li>
+          <li class="divider"></li>
+          <li><a href="/third-party-projects.html">Third-Party 
Projects</a></li>
+        </ul>
+      </li>
+      <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">
+          Documentation <b class="caret"></b>
+        </a>
+        <ul class="dropdown-menu">
+          <li><a href="/docs/latest/">Latest Release (Spark 2.0.2)</a></li>
+          <li><a href="/documentation.html">Older Versions and Other 
Resources</a></li>
+        </ul>
+      </li>
+      <li><a href="/examples.html">Examples</a></li>
+      <li class="dropdown">
+        <a href="/community.html" class="dropdown-toggle" 
data-toggle="dropdown">
+          Community <b class="caret"></b>
+        </a>
+        <ul class="dropdown-menu">
+          <li><a href="/community.html#mailing-lists">Mailing Lists</a></li>
+          <li><a href="/contributing.html">Contributing to Spark</a></li>
+          <li><a href="https://issues.apache.org/jira/browse/SPARK";>Issue 
Tracker</a></li>
+          <li><a href="/community.html#events">Events and Meetups</a></li>
+          <li><a href="/community.html#history">Project History</a></li>
+          <li><a href="/powered-by.html">Powered By</a></li>
+          <li><a href="/committers.html">Project Committers</a></li>
+        </ul>
+      </li>
+      <li><a href="/faq.html">FAQ</a></li>
+    </ul>
+    <ul class="nav navbar-nav navbar-right">
+      <li class="dropdown">
+        <a href="http://www.apache.org/"; class="dropdown-toggle" 
data-toggle="dropdown">
+          Apache Software Foundation <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+          <li><a href="http://www.apache.org/";>Apache Homepage</a></li>
+          <li><a href="http://www.apache.org/licenses/";>License</a></li>
+          <li><a 
href="http://www.apache.org/foundation/sponsorship.html";>Sponsorship</a></li>
+          <li><a 
href="http://www.apache.org/foundation/thanks.html";>Thanks</a></li>
+          <li><a href="http://www.apache.org/security/";>Security</a></li>
+        </ul>
+      </li>
+    </ul>
+  </div>
+  <!-- /.navbar-collapse -->
+</nav>
+
+
+<div class="row">
+  <div class="col-md-3 col-md-push-9">
+    <div class="news" style="margin-bottom: 20px;">
+      <h5>Latest News</h5>
+      <ul class="list-unstyled">
+        
+          <li><a href="/news/spark-wins-cloudsort-100tb-benchmark.html">Spark 
wins CloudSort Benchmark as the most efficient engine</a>
+          <span class="small">(Nov 15, 2016)</span></li>
+        
+          <li><a href="/news/spark-2-0-2-released.html">Spark 2.0.2 
released</a>
+          <span class="small">(Nov 14, 2016)</span></li>
+        
+          <li><a href="/news/spark-1-6-3-released.html">Spark 1.6.3 
released</a>
+          <span class="small">(Nov 07, 2016)</span></li>
+        
+          <li><a href="/news/spark-2-0-1-released.html">Spark 2.0.1 
released</a>
+          <span class="small">(Oct 03, 2016)</span></li>
+        
+      </ul>
+      <p class="small" style="text-align: right;"><a 
href="/news/index.html">Archive</a></p>
+    </div>
+    <div class="hidden-xs hidden-sm">
+      <a href="/downloads.html" class="btn btn-success btn-lg btn-block" 
style="margin-bottom: 30px;">
+        Download Spark
+      </a>
+      <p style="font-size: 16px; font-weight: 500; color: #555;">
+        Built-in Libraries:
+      </p>
+      <ul class="list-none">
+        <li><a href="/sql/">SQL and DataFrames</a></li>
+        <li><a href="/streaming/">Spark Streaming</a></li>
+        <li><a href="/mllib/">MLlib (machine learning)</a></li>
+        <li><a href="/graphx/">GraphX (graph)</a></li>
+      </ul>
+      <a href="/third-party-projects.html">Third-Party Projects</a>
+    </div>
+  </div>
+
+  <div class="col-md-9 col-md-pull-3">
+    <h2>Current Committers</h2>
+
+<table>
+  <thead>
+    <tr>
+      <th>Name</th>
+      <th>Organization</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>Michael Armbrust</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Joseph Bradley</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Felix Cheung</td>
+      <td>Automattic</td>
+    </tr>
+    <tr>
+      <td>Mosharaf Chowdhury</td>
+      <td>University of Michigan, Ann Arbor</td>
+    </tr>
+    <tr>
+      <td>Jason Dai</td>
+      <td>Intel</td>
+    </tr>
+    <tr>
+      <td>Tathagata Das</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Ankur Dave</td>
+      <td>UC Berkeley</td>
+    </tr>
+    <tr>
+      <td>Aaron Davidson</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Thomas Dudziak</td>
+      <td>Facebook</td>
+    </tr>
+    <tr>
+      <td>Robert Evans</td>
+      <td>Yahoo!</td>
+    </tr>
+    <tr>
+      <td>Wenchen Fan</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Joseph Gonzalez</td>
+      <td>UC Berkeley</td>
+    </tr>
+    <tr>
+      <td>Thomas Graves</td>
+      <td>Yahoo!</td>
+    </tr>
+    <tr>
+      <td>Stephen Haberman</td>
+      <td>Bizo</td>
+    </tr>
+    <tr>
+      <td>Mark Hamstra</td>
+      <td>ClearStory Data</td>
+    </tr>
+    <tr>
+      <td>Herman van Hovell</td>
+      <td>QuestTec B.V.</td>
+    </tr>
+    <tr>
+      <td>Yin Huai</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Shane Huang</td>
+      <td>Intel</td>
+    </tr>
+    <tr>
+      <td>Andy Konwinski</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Ryan LeCompte</td>
+      <td>Quantifind</td>
+    </tr>
+    <tr>
+      <td>Haoyuan Li</td>
+      <td>Alluxio, UC Berkeley</td>
+    </tr>
+    <tr>
+      <td>Xiao Li</td>
+      <td>IBM</td>
+    </tr>
+    <tr>
+      <td>Davies Liu</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Cheng Lian</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Yanbo Liang</td>
+      <td>Hortonworks</td>
+    </tr>
+    <tr>
+      <td>Sean McNamara</td>
+      <td>Webtrends</td>
+    </tr>
+    <tr>
+      <td>Xiangrui Meng</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Mridul Muralidharam</td>
+      <td>Hortonworks</td>
+    </tr>
+    <tr>
+      <td>Andrew Or</td>
+      <td>Princeton University</td>
+    </tr>
+    <tr>
+      <td>Kay Ousterhout</td>
+      <td>UC Berkeley</td>
+    </tr>
+    <tr>
+      <td>Sean Owen</td>
+      <td>Cloudera</td>
+    </tr>
+    <tr>
+      <td>Nick Pentreath</td>
+      <td>IBM</td>
+    </tr>
+    <tr>
+      <td>Imran Rashid</td>
+      <td>Cloudera</td>
+    </tr>
+    <tr>
+      <td>Charles Reiss</td>
+      <td>UC Berkeley</td>
+    </tr>
+    <tr>
+      <td>Josh Rosen</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Sandy Ryza</td>
+      <td>Clover Health</td>
+    </tr>
+    <tr>
+      <td>Kousuke Saruta</td>
+      <td>NTT Data</td>
+    </tr>
+    <tr>
+      <td>Prashant Sharma</td>
+      <td>IBM</td>
+    </tr>
+    <tr>
+      <td>Ram Sriharsha</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>DB Tsai</td>
+      <td>Netflix</td>
+    </tr>
+    <tr>
+      <td>Marcelo Vanzin</td>
+      <td>Cloudera</td>
+    </tr>
+    <tr>
+      <td>Shivaram Venkataraman</td>
+      <td>UC Berkeley</td>
+    </tr>
+    <tr>
+      <td>Patrick Wendell</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Andrew Xia</td>
+      <td>Alibaba</td>
+    </tr>
+    <tr>
+      <td>Reynold Xin</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Matei Zaharia</td>
+      <td>Databricks, Stanford</td>
+    </tr>
+    <tr>
+      <td>Shixiong Zhu</td>
+      <td>Databricks</td>
+    </tr>
+  </tbody>
+</table>
+
+<h3>Becoming a Committer</h3>
+
+<p>To get started contributing to Spark, learn 
+<a href="/contributing.html">how to contribute</a> – 
+anyone can submit patches, documentation and examples to the project.</p>
+
+<p>The PMC regularly adds new committers from the active contributors, based 
on their contributions 
+to Spark. The qualifications for new committers include:</p>
+
+<ol>
+  <li>Sustained contributions to Spark: Committers should have a history of 
major contributions to 
+Spark. An ideal committer will have contributed broadly throughout the 
project, and have 
+contributed at least one major component where they have taken an 
&#8220;ownership&#8221; role. An ownership 
+role means that existing contributors feel that they should run patches for 
this component by 
+this person.</li>
+  <li>Quality of contributions: Committers more than any other community 
member should submit simple, 
+well-tested, and well-designed patches. In addition, they should show 
sufficient expertise to be 
+able to review patches, including making sure they fit within Spark&#8217;s 
engineering practices 
+(testability, documentation, API stability, code style, etc). The 
committership is collectively 
+responsible for the software quality and maintainability of Spark.</li>
+  <li>Community involvement: Committers should have a constructive and 
friendly attitude in all 
+community interactions. They should also be active on the dev and user list 
and help mentor 
+newer contributors and users. In design discussions, committers should 
maintain a professional 
+and diplomatic approach, even in the face of disagreement.</li>
+</ol>
+
+<p>The type and level of contributions considered may vary by project area 
&#8211; for example, we 
+greatly encourage contributors who want to work on mainly the documentation, 
or mainly on 
+platform support for specific OSes, storage systems, etc.</p>
+
+<h3>Review Process</h3>
+
+<p>All contributions should be reviewed before merging as described in 
+<a href="/contributing.html">Contributing to Spark</a>. 
+In particular, if you are working on an area of the codebase you are 
unfamiliar with, look at the 
+Git history for that code to see who reviewed patches before. You can do this 
using 
+<code>git log --format=full &lt;filename&gt;</code>, by examining the 
&#8220;Commit&#8221; field to see who committed each patch.</p>
+
+<h3>How to Merge a Pull Request</h3>
+
+<p>Changes pushed to the master branch on Apache cannot be removed; that is, 
we can&#8217;t force-push to 
+it. So please don&#8217;t add any test commits or anything like that, only 
real patches.</p>
+
+<p>All merges should be done using the 
+<a 
href="https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py";>dev/merge_spark_pr.py</a>
 
+script, which squashes the pull request&#8217;s changes into one commit. To 
use this script, you 
+will need to add a git remote called &#8220;apache&#8221; at 
https://git-wip-us.apache.org/repos/asf/spark.git, 
+as well as one called &#8220;apache-github&#8221; at 
<code>git://github.com/apache/spark</code>. For the <code>apache</code> repo, 
+you can authenticate using your ASF username and password. Ask Patrick if you 
have trouble with 
+this or want help doing your first merge.</p>
+
+<p>The script is fairly self explanatory and walks you through steps and 
options interactively.</p>
+
+<p>If you want to amend a commit before merging – which should be used for 
trivial touch-ups – 
+then simply let the script wait at the point where it asks you if you want to 
push to Apache. 
+Then, in a separate window, modify the code and push a commit. Run <code>git 
rebase -i HEAD~2</code> and 
+&#8220;squash&#8221; your new commit. Edit the commit message just after to 
remove your commit message. 
+You can verify the result is one change with <code>git log</code>. Then resume 
the script in the other window.</p>
+
+<p>Also, please remember to set Assignee on JIRAs where applicable when they 
are resolved. The script 
+can&#8217;t do this automatically.</p>
+
+<!--
+<h3>Minimize use of MINOR, BUILD, and HOTFIX with no JIRA</h3>
+
+From pwendell at 
https://www.mail-archive.com/dev@spark.apache.org/msg09565.html:
+It would be great if people could create JIRA's for any and all merged pull 
requests. The reason is 
+that when patches get reverted due to build breaks or other issues, it is very 
difficult to keep 
+track of what is going on if there is no JIRA. 
+Here is a list of 5 patches we had to revert recently that didn't include a 
JIRA:
+    Revert "[MINOR] [BUILD] Use custom temp directory during build."
+    Revert "[SQL] [TEST] [MINOR] Uses a temporary log4j.properties in 
HiveThriftServer2Test to ensure expected logging behavior"
+    Revert "[BUILD] Always run SQL tests in master build."
+    Revert "[MINOR] [CORE] Warn users who try to cache RDDs with dynamic 
allocation on."
+    Revert "[HOT FIX] [YARN] Check whether `/lib` exists before listing its 
files"
+
+The cost overhead of creating a JIRA relative to other aspects of development 
is very small. 
+If it's really a documentation change or something small, that's okay.
+
+But anything affecting the build, packaging, etc. These all need to have a 
JIRA to ensure that 
+follow-up can be well communicated to all Spark developers.
+-->
+
+<h3>Policy on Backporting Bug Fixes</h3>
+
+<p>From <a 
href="https://www.mail-archive.com/dev@spark.apache.org/msg10284.html";><code>pwendell</code></a>:</p>
+
+<p>The trade off when backporting is you get to deliver the fix to people 
running older versions 
+(great!), but you risk introducing new or even worse bugs in maintenance 
releases (bad!). 
+The decision point is when you have a bug fix and it&#8217;s not clear whether 
it is worth backporting.</p>
+
+<p>I think the following facets are important to consider:</p>
+<ul>
+  <li>Backports are an extremely valuable service to the community and should 
be considered for 
+any bug fix.</li>
+  <li>Introducing a new bug in a maintenance release must be avoided at all 
costs. It over time would 
+erode confidence in our release process.</li>
+  <li>Distributions or advanced users can always backport risky patches on 
their own, if they see fit.</li>
+</ul>
+
+<p>For me, the consequence of these is that we should backport in the 
following situations:</p>
+<ul>
+  <li>Both the bug and the fix are well understood and isolated. Code being 
modified is well tested.</li>
+  <li>The bug being addressed is high priority to the community.</li>
+  <li>The backported fix does not vary widely from the master branch fix.</li>
+</ul>
+
+<p>We tend to avoid backports in the converse situations:</p>
+<ul>
+  <li>The bug or fix are not well understood. For instance, it relates to 
interactions between complex 
+components or third party libraries (e.g. Hadoop libraries). The code is not 
well tested outside 
+of the immediate bug being fixed.</li>
+  <li>The bug is not clearly a high priority for the community.</li>
+  <li>The backported fix is widely different from the master branch fix.</li>
+</ul>
+
+  </div>
+</div>
+
+
+
+<footer class="small">
+  <hr>
+  Apache Spark, Spark, Apache, and the Spark logo are <a 
href="/trademarks.html">trademarks</a> of
+  <a href="http://www.apache.org";>The Apache Software Foundation</a>.
+</footer>
+
+</div>
+
+</body>
+</html>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to