This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 0f40d122e1 Recover documentation mistakenly removed in 4.1.0-preview2
release
0f40d122e1 is described below
commit 0f40d122e1e775ec734af72de5f557a3d1ef5895
Author: Hyukjin Kwon <[email protected]>
AuthorDate: Fri Jan 9 06:38:36 2026 +0900
Recover documentation mistakenly removed in 4.1.0-preview2 release
https://github.com/apache/spark-website/commit/e94b6c4786da5ba6c06af6320585b84fa889da7f
mistakenly removed this page. This PR recovers it.
Author: Hyukjin Kwon <[email protected]>
Closes #658 from HyukjinKwon/recover-docs.
---
documentation.md | 228 ++++++++++++++++++++++++++++++++++++++++++++++++
site/documentation.html | 226 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 454 insertions(+)
diff --git a/documentation.md b/documentation.md
index e82de21887..8d23c599dc 100644
--- a/documentation.md
+++ b/documentation.md
@@ -14,3 +14,231 @@ navigation:
<ul>
<li><a href="{{site.baseurl}}/docs/">Spark </a></li>
<li><a href="{{site.baseurl}}/docs/4.1.0/">Spark 4.1.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/4.0.1/">Spark 4.0.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/4.0.0/">Spark 4.0.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.5.7/">Spark 3.5.7</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.5.6/">Spark 3.5.6</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.5.5/">Spark 3.5.5</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.5.4/">Spark 3.5.4</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.5.3/">Spark 3.5.3</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.5.2/">Spark 3.5.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.5.1/">Spark 3.5.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.5.0/">Spark 3.5.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.4.4/">Spark 3.4.4</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.4.3/">Spark 3.4.3</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.4.2/">Spark 3.4.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.4.1/">Spark 3.4.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.4.0/">Spark 3.4.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.3.4/">Spark 3.3.4</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.3.3/">Spark 3.3.3</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.3.2/">Spark 3.3.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.3.1/">Spark 3.3.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.3.0/">Spark 3.3.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.2.4/">Spark 3.2.4</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.2.3/">Spark 3.2.3</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.2.2/">Spark 3.2.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.2.1/">Spark 3.2.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.2.0/">Spark 3.2.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.1.3/">Spark 3.1.3</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.1.2/">Spark 3.1.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.1.1/">Spark 3.1.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.0.3/">Spark 3.0.3</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.0.2/">Spark 3.0.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.0.1/">Spark 3.0.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.0.0/">Spark 3.0.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.4.8/">Spark 2.4.8</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.4.7/">Spark 2.4.7</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.4.6/">Spark 2.4.6</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.4.5/">Spark 2.4.5</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.4.4/">Spark 2.4.4</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.4.3/">Spark 2.4.3</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.4.2/">Spark 2.4.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.4.1/">Spark 2.4.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.4.0/">Spark 2.4.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.3.4/">Spark 2.3.4</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.3.3/">Spark 2.3.3</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.3.2/">Spark 2.3.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.3.1/">Spark 2.3.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.3.0/">Spark 2.3.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.2.3/">Spark 2.2.3</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.2.2/">Spark 2.2.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.2.1/">Spark 2.2.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.2.0/">Spark 2.2.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.1.3/">Spark 2.1.3</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.1.2/">Spark 2.1.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.1.1/">Spark 2.1.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.1.0/">Spark 2.1.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.0.2/">Spark 2.0.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.0.1/">Spark 2.0.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.0.0/">Spark 2.0.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/1.6.3/">Spark 1.6.3</a></li>
+ <li><a href="{{site.baseurl}}/docs/1.6.2/">Spark 1.6.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/1.6.1/">Spark 1.6.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/1.6.0/">Spark 1.6.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/1.5.2/">Spark 1.5.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/1.5.1/">Spark 1.5.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/1.5.0/">Spark 1.5.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/1.4.1/">Spark 1.4.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/1.4.0/">Spark 1.4.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/1.3.1/">Spark 1.3.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/1.3.0/">Spark 1.3.0</a></li>
+ <li><a href="{{site.baseurl}}/docs/1.2.1/">Spark 1.2.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/1.1.1/">Spark 1.1.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/1.0.2/">Spark 1.0.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/0.9.2/">Spark 0.9.2</a></li>
+ <li><a href="{{site.baseurl}}/docs/0.8.1/">Spark 0.8.1</a></li>
+ <li><a href="{{site.baseurl}}/docs/0.7.3/">Spark 0.7.3</a></li>
+ <li><a href="{{site.baseurl}}/docs/0.6.2/">Spark 0.6.2</a></li>
+</ul>
+
+<p>Documentation for preview releases:</p>
+
+<ul>
+ <li><a href="{{site.baseurl}}/docs/4.1.0-preview4/">Spark
4.1.0-preview4</a></li>
+ <li><a href="{{site.baseurl}}/docs/4.1.0-preview3/">Spark
4.1.0-preview3</a></li>
+ <li><a href="{{site.baseurl}}/docs/4.1.0-preview2/">Spark
4.1.0-preview2</a></li>
+ <li><a href="{{site.baseurl}}/docs/4.1.0-preview1/">Spark
4.1.0-preview1</a></li>
+ <li><a href="{{site.baseurl}}/docs/4.0.0-preview2/">Spark 4.0.0
preview2</a></li>
+ <li><a href="{{site.baseurl}}/docs/4.0.0-preview1/">Spark 4.0.0
preview1</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.0.0-preview2/">Spark 3.0.0
preview2</a></li>
+ <li><a href="{{site.baseurl}}/docs/3.0.0-preview/">Spark 3.0.0
preview</a></li>
+ <li><a href="{{site.baseurl}}/docs/2.0.0-preview/">Spark 2.0.0
preview</a></li>
+</ul>
+
+<p>The documentation linked to above covers getting started with Spark, as
well the built-in components <a
href="{{site.baseurl}}/docs/latest/mllib-guide.html">MLlib</a>,
+<a href="{{site.baseurl}}/docs/latest/streaming-programming-guide.html">Spark
Streaming</a>, and <a
href="{{site.baseurl}}/docs/latest/graphx-programming-guide.html">GraphX</a>.</p>
+
+<p>In addition, this page lists other resources for learning Spark.</p>
+
+<h3>Videos</h3>
+See the <a
href="https://www.youtube.com/channel/UCRzsq7k4-kT-h3TDUBQ82-w">Apache Spark
YouTube Channel</a> for videos from Spark events. There are separate <a
href="https://www.youtube.com/channel/UCRzsq7k4-kT-h3TDUBQ82-w/playlists">playlists</a>
for videos of different topics. Besides browsing through playlists, you can
also find direct links to videos below.
+
+<h4>Screencast Tutorial Videos</h4>
+<ul>
+ <li><a
href="{{site.baseurl}}/screencasts/1-first-steps-with-spark.html">Screencast 1:
First Steps with Spark</a></li>
+ <li><a
href="{{site.baseurl}}/screencasts/2-spark-documentation-overview.html">Screencast
2: Spark Documentation Overview</a></li>
+<li><a
href="{{site.baseurl}}/screencasts/3-transformations-and-caching.html">Screencast
3: Transformations and Caching</a></li>
+<li><a
href="{{site.baseurl}}/screencasts/4-a-standalone-job-in-spark.html">Screencast
4: A Spark Standalone Job in Scala</a></li>
+
+</ul>
+
+<h4>Spark Summit Videos</h4>
+<ul>
+ <li>Videos from Spark Summit 2014, San Francisco, June 30 - July 2 2013
+ <ul>
+ <li><a href="https://spark-summit.org/2014/agenda">Full agenda with
links to all videos and slides</a></li>
+ <li><a href="https://spark-summit.org/2014/training">Training videos and
slides</a></li>
+ </ul>
+ </li>
+ <li>Videos from Spark Summit 2013, San Francisco, Dec 2-3 2013
+ <ul>
+ <li><a href="https://spark-summit.org/2013#agendapluginwidget-4">Full
agenda with links to all videos and slides</a></li>
+ <li><a
href="https://www.youtube.com/playlist?list=PL-x35fyliRwjXj33QvAXN0Vlx0gc6u0je">YouTube
playlist of all Keynotes</a></li>
+ <li><a
href="https://www.youtube.com/playlist?list=PL-x35fyliRwiNcKwIkDEQZBejiqxEJ79U">YouTube
playlist of Track A (Spark Applications)</a></li>
+ <li><a
href="https://www.youtube.com/playlist?list=PL-x35fyliRwiNcKwIkDEQZBejiqxEJ79U">YouTube
playlist of Track B (Spark Deployment, Scheduling & Perf, Related
projects)</a></li>
+ <li><a
href="https://www.youtube.com/playlist?list=PL-x35fyliRwjR1Umntxz52zv3EcKpbzCp">YouTube
playlist of the Training Day (i.e. the 2nd day of the summit)</a></li>
+ </ul>
+ </li>
+</ul>
+
+<h4><a name="meetup-videos"></a>Meetup Talk Videos</h4>
+In addition to the videos listed below, you can also view <a
href="http://www.meetup.com/spark-users/files/">all slides from Bay Area
meetups here</a>.
+<style type="text/css">
+ .video-meta-info {
+ font-size: 0.95em;
+ }
+</style>
+<ul>
+ <li><a
href="https://www.youtube.com/watch?v=NUQ-8to2XAk&list=PL-x35fyliRwiP3YteXbnhk0QGOtYLBT3a">Spark
1.0 and Beyond</a> (<a
href="http://files.meetup.com/3138542/Spark%201.0%20Meetup.ppt">slides</a>)
<span class="video-meta-info">by Patrick Wendell, at Cisco in San Jose,
2014-04-23</span></li>
+
+ <li><a
href="https://www.youtube.com/watch?v=ju2OQEXqONU&list=PL-x35fyliRwiP3YteXbnhk0QGOtYLBT3a">Adding
Native SQL Support to Spark with Catalyst</a> (<a
href="http://files.meetup.com/3138542/Spark%20SQL%20Meetup%20-%204-8-2012.pdf">slides</a>)
<span class="video-meta-info">by Michael Armbrust, at Tagged in SF,
2014-04-08</span></li>
+
+ <li><a
href="https://www.youtube.com/watch?v=MY0NkZY_tJw&list=PL-x35fyliRwiP3YteXbnhk0QGOtYLBT3a">SparkR
and GraphX</a> (slides: <a
href="http://files.meetup.com/3138542/SparkR-meetup.pdf">SparkR</a>, <a
href="http://files.meetup.com/3138542/graphx%40spark_meetup03_2014.pdf">GraphX</a>)
<span class="video-meta-info">by Shivaram Venkataraman & Dan Crankshaw, at
SkyDeck in Berkeley, 2014-03-25</span></li>
+
+ <li><a
href="https://www.youtube.com/watch?v=5niXiiEX5pE&list=PL-x35fyliRwiP3YteXbnhk0QGOtYLBT3a">Simple
deployment w/ SIMR & Advanced Shark Analytics w/ TGFs</a> (<a
href="http://files.meetup.com/3138542/tgf.pptx">slides</a>) <span
class="video-meta-info">by Ali Ghodsi, at Huawei in Santa Clara,
2014-02-05</span></li>
+
+ <li><a
href="https://www.youtube.com/watch?v=C7gWtxelYNM&list=PL-x35fyliRwiP3YteXbnhk0QGOtYLBT3a">Stores,
Monoids & Dependency Injection - Abstractions for Spark</a> (<a
href="http://files.meetup.com/3138542/Abstractions%20for%20spark%20streaming%20-%20spark%20meetup%20presentation.pdf">slides</a>)
<span class="video-meta-info">by Ryan Weald, at Sharethrough in SF,
2014-01-17</span></li>
+
+ <li><a href="https://www.youtube.com/watch?v=IxDnF_X4M-8">Distributed
Machine Learning using MLbase</a> (<a
href="http://files.meetup.com/3138542/sparkmeetup_8_6_13_final_reduced.pdf">slides</a>)
<span class="video-meta-info">by Evan Sparks & Ameet Talwalkar, at Twitter
in SF, 2013-08-06</span></li>
+
+ <li><a href="https://www.youtube.com/watch?v=vJQ2RZj9hqs">GraphX Preview:
Graph Analysis on Spark</a> <span class="video-meta-info">by Reynold Xin &
Joseph Gonzalez, at Flurry in SF, 2013-07-02</span></li>
+
+ <li><a href="https://www.youtube.com/watch?v=D1knCQZQQnw">Deep Dive with
Spark Streaming</a> (<a
href="http://www.slideshare.net/spark-project/deep-divewithsparkstreaming-tathagatadassparkmeetup20130617">slides</a>)
<span class="video-meta-info">by Tathagata Das, at Plug and Play in Sunnyvale,
2013-06-17</span></li>
+
+ <li><a href="https://www.youtube.com/watch?v=cAZ624-69PQ">Tachyon and Shark
update</a> (slides: <a
href="http://files.meetup.com/3138542/2013-05-09%20Shark%20%40%20Spark%20Meetup.pdf">Shark</a>,
<a
href="http://files.meetup.com/3138542/Tachyon_2013-05-09_Spark_Meetup.pdf">Tachyon</a>)
<span class="video-meta-info">by Ali Ghodsi, Haoyuan Li, Reynold Xin, Google
Ventures, 2013-05-09</span></li>
+
+ <li><a
href="https://www.youtube.com/playlist?list=PLxwbieuTaYXmWTBovyyw2NibPfUaJk-h4">Spark
0.7: Overview, pySpark, & Streaming</a> <span class="video-meta-info">by
Matei Zaharia, Josh Rosen, Tathagata Das, at Conviva on 2013-02-21</span></li>
+
+ <li><a href="https://www.youtube.com/watch?v=49Hr5xZyTEA">Introduction to
Spark Internals</a> (<a
href="http://files.meetup.com/3138542/dev-meetup-dec-2012.pptx">slides</a>)
<span class="video-meta-info">by Matei Zaharia, at Yahoo in Sunnyvale,
2012-12-18</span></li>
+
+
+
+
+</ul>
+
+
+<a name="summit"></a>
+<h3>Training Materials</h3>
+<ul>
+ <li><a href="https://spark-summit.org/2014/training">Training materials and
exercises from Spark Summit 2014</a> are available online. These include videos
and slides of talks as well as exercises you can run on your laptop. Topics
include Spark core, tuning and debugging, Spark SQL, Spark Streaming, GraphX
and MLlib.</li>
+ <li><a href="https://spark-summit.org/2013">Spark Summit 2013</a> included a
training session, with slides and videos available on <a
href="https://spark-summit.org/summit-2013/#agendapluginwidget-5">the training
day agenda</a>.
+ The session also included <a
href="https://spark-summit.org/2013/exercises/">exercises</a> that you can walk
through on Amazon EC2.</li>
+ <li>The <a href="https://amplab.cs.berkeley.edu/">UC Berkeley AMPLab</a>
regularly hosts training camps on Spark and related projects.
+Slides, videos and EC2-based exercises from each of these are available online:
+<ul>
+ <li><a href="http://ampcamp.berkeley.edu/4/">AMP Camp 4</a> (Strata Santa
Clara, Feb 2014) — focus on BlinkDB, MLlib, GraphX, Tachyon</li>
+ <li><a href="http://ampcamp.berkeley.edu/3/">AMP Camp 3</a> (Berkeley, CA,
Aug 2013)</li>
+ <li><a href="http://ampcamp.berkeley.edu/amp-camp-two-strata-2013/">AMP
Camp 2</a> (Strata Santa Clara, Feb 2013)</li>
+ <li><a href="http://ampcamp.berkeley.edu/agenda-2012/">AMP Camp 1</a>
(Berkeley, CA, Aug 2012)</li>
+ </ul>
+ </li>
+</ul>
+
+
+<h3>Hands-On Exercises</h3>
+
+<ul>
+ <li><a href="https://spark-summit.org/2014/training">Hands-on exercises from
Spark Summit 2014</a>. These let you install Spark on your laptop and learn
basic concepts, Spark SQL, Spark Streaming, GraphX and MLlib.</li>
+ <li><a href="https://spark-summit.org/2013/exercises/">Hands-on exercises
from Spark Summit 2013</a>. These exercises let you launch a small EC2 cluster,
load a dataset, and query it with Spark, Shark, Spark Streaming, and MLlib.</li>
+</ul>
+
+<h3>External Tutorials, Blog Posts, and Talks</h3>
+
+<ul>
+ <li><a
href="http://codeforhire.com/2014/02/18/using-spark-with-mongodb/">Using Spark
with MongoDB</a> — by Sampo Niskanen from Wellmo</li>
+ <li><a href="https://spark-summit.org/2013">Spark Summit 2013</a> —
contained 30 talks about Spark use cases, available as slides and videos</li>
+ <li><a href="http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/">A
Powerful Big Data Trio: Spark, Parquet and Avro</a> — Using Parquet in
Spark by Matt Massie</li>
+ <li><a
href="http://www.slideshare.net/EvanChan2/cassandra2013-spark-talk-final">Real-time
Analytics with Cassandra, Spark, and Shark</a> — Presentation by Evan
Chan from Ooyala at 2013 Cassandra Summit</li>
+ <li><a
href="http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923">Run
Spark and Shark on Amazon Elastic MapReduce</a> — Article by Amazon
Elastic MapReduce team member Parviz Deyhim</li>
+ <li><a href="http://www.ibm.com/developerworks/library/os-spark/">Spark, an
alternative for fast data analytics</a> — IBM Developer Works article by
M. Tim Jones</li>
+</ul>
+
+<h3>Books</h3>
+<ul>
+ <li><a href="http://shop.oreilly.com/product/0636920028512.do">Learning
Spark</a>, by Holden Karau, Andy Konwinski, Patrick Wendell and Matei Zaharia
(O'Reilly Media)</li>
+ <li><a href="http://www.manning.com/bonaci/">Spark in Action</a>, by Marko
Bonaci and Petar Zecevic (Manning)</li>
+ <li><a href="http://shop.oreilly.com/product/0636920035091.do">Advanced
Analytics with Spark</a>, by Juliet Hougland, Uri Laserson, Sean Owen, Sandy
Ryza and Josh Wills (O'Reilly Media)</li>
+ <li><a href="https://www.manning.com/books/spark-graphx-in-action">Spark
GraphX in Action</a>, by Michael Malak (Manning)</li>
+ <li><a
href="https://www.packtpub.com/big-data-and-business-intelligence/fast-data-processing-spark-second-edition">Fast
Data Processing with Spark</a>, by Krishna Sankar and Holden Karau (Packt
Publishing)</li>
+ <li><a
href="https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-spark">Machine
Learning with Spark</a>, by Nick Pentreath (Packt Publishing)</li>
+ <li><a
href="https://www.packtpub.com/big-data-and-business-intelligence/spark-cookbook">Spark
Cookbook</a>, by Rishi Yadav (Packt Publishing)</li>
+ <li><a
href="https://www.packtpub.com/big-data-and-business-intelligence/apache-spark-graph-processing">Apache
Spark Graph Processing</a>, by Rindra Ramamonjison (Packt Publishing)</li>
+ <li><a
href="https://www.packtpub.com/big-data-and-business-intelligence/mastering-apache-spark">Mastering
Apache Spark</a>, by Mike Frampton (Packt Publishing)</li>
+ <li><a href="http://www.apress.com/9781484209653">Big Data Analytics with
Spark: A Practitioner's Guide to Using Spark for Large Scale Data Analysis</a>,
by Mohammed Guller (Apress)</li>
+ <li><a
href="https://www.packtpub.com/big-data-and-business-intelligence/large-scale-machine-learning-spark">Large
Scale Machine Learning with Spark</a>, by Md. Rezaul Karim, Md. Mahedi Kaysar
(Packt Publishing)</li>
+ <li><a
href="https://www.packtpub.com/big-data-and-business-intelligence/big-data-analytics">Big
Data Analytics with Spark and Hadoop</a>, by Venkat Ankam (Packt
Publishing)</li>
+</ul>
+
+<h3>Examples</h3>
+
+<ul>
+ <li>The <a href="{{site.baseurl}}/examples.html">Spark examples page</a>
shows the basic API in Scala, Java and Python.</li>
+</ul>
+
+<h3>Research Papers</h3>
+
+<p>
+Spark was initially developed as a UC Berkeley research project, and much of
the design is documented in papers.
+The <a href="{{site.baseurl}}/research.html">research page</a> lists some of
the original motivation and direction.
+</p>
+
diff --git a/site/documentation.html b/site/documentation.html
index 7f64d94f1b..d3bb1e837e 100644
--- a/site/documentation.html
+++ b/site/documentation.html
@@ -159,8 +159,234 @@
<ul>
<li><a href="/docs/">Spark </a></li>
<li><a href="/docs/4.1.0/">Spark 4.1.0</a></li>
+ <li><a href="/docs/4.0.1/">Spark 4.0.1</a></li>
+ <li><a href="/docs/4.0.0/">Spark 4.0.0</a></li>
+ <li><a href="/docs/3.5.7/">Spark 3.5.7</a></li>
+ <li><a href="/docs/3.5.6/">Spark 3.5.6</a></li>
+ <li><a href="/docs/3.5.5/">Spark 3.5.5</a></li>
+ <li><a href="/docs/3.5.4/">Spark 3.5.4</a></li>
+ <li><a href="/docs/3.5.3/">Spark 3.5.3</a></li>
+ <li><a href="/docs/3.5.2/">Spark 3.5.2</a></li>
+ <li><a href="/docs/3.5.1/">Spark 3.5.1</a></li>
+ <li><a href="/docs/3.5.0/">Spark 3.5.0</a></li>
+ <li><a href="/docs/3.4.4/">Spark 3.4.4</a></li>
+ <li><a href="/docs/3.4.3/">Spark 3.4.3</a></li>
+ <li><a href="/docs/3.4.2/">Spark 3.4.2</a></li>
+ <li><a href="/docs/3.4.1/">Spark 3.4.1</a></li>
+ <li><a href="/docs/3.4.0/">Spark 3.4.0</a></li>
+ <li><a href="/docs/3.3.4/">Spark 3.3.4</a></li>
+ <li><a href="/docs/3.3.3/">Spark 3.3.3</a></li>
+ <li><a href="/docs/3.3.2/">Spark 3.3.2</a></li>
+ <li><a href="/docs/3.3.1/">Spark 3.3.1</a></li>
+ <li><a href="/docs/3.3.0/">Spark 3.3.0</a></li>
+ <li><a href="/docs/3.2.4/">Spark 3.2.4</a></li>
+ <li><a href="/docs/3.2.3/">Spark 3.2.3</a></li>
+ <li><a href="/docs/3.2.2/">Spark 3.2.2</a></li>
+ <li><a href="/docs/3.2.1/">Spark 3.2.1</a></li>
+ <li><a href="/docs/3.2.0/">Spark 3.2.0</a></li>
+ <li><a href="/docs/3.1.3/">Spark 3.1.3</a></li>
+ <li><a href="/docs/3.1.2/">Spark 3.1.2</a></li>
+ <li><a href="/docs/3.1.1/">Spark 3.1.1</a></li>
+ <li><a href="/docs/3.0.3/">Spark 3.0.3</a></li>
+ <li><a href="/docs/3.0.2/">Spark 3.0.2</a></li>
+ <li><a href="/docs/3.0.1/">Spark 3.0.1</a></li>
+ <li><a href="/docs/3.0.0/">Spark 3.0.0</a></li>
+ <li><a href="/docs/2.4.8/">Spark 2.4.8</a></li>
+ <li><a href="/docs/2.4.7/">Spark 2.4.7</a></li>
+ <li><a href="/docs/2.4.6/">Spark 2.4.6</a></li>
+ <li><a href="/docs/2.4.5/">Spark 2.4.5</a></li>
+ <li><a href="/docs/2.4.4/">Spark 2.4.4</a></li>
+ <li><a href="/docs/2.4.3/">Spark 2.4.3</a></li>
+ <li><a href="/docs/2.4.2/">Spark 2.4.2</a></li>
+ <li><a href="/docs/2.4.1/">Spark 2.4.1</a></li>
+ <li><a href="/docs/2.4.0/">Spark 2.4.0</a></li>
+ <li><a href="/docs/2.3.4/">Spark 2.3.4</a></li>
+ <li><a href="/docs/2.3.3/">Spark 2.3.3</a></li>
+ <li><a href="/docs/2.3.2/">Spark 2.3.2</a></li>
+ <li><a href="/docs/2.3.1/">Spark 2.3.1</a></li>
+ <li><a href="/docs/2.3.0/">Spark 2.3.0</a></li>
+ <li><a href="/docs/2.2.3/">Spark 2.2.3</a></li>
+ <li><a href="/docs/2.2.2/">Spark 2.2.2</a></li>
+ <li><a href="/docs/2.2.1/">Spark 2.2.1</a></li>
+ <li><a href="/docs/2.2.0/">Spark 2.2.0</a></li>
+ <li><a href="/docs/2.1.3/">Spark 2.1.3</a></li>
+ <li><a href="/docs/2.1.2/">Spark 2.1.2</a></li>
+ <li><a href="/docs/2.1.1/">Spark 2.1.1</a></li>
+ <li><a href="/docs/2.1.0/">Spark 2.1.0</a></li>
+ <li><a href="/docs/2.0.2/">Spark 2.0.2</a></li>
+ <li><a href="/docs/2.0.1/">Spark 2.0.1</a></li>
+ <li><a href="/docs/2.0.0/">Spark 2.0.0</a></li>
+ <li><a href="/docs/1.6.3/">Spark 1.6.3</a></li>
+ <li><a href="/docs/1.6.2/">Spark 1.6.2</a></li>
+ <li><a href="/docs/1.6.1/">Spark 1.6.1</a></li>
+ <li><a href="/docs/1.6.0/">Spark 1.6.0</a></li>
+ <li><a href="/docs/1.5.2/">Spark 1.5.2</a></li>
+ <li><a href="/docs/1.5.1/">Spark 1.5.1</a></li>
+ <li><a href="/docs/1.5.0/">Spark 1.5.0</a></li>
+ <li><a href="/docs/1.4.1/">Spark 1.4.1</a></li>
+ <li><a href="/docs/1.4.0/">Spark 1.4.0</a></li>
+ <li><a href="/docs/1.3.1/">Spark 1.3.1</a></li>
+ <li><a href="/docs/1.3.0/">Spark 1.3.0</a></li>
+ <li><a href="/docs/1.2.1/">Spark 1.2.1</a></li>
+ <li><a href="/docs/1.1.1/">Spark 1.1.1</a></li>
+ <li><a href="/docs/1.0.2/">Spark 1.0.2</a></li>
+ <li><a href="/docs/0.9.2/">Spark 0.9.2</a></li>
+ <li><a href="/docs/0.8.1/">Spark 0.8.1</a></li>
+ <li><a href="/docs/0.7.3/">Spark 0.7.3</a></li>
+ <li><a href="/docs/0.6.2/">Spark 0.6.2</a></li>
</ul>
+<p>Documentation for preview releases:</p>
+
+<ul>
+ <li><a href="/docs/4.1.0-preview4/">Spark 4.1.0-preview4</a></li>
+ <li><a href="/docs/4.1.0-preview3/">Spark 4.1.0-preview3</a></li>
+ <li><a href="/docs/4.1.0-preview2/">Spark 4.1.0-preview2</a></li>
+ <li><a href="/docs/4.1.0-preview1/">Spark 4.1.0-preview1</a></li>
+ <li><a href="/docs/4.0.0-preview2/">Spark 4.0.0 preview2</a></li>
+ <li><a href="/docs/4.0.0-preview1/">Spark 4.0.0 preview1</a></li>
+ <li><a href="/docs/3.0.0-preview2/">Spark 3.0.0 preview2</a></li>
+ <li><a href="/docs/3.0.0-preview/">Spark 3.0.0 preview</a></li>
+ <li><a href="/docs/2.0.0-preview/">Spark 2.0.0 preview</a></li>
+</ul>
+
+<p>The documentation linked to above covers getting started with Spark, as
well the built-in components <a href="/docs/latest/mllib-guide.html">MLlib</a>,
+<a href="/docs/latest/streaming-programming-guide.html">Spark Streaming</a>,
and <a href="/docs/latest/graphx-programming-guide.html">GraphX</a>.</p>
+
+<p>In addition, this page lists other resources for learning Spark.</p>
+
+<h3>Videos</h3>
+<p>See the <a
href="https://www.youtube.com/channel/UCRzsq7k4-kT-h3TDUBQ82-w">Apache Spark
YouTube Channel</a> for videos from Spark events. There are separate <a
href="https://www.youtube.com/channel/UCRzsq7k4-kT-h3TDUBQ82-w/playlists">playlists</a>
for videos of different topics. Besides browsing through playlists, you can
also find direct links to videos below.</p>
+
+<h4>Screencast Tutorial Videos</h4>
+<ul>
+ <li><a href="/screencasts/1-first-steps-with-spark.html">Screencast 1: First
Steps with Spark</a></li>
+ <li><a href="/screencasts/2-spark-documentation-overview.html">Screencast 2:
Spark Documentation Overview</a></li>
+<li><a href="/screencasts/3-transformations-and-caching.html">Screencast 3:
Transformations and Caching</a></li>
+<li><a href="/screencasts/4-a-standalone-job-in-spark.html">Screencast 4: A
Spark Standalone Job in Scala</a></li>
+
+</ul>
+
+<h4>Spark Summit Videos</h4>
+<ul>
+ <li>Videos from Spark Summit 2014, San Francisco, June 30 - July 2 2013
+ <ul>
+ <li><a href="https://spark-summit.org/2014/agenda">Full agenda with
links to all videos and slides</a></li>
+ <li><a href="https://spark-summit.org/2014/training">Training videos and
slides</a></li>
+ </ul>
+ </li>
+ <li>Videos from Spark Summit 2013, San Francisco, Dec 2-3 2013
+ <ul>
+ <li><a href="https://spark-summit.org/2013#agendapluginwidget-4">Full
agenda with links to all videos and slides</a></li>
+ <li><a
href="https://www.youtube.com/playlist?list=PL-x35fyliRwjXj33QvAXN0Vlx0gc6u0je">YouTube
playlist of all Keynotes</a></li>
+ <li><a
href="https://www.youtube.com/playlist?list=PL-x35fyliRwiNcKwIkDEQZBejiqxEJ79U">YouTube
playlist of Track A (Spark Applications)</a></li>
+ <li><a
href="https://www.youtube.com/playlist?list=PL-x35fyliRwiNcKwIkDEQZBejiqxEJ79U">YouTube
playlist of Track B (Spark Deployment, Scheduling & Perf, Related
projects)</a></li>
+ <li><a
href="https://www.youtube.com/playlist?list=PL-x35fyliRwjR1Umntxz52zv3EcKpbzCp">YouTube
playlist of the Training Day (i.e. the 2nd day of the summit)</a></li>
+ </ul>
+ </li>
+</ul>
+
+<h4><a name="meetup-videos"></a>Meetup Talk Videos</h4>
+<p>In addition to the videos listed below, you can also view <a
href="http://www.meetup.com/spark-users/files/">all slides from Bay Area
meetups here</a>.</p>
+<style type="text/css">
+ .video-meta-info {
+ font-size: 0.95em;
+ }
+</style>
+
+<ul>
+ <li><a
href="https://www.youtube.com/watch?v=NUQ-8to2XAk&list=PL-x35fyliRwiP3YteXbnhk0QGOtYLBT3a">Spark
1.0 and Beyond</a> (<a
href="http://files.meetup.com/3138542/Spark%201.0%20Meetup.ppt">slides</a>)
<span class="video-meta-info">by Patrick Wendell, at Cisco in San Jose,
2014-04-23</span></li>
+
+ <li><a
href="https://www.youtube.com/watch?v=ju2OQEXqONU&list=PL-x35fyliRwiP3YteXbnhk0QGOtYLBT3a">Adding
Native SQL Support to Spark with Catalyst</a> (<a
href="http://files.meetup.com/3138542/Spark%20SQL%20Meetup%20-%204-8-2012.pdf">slides</a>)
<span class="video-meta-info">by Michael Armbrust, at Tagged in SF,
2014-04-08</span></li>
+
+ <li><a
href="https://www.youtube.com/watch?v=MY0NkZY_tJw&list=PL-x35fyliRwiP3YteXbnhk0QGOtYLBT3a">SparkR
and GraphX</a> (slides: <a
href="http://files.meetup.com/3138542/SparkR-meetup.pdf">SparkR</a>, <a
href="http://files.meetup.com/3138542/graphx%40spark_meetup03_2014.pdf">GraphX</a>)
<span class="video-meta-info">by Shivaram Venkataraman & Dan Crankshaw, at
SkyDeck in Berkeley, 2014-03-25</span></li>
+
+ <li><a
href="https://www.youtube.com/watch?v=5niXiiEX5pE&list=PL-x35fyliRwiP3YteXbnhk0QGOtYLBT3a">Simple
deployment w/ SIMR & Advanced Shark Analytics w/ TGFs</a> (<a
href="http://files.meetup.com/3138542/tgf.pptx">slides</a>) <span
class="video-meta-info">by Ali Ghodsi, at Huawei in Santa Clara,
2014-02-05</span></li>
+
+ <li><a
href="https://www.youtube.com/watch?v=C7gWtxelYNM&list=PL-x35fyliRwiP3YteXbnhk0QGOtYLBT3a">Stores,
Monoids & Dependency Injection - Abstractions for Spark</a> (<a
href="http://files.meetup.com/3138542/Abstractions%20for%20spark%20streaming%20-%20spark%20meetup%20presentation.pdf">slides</a>)
<span class="video-meta-info">by Ryan Weald, at Sharethrough in SF,
2014-01-17</span></li>
+
+ <li><a href="https://www.youtube.com/watch?v=IxDnF_X4M-8">Distributed
Machine Learning using MLbase</a> (<a
href="http://files.meetup.com/3138542/sparkmeetup_8_6_13_final_reduced.pdf">slides</a>)
<span class="video-meta-info">by Evan Sparks & Ameet Talwalkar, at Twitter
in SF, 2013-08-06</span></li>
+
+ <li><a href="https://www.youtube.com/watch?v=vJQ2RZj9hqs">GraphX Preview:
Graph Analysis on Spark</a> <span class="video-meta-info">by Reynold Xin &
Joseph Gonzalez, at Flurry in SF, 2013-07-02</span></li>
+
+ <li><a href="https://www.youtube.com/watch?v=D1knCQZQQnw">Deep Dive with
Spark Streaming</a> (<a
href="http://www.slideshare.net/spark-project/deep-divewithsparkstreaming-tathagatadassparkmeetup20130617">slides</a>)
<span class="video-meta-info">by Tathagata Das, at Plug and Play in Sunnyvale,
2013-06-17</span></li>
+
+ <li><a href="https://www.youtube.com/watch?v=cAZ624-69PQ">Tachyon and Shark
update</a> (slides: <a
href="http://files.meetup.com/3138542/2013-05-09%20Shark%20%40%20Spark%20Meetup.pdf">Shark</a>,
<a
href="http://files.meetup.com/3138542/Tachyon_2013-05-09_Spark_Meetup.pdf">Tachyon</a>)
<span class="video-meta-info">by Ali Ghodsi, Haoyuan Li, Reynold Xin, Google
Ventures, 2013-05-09</span></li>
+
+ <li><a
href="https://www.youtube.com/playlist?list=PLxwbieuTaYXmWTBovyyw2NibPfUaJk-h4">Spark
0.7: Overview, pySpark, & Streaming</a> <span class="video-meta-info">by
Matei Zaharia, Josh Rosen, Tathagata Das, at Conviva on 2013-02-21</span></li>
+
+ <li><a href="https://www.youtube.com/watch?v=49Hr5xZyTEA">Introduction to
Spark Internals</a> (<a
href="http://files.meetup.com/3138542/dev-meetup-dec-2012.pptx">slides</a>)
<span class="video-meta-info">by Matei Zaharia, at Yahoo in Sunnyvale,
2012-12-18</span></li>
+
+
+
+
+</ul>
+
+<p><a name="summit"></a></p>
+<h3>Training Materials</h3>
+<ul>
+ <li><a href="https://spark-summit.org/2014/training">Training materials and
exercises from Spark Summit 2014</a> are available online. These include videos
and slides of talks as well as exercises you can run on your laptop. Topics
include Spark core, tuning and debugging, Spark SQL, Spark Streaming, GraphX
and MLlib.</li>
+ <li><a href="https://spark-summit.org/2013">Spark Summit 2013</a> included a
training session, with slides and videos available on <a
href="https://spark-summit.org/summit-2013/#agendapluginwidget-5">the training
day agenda</a>.
+ The session also included <a
href="https://spark-summit.org/2013/exercises/">exercises</a> that you can walk
through on Amazon EC2.</li>
+ <li>The <a href="https://amplab.cs.berkeley.edu/">UC Berkeley AMPLab</a>
regularly hosts training camps on Spark and related projects.
+Slides, videos and EC2-based exercises from each of these are available online:
+<ul>
+ <li><a href="http://ampcamp.berkeley.edu/4/">AMP Camp 4</a> (Strata Santa
Clara, Feb 2014) — focus on BlinkDB, MLlib, GraphX, Tachyon</li>
+ <li><a href="http://ampcamp.berkeley.edu/3/">AMP Camp 3</a> (Berkeley, CA,
Aug 2013)</li>
+ <li><a href="http://ampcamp.berkeley.edu/amp-camp-two-strata-2013/">AMP
Camp 2</a> (Strata Santa Clara, Feb 2013)</li>
+ <li><a href="http://ampcamp.berkeley.edu/agenda-2012/">AMP Camp 1</a>
(Berkeley, CA, Aug 2012)</li>
+ </ul>
+ </li>
+</ul>
+
+<h3>Hands-On Exercises</h3>
+
+<ul>
+ <li><a href="https://spark-summit.org/2014/training">Hands-on exercises from
Spark Summit 2014</a>. These let you install Spark on your laptop and learn
basic concepts, Spark SQL, Spark Streaming, GraphX and MLlib.</li>
+ <li><a href="https://spark-summit.org/2013/exercises/">Hands-on exercises
from Spark Summit 2013</a>. These exercises let you launch a small EC2 cluster,
load a dataset, and query it with Spark, Shark, Spark Streaming, and MLlib.</li>
+</ul>
+
+<h3>External Tutorials, Blog Posts, and Talks</h3>
+
+<ul>
+ <li><a
href="http://codeforhire.com/2014/02/18/using-spark-with-mongodb/">Using Spark
with MongoDB</a> — by Sampo Niskanen from Wellmo</li>
+ <li><a href="https://spark-summit.org/2013">Spark Summit 2013</a> —
contained 30 talks about Spark use cases, available as slides and videos</li>
+ <li><a href="http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/">A
Powerful Big Data Trio: Spark, Parquet and Avro</a> — Using Parquet in
Spark by Matt Massie</li>
+ <li><a
href="http://www.slideshare.net/EvanChan2/cassandra2013-spark-talk-final">Real-time
Analytics with Cassandra, Spark, and Shark</a> — Presentation by Evan
Chan from Ooyala at 2013 Cassandra Summit</li>
+ <li><a
href="http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923">Run
Spark and Shark on Amazon Elastic MapReduce</a> — Article by Amazon
Elastic MapReduce team member Parviz Deyhim</li>
+ <li><a href="http://www.ibm.com/developerworks/library/os-spark/">Spark, an
alternative for fast data analytics</a> — IBM Developer Works article by
M. Tim Jones</li>
+</ul>
+
+<h3>Books</h3>
+<ul>
+ <li><a href="http://shop.oreilly.com/product/0636920028512.do">Learning
Spark</a>, by Holden Karau, Andy Konwinski, Patrick Wendell and Matei Zaharia
(O'Reilly Media)</li>
+ <li><a href="http://www.manning.com/bonaci/">Spark in Action</a>, by Marko
Bonaci and Petar Zecevic (Manning)</li>
+ <li><a href="http://shop.oreilly.com/product/0636920035091.do">Advanced
Analytics with Spark</a>, by Juliet Hougland, Uri Laserson, Sean Owen, Sandy
Ryza and Josh Wills (O'Reilly Media)</li>
+ <li><a href="https://www.manning.com/books/spark-graphx-in-action">Spark
GraphX in Action</a>, by Michael Malak (Manning)</li>
+ <li><a
href="https://www.packtpub.com/big-data-and-business-intelligence/fast-data-processing-spark-second-edition">Fast
Data Processing with Spark</a>, by Krishna Sankar and Holden Karau (Packt
Publishing)</li>
+ <li><a
href="https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-spark">Machine
Learning with Spark</a>, by Nick Pentreath (Packt Publishing)</li>
+ <li><a
href="https://www.packtpub.com/big-data-and-business-intelligence/spark-cookbook">Spark
Cookbook</a>, by Rishi Yadav (Packt Publishing)</li>
+ <li><a
href="https://www.packtpub.com/big-data-and-business-intelligence/apache-spark-graph-processing">Apache
Spark Graph Processing</a>, by Rindra Ramamonjison (Packt Publishing)</li>
+ <li><a
href="https://www.packtpub.com/big-data-and-business-intelligence/mastering-apache-spark">Mastering
Apache Spark</a>, by Mike Frampton (Packt Publishing)</li>
+ <li><a href="http://www.apress.com/9781484209653">Big Data Analytics with
Spark: A Practitioner's Guide to Using Spark for Large Scale Data Analysis</a>,
by Mohammed Guller (Apress)</li>
+ <li><a
href="https://www.packtpub.com/big-data-and-business-intelligence/large-scale-machine-learning-spark">Large
Scale Machine Learning with Spark</a>, by Md. Rezaul Karim, Md. Mahedi Kaysar
(Packt Publishing)</li>
+ <li><a
href="https://www.packtpub.com/big-data-and-business-intelligence/big-data-analytics">Big
Data Analytics with Spark and Hadoop</a>, by Venkat Ankam (Packt
Publishing)</li>
+</ul>
+
+<h3>Examples</h3>
+
+<ul>
+ <li>The <a href="/examples.html">Spark examples page</a> shows the basic API
in Scala, Java and Python.</li>
+</ul>
+
+<h3>Research Papers</h3>
+
+<p>
+Spark was initially developed as a UC Berkeley research project, and much of
the design is documented in papers.
+The <a href="/research.html">research page</a> lists some of the original
motivation and direction.
+</p>
+
+
</div>
<div class="col-12 col-md-3">
<div class="news" style="margin-bottom: 20px;">
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]