svn commit: r1750410 [2/2] - in /spark: ./ _plugins/ mllib/ releases/_posts/ site/ site/mllib/ site/news/ site/releases/ site/sql/ site/streaming/ sql/ streaming/

srowen Mon, 27 Jun 2016 13:32:06 -0700

Modified: spark/site/releases/spark-release-1-1-0.html
URL: 
http://svn.apache.org/viewvc/spark/site/releases/spark-release-1-1-0.html?rev=1750410&r1=1750409&r2=1750410&view=diff
==============================================================================
--- spark/site/releases/spark-release-1-1-0.html (original)
+++ spark/site/releases/spark-release-1-1-0.html Mon Jun 27 20:31:41 2016
@@ -197,7 +197,7 @@
 <p>Spark SQL adds a number of new features and performance improvements in 
this release. A <a 
href="http://spark.apache.org/docs/1.1.0/sql-programming-guide.html#running-the-thrift-jdbc-server";>JDBC/ODBC
 server</a> allows users to connect to SparkSQL from many different 
applications and provides shared access to cached tables. A new module provides 
<a 
href="http://spark.apache.org/docs/1.1.0/sql-programming-guide.html#json-datasets";>support
 for loading JSON data</a> directly into Sparkâs SchemaRDD format, including 
automatic schema inference. Spark SQL introduces <a 
href="http://spark.apache.org/docs/1.1.0/sql-programming-guide.html#other-configuration-options";>dynamic
 bytecode generation</a> in this release, a technique which significantly 
speeds up execution for queries that perform complex expression evaluation.  
This release also adds support for registering Python, Scala, and Java lambda 
functions as UDFs, which can then be called directly in SQL. Spark 1.1 adds a 
<a href=
 
"http://spark.apache.org/docs/1.1.0/sql-programming-guide.html#programmatically-specifying-the-schema";>public
 types API to allow users to create SchemaRDDâs from custom data sources</a>. 
Finally, many optimizations have been added to the native Parquet support as 
well as throughout the engine.</p>
 
 <h3 id="mllib">MLlib</h3>
-<p>MLlib adds several new algorithms and optimizations in this release. 1.1 
introduces a <a href="https://issues.apache.org/jira/browse/SPARK-2359";>new 
library of statistical packages</a> which provides exploratory analytic 
functions. These include stratified sampling, correlations, chi-squared tests 
and support for creating random datasets. This release adds utilities for 
feature extraction (<a 
href="https://issues.apache.org/jira/browse/SPARK-2510";>Word2Vec</a> and <a 
href="https://issues.apache.org/jira/browse/SPARK-2511";>TF-IDF</a>) and feature 
transformation (<a 
href="https://issues.apache.org/jira/browse/SPARK-2272";>normalization and 
standard scaling</a>). Also new are support for <a 
href="https://issues.apache.org/jira/browse/SPARK-1553";>nonnegative matrix 
factorization</a> and <a 
href="https://issues.apache.org/jira/browse/SPARK-1782";>SVD via Lanczos</a>. 
The decision tree algorithm has been <a 
href="https://issues.apache.org/jira/browse/SPARK-2478";>added in Python and 
Java<
 /a>. A tree aggregation primitive has been added to help optimize many 
existing algorithms. Performance improves across the board in MLlib 1.1, with 
improvements of around 2-3X for many algorithms and up to 5X for large scale 
decision tree problems. </p>
+<p>MLlib adds several new algorithms and optimizations in this release. 1.1 
introduces a <a href="https://issues.apache.org/jira/browse/SPARK-2359";>new 
library of statistical packages</a> which provides exploratory analytic 
functions. These include stratified sampling, correlations, chi-squared tests 
and support for creating random datasets. This release adds utilities for 
feature extraction (<a 
href="https://issues.apache.org/jira/browse/SPARK-2510";>Word2Vec</a> and <a 
href="https://issues.apache.org/jira/browse/SPARK-2511";>TF-IDF</a>) and feature 
transformation (<a 
href="https://issues.apache.org/jira/browse/SPARK-2272";>normalization and 
standard scaling</a>). Also new are support for <a 
href="https://issues.apache.org/jira/browse/SPARK-1553";>nonnegative matrix 
factorization</a> and <a 
href="https://issues.apache.org/jira/browse/SPARK-1782";>SVD via Lanczos</a>. 
The decision tree algorithm has been <a 
href="https://issues.apache.org/jira/browse/SPARK-2478";>added in Python and 
Java<
 /a>. A tree aggregation primitive has been added to help optimize many 
existing algorithms. Performance improves across the board in MLlib 1.1, with 
improvements of around 2-3X for many algorithms and up to 5X for large scale 
decision tree problems.</p>
 
 <h3 id="graphx-and-spark-streaming">GraphX and Spark Streaming</h3>
 <p>Spark streaming adds a new data source <a 
href="https://issues.apache.org/jira/browse/SPARK-1981";>Amazon Kinesis</a>. For 
the Apache Flume, a new mode is supported which <a 
href="https://issues.apache.org/jira/browse/SPARK-1729";>pulls data from 
Flume</a>, simplifying deployment and providing high availability. The first of 
a set of <a href="https://issues.apache.org/jira/browse/SPARK-2438";>streaming 
machine learning algorithms</a> is introduced with streaming linear regression. 
Finally, <a href="https://issues.apache.org/jira/browse/SPARK-1341";>rate 
limiting</a> has been added for streaming inputs. GraphX adds <a 
href="https://issues.apache.org/jira/browse/SPARK-1991";>custom storage levels 
for vertices and edges</a> along with <a 
href="https://issues.apache.org/jira/browse/SPARK-2748";>improved numerical 
precision</a> across the board. Finally, GraphX adds a new label propagation 
algorithm.</p>
@@ -215,7 +215,7 @@
 
 <ul>
   <li>The default value of <code>spark.io.compression.codec</code> is now 
<code>snappy</code> for improved memory usage. Old behavior can be restored by 
switching to <code>lzf</code>.</li>
-  <li>The default value of <code>spark.broadcast.factory</code> is now 
<code>org.apache.spark.broadcast.TorrentBroadcastFactory</code> for improved 
efficiency of broadcasts. Old behavior can be restored by switching to 
<code>org.apache.spark.broadcast.HttpBroadcastFactory</code>. </li>
+  <li>The default value of <code>spark.broadcast.factory</code> is now 
<code>org.apache.spark.broadcast.TorrentBroadcastFactory</code> for improved 
efficiency of broadcasts. Old behavior can be restored by switching to 
<code>org.apache.spark.broadcast.HttpBroadcastFactory</code>.</li>
   <li>PySpark now performs external spilling during aggregations. Old behavior 
can be restored by setting <code>spark.shuffle.spill</code> to 
<code>false</code>.</li>
   <li>PySpark uses a new heuristic for determining the parallelism of shuffle 
operations. Old behavior can be restored by setting 
<code>spark.default.parallelism</code> to the number of cores in the 
cluster.</li>
 </ul>
@@ -275,7 +275,7 @@
   <li>Daneil Darabos &#8211; bug fixes and UI enhancements</li>
   <li>Daoyuan Wang &#8211; SQL fixes</li>
   <li>David Lemieux &#8211; bug fix</li>
-  <li>Davies Liu &#8211; PySpark fixes and spilling </li>
+  <li>Davies Liu &#8211; PySpark fixes and spilling</li>
   <li>DB Tsai &#8211; online summaries in MLlib and other MLlib features</li>
   <li>Derek Ma &#8211; bug fix</li>
   <li>Doris Xin &#8211; MLlib stats library and several fixes</li>


Modified: spark/site/releases/spark-release-1-2-0.html
URL: 
http://svn.apache.org/viewvc/spark/site/releases/spark-release-1-2-0.html?rev=1750410&r1=1750409&r2=1750410&view=diff
==============================================================================
--- spark/site/releases/spark-release-1-2-0.html (original)
+++ spark/site/releases/spark-release-1-2-0.html Mon Jun 27 20:31:41 2016
@@ -194,7 +194,7 @@
 <p>In 1.2 Spark core upgrades two major subsystems to improve the performance 
and stability of very large scale shuffles. The first is Sparkâs 
communication manager used during bulk transfers, which upgrades to a <a 
href="https://issues.apache.org/jira/browse/SPARK-2468";>netty-based 
implementation</a>. The second is Sparkâs shuffle mechanism, which upgrades 
to the <a href="https://issues.apache.org/jira/browse/SPARK-3280";>âsort 
basedâ shuffle initially released in Spark 1.1</a>. These both improve the 
performance and stability of very large scale shuffles. Spark also adds an <a 
href="https://issues.apache.org/jira/browse/SPARK-3174";>elastic scaling 
mechanism</a> designed to improve cluster utilization during long running 
ETL-style jobs. This is currently supported on YARN and will make its way to 
other cluster managers in future versions. Finally, Spark 1.2 adds support for 
Scala 2.11. For instructions on building for Scala 2.11 see the <a 
href="/docs/1.2.0/build
 ing-spark.html#building-for-scala-211">build documentation</a>.</p>
 
 <h3 id="spark-streaming">Spark Streaming</h3>
-<p>This release includes two major feature additions to Sparkâs streaming 
library, a Python API and a write ahead log for full driver H/A. The <a 
href="https://issues.apache.org/jira/browse/SPARK-2377";>Python API</a> covers 
almost all the DStream transformations and output operations. Input sources 
based on text files and text over sockets are currently supported. Support for 
Kafka and Flume input streams in Python will be added in the next release. 
Second, Spark streaming now features H/A driver support through a <a 
href="https://issues.apache.org/jira/browse/SPARK-3129";>write ahead log 
(WAL)</a>. In Spark 1.1 and earlier, some buffered (received but not yet 
processed) data can be lost during driver restarts. To prevent this Spark 1.2 
adds an optional WAL, which buffers received data into a fault-tolerant file 
system (e.g. HDFS). See the <a 
href="/docs/1.2.0/streaming-programming-guide.html">streaming programming 
guide</a> for more details. </p>
+<p>This release includes two major feature additions to Sparkâs streaming 
library, a Python API and a write ahead log for full driver H/A. The <a 
href="https://issues.apache.org/jira/browse/SPARK-2377";>Python API</a> covers 
almost all the DStream transformations and output operations. Input sources 
based on text files and text over sockets are currently supported. Support for 
Kafka and Flume input streams in Python will be added in the next release. 
Second, Spark streaming now features H/A driver support through a <a 
href="https://issues.apache.org/jira/browse/SPARK-3129";>write ahead log 
(WAL)</a>. In Spark 1.1 and earlier, some buffered (received but not yet 
processed) data can be lost during driver restarts. To prevent this Spark 1.2 
adds an optional WAL, which buffers received data into a fault-tolerant file 
system (e.g. HDFS). See the <a 
href="/docs/1.2.0/streaming-programming-guide.html">streaming programming 
guide</a> for more details.</p>
 
 <h3 id="mllib">MLLib</h3>
 <p>Spark 1.2 previews a new set of machine learning APIâs in a package 
called spark.ml that <a 
href="https://issues.apache.org/jira/browse/SPARK-3530";>supports learning 
pipelines</a>, where multiple algorithms are run in sequence with varying 
parameters. This type of pipeline is common in practical machine learning 
deployments. The new ML package uses Sparkâs SchemaRDD to represent <a 
href="https://issues.apache.org/jira/browse/SPARK-3573";>ML datasets</a>, 
providing direct interoperability with Spark SQL. In addition to the new API, 
Spark 1.2 extends decision trees with two tree ensemble methods: <a 
href="https://issues.apache.org/jira/browse/SPARK-1545";>random forests</a> and 
<a href="https://issues.apache.org/jira/browse/SPARK-1547";>gradient-boosted 
trees</a>, among the most successful tree-based models for classification and 
regression. Finally, MLlib&#8217;s Python implementation receives a major 
update in 1.2 to simplify the process of adding Python APIs, along with b
 etter Python API coverage.</p>

Modified: spark/site/releases/spark-release-1-3-0.html
URL: 
http://svn.apache.org/viewvc/spark/site/releases/spark-release-1-3-0.html?rev=1750410&r1=1750409&r2=1750410&view=diff
==============================================================================
--- spark/site/releases/spark-release-1-3-0.html (original)
+++ spark/site/releases/spark-release-1-3-0.html Mon Jun 27 20:31:41 2016
@@ -191,7 +191,7 @@
 <p>To download Spark 1.3 visit the <a href="/downloads.html">downloads</a> 
page.</p>
 
 <h3 id="spark-core">Spark Core</h3>
-<p>Spark 1.3 sees a handful of usability improvements in the core engine. The 
core API now supports <a 
href="https://issues.apache.org/jira/browse/SPARK-5430";>multi level aggregation 
trees</a> to help speed up expensive reduce operations. <a 
href="https://issues.apache.org/jira/browse/SPARK-5063";>Improved error 
reporting</a> has been added for certain gotcha operations. Spark&#8217;s Jetty 
dependency is <a href="https://issues.apache.org/jira/browse/SPARK-3996";>now 
shaded</a> to help avoid conflicts with user programs. Spark now supports <a 
href="https://issues.apache.org/jira/browse/SPARK-3883";>SSL encryption</a> for 
some communication endpoints. Finaly, realtime <a 
href="https://issues.apache.org/jira/browse/SPARK-3428";>GC metrics</a> and <a 
href="https://issues.apache.org/jira/browse/SPARK-4874";>record counts</a> have 
been added to the UI. </p>
+<p>Spark 1.3 sees a handful of usability improvements in the core engine. The 
core API now supports <a 
href="https://issues.apache.org/jira/browse/SPARK-5430";>multi level aggregation 
trees</a> to help speed up expensive reduce operations. <a 
href="https://issues.apache.org/jira/browse/SPARK-5063";>Improved error 
reporting</a> has been added for certain gotcha operations. Spark&#8217;s Jetty 
dependency is <a href="https://issues.apache.org/jira/browse/SPARK-3996";>now 
shaded</a> to help avoid conflicts with user programs. Spark now supports <a 
href="https://issues.apache.org/jira/browse/SPARK-3883";>SSL encryption</a> for 
some communication endpoints. Finaly, realtime <a 
href="https://issues.apache.org/jira/browse/SPARK-3428";>GC metrics</a> and <a 
href="https://issues.apache.org/jira/browse/SPARK-4874";>record counts</a> have 
been added to the UI.</p>
 
 <h3 id="dataframe-api">DataFrame API</h3>
 <p>Spark 1.3 adds a new <a 
href="/docs/1.3.0/sql-programming-guide.html#dataframes">DataFrames API</a> 
that provides powerful and convenient operators when working with structured 
datasets. The DataFrame is an evolution of the base RDD API that includes named 
fields along with schema information. Itâs easy to construct a DataFrame from 
sources such as Hive tables, JSON data, a JDBC database, or any implementation 
of Sparkâs new data source API. Data frames will become a common interchange 
format between Spark components and when importing and exporting data to other 
systems. Data frames are supported in Python, Scala, and Java.</p>
@@ -203,7 +203,7 @@
 <p>In this release Spark MLlib introduces several new algorithms: latent 
Dirichlet allocation (LDA) for <a 
href="https://issues.apache.org/jira/browse/SPARK-1405";>topic modeling</a>, <a 
href="https://issues.apache.org/jira/browse/SPARK-2309";>multinomial logistic 
regression</a> for multiclass classification, <a 
href="https://issues.apache.org/jira/browse/SPARK-5012";>Gaussian mixture model 
(GMM)</a> and <a href="https://issues.apache.org/jira/browse/SPARK-4259";>power 
iteration clustering</a> for clustering, <a 
href="https://issues.apache.org/jira/browse/SPARK-4001";>FP-growth</a> for 
frequent pattern mining, and <a 
href="https://issues.apache.org/jira/browse/SPARK-4409";>block matrix 
abstraction</a> for distributed linear algebra. Initial support has been added 
for <a href="https://issues.apache.org/jira/browse/SPARK-4587";>model 
import/export</a> in exchangeable format, which will be expanded in future 
versions to cover more model types in Java/Python/Scala. The implementations of 
k-mea
 ns and ALS receive <a href="https://issues.apache.org/jira/browse/SPARK-3424, 
https://issues.apache.org/jira/browse/SPARK-3541";>updates</a> that lead to 
significant performance gain. PySpark now supports the <a 
href="https://issues.apache.org/jira/browse/SPARK-4586";>ML pipeline API</a> 
added in Spark 1.2, and <a 
href="https://issues.apache.org/jira/browse/SPARK-5094";>gradient boosted 
trees</a> and <a 
href="https://issues.apache.org/jira/browse/SPARK-5012";>Gaussian mixture 
model</a>. Finally, the ML pipeline API has been ported to support the new 
DataFrames abstraction.</p>
 
 <h3 id="spark-streaming">Spark Streaming</h3>
-<p>Spark 1.3 introduces a new <a 
href="https://issues.apache.org/jira/browse/SPARK-4964";><em>direct</em> Kafka 
API</a> (<a 
href="http://spark.apache.org/docs/1.3.0/streaming-kafka-integration.html";>docs</a>)
 which enables exactly-once delivery without the use of write ahead logs. It 
also adds a <a href="https://issues.apache.org/jira/browse/SPARK-5047";>Python 
Kafka API</a> along with infrastructure for additional Python APIâs in future 
releases. An online version of <a 
href="https://issues.apache.org/jira/browse/SPARK-4979";>logistic regression</a> 
and the ability to read <a 
href="https://issues.apache.org/jira/browse/SPARK-4969";>binary records</a> have 
also been added. For stateful operations, support has been added for loading of 
an <a href="https://issues.apache.org/jira/browse/SPARK-3660";>initial state 
RDD</a>. Finally, the streaming programming guide has been updated to include 
information about SQL and DataFrame operations within streaming applications, 
and important clari
 fications to the fault-tolerance semantics. </p>
+<p>Spark 1.3 introduces a new <a 
href="https://issues.apache.org/jira/browse/SPARK-4964";><em>direct</em> Kafka 
API</a> (<a 
href="http://spark.apache.org/docs/1.3.0/streaming-kafka-integration.html";>docs</a>)
 which enables exactly-once delivery without the use of write ahead logs. It 
also adds a <a href="https://issues.apache.org/jira/browse/SPARK-5047";>Python 
Kafka API</a> along with infrastructure for additional Python APIâs in future 
releases. An online version of <a 
href="https://issues.apache.org/jira/browse/SPARK-4979";>logistic regression</a> 
and the ability to read <a 
href="https://issues.apache.org/jira/browse/SPARK-4969";>binary records</a> have 
also been added. For stateful operations, support has been added for loading of 
an <a href="https://issues.apache.org/jira/browse/SPARK-3660";>initial state 
RDD</a>. Finally, the streaming programming guide has been updated to include 
information about SQL and DataFrame operations within streaming applications, 
and important clari
 fications to the fault-tolerance semantics.</p>
 
 <h3 id="graphx">GraphX</h3>
 <p>GraphX adds a handful of utility functions in this release, including 
conversion into a <a 
href="https://issues.apache.org/jira/browse/SPARK-4917";>canonical edge 
graph</a>.</p>
@@ -219,7 +219,7 @@
 <ul>
   <li><a 
href="https://issues.apache.org/jira/browse/SPARK-6194";>SPARK-6194</a>: A 
memory leak in PySPark&#8217;s <code>collect()</code>.</li>
   <li><a 
href="https://issues.apache.org/jira/browse/SPARK-6222";>SPARK-6222</a>: An 
issue with failure recovery in Spark Streaming.</li>
-  <li><a 
href="https://issues.apache.org/jira/browse/SPARK-6315";>SPARK-6315</a>: Spark 
SQL can&#8217;t read parquet data generated with Spark 1.1. </li>
+  <li><a 
href="https://issues.apache.org/jira/browse/SPARK-6315";>SPARK-6315</a>: Spark 
SQL can&#8217;t read parquet data generated with Spark 1.1.</li>
   <li><a 
href="https://issues.apache.org/jira/browse/SPARK-6247";>SPARK-6247</a>: Errors 
analyzing certain join types in Spark SQL.</li>
 </ul>
 

Modified: spark/site/releases/spark-release-1-3-1.html
URL: 
http://svn.apache.org/viewvc/spark/site/releases/spark-release-1-3-1.html?rev=1750410&r1=1750409&r2=1750410&view=diff
==============================================================================
--- spark/site/releases/spark-release-1-3-1.html (original)
+++ spark/site/releases/spark-release-1-3-1.html Mon Jun 27 20:31:41 2016
@@ -196,10 +196,10 @@
 <h4 id="spark-sql">Spark SQL</h4>
 <ul>
   <li>Unable to use reserved words in DDL (<a 
href="http://issues.apache.org/jira/browse/SPARK-6250";>SPARK-6250</a>)</li>
-  <li>Parquet no longer caches metadata (<a 
href="http://issues.apache.org/jira/browse/SPARK-6575";>SPARK-6575</a>) </li>
+  <li>Parquet no longer caches metadata (<a 
href="http://issues.apache.org/jira/browse/SPARK-6575";>SPARK-6575</a>)</li>
   <li>Bug when joining two Parquet tables (<a 
href="http://issues.apache.org/jira/browse/SPARK-6851";>SPARK-6851</a>)</li>
-  <li>Unable to read parquet data generated by Spark 1.1.1 (<a 
href="http://issues.apache.org/jira/browse/SPARK-6315";>SPARK-6315</a>) </li>
-  <li>Parquet data source may use wrong Hadoop FileSystem (<a 
href="http://issues.apache.org/jira/browse/SPARK-6330";>SPARK-6330</a>) </li>
+  <li>Unable to read parquet data generated by Spark 1.1.1 (<a 
href="http://issues.apache.org/jira/browse/SPARK-6315";>SPARK-6315</a>)</li>
+  <li>Parquet data source may use wrong Hadoop FileSystem (<a 
href="http://issues.apache.org/jira/browse/SPARK-6330";>SPARK-6330</a>)</li>
 </ul>
 
 <h4 id="spark-streaming">Spark Streaming</h4>

Modified: spark/site/releases/spark-release-1-4-0.html
URL: 
http://svn.apache.org/viewvc/spark/site/releases/spark-release-1-4-0.html?rev=1750410&r1=1750409&r2=1750410&view=diff
==============================================================================
--- spark/site/releases/spark-release-1-4-0.html (original)
+++ spark/site/releases/spark-release-1-4-0.html Mon Jun 27 20:31:41 2016
@@ -250,7 +250,7 @@ Python coverage. MLlib also adds several
 </ul>
 
 <h3 id="spark-streaming">Spark Streaming</h3>
-<p>Spark streaming adds visual instrumentation graphs and significantly 
improved debugging information in the UI. It also enhances support for both 
Kafka and Kinesis. </p>
+<p>Spark streaming adds visual instrumentation graphs and significantly 
improved debugging information in the UI. It also enhances support for both 
Kafka and Kinesis.</p>
 
 <ul>
   <li><a 
href="https://issues.apache.org/jira/browse/SPARK-7602";>SPARK-7602</a>: 
Visualization and monitoring in the streaming UI including batch drill down (<a 
href="https://issues.apache.org/jira/browse/SPARK-6796";>SPARK-6796</a>, <a 
href="https://issues.apache.org/jira/browse/SPARK-6862";>SPARK-6862</a>)</li>
@@ -276,7 +276,7 @@ Python coverage. MLlib also adds several
 
 <h4 id="test-partners">Test Partners</h4>
 
-<p>Thanks to The following organizations, who helped benchmark or integration 
test release candidates: <br /> Intel, Palantir, Cloudera, Mesosphere, Huawei, 
Shopify, Netflix, Yahoo, UC Berkeley and Databricks. </p>
+<p>Thanks to The following organizations, who helped benchmark or integration 
test release candidates: <br /> Intel, Palantir, Cloudera, Mesosphere, Huawei, 
Shopify, Netflix, Yahoo, UC Berkeley and Databricks.</p>
 
 <h4 id="contributors">Contributors</h4>
 <ul>

Modified: spark/site/releases/spark-release-1-5-0.html
URL: 
http://svn.apache.org/viewvc/spark/site/releases/spark-release-1-5-0.html?rev=1750410&r1=1750409&r2=1750410&view=diff
==============================================================================
--- spark/site/releases/spark-release-1-5-0.html (original)
+++ spark/site/releases/spark-release-1-5-0.html Mon Jun 27 20:31:41 2016
@@ -191,25 +191,25 @@
 <p>You can consult JIRA for the <a 
href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315420&amp;version=12332078";>detailed
 changes</a>. We have curated a list of high level changes here:</p>
 
 <ul id="markdown-toc">
-  <li><a href="#apis-rdd-dataframe-and-sql">APIs: RDD, DataFrame and 
SQL</a></li>
-  <li><a href="#backend-execution-dataframe-and-sql">Backend Execution: 
DataFrame and SQL</a></li>
-  <li><a 
href="#integrations-data-sources-hive-hadoop-mesos-and-cluster-management">Integrations:
 Data Sources, Hive, Hadoop, Mesos and Cluster Management</a></li>
-  <li><a href="#r-language">R Language</a></li>
-  <li><a href="#machine-learning-and-advanced-analytics">Machine Learning and 
Advanced Analytics</a></li>
-  <li><a href="#spark-streaming">Spark Streaming</a></li>
-  <li><a 
href="#deprecations-removals-configs-and-behavior-changes">Deprecations, 
Removals, Configs, and Behavior Changes</a>    <ul>
-      <li><a href="#spark-core">Spark Core</a></li>
-      <li><a href="#spark-sql--dataframes">Spark SQL &amp; DataFrames</a></li>
-      <li><a href="#spark-streaming-1">Spark Streaming</a></li>
-      <li><a href="#mllib">MLlib</a></li>
+  <li><a href="#apis-rdd-dataframe-and-sql" 
id="markdown-toc-apis-rdd-dataframe-and-sql">APIs: RDD, DataFrame and 
SQL</a></li>
+  <li><a href="#backend-execution-dataframe-and-sql" 
id="markdown-toc-backend-execution-dataframe-and-sql">Backend Execution: 
DataFrame and SQL</a></li>
+  <li><a 
href="#integrations-data-sources-hive-hadoop-mesos-and-cluster-management" 
id="markdown-toc-integrations-data-sources-hive-hadoop-mesos-and-cluster-management">Integrations:
 Data Sources, Hive, Hadoop, Mesos and Cluster Management</a></li>
+  <li><a href="#r-language" id="markdown-toc-r-language">R Language</a></li>
+  <li><a href="#machine-learning-and-advanced-analytics" 
id="markdown-toc-machine-learning-and-advanced-analytics">Machine Learning and 
Advanced Analytics</a></li>
+  <li><a href="#spark-streaming" id="markdown-toc-spark-streaming">Spark 
Streaming</a></li>
+  <li><a href="#deprecations-removals-configs-and-behavior-changes" 
id="markdown-toc-deprecations-removals-configs-and-behavior-changes">Deprecations,
 Removals, Configs, and Behavior Changes</a>    <ul>
+      <li><a href="#spark-core" id="markdown-toc-spark-core">Spark 
Core</a></li>
+      <li><a href="#spark-sql--dataframes" 
id="markdown-toc-spark-sql--dataframes">Spark SQL &amp; DataFrames</a></li>
+      <li><a href="#spark-streaming-1" 
id="markdown-toc-spark-streaming-1">Spark Streaming</a></li>
+      <li><a href="#mllib" id="markdown-toc-mllib">MLlib</a></li>
     </ul>
   </li>
-  <li><a href="#known-issues">Known Issues</a>    <ul>
-      <li><a href="#sqldataframe">SQL/DataFrame</a></li>
-      <li><a href="#streaming">Streaming</a></li>
+  <li><a href="#known-issues" id="markdown-toc-known-issues">Known Issues</a>  
  <ul>
+      <li><a href="#sqldataframe" 
id="markdown-toc-sqldataframe">SQL/DataFrame</a></li>
+      <li><a href="#streaming" id="markdown-toc-streaming">Streaming</a></li>
     </ul>
   </li>
-  <li><a href="#credits">Credits</a></li>
+  <li><a href="#credits" id="markdown-toc-credits">Credits</a></li>
 </ul>
 
 <h3 id="apis-rdd-dataframe-and-sql">APIs: RDD, DataFrame and SQL</h3>

Modified: spark/site/releases/spark-release-1-6-0.html
URL: 
http://svn.apache.org/viewvc/spark/site/releases/spark-release-1-6-0.html?rev=1750410&r1=1750409&r2=1750410&view=diff
==============================================================================
--- spark/site/releases/spark-release-1-6-0.html (original)
+++ spark/site/releases/spark-release-1-6-0.html Mon Jun 27 20:31:41 2016
@@ -191,13 +191,13 @@
 <p>You can consult JIRA for the <a 
href="https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12333083&amp;projectId=12315420";>detailed
 changes</a>. We have curated a list of high level changes here:</p>
 
 <ul id="markdown-toc">
-  <li><a href="#spark-coresql">Spark Core/SQL</a></li>
-  <li><a href="#spark-streaming">Spark Streaming</a></li>
-  <li><a href="#mllib">MLlib</a></li>
-  <li><a href="#deprecations">Deprecations</a></li>
-  <li><a href="#changes-of-behavior">Changes of behavior</a></li>
-  <li><a href="#known-issues">Known issues</a></li>
-  <li><a href="#credits">Credits</a></li>
+  <li><a href="#spark-coresql" id="markdown-toc-spark-coresql">Spark 
Core/SQL</a></li>
+  <li><a href="#spark-streaming" id="markdown-toc-spark-streaming">Spark 
Streaming</a></li>
+  <li><a href="#mllib" id="markdown-toc-mllib">MLlib</a></li>
+  <li><a href="#deprecations" 
id="markdown-toc-deprecations">Deprecations</a></li>
+  <li><a href="#changes-of-behavior" 
id="markdown-toc-changes-of-behavior">Changes of behavior</a></li>
+  <li><a href="#known-issues" id="markdown-toc-known-issues">Known 
issues</a></li>
+  <li><a href="#credits" id="markdown-toc-credits">Credits</a></li>
 </ul>
 
 <h3 id="spark-coresql">Spark Core/SQL</h3>
@@ -220,7 +220,7 @@
     <ul>
       <li><a 
href="https://issues.apache.org/jira/browse/SPARK-10000";>SPARK-10000</a> 
<strong>Unified Memory Management</strong>  - Shared memory for execution and 
caching instead of exclusive division of the regions.</li>
       <li><a 
href="https://issues.apache.org/jira/browse/SPARK-11787";>SPARK-11787</a> 
<strong>Parquet Performance</strong> - Improve Parquet scan performance when 
using flat schemas.</li>
-      <li><a 
href="https://issues.apache.org/jira/browse/SPARK-9241";>SPARK-9241&#160;</a> 
<strong>Improved query planner for queries having distinct 
aggregations</strong> - Query plans of distinct aggregations are more robust 
when distinct columns have high cardinality. </li>
+      <li><a 
href="https://issues.apache.org/jira/browse/SPARK-9241";>SPARK-9241&#160;</a> 
<strong>Improved query planner for queries having distinct 
aggregations</strong> - Query plans of distinct aggregations are more robust 
when distinct columns have high cardinality.</li>
       <li><a 
href="https://issues.apache.org/jira/browse/SPARK-9858";>SPARK-9858&#160;</a> 
<strong>Adaptive query execution</strong> - Initial support for automatically 
selecting the number of reducers for joins and aggregations.</li>
       <li><a 
href="https://issues.apache.org/jira/browse/SPARK-10978";>SPARK-10978</a> 
<strong>Avoiding double filters in Data Source API</strong> - When implementing 
a data source with filter pushdown, developers can now tell Spark SQL to avoid 
double evaluating a pushed-down filter.</li>
       <li><a 
href="https://issues.apache.org/jira/browse/SPARK-11111";>SPARK-11111</a> 
<strong>Fast null-safe joins</strong> - Joins using null-safe equality 
(<code>&lt;=&gt;</code>) will now execute using SortMergeJoin instead of 
computing a cartisian product.</li>
@@ -233,7 +233,7 @@
 <h3 id="spark-streaming">Spark Streaming</h3>
 
 <ul>
-  <li><strong>API Updates</strong> 
+  <li><strong>API Updates</strong>
     <ul>
       <li><a 
href="https://issues.apache.org/jira/browse/SPARK-2629";>SPARK-2629&#160;</a> 
<strong>New improved state management</strong> - <code>mapWithState</code> - a 
DStream transformation for stateful stream processing, supercedes 
<code>updateStateByKey</code> in functionality and performance.</li>
       <li><a 
href="https://issues.apache.org/jira/browse/SPARK-11198";>SPARK-11198</a> 
<strong>Kinesis record deaggregation</strong> - Kinesis streams have been 
upgraded to use KCL 1.4.0 and supports transparent deaggregation of 
KPL-aggregated records.</li>
@@ -244,7 +244,7 @@
   <li><strong>UI Improvements</strong>
     <ul>
       <li>Made failures visible in the streaming tab, in the timelines, batch 
list, and batch details page.</li>
-      <li>Made output operations visible in the streaming tab as progress 
bars. </li>
+      <li>Made output operations visible in the streaming tab as progress 
bars.</li>
     </ul>
   </li>
 </ul>

Modified: spark/site/sql/index.html
URL: 
http://svn.apache.org/viewvc/spark/site/sql/index.html?rev=1750410&r1=1750409&r2=1750410&view=diff
==============================================================================
--- spark/site/sql/index.html (original)
+++ spark/site/sql/index.html Mon Jun 27 20:31:41 2016
@@ -295,12 +295,6 @@
 </div>
 -->
 
-
-  </div>
-</div>
-
-
-  
 <div class="row">
   <div class="col-md-4 col-padded">
     <h3>Performance &amp; Scalability</h3>
@@ -348,6 +342,8 @@
   </div>
 </div>
 
+  </div>
+</div>
 
 
 

Modified: spark/site/streaming/index.html
URL: 
http://svn.apache.org/viewvc/spark/site/streaming/index.html?rev=1750410&r1=1750409&r2=1750410&view=diff
==============================================================================
--- spark/site/streaming/index.html (original)
+++ spark/site/streaming/index.html Mon Jun 27 20:31:41 2016
@@ -261,12 +261,6 @@
   </div>
 </div>
 
-
-  </div>
-</div>
-
-
-  
 <div class="row">
   <div class="col-md-4 col-padded">
     <h3>Deployment Options</h3>
@@ -326,6 +320,8 @@
   </div>
 </div>
 
+  </div>
+</div>
 
 
 

Modified: spark/sql/index.md
URL: 
http://svn.apache.org/viewvc/spark/sql/index.md?rev=1750410&r1=1750409&r2=1750410&view=diff
==============================================================================
--- spark/sql/index.md (original)
+++ spark/sql/index.md Mon Jun 27 20:31:41 2016
@@ -117,9 +117,6 @@ subproject: SQL
 </div>
 -->
 
-{% extra %}
-
-
 <div class="row">
   <div class="col-md-4 col-padded">
     <h3>Performance &amp; Scalability</h3>
@@ -166,5 +163,3 @@ subproject: SQL
     </a>
   </div>
 </div>
-
-{% endextra %}

Modified: spark/streaming/index.md
URL: 
http://svn.apache.org/viewvc/spark/streaming/index.md?rev=1750410&r1=1750409&r2=1750410&view=diff
==============================================================================
--- spark/streaming/index.md (original)
+++ spark/streaming/index.md Mon Jun 27 20:31:41 2016
@@ -84,10 +84,6 @@ subproject: Streaming
   </div>
 </div>
 
-
-{% extra %}
-
-
 <div class="row">
   <div class="col-md-4 col-padded">
     <h3>Deployment Options</h3>
@@ -146,5 +142,3 @@ subproject: Streaming
     </a>
   </div>
 </div>
-
-{% endextra %}



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r1750410 [2/2] - in /spark: ./ _plugins/ mllib/ releases/_posts/ site/ site/mllib/ site/news/ site/releases/ site/sql/ site/streaming/ sql/ streaming/

Reply via email to