spark git commit: [SPARK-16256][DOCS] Minor fixes on the Structured Streaming Programming Guide

2016-06-29 Thread tdas
Repository: spark
Updated Branches:
  refs/heads/branch-2.0 3134f116a -> c8a7c2305


[SPARK-16256][DOCS] Minor fixes on the Structured Streaming Programming Guide

Author: Tathagata Das 

Closes #13978 from tdas/SPARK-16256-1.

(cherry picked from commit 2c3d96134dcc0428983eea087db7e91072215aea)
Signed-off-by: Tathagata Das 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c8a7c230
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c8a7c230
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c8a7c230

Branch: refs/heads/branch-2.0
Commit: c8a7c23054209db5474d96de2a7e2d8a6f8cc0da
Parents: 3134f11
Author: Tathagata Das 
Authored: Wed Jun 29 23:38:19 2016 -0700
Committer: Tathagata Das 
Committed: Wed Jun 29 23:38:35 2016 -0700

--
 docs/structured-streaming-programming-guide.md | 44 +++--
 1 file changed, 23 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/c8a7c230/docs/structured-streaming-programming-guide.md
--
diff --git a/docs/structured-streaming-programming-guide.md 
b/docs/structured-streaming-programming-guide.md
index 9ed06be..5932566 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -459,7 +459,7 @@ val csvDF = spark
 .readStream
 .option("sep", ";")
 .schema(userSchema)  // Specify schema of the parquet files
-.csv("/path/to/directory")// Equivalent to 
format("cv").load("/path/to/directory")
+.csv("/path/to/directory")// Equivalent to 
format("csv").load("/path/to/directory")
 {% endhighlight %}
 
 
@@ -486,7 +486,7 @@ Dataset[Row] csvDF = spark
 .readStream()
 .option("sep", ";")
 .schema(userSchema)  // Specify schema of the parquet files
-.csv("/path/to/directory");// Equivalent to 
format("cv").load("/path/to/directory")
+.csv("/path/to/directory");// Equivalent to 
format("csv").load("/path/to/directory")
 {% endhighlight %}
 
 
@@ -513,7 +513,7 @@ csvDF = spark \
 .readStream() \
 .option("sep", ";") \
 .schema(userSchema) \
-.csv("/path/to/directory")# Equivalent to 
format("cv").load("/path/to/directory")
+.csv("/path/to/directory")# Equivalent to 
format("csv").load("/path/to/directory")
 {% endhighlight %}
 
 
@@ -522,10 +522,10 @@ csvDF = spark \
 These examples generate streaming DataFrames that are untyped, meaning that 
the schema of the DataFrame is not checked at compile time, only checked at 
runtime when the query is submitted. Some operations like `map`, `flatMap`, 
etc. need the type to be known at compile time. To do those, you can convert 
these untyped streaming DataFrames to typed streaming Datasets using the same 
methods as static DataFrame. See the SQL Programming Guide for more details. 
Additionally, more details on the supported streaming sources are discussed 
later in the document.
 
 ## Operations on streaming DataFrames/Datasets
-You can apply all kinds of operations on streaming DataFrames/Datasets - 
ranging from untyped, SQL-like operations (e.g. select, where, groupBy), to 
typed RDD-like operations (e.g. map, filter, flatMap). See the SQL programming 
guide for more details. Let’s take a look at a few example operations that 
you can use.
+You can apply all kinds of operations on streaming DataFrames/Datasets - 
ranging from untyped, SQL-like operations (e.g. select, where, groupBy), to 
typed RDD-like operations (e.g. map, filter, flatMap). See the [SQL programming 
guide](sql-programming-guide.html) for more details. Let’s take a look at a 
few example operations that you can use.
 
 ### Basic Operations - Selection, Projection, Aggregation
-Most of the common operations on DataFrame/Dataset are supported for 
streaming. The few operations that are not supported are discussed later in 
this section.
+Most of the common operations on DataFrame/Dataset are supported for 
streaming. The few operations that are not supported are [discussed 
later](#unsupported-operations) in this section.
 
 
 
@@ -618,7 +618,7 @@ df.groupBy("type").count()
 
 
 ### Window Operations on Event Time
-Aggregations over a sliding event-time window are straightforward with 
Structured Streaming. The key idea to understand about window-based 
aggregations are very similar to grouped aggregations. In a grouped 
aggregation, aggregate values (e.g. counts) are maintained for each unique 
value in the user-specified grouping column. In case of, window-based 
aggregations, aggregate values are maintained for each window the event-time of 
a row falls into. Let's understand this with an illustration. 
+Aggregations over a sliding event-time window are straightforward with 
Structured Streaming. Th

spark git commit: [SPARK-16256][DOCS] Minor fixes on the Structured Streaming Programming Guide

2016-06-29 Thread tdas
Repository: spark
Updated Branches:
  refs/heads/master dedbceec1 -> 2c3d96134


[SPARK-16256][DOCS] Minor fixes on the Structured Streaming Programming Guide

Author: Tathagata Das 

Closes #13978 from tdas/SPARK-16256-1.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2c3d9613
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2c3d9613
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2c3d9613

Branch: refs/heads/master
Commit: 2c3d96134dcc0428983eea087db7e91072215aea
Parents: dedbcee
Author: Tathagata Das 
Authored: Wed Jun 29 23:38:19 2016 -0700
Committer: Tathagata Das 
Committed: Wed Jun 29 23:38:19 2016 -0700

--
 docs/structured-streaming-programming-guide.md | 44 +++--
 1 file changed, 23 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/2c3d9613/docs/structured-streaming-programming-guide.md
--
diff --git a/docs/structured-streaming-programming-guide.md 
b/docs/structured-streaming-programming-guide.md
index 9ed06be..5932566 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -459,7 +459,7 @@ val csvDF = spark
 .readStream
 .option("sep", ";")
 .schema(userSchema)  // Specify schema of the parquet files
-.csv("/path/to/directory")// Equivalent to 
format("cv").load("/path/to/directory")
+.csv("/path/to/directory")// Equivalent to 
format("csv").load("/path/to/directory")
 {% endhighlight %}
 
 
@@ -486,7 +486,7 @@ Dataset[Row] csvDF = spark
 .readStream()
 .option("sep", ";")
 .schema(userSchema)  // Specify schema of the parquet files
-.csv("/path/to/directory");// Equivalent to 
format("cv").load("/path/to/directory")
+.csv("/path/to/directory");// Equivalent to 
format("csv").load("/path/to/directory")
 {% endhighlight %}
 
 
@@ -513,7 +513,7 @@ csvDF = spark \
 .readStream() \
 .option("sep", ";") \
 .schema(userSchema) \
-.csv("/path/to/directory")# Equivalent to 
format("cv").load("/path/to/directory")
+.csv("/path/to/directory")# Equivalent to 
format("csv").load("/path/to/directory")
 {% endhighlight %}
 
 
@@ -522,10 +522,10 @@ csvDF = spark \
 These examples generate streaming DataFrames that are untyped, meaning that 
the schema of the DataFrame is not checked at compile time, only checked at 
runtime when the query is submitted. Some operations like `map`, `flatMap`, 
etc. need the type to be known at compile time. To do those, you can convert 
these untyped streaming DataFrames to typed streaming Datasets using the same 
methods as static DataFrame. See the SQL Programming Guide for more details. 
Additionally, more details on the supported streaming sources are discussed 
later in the document.
 
 ## Operations on streaming DataFrames/Datasets
-You can apply all kinds of operations on streaming DataFrames/Datasets - 
ranging from untyped, SQL-like operations (e.g. select, where, groupBy), to 
typed RDD-like operations (e.g. map, filter, flatMap). See the SQL programming 
guide for more details. Let’s take a look at a few example operations that 
you can use.
+You can apply all kinds of operations on streaming DataFrames/Datasets - 
ranging from untyped, SQL-like operations (e.g. select, where, groupBy), to 
typed RDD-like operations (e.g. map, filter, flatMap). See the [SQL programming 
guide](sql-programming-guide.html) for more details. Let’s take a look at a 
few example operations that you can use.
 
 ### Basic Operations - Selection, Projection, Aggregation
-Most of the common operations on DataFrame/Dataset are supported for 
streaming. The few operations that are not supported are discussed later in 
this section.
+Most of the common operations on DataFrame/Dataset are supported for 
streaming. The few operations that are not supported are [discussed 
later](#unsupported-operations) in this section.
 
 
 
@@ -618,7 +618,7 @@ df.groupBy("type").count()
 
 
 ### Window Operations on Event Time
-Aggregations over a sliding event-time window are straightforward with 
Structured Streaming. The key idea to understand about window-based 
aggregations are very similar to grouped aggregations. In a grouped 
aggregation, aggregate values (e.g. counts) are maintained for each unique 
value in the user-specified grouping column. In case of, window-based 
aggregations, aggregate values are maintained for each window the event-time of 
a row falls into. Let's understand this with an illustration. 
+Aggregations over a sliding event-time window are straightforward with 
Structured Streaming. The key idea to understand about window-based 
aggregations are very similar to grouped aggregations. In a gro