subject:"\[GitHub\] flink pull request #4365\: \[FLINK\-6747\] \[docs\] Add documentation for dynamic ..."

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-21 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4365


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-20 Thread sunjincheng121

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128656745
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-20 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128566720
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic tables and

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-20 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128513830
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic tables and

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-20 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128512155
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic tables and

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-19 Thread sunjincheng121

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128406783
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-19 Thread sunjincheng121

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128406716
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-19 Thread sunjincheng121

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128406043
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-19 Thread sunjincheng121

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128405026
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-19 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128350244
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic tables and

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-19 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128349657
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic tables and

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-19 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128349315
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic tables and

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-19 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128348285
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic tables and

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-19 Thread sunjincheng121

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128245611
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-19 Thread sunjincheng121

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128250087
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-19 Thread sunjincheng121

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128248216
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-19 Thread sunjincheng121

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4365#discussion_r128242397
  
--- Diff: docs/dev/table/streaming.md ---
@@ -22,21 +22,166 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-**TO BE DONE:** Intro
+Flink's [Table API](tableApi.html) and [SQL support](sql.html) are unified 
APIs for batch and stream processing. This means that Table API and SQL queries 
have the same semantics regardless whether their input is bounded batch input 
or unbounded stream input. Because the relational algebra and SQL were 
originally designed for batch processing, relational queries on unbounded 
streaming input are not as well understood as relational queries on bounded 
batch input. 
+
+On this page, we explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data. 
 
 * This will be replaced by the TOC
 {:toc}
 
-Dynamic Table
--
+Relational Queries on Data Streams
+--
+
+SQL and the relational algebra have not been designed with streaming data 
in mind. As a consequence, there are few conceptual gaps between relational 
algebra (and SQL) and stream processing.
+
+
+   
+   Relational Algebra / SQL
+   Stream Processing
+   
+   
+   Relations (or tables) are bounded (multi-)sets of 
tuples.
+   A stream is an infinite sequences of tuples.
+   
+   
+   A query that is executed on batch data (e.g., a table in a 
relational database) has access to the complete input data.
+   A streaming query cannot access all data when is started 
and has to "wait" for data to be streamed in.
+   
+   
+   A batch query terminates after it produced a fixed sized 
result.
+   A streaming query continuously updates its result based on 
the received records and never completes.
+   
+
+
+Despite these differences, processing streams with relational queries and 
SQL is not impossible. Advanced relational database systems offer a feature 
called *Materialized Views*. A materialized view is defined as a SQL query, 
just like a regular virtual view. In contrast to a virtual view, a materialized 
view caches the result of the query such that the query does not need to be 
evaluated when the view is accessed. A common challenge for caching is to 
prevent a cache from serving outdated results. A materialized view becomes 
outdated when the base tables of its definition query are modified. *Eager View 
Maintenance* is a technique to update materialized views and updates a 
materialized view as soon as its base tables are updated. 
+
+The connection between eager view maintenance and SQL queries on streams 
becomes obvious if we consider the following:
+
+- A database table is the result of a *stream* of `INSERT`, `UPDATE`, and 
`DELETE` DML statements, often called *changelog stream*.
+- A materialized view is defined as a SQL query. In order to update the 
view, the query is continuously processes the changelog streams of the view's 
base relations.
+- The materialized view is the result of the streaming SQL query.
+
+With these points in mind, we introduce Flink's concept of *Dynamic 
Tables* in the next section.
+
+Dynamic Tables  Continuous Queries
+---
+
+*Dynamic tables* are the core concept of Flink's Table API and SQL support 
for streaming data. In contrast to the static tables that represent batch data, 
dynamic table are changing over time. They can be queried like static batch 
tables. Querying a dynamic table yields a *Continuous Query*. A continuous 
query never terminates and produces a dynamic table as result. The query 
continuously updates its (dynamic) result table to reflect the changes on its 
input (dynamic) table. Essentially, a continuous query on a dynamic table is 
very similar to the definition query of a materialized view. 
+
+It is important to note that the result of a continuous query is always 
semantically equivalent to the result of the same query being executed in batch 
mode on a snapshot of the input tables.
+
+The following figure visualizes the relationship of streams, dynamic 
tables, and  continuous queries: 
+
+
+
+
+
+1. A stream is converted into a dynamic table.
+1. A continuous query is evaluated on the dynamic table yielding a new 
dynamic table.
+1. The resulting dynamic table is converted back into a stream.
+
+**Note:** Dynamic tables are foremost a logical concept. Dynamic tables 
are not necessarily (fully) materialized during query execution.
+
+In the following, we will explain the concepts of dynamic

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

2017-07-19 Thread fhueske

GitHub user fhueske opened a pull request:

https://github.com/apache/flink/pull/4365

[FLINK-6747] [docs] Add documentation for dynamic tables.

This PR adds documentation about dynamic tables to the Table API / SQL docs.

- [X] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [X] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhueske/flink tableStreamDocs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4365.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4365


commit 56c607b164a488493d69e5cb9eb9fab4b54175cf
Author: Fabian Hueske 
Date:   2017-07-17T17:11:04Z

[FLINK-6747] [docs] Add documentation for dynamic tables.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

[GitHub] flink pull request #4365: [FLINK-6747] [docs] Add documentation for dynamic ...

18 matches

Site Navigation

Mail list logo

Footer information