[jira] [Commented] (FLINK-4557) Table API Stream Aggregations

2017-02-15 Thread sunjincheng (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869418#comment-15869418
 ] 

sunjincheng commented on FLINK-4557:


[~fhueske] Thanks for converter the "FLINK-5803" & "FLINK-5804" to FLINK-4557's 
sub-task.

> Table API Stream Aggregations
> -
>
> Key: FLINK-4557
> URL: https://issues.apache.org/jira/browse/FLINK-4557
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>
> The Table API is a declarative API to define queries on static and streaming 
> tables. So far, only projection, selection, and union are supported 
> operations on streaming tables.
> This issue and the corresponding FLIP proposes to add support for different 
> types of aggregations on top of streaming tables. In particular, we seek to 
> support:
> *Group-window aggregates*, i.e., aggregates which are computed for a group of 
> elements. A (time or row-count) window is required to bound the infinite 
> input stream into a finite group.
> *Row-window aggregates*, i.e., aggregates which are computed for each row, 
> based on a window (range) of preceding and succeeding rows.
> Each type of aggregate shall be supported on keyed/grouped or 
> non-keyed/grouped data streams for streaming tables as well as batch tables.
> Since time-windowed aggregates will be the first operation that require the 
> definition of time, we also need to discuss how the Table API handles time 
> characteristics, timestamps, and watermarks.
> The FLIP can be found here: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4557) Table API Stream Aggregations

2017-02-15 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869407#comment-15869407
 ] 

Fabian Hueske commented on FLINK-4557:
--

Hi [~sunjincheng121], I think your last two comment should go into FLINK-5803 
and FLINK-5804, right?
Can you move them and delete them here? Thanks, Fabian

> Table API Stream Aggregations
> -
>
> Key: FLINK-4557
> URL: https://issues.apache.org/jira/browse/FLINK-4557
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>
> The Table API is a declarative API to define queries on static and streaming 
> tables. So far, only projection, selection, and union are supported 
> operations on streaming tables.
> This issue and the corresponding FLIP proposes to add support for different 
> types of aggregations on top of streaming tables. In particular, we seek to 
> support:
> *Group-window aggregates*, i.e., aggregates which are computed for a group of 
> elements. A (time or row-count) window is required to bound the infinite 
> input stream into a finite group.
> *Row-window aggregates*, i.e., aggregates which are computed for each row, 
> based on a window (range) of preceding and succeeding rows.
> Each type of aggregate shall be supported on keyed/grouped or 
> non-keyed/grouped data streams for streaming tables as well as batch tables.
> Since time-windowed aggregates will be the first operation that require the 
> definition of time, we also need to discuss how the Table API handles time 
> characteristics, timestamps, and watermarks.
> The FLIP can be found here: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4557) Table API Stream Aggregations

2017-02-15 Thread sunjincheng (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869275#comment-15869275
 ] 

sunjincheng commented on FLINK-4557:


Hi,guys,I made a preliminary implementation of this JIRA.
My approach is:
1. Calcite -> Flink
"LogicalProject with RexOver expression" – (normalize rule) -> "Calcite's 
LogicalWindow" – (opt rule) -> DataStreamRowWindowAggregate
2. datastreamAPI:
a. With partitionBy situation: 
approach1: inputDS.map().keyBy().reduce().map() //we prefer this approach, 
because we can use the current reduce state.
approach2: inputDS.map().keyBy().process()
b. Without partitionBy situation: 
inputDS.map().setParallelism(1), map has implement CheckPointedFunction.
3. About OrderBy:
According to the natural order of elements, procTime () use for generate 
end-time of the window and guaranteed pass the sql validation.
HI,Fabian Hueske IMO. Since all the above design has not been implemented yet 
in flink master, if I put all of my design into one PR, it will be very huge. I 
would like to split the design into the following subtasks:
1. first submit one JIRA. with "#1Calcite -> FlINK part #2dataStreamAPI a. 
rowWindow with partitionBy"
2. then submit #2dataStreamAPI b.rowWindow without partitionBy
Does this make sense to you? It would be very appreciated if you could give 
some advice.

> Table API Stream Aggregations
> -
>
> Key: FLINK-4557
> URL: https://issues.apache.org/jira/browse/FLINK-4557
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>
> The Table API is a declarative API to define queries on static and streaming 
> tables. So far, only projection, selection, and union are supported 
> operations on streaming tables.
> This issue and the corresponding FLIP proposes to add support for different 
> types of aggregations on top of streaming tables. In particular, we seek to 
> support:
> *Group-window aggregates*, i.e., aggregates which are computed for a group of 
> elements. A (time or row-count) window is required to bound the infinite 
> input stream into a finite group.
> *Row-window aggregates*, i.e., aggregates which are computed for each row, 
> based on a window (range) of preceding and succeeding rows.
> Each type of aggregate shall be supported on keyed/grouped or 
> non-keyed/grouped data streams for streaming tables as well as batch tables.
> Since time-windowed aggregates will be the first operation that require the 
> definition of time, we also need to discuss how the Table API handles time 
> characteristics, timestamps, and watermarks.
> The FLIP can be found here: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4557) Table API Stream Aggregations

2017-02-15 Thread sunjincheng (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869276#comment-15869276
 ] 

sunjincheng commented on FLINK-4557:


Hi,guys,I made a preliminary implementation of this JIRA.
My approach is:
1. Calcite -> Flink
"LogicalProject with RexOver expression" – (normalize rule) -> "Calcite's 
LogicalWindow" – (opt rule) -> DataStreamRowWindowAggregate
2. datastreamAPI:
a. With partitionBy situation: 
approach1: inputDS.map().keyBy().reduce().map() //we prefer this approach, 
because we can use the current reduce state.
approach2: inputDS.map().keyBy().process()
b. Without partitionBy situation: 
inputDS.map().setParallelism(1), map has implement CheckPointedFunction.
3. About OrderBy:
According to the natural order of elements, procTime () use for generate 
end-time of the window and guaranteed pass the sql validation.
HI,Fabian Hueske IMO. Since all the above design has not been implemented yet 
in flink master, if I put all of my design into one PR, it will be very huge. I 
would like to split the design into the following subtasks:
1. first submit one JIRA. with "#1Calcite -> FlINK part #2dataStreamAPI a. 
rowWindow with partitionBy"
2. then submit #2dataStreamAPI b.rowWindow without partitionBy
Does this make sense to you? It would be very appreciated if you could give 
some advice.

> Table API Stream Aggregations
> -
>
> Key: FLINK-4557
> URL: https://issues.apache.org/jira/browse/FLINK-4557
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>
> The Table API is a declarative API to define queries on static and streaming 
> tables. So far, only projection, selection, and union are supported 
> operations on streaming tables.
> This issue and the corresponding FLIP proposes to add support for different 
> types of aggregations on top of streaming tables. In particular, we seek to 
> support:
> *Group-window aggregates*, i.e., aggregates which are computed for a group of 
> elements. A (time or row-count) window is required to bound the infinite 
> input stream into a finite group.
> *Row-window aggregates*, i.e., aggregates which are computed for each row, 
> based on a window (range) of preceding and succeeding rows.
> Each type of aggregate shall be supported on keyed/grouped or 
> non-keyed/grouped data streams for streaming tables as well as batch tables.
> Since time-windowed aggregates will be the first operation that require the 
> definition of time, we also need to discuss how the Table API handles time 
> characteristics, timestamps, and watermarks.
> The FLIP can be found here: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4557) Table API Stream Aggregations

2016-09-20 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506053#comment-15506053
 ] 

Timo Walther commented on FLINK-4557:
-

[~shijinkui] I will create issues for the subtasks once the FLIP-11 is not in 
"discuss" state anymore.

> Table API Stream Aggregations
> -
>
> Key: FLINK-4557
> URL: https://issues.apache.org/jira/browse/FLINK-4557
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>
> The Table API is a declarative API to define queries on static and streaming 
> tables. So far, only projection, selection, and union are supported 
> operations on streaming tables.
> This issue and the corresponding FLIP proposes to add support for different 
> types of aggregations on top of streaming tables. In particular, we seek to 
> support:
> *Group-window aggregates*, i.e., aggregates which are computed for a group of 
> elements. A (time or row-count) window is required to bound the infinite 
> input stream into a finite group.
> *Row-window aggregates*, i.e., aggregates which are computed for each row, 
> based on a window (range) of preceding and succeeding rows.
> Each type of aggregate shall be supported on keyed/grouped or 
> non-keyed/grouped data streams for streaming tables as well as batch tables.
> Since time-windowed aggregates will be the first operation that require the 
> definition of time, we also need to discuss how the Table API handles time 
> characteristics, timestamps, and watermarks.
> The FLIP can be found here: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4557) Table API Stream Aggregations

2016-09-17 Thread shijinkui (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15500099#comment-15500099
 ] 

shijinkui commented on FLINK-4557:
--

Does this plan started? 
What's the progress now?

thanks

> Table API Stream Aggregations
> -
>
> Key: FLINK-4557
> URL: https://issues.apache.org/jira/browse/FLINK-4557
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>
> The Table API is a declarative API to define queries on static and streaming 
> tables. So far, only projection, selection, and union are supported 
> operations on streaming tables.
> This issue and the corresponding FLIP proposes to add support for different 
> types of aggregations on top of streaming tables. In particular, we seek to 
> support:
> *Group-window aggregates*, i.e., aggregates which are computed for a group of 
> elements. A (time or row-count) window is required to bound the infinite 
> input stream into a finite group.
> *Row-window aggregates*, i.e., aggregates which are computed for each row, 
> based on a window (range) of preceding and succeeding rows.
> Each type of aggregate shall be supported on keyed/grouped or 
> non-keyed/grouped data streams for streaming tables as well as batch tables.
> Since time-windowed aggregates will be the first operation that require the 
> definition of time, we also need to discuss how the Table API handles time 
> characteristics, timestamps, and watermarks.
> The FLIP can be found here: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)