[ https://issues.apache.org/jira/browse/CALCITE-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022551#comment-17022551 ]
Rui Wang edited comment on CALCITE-3737 at 1/23/20 10:15 PM: ------------------------------------------------------------- Addressed your comments and have two responses to two of the comments: > Can HOP and TUMBLE share implementation? I tried to share most of the code and just implemented the windowing part (computing window_start and window_end). Later I gave it up cause hopping need call one function to return a list of hopping's window_start and window_end, and we won't know the size of the list so we cannot really write a for loop in Java. (note that I need to build a list of lin4j expressions and you can check discussion here: [link|https://lists.apache.org/thread.html/86e5aa132de0656419843cab6c1f4fbea5941d4401dbde36cc11827e%40%3Cdev.calcite.apache.org%3E]). Also considering later I will add per-key sessionazation and bucket_gap_filling table functions, they will have even more complicated code to write and is also less sharable. For example, per-key sessionazation will need know all data first and then apply sorting to find window start and window end. Thus I will prefer implement those by the way that implements hopping (e.g. provide a AbstractEnumerable<Object[]> implementation). As I am building more table functions and add support for streaming sql, if I want better way to unified table functions implementation, I will add patches for that. >Changes to reference.md need some copy-editing. I tried to check the changes in reference.md and made some changes. However I am not a native English speaker so I might not really fix what in your mind before. was (Author: amaliujia): Addressed your comments and have two responses to two of the comments: > Can HOP and TUMBLE share implementation? I tried to share most of the code and just implemented the windowing part (computing window_start and window_end). Later I gave it up cause hopping need call one function to return a list of hopping's window_start and window_end, and we won't know the size of the list so we cannot really write a for loop in Java. (note that I need to build a list of lin4j expressions and you can check discussion here: [link|https://lists.apache.org/thread.html/86e5aa132de0656419843cab6c1f4fbea5941d4401dbde36cc11827e%40%3Cdev.calcite.apache.org%3E]). Also considering later I will add per-key sessionazation and bucket_gap_filling table functions, they will have even more complicated code to write thus I will prefer implement those by the way that implements hopping (e.g. provide a AbstractEnumerable<Object[]> implementation). >Changes to reference.md need some copy-editing. I tried to check the changes in reference.md and made some changes. However I am not a native English speaker so I might not really fix what in your mind before. > HOP Table-valued Function > ------------------------- > > Key: CALCITE-3737 > URL: https://issues.apache.org/jira/browse/CALCITE-3737 > Project: Calcite > Issue Type: Sub-task > Reporter: Rui Wang > Assignee: Rui Wang > Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > Hopping windows place intervals of a fixed size evenly spaced across event > time. Most importantly, in the most common use a given event time timestamp > will generally fall into more than one window. > The table-valued function Hop may produce zero, one, or multiple rows > corresponding to each row of input. Hop takes four required parameters and > one optional parameter. All parameters are analogous to those for Tumble > except for hopsize, which specifies the duration between the starting points > (and endpoints) of the hopping windows, allowing for overlapping windows > (hopsize < dur, common) or gaps in the data (hopsize > dur, rarely useful). > {code:java} > Hop (data , timecol , dur, hopsize) > {code} > The return value of Hop is a relation that includes all columns of data as > well as additional event time columns wstart and wend. Here is an example > (from https://s.apache.org/streaming-beam-sql ): > {code:sql} > SELECT * > FROM Hop ( > data => TABLE Bids , > timecol => DESCRIPTOR ( bidtime ) , > dur => INTERVAL '10' MINUTES , > hopsize => INTERVAL '5' MINUTES ); > ------------------------------------------ > | wstart | wend | bidtime | price | item | > ------------------------------------------ > | 8:00 | 8:10 | 8:07 | $2 | A | > | 8:05 | 8:15 | 8:07 | $2 | A | > | 8:05 | 8:15 | 8:11 | $3 | B | > | 8:10 | 8:20 | 8:11 | $3 | B | > | 8:00 | 8:10 | 8:05 | $4 | C | > | 8:05 | 8:15 | 8:05 | $4 | C | > | 8:00 | 8:10 | 8:09 | $5 | D | > | 8:05 | 8:15 | 8:09 | $5 | D | > | 8:05 | 8:15 | 8:13 | $1 | E | > | 8:10 | 8:20 | 8:13 | $1 | E | > | 8:10 | 8:20 | 8:17 | $6 | F | > | 8:15 | 8:25 | 8:17 | $6 | F | > ------------------------------------------ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)