I am also asking TVF windowing and EMIT syntax support in dev@calcite. See [1].
[1]: https://lists.apache.org/thread.html/71724f8a9079be11c04c70c64097491822323f560a79a7fa1321711d@%3Cdev.calcite.apache.org%3E -Rui On Mon, Aug 19, 2019 at 4:40 PM Rui Wang <[email protected]> wrote: > Hi Mingmin, > > Thanks for adding "INSERT INTO" (which I missed from the example) > > I am not sure if I understand the question: > > 1. multiple GBK with retraction is solved by [1]. > 2. In terms of SQL and its view, the output are defined by the last GBK. > > [1]: > https://docs.google.com/document/d/14WRfxwk_iLUHGPty3C6ZenddPsp_d6jhmx0vuafXqmE/edit?usp=sharing > > > -Rui > > On Mon, Aug 19, 2019 at 4:02 PM Mingmin Xu <[email protected]> wrote: > >> +1 to support EMIT in Beam side first if we cannot include it in Calcite >> in short time(See #1, #2). I'm open to use any format, the one above or >> something as below. The tricky question is, what's the expected behavior >> for a complex query with more than 1 GBK operators? >> >> EMIT <INTERVAL '1' MINUTE> | <INTERVAL '100' ROW> [ACCUMULATE|DISCARD] >> [INSERT INTO ...] >> SELECT ... >> >> #1. >> https://sematext.com/opensee/m/Calcite/FR3K9JVAl32VULr6?subj=Towards+a+spec+for+robust+streaming+SQL+Part+1 >> #2 >> https://sematext.com/opensee/m/Beam/gfKHFFDd4i1I3nZc2?subj=Towards+a+spec+for+robust+streaming+SQL+Part+2 >> >> On Mon, Aug 19, 2019 at 12:02 PM Rui Wang <[email protected]> wrote: >> >>> To update this idea, I think we can go a step further to support EMIT >>> syntax from one-sql-to-rule-them-all paper [1]. >>> >>> EMIT will allow periodic delay stream materialization. For stream view, >>> it means we will add support to sinks to keep generating a changelog table. >>> For view only, it means we will add support to sinks to generate a >>> compacted table form changelog table periodically. >>> >>> Regarding to SQL, a typical query like the following should run: >>> >>> >>> *WITH joined_table AS (SELECT * FROM S1 JOIN S2)* >>> *SELECT XX FROM HOP(joined_table)* >>> *EMTI [STREAM] AFTER DELAY INTERVAL '1' HOUR* >>> >>> >>> By doing so, retractions will be much useful for SQL from a product >>> scenario, in which we can have a meaningful end to end SQL pipeline. >>> >>> [1]: https://arxiv.org/pdf/1905.12133.pdf >>> >>> -Rui >>> >>> On Mon, Aug 12, 2019 at 11:30 PM Rui Wang <[email protected]> wrote: >>> >>>> Hi Community, >>>> >>>> BeamSQL currently does not support unbounded-unbounded join with >>>> non-default trigger. It is because: >>>> >>>> - Discarding mode does not work for outer joins because of lacking of >>>> ability to retract pre-emitted values. You can think about an example in >>>> which a tuple of (left_row, null) needed to be retracted if the matched >>>> right_row appears since last trigger fired. >>>> - Accumulating mode *theoretically* can support unbounded-unbounded >>>> join because it's supposed to always "overwrite" previous result. However >>>> in practice, for join use cases such overwriting is too expensive. It would >>>> be much more efficient if small changes in inputs of join only cause small >>>> changes to downstream to compute. >>>> - Both discarding mode and accumulating mode are not sufficient to >>>> refine materialized data. >>>> >>>> Meanwhile, [1] has kicked off a discussion on retractions in Beam >>>> model. I have been collecting people's feedback and generally speaking >>>> people agree that retractions are useful for some use cases. >>>> >>>> Thus I propose to combine SQL join with retractions to >>>> support multiple-triggering SQL Join. >>>> >>>> I think SQL join is a good start for supporting retraction in Beam with >>>> the following caveats: >>>> 1. multiple-triggering SQL Join is a useful feature. >>>> 2. SQL join is an opportunity for us to figure out implementation >>>> details of retraction by building it for a well defined use case. >>>> 3. Supporting retraction should not cause performance regression on >>>> existing pipelines, or require changes on existing pipelines. >>>> >>>> >>>> What do you think? >>>> >>>> [1]: >>>> https://lists.apache.org/thread.html/bb2d40b1bea8b21fbbb7caf599fabba823da357768ceca8ea2363789@%3Cdev.beam.apache.org%3E >>>> >>>> >>>> -Rui >>>> >>> >> >> -- >> ---- >> Mingmin >> >
