I am also asking TVF windowing and EMIT syntax support in dev@calcite. See
[1].



[1]:
https://lists.apache.org/thread.html/71724f8a9079be11c04c70c64097491822323f560a79a7fa1321711d@%3Cdev.calcite.apache.org%3E

-Rui

On Mon, Aug 19, 2019 at 4:40 PM Rui Wang <ruw...@google.com> wrote:

> Hi Mingmin,
>
> Thanks for adding "INSERT INTO" (which I missed from the example)
>
> I am not sure if I understand the question:
>
> 1. multiple GBK with retraction is solved by [1].
> 2. In terms of SQL and its view, the output are defined by the last GBK.
>
> [1]:
> https://docs.google.com/document/d/14WRfxwk_iLUHGPty3C6ZenddPsp_d6jhmx0vuafXqmE/edit?usp=sharing
>
>
> -Rui
>
> On Mon, Aug 19, 2019 at 4:02 PM Mingmin Xu <mingm...@gmail.com> wrote:
>
>> +1 to support EMIT in Beam side first if we cannot include it in Calcite
>> in short time(See #1, #2). I'm open to use any format, the one above or
>> something as below. The tricky question is, what's the expected behavior
>> for a complex query with more than 1 GBK operators?
>>
>> EMIT  <INTERVAL '1' MINUTE> | <INTERVAL '100' ROW> [ACCUMULATE|DISCARD]
>> [INSERT INTO ...]
>> SELECT ...
>>
>> #1.
>> https://sematext.com/opensee/m/Calcite/FR3K9JVAl32VULr6?subj=Towards+a+spec+for+robust+streaming+SQL+Part+1
>> #2
>> https://sematext.com/opensee/m/Beam/gfKHFFDd4i1I3nZc2?subj=Towards+a+spec+for+robust+streaming+SQL+Part+2
>>
>> On Mon, Aug 19, 2019 at 12:02 PM Rui Wang <ruw...@google.com> wrote:
>>
>>> To update this idea, I think we can go a step further to support EMIT
>>> syntax from one-sql-to-rule-them-all paper [1].
>>>
>>> EMIT will allow periodic delay stream materialization. For stream view,
>>> it means we will add support to sinks to keep generating a changelog table.
>>> For view only, it means we will add support to sinks to generate a
>>> compacted table form changelog table periodically.
>>>
>>> Regarding to SQL, a typical query like the following should run:
>>>
>>>
>>> *WITH joined_table AS (SELECT * FROM S1 JOIN S2)*
>>> *SELECT XX FROM HOP(joined_table)*
>>> *EMTI [STREAM] AFTER DELAY INTERVAL '1' HOUR*
>>>
>>>
>>> By doing so, retractions will be much useful for SQL from a product
>>> scenario, in which we can have a meaningful end to end SQL pipeline.
>>>
>>> [1]: https://arxiv.org/pdf/1905.12133.pdf
>>>
>>> -Rui
>>>
>>> On Mon, Aug 12, 2019 at 11:30 PM Rui Wang <ruw...@google.com> wrote:
>>>
>>>> Hi Community,
>>>>
>>>> BeamSQL currently does not support unbounded-unbounded join with
>>>> non-default trigger. It is because:
>>>>
>>>> - Discarding mode does not work for outer joins because of lacking of
>>>> ability to retract pre-emitted values. You can think about an example in
>>>> which a tuple of (left_row, null) needed to be retracted  if the matched
>>>> right_row appears since last trigger fired.
>>>> - Accumulating mode *theoretically* can support unbounded-unbounded
>>>> join because it's supposed to always "overwrite" previous result. However
>>>> in practice, for join use cases such overwriting is too expensive. It would
>>>> be much more efficient if small changes in inputs of join only cause small
>>>> changes to downstream to compute.
>>>> - Both discarding mode and accumulating mode are not sufficient to
>>>> refine materialized data.
>>>>
>>>> Meanwhile, [1] has kicked off a discussion on retractions in Beam
>>>> model. I have been collecting people's feedback and generally speaking
>>>> people agree that retractions are useful for some use cases.
>>>>
>>>> Thus I propose to combine SQL join with retractions to
>>>> support multiple-triggering SQL Join.
>>>>
>>>> I think SQL join is a good start for supporting retraction in Beam with
>>>> the following caveats:
>>>> 1. multiple-triggering SQL Join is a useful feature.
>>>> 2. SQL join is an opportunity for us to figure out implementation
>>>> details of retraction by building it for a well defined use case.
>>>> 3. Supporting retraction should not cause performance regression on
>>>> existing pipelines, or require changes on existing pipelines.
>>>>
>>>>
>>>> What do you think?
>>>>
>>>> [1]:
>>>> https://lists.apache.org/thread.html/bb2d40b1bea8b21fbbb7caf599fabba823da357768ceca8ea2363789@%3Cdev.beam.apache.org%3E
>>>>
>>>>
>>>> -Rui
>>>>
>>>
>>
>> --
>> ----
>> Mingmin
>>
>

Reply via email to