> On Feb 13, 2016, at 11:38 PM, Wanglan (Lan) <[email protected]> wrote:
> 
> Great discussions!
> 
> It seems kind of agreement has been reached . In my opinion, the window 
> definitions are the basic concepts we should clearly describe first. How is 
> the progress? Do we need to create a jira or something? 

I think we have a good start in http://calcite.apache.org/docs/stream.html, and 
some extra ideas in my HOP/TUMBLE email[1]. I promised to write them up, but I 
haven’t got around to it yet, and a JIRA would help me remember.

I don’t know whether we can ever say we are “done” with a specification. Of 
course I can write down my opinion, but it would just be my opinion. :) If we 
have a conversation thread (or a JIRA case) about each requirement, and 
representatives of the streaming projects (Fabian for Flink, Milinda for Samza, 
? for Storm) chime in, maybe we can reach consensus.

After we reach consensus on a particular feature, and write it up, I’d also 
like to create a set of sample queries & responses that illustrate that 
feature. Calcite could contain a TCK that any compliant SQL engine could run. 
How do people feel about having tests as a deliverable? Would you use them in 
your project?

> Btw, happy Chinese new year ;) !

Thank you! And you too!

Julian

[1] 
http://mail-archives.apache.org/mod_mbox/calcite-dev/201506.mbox/%3CCAPSgeETbowxM2TRX0RFxQ_tEAPk2uM=he0arywinbtovgwb...@mail.gmail.com%3E

> 
> Lan
> 
> -----邮件原件-----
> 发件人: Fabian Hueske [mailto:[email protected]] 
> 发送时间: 2016年2月6日 17:29
> 收件人: [email protected]
> 主题: Re: About Stream SQL
> 
> Excellent! I missed the punctuations in the todo list.
> 
> What kind of strategies do you have in mind to handle events that arrive too 
> late? I see 1. dropping of late events 2. computing an updated window result 
> for each late arriving element (implies that the window state is stored for a 
> certain period before it is discarded) 3. computing a delta to the previous 
> window result for each late arriving element (requires window state as well, 
> not applicable to all aggregation
> types)
> 
> It would be nice if strategies to handle late-arrivers could be defined in 
> the query.
> 
> I think the plans of the Flink community are quite well aligned with your 
> ideas for SQL on Streams.
> Should we start by updating / extending the Stream document on the Calcite 
> website to include the new window definitions (TUMBLE, HOP) and a discussion 
> of punctuations/watermarks/time bounds?
> 
> Fabian
> 
> 
> 
> 
> 
> 
> 2016-02-06 2:35 GMT+01:00 Julian Hyde <[email protected]>:
> 
>> Let me rephrase: The *majority* of the literature, of which I cited 
>> just one example, calls them punctuation, and a couple of recent 
>> papers out of Mountain View doesn't change that.
>> 
>> There are some fine distinctions between punctuation, heartbeats, 
>> watermarks and rowtime bounds, mostly in terms of how they are 
>> generated and propagated, that matter little when planning the query.
>> 
>> On Fri, Feb 5, 2016 at 5:18 PM, Ted Dunning <[email protected]> wrote:
>>> On Fri, Feb 5, 2016 at 5:10 PM, Julian Hyde <[email protected]> wrote:
>>> 
>>>> Yes, watermarks, absolutely. The "to do" list has "punctuation", 
>>>> which is the same thing. (Actually, I prefer to call it "rowtime bound"
>>>> because it is feels more like a dynamic constraint than a piece of 
>>>> data, but the literature[1] calls them punctuation.)
>>>> 
>>> 
>>> Some of the literature calls them punctuation, other literature [1] 
>>> calls them watermarks.
>>> 
>>> [1] http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf
>> 

Reply via email to