Hi Yi, As I understand rules and re-writes basically do the same thing (changing/re-writing the operator tree). But in case of rules this happens during planning based on the query planner configuration. And re-writing is done on the planner output, after the query goes through the planner. In Calcite re-write is happening inside the interpreter and in our case it will be inside the query plan to operator router conversion phase.
Thanks Milinda On Mon, Mar 2, 2015 at 2:31 PM, Yi Pan <nickpa...@gmail.com> wrote: > Hi, Milinda, > > +1 on your default window idea. One question: what's the difference between > a rule and a re-write? > > Thanks! > > On Mon, Mar 2, 2015 at 7:14 AM, Milinda Pathirage <mpath...@umail.iu.edu> > wrote: > > > @Chris > > Yes, I was referring to that mail. Actually I was wrong about the ‘Now’ > > window, it should be a ‘Unbounded’ window for most the default scenarios > > (Section 6.4 of https://cs.uwaterloo.ca/~david/cs848/stream-cql.pdf). > > Because > > applying a ‘Now’ window with size of 1 will double the number of events > > generated if we consider insert/delete streams. But ‘Unbounded’ will only > > generate insert events. > > > > @Yi > > 1. You are correct about Calcite.There is no stream-to-relation > conversion > > happening. But as I understand we don’t need Calcite to support this. We > > can add it to our query planner as a rule or re-write. What I am not sure > > is whether to use a rule or a re-write. > > 2. There is a rule in Calcite which extract the Window out from the > > Project. But I am not sure why that didn’t happen in my test. This rule > is > > added to the planner by default. I’ll ask about this in Calcite mailing > > list. > > > > I think we can figure out a way to move the window to the input stream if > > Calcite can move the window out from Project. I’ll see how we can do > this. > > > > Also I’ll go ahead and implement default windows. We can change it later > if > > Julian or someone from Calcite comes up with a better suggestion. > > > > Thanks > > Milinda > > > > On Sun, Mar 1, 2015 at 8:23 PM, Yi Pan <nickpa...@gmail.com> wrote: > > > > > Hi, Milinda, > > > > > > Sorry to reply late on this. Here are some of my comments: > > > 1) In Calcite's model, it seems that there is no stream-to-relation > > > conversion step. In the first example where the window specification is > > > missing, I like your solution to add the default LogicalNowWindow > > operator > > > s.t. it makes the physical operator matches the query plan. However, if > > > Calcite community does not agree to add the default LogicalNowWindow, > it > > > would be fine for us if we always insert a default "now" window on a > > stream > > > when we generate the Samza configuration. > > > 2) I am more concerned on the other cases, where window operator is > used > > in > > > aggregation and join. In your example of windowed aggregation in > Calcite, > > > window spec seems to be a decoration to the LogicalProject operator, > > > instead of defining a data source to the LogicalProject operator. In > the > > > CQL model we followed, the window operator is considered as a query > > > primitive that generate a data source for other relation operators to > > > consume. How exactly is window operator used in Calcite planner? Isn't > it > > > much clear if the following is used? > > > LogicalProject(EXPR$0=[CASE(>(COUNT($2), 0), CAST($SUM0($2)):INTEGER, > > > null)]) > > > LogicalWindow(ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) > > > > > > On Thu, Feb 26, 2015 at 12:18 PM, Milinda Pathirage < > > mpath...@umail.iu.edu > > > > > > > wrote: > > > > > > > Hi devs, > > > > > > > > I ask about $subject in calcite-dev. You can find the archived > > discussion > > > > at [1]. I think your thoughts are also valuable in this discussion in > > > > calcite list. > > > > > > > > I discovered the requirement for a default window operator when I > tried > > > to > > > > integrate streamscan (I was using tablescan prevously) into the > > physical > > > > plan generation logic. Because of the way we have written the > > > > OperatorRouter API, we always need a stream-to-relation operator at > the > > > > input. But Calcite generates a query plan like following: > > > > > > > > LogicalDelta > > > > LogicalProject(id=[$0], product=[$1], quantity=[$2]) > > > > LogicalFilter(condition=[>($2, 5)]) > > > > > > > > StreamScan(table=[[KAFKA, ORDERS]], fields=[[0, 1, 2]]) > > > > > > > > If we consider LogicalFilter as a relation operator, we need > something > > to > > > > convert input stream to a relation before sending the tuples > > downstream. > > > > In addition to this, there is a optimization where we consider filter > > > > operator as a tuple operator and have it between StreamScan and > > > > stream-to-relation operator as a way of reducing the amount of > messages > > > > going downstream. > > > > > > > > Other scenario is windowed aggregates. Currently window spec is > > attached > > > to > > > > the LogicalProject in query plan like following: > > > > > > > > LogicalProject(EXPR$0=[CASE(>(COUNT($2) OVER (ROWS BETWEEN 2 > PRECEDING > > > AND > > > > 2 FOLLOWING), 0), CAST($SUM0($2) OVER (ROWS BETWEEN 2 PRECEDING AND 2 > > > > FOLLOWING)):INTEGER, null)]) > > > > > > > > I wanted to know from them whether it is possible to move window > > > operation > > > > just after the stream scan, so that it is compatible with our > operator > > > > layer. > > > > May be there are better or easier ways to do this. So your comments > are > > > > always welcome. > > > > > > > > Thanks > > > > Milinda > > > > > > > > > > > > [1] > > > > > > > > > > > > > > http://mail-archives.apache.org/mod_mbox/incubator-calcite-dev/201502.mbox/browser > > > > > > > > -- > > > > Milinda Pathirage > > > > > > > > PhD Student | Research Assistant > > > > School of Informatics and Computing | Data to Insight Center > > > > Indiana University > > > > > > > > twitter: milindalakmal > > > > skype: milinda.pathirage > > > > blog: http://milinda.pathirage.org > > > > > > > > > > > > > > > -- > > Milinda Pathirage > > > > PhD Student | Research Assistant > > School of Informatics and Computing | Data to Insight Center > > Indiana University > > > > twitter: milindalakmal > > skype: milinda.pathirage > > blog: http://milinda.pathirage.org > > > -- Milinda Pathirage PhD Student | Research Assistant School of Informatics and Computing | Data to Insight Center Indiana University twitter: milindalakmal skype: milinda.pathirage blog: http://milinda.pathirage.org