Hi Botong,

We haven't heard from you for a while.
Feel free to reach out if you get stuck or need help on rebasing code.

Thanks,
Haisheng

On 2021/05/15 00:54:02, Botong Huang <pku...@gmail.com> wrote: 
> Hi all,
> 
> Thank you all for the interest, and thanks Julian for the update!
> 
> I am having problems uploading the pdf files into the jira CALCITE-4568
> <https://issues.apache.org/jira/browse/CALCITE-4568>, so I attached the
> slides in our code base:
> https://github.com/alibaba/cost-based-incremental-optimizer/blob/main/Tempura_Calcite_presentation.pdf
> 
> The slides contain a walking example of how Tempura expands its memo. The
> current version of the code also has two e2e unit tests at
> TvrOptimizationTest.java and TvrExecutionTest.java. Please feel free to
> start playing with them, and feel free to reach out and possibly schedule
> another meeting if needed.
> 
> As agreed in the meeting, we will rebase our code to a newer version of
> Calcite.
> 
> Best,
> Botong
> 
> On Thu, May 13, 2021 at 12:47 PM Julian Hyde <jhyde.apa...@gmail.com> wrote:
> 
> > During the meeting we agreed to start progressing this contribution in the
> > usual Apache Way, with conversations on the dev list and in the
> > https://issues.apache.org/jira/browse/CALCITE-4568 <
> > https://issues.apache.org/jira/browse/CALCITE-4568> JIRA case. So, it
> > should be easy for you to participate.
> >
> > Botong said he would share the slides. (He might be unwilling to make them
> > public, because they are his presentation for a conference that has not
> > happened yet. Reach out to him one-to-one.)
> >
> > Next step is for someone on the Alibaba side to create a PR that is
> > rebased on the latest Calcite master, and add a comment to the JIRA case.
> > Then we can discuss what needs to be done for that PR. Code quality, adding
> > comments, breaking up into smaller commits, additional tests, renaming
> > packages/classes, restructuring into plugins are all possibilities.
> >
> > Our side of the bargain, as committers, is that we should review in a
> > timely manner, and not move the goal posts — if the contributors make the
> > changes we request then we will land this code in master in a reasonable
> > amount of time.
> >
> > We also discussed incremental view maintenance (IVM). Tempura solves a
> > more general problem (finding the optimal K steps to maintain a
> > materialized view as data arrives in K points in time) but if we set K=2,
> > we can generate a plan for how to update a materialized view given a delta
> > table. The plan will be different based on cost - e.g. whether the delta
> > table is small or large. This is a problem that many of our users would
> > like to solve. It will exercise much of Tempura’s code base, and encourage
> > contributions.
> >
> > In my opinion, we should do IVM at launch. It should be the main example
> > we use in conference talks, blog posts, etc. When people understand that
> > case, we can explain how we generalize from K=2 to arbitrary K.
> >
> > Julian
> >
> >
> > > On May 13, 2021, at 9:51 AM, Rui Wang <amaliu...@apache.org> wrote:
> > >
> > > I apologize that I had a wrong impression on the meeting time (I thought
> > it
> > > should be on Thursday but it is Wednesday). I can follow up your meeting
> > > records if you have any.
> > >
> > >
> > > -Rui
> > >
> > > On Tue, May 11, 2021 at 8:17 PM Botong Huang <pku...@gmail.com> wrote:
> > >
> > >> Hi all,
> > >>
> > >> This is a reminder that we are going to have our second discussion
> > meeting
> > >> tomorrow at 10-11pm PST. Please find the link below, everyone is
> > welcome to
> > >> join!
> > >>
> > >> Join Zoom Meeting
> > >> https://uci.zoom.us/j/91986206610
> > >> <
> > >>
> > https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fj%2F91986206610&sa=D&source=calendar&usd=2&usg=AOvVaw24sxPtI6hbukCSo3nlQsbn
> > >>>
> > >>
> > >> Meeting ID: 919 8620 6610
> > >> One tap mobile
> > >> +16699006833 <(669)%20900-6833>,,91986206610# US (San Jose)
> > >> +12532158782 <(253)%20215-8782>,,91986206610# US (Tacoma)
> > >>
> > >> Dial by your location
> > >>        +1 669 900 6833 <(669)%20900-6833> US (San Jose)
> > >>        +1 253 215 8782 <(253)%20215-8782> US (Tacoma)
> > >>        +1 346 248 7799 <(346)%20248-7799> US (Houston)
> > >>        +1 301 715 8592 <(301)%20715-8592> US (Washington DC)
> > >>        +1 312 626 6799 <(312)%20626-6799> US (Chicago)
> > >>        +1 646 558 8656 <(646)%20558-8656> US (New York)
> > >> Meeting ID: 919 8620 6610
> > >> Find your local number: https://uci.zoom.us/u/acyXcc43Cd
> > >> <
> > >>
> > https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fu%2FacyXcc43Cd&sa=D&source=calendar&usd=2&usg=AOvVaw2W08kj_8hEx44dryeZlXb6
> > >>>
> > >>
> > >> Join by Skype for Business
> > >> https://uci.zoom.us/skype/91986206610
> > >> <
> > >>
> > https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fskype%2F91986206610&sa=D&source=calendar&usd=2&usg=AOvVaw3w0M0YYbcjPyBXzNpyyk0Z
> > >>>
> > >>
> > >> Thanks,
> > >> Botong
> > >>
> > >> On Wed, May 5, 2021 at 9:55 AM Botong Huang <pku...@gmail.com> wrote:
> > >>
> > >>> Hi Stamatis and all,
> > >>>
> > >>> Thanks for the interest! Let's tentatively schedule the next meeting
> > next
> > >>> Wednesday at May 12, 10pm-11pm PST then. Please let us know if there's
> > >> new
> > >>> needs showing up.
> > >>>
> > >>> Best,
> > >>> Botong
> > >>>
> > >>> On Sun, May 2, 2021 at 2:59 PM Stamatis Zampetakis <zabe...@gmail.com>
> > >>> wrote:
> > >>>
> > >>>> Hello,
> > >>>>
> > >>>> I really regret missing the first meeting, sorry about that. I added
> > my
> > >>>> preferences in the document.
> > >>>> I will make sure to attend the next one and help as much as I can.
> > >>>>
> > >>>> I didn't have the chance yet to go over the paper but will try to do
> > it
> > >>>> before the next meeting.
> > >>>>
> > >>>> For me the following dates are more convenient than others so it would
> > >> be
> > >>>> nice if we could arrange it then.
> > >>>>
> > >>>> Thu, May 6, 10pm PST
> > >>>> Tue, May 12, 10pm PST
> > >>>>
> > >>>> Best,
> > >>>> Stamatis
> > >>>>
> > >>>> On Sat, May 1, 2021 at 9:42 PM Julian Hyde <jh...@apache.org> wrote:
> > >>>>
> > >>>>> I have added my time preferences to the doc [1]. I am generally
> > >>>>> available any evening Mon - Thu. How about we meet Monday 10th May?
> > >>>>>
> > >>>>> Stamatis, Jesus, Given the complexity of this work, I would very much
> > >>>>> appreciate your insight, as experts in optimizer theory. Could one of
> > >>>>> you join the next meeting? Of course we should choose a time that
> > >>>>> works for everyone's schedule.
> > >>>>>
> > >>>>> Julian
> > >>>>>
> > >>>>> [1]
> > >>>>>
> > >>>>
> > >>
> > https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > >>>>>
> > >>>>> On Wed, Apr 28, 2021 at 9:32 AM Botong Huang <pku...@gmail.com>
> > >> wrote:
> > >>>>>>
> > >>>>>> We didn't record it, we will try to record the following meetings.
> > >>>> Please
> > >>>>>> add your time preference in the docs, so that we can find a meeting
> > >>>> time
> > >>>>>> that works for more people.
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Botong
> > >>>>>>
> > >>>>>> On Wed, Apr 28, 2021 at 12:23 AM Viliam Durina <
> > >> vil...@hazelcast.com>
> > >>>>> wrote:
> > >>>>>>
> > >>>>>>> Is there a recording available?
> > >>>>>>> Viliam
> > >>>>>>>
> > >>>>>>> On Wed, 28 Apr 2021 at 00:15, Botong Huang <pku...@gmail.com>
> > >>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi all,
> > >>>>>>>>
> > >>>>>>>> The meeting yesterday was fun and productive. As discussed, this
> > >>>> is
> > >>>>> the
> > >>>>>>>> call to schedule our second meeting.
> > >>>>>>>>
> > >>>>>>>> We encourage everyone to add their time preferences during
> > >> 05/01 -
> > >>>>> 05/15
> > >>>>>>>> here:
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > >>>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>> Botong
> > >>>>>>>>
> > >>>>>>>> On Wed, Apr 21, 2021 at 5:19 PM Botong Huang <pku...@gmail.com>
> > >>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Hi all,
> > >>>>>>>>> We've created a zoom meeting below for our meeting next Monday
> > >>>>>>>>> (9pm-10:30pm PST on 04/26).
> > >>>>>>>>> Talk to you all soon!
> > >>>>>>>>>
> > >>>>>>>>> Join Zoom Meeting
> > >>>>>>>>> https://uci.zoom.us/j/91279732686
> > >>>>>>>>> <
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fj%2F91279732686&sa=D&source=calendar&usd=2&usg=AOvVaw2C5LoOmCaSLWSi-YvMmsOE
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Meeting ID: 912 7973 2686
> > >>>>>>>>> One tap mobile
> > >>>>>>>>> +16699006833 <(669)%20900-6833>,,91279732686# US (San Jose)
> > >>>>>>>>> +12532158782 <(253)%20215-8782>,,91279732686# US (Tacoma)
> > >>>>>>>>>
> > >>>>>>>>> Dial by your location
> > >>>>>>>>> +1 669 900 6833 <(669)%20900-6833> US (San Jose)
> > >>>>>>>>> +1 253 215 8782 <(253)%20215-8782> US (Tacoma)
> > >>>>>>>>> +1 346 248 7799 <(346)%20248-7799> US (Houston)
> > >>>>>>>>> +1 301 715 8592 <(301)%20715-8592> US (Washington DC)
> > >>>>>>>>> +1 312 626 6799 <(312)%20626-6799> US (Chicago)
> > >>>>>>>>> +1 646 558 8656 <(646)%20558-8656> US (New York)
> > >>>>>>>>> Meeting ID: 912 7973 2686
> > >>>>>>>>> Find your local number: https://uci.zoom.us/u/aykHTkJBh
> > >>>>>>>>> <
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fu%2FaykHTkJBh&sa=D&source=calendar&usd=2&usg=AOvVaw0y_V5CisCHRyt9wsXLa9UM
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Join by Skype for Business
> > >>>>>>>>> https://uci.zoom.us/skype/91279732686
> > >>>>>>>>> <
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://www.google.com/url?q=https%3A%2F%2Fuci.zoom.us%2Fskype%2F91279732686&sa=D&source=calendar&usd=2&usg=AOvVaw3iQwsDViu3K7-Rb_Iy6Zsy
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Thanks,
> > >>>>>>>>> Botong
> > >>>>>>>>>
> > >>>>>>>>> On Tue, Apr 13, 2021 at 10:16 PM Botong Huang <
> > >> pku...@gmail.com
> > >>>>>
> > >>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Hi all,
> > >>>>>>>>>>
> > >>>>>>>>>> According to the preferences collected, we are tentatively
> > >>>>> scheduling
> > >>>>>>>> our
> > >>>>>>>>>> meeting at 9pm-10:30pm PST on 04/26 Monday.
> > >>>>>>>>>>
> > >>>>>>>>>> We will give a presentation about Tempura, followed by a free
> > >>>>>>>> discussion.
> > >>>>>>>>>>
> > >>>>>>>>>> Please let us know if there are new other requests. Few days
> > >>>>> before
> > >>>>>>>>>> the meeting, I will send out a zoom meeting link.
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks,
> > >>>>>>>>>> Botong
> > >>>>>>>>>>
> > >>>>>>>>>> On Wed, Apr 7, 2021 at 2:46 PM Botong Huang <
> > >> pku...@gmail.com>
> > >>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Hi Julian and all,
> > >>>>>>>>>>>
> > >>>>>>>>>>> We've posted the Tempura code base below. Feel free to take
> > >> a
> > >>>>> quick
> > >>>>>>>> peek
> > >>>>>>>>>>> at the last five commits.
> > >>>>>>>>>>>
> > >>>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://github.com/alibaba/cost-based-incremental-optimizer/commits/main
> > >>>>>>>>>>>
> > >>>>>>>>>>> I've also opened a Jira (CALCITE-4568
> > >>>>>>>>>>> <https://issues.apache.org/jira/browse/CALCITE-4568>),
> > >> which
> > >>>>> will
> > >>>>>>>> serve
> > >>>>>>>>>>> as the umbrella Jira for the feature.
> > >>>>>>>>>>>
> > >>>>>>>>>>> In the meantime, we encourage everyone to enter the time
> > >>>>> preferences
> > >>>>>>>> for
> > >>>>>>>>>>> our first meeting here:
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > >>>>>>>>>>>
> > >>>>>>>>>>> Thanks,
> > >>>>>>>>>>> Botong
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Mon, Apr 5, 2021 at 3:59 PM Julian Hyde <
> > >>>>> jhyde.apa...@gmail.com>
> > >>>>>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> I have added my time preferences to the doc.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Before we meet, could you publish a PR for us to review?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Initial discussions will need to be about architecture and
> > >>>>>>> high-level
> > >>>>>>>>>>>> design. So I would ask Calcite reviewers not to review the
> > >> PR
> > >>>>>>>> line-by-line
> > >>>>>>>>>>>> (or to leave comments in GitHub) but try to understand the
> > >>>>> design
> > >>>>>>>>>>>> holistically, and prepare questions/comments before the
> > >>>> meeting.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Botong, Can you please create a Calcite JIRA case for this
> > >>>> task?
> > >>>>>>> JIRA
> > >>>>>>>>>>>> how we track long-running tasks such as this.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Julian
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> On Apr 3, 2021, at 5:15 PM, Botong Huang <
> > >> pku...@gmail.com
> > >>>>>
> > >>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Hi all,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Apology for the delay. It took us some time to clean up
> > >> our
> > >>>>> code
> > >>>>>>>> base
> > >>>>>>>>>>>> and
> > >>>>>>>>>>>>> publicly release it (which will be out soon) for a quick
> > >>>> peek.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> We are ready to present our work. Let's schedule a time
> > >>>> for a
> > >>>>> Zoom
> > >>>>>>>>>>>>> meeting and discuss how to integrate Tempura into
> > >> Calcite.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Since some of our team members are in China, we prefer
> > >> the
> > >>>>> time
> > >>>>>>> slot
> > >>>>>>>>>>>> of
> > >>>>>>>>>>>>> 7:00pm-11:30pm PST any day. I've added our time
> > >> preference
> > >>>> in
> > >>>>> the
> > >>>>>>>>>>>> shared
> > >>>>>>>>>>>>> doc below.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> We encourage everyone to add their time preferences
> > >> (during
> > >>>>>>>>>>>> 04/15-04/30) in
> > >>>>>>>>>>>>> this doc. In a week or so, we will try to settle a time
> > >>>> that
> > >>>>> works
> > >>>>>>>> for
> > >>>>>>>>>>>>> most.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>> Botong
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Sat, Jan 30, 2021 at 9:19 PM Botong Huang <
> > >>>>> pku...@gmail.com>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Hi Julian and Rui,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Sounds good to us. Please give us some time to prepare
> > >>>> some
> > >>>>>>> slides
> > >>>>>>>>>>>> for the
> > >>>>>>>>>>>>>> meeting.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I've created a doc below for discussion. Please feel
> > >> free
> > >>>> to
> > >>>>> add
> > >>>>>>>>>>>> more in
> > >>>>>>>>>>>>>> here:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://docs.google.com/document/d/1wyNjB94uSGwHtVvGYDwaLlCghUJE-7aDLnCdKKXJN1o/edit?usp=sharing
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>> Botong
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Thu, Jan 28, 2021 at 11:18 AM Julian Hyde <
> > >>>>>>>> jhyde.apa...@gmail.com
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> PS The “editable doc” that Rui refers to is also a good
> > >>>>> idea. I
> > >>>>>>>>>>>> think we
> > >>>>>>>>>>>>>>> should create it to continue discussion after the first
> > >>>>> meeting.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Julian
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On Jan 28, 2021, at 11:16 AM, Julian Hyde <
> > >>>>>>>> jhyde.apa...@gmail.com>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I think good next steps would be a PR and a meeting.
> > >>>> The
> > >>>>> PR
> > >>>>>>> will
> > >>>>>>>>>>>> allow
> > >>>>>>>>>>>>>>> us to read the code, but I think we should do the first
> > >>>>> round of
> > >>>>>>>>>>>> questions
> > >>>>>>>>>>>>>>> at the meeting.  The meeting could perhaps start with a
> > >>>>>>>>>>>> presentation of the
> > >>>>>>>>>>>>>>> paper (do you have some slides you are planning to
> > >>>> present
> > >>>>> at
> > >>>>>>>> VLDB,
> > >>>>>>>>>>>>>>> Botong?) and then move on to questions about the
> > >>>> concepts,
> > >>>>> which
> > >>>>>>>>>>>>>>> alternatives were considered, and how the concepts map
> > >>>> onto
> > >>>>>>> other
> > >>>>>>>>>>>> current
> > >>>>>>>>>>>>>>> and future concepts in calcite.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I don’t think we should start “reviewing” the PR
> > >>>>> line-by-line
> > >>>>>>> at
> > >>>>>>>>>>>> this
> > >>>>>>>>>>>>>>> point. We need to understand the high-level concepts
> > >> and
> > >>>>> design
> > >>>>>>>>>>>> choices. If
> > >>>>>>>>>>>>>>> we start reviewing the PR we will get lost in the
> > >>>> details.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I know that integrating a major change is hard; I
> > >> doubt
> > >>>>> that we
> > >>>>>>>>>>>> will be
> > >>>>>>>>>>>>>>> able to integrate everything, but we can build
> > >>>> understanding
> > >>>>>>> about
> > >>>>>>>>>>>> where
> > >>>>>>>>>>>>>>> calcite needs to go, and I hope integrate a good amount
> > >>>> of
> > >>>>> code
> > >>>>>>> to
> > >>>>>>>>>>>> help us
> > >>>>>>>>>>>>>>> get there.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> As I said before, after the integration I would like
> > >>>>> people to
> > >>>>>>> be
> > >>>>>>>>>>>> able
> > >>>>>>>>>>>>>>> to experiment with it and use it in their production
> > >>>>> systems.
> > >>>>>>>> That
> > >>>>>>>>>>>> way, it
> > >>>>>>>>>>>>>>> will not be an experiment that withers, but a feature
> > >> set
> > >>>>>>>>>>>> integrates with
> > >>>>>>>>>>>>>>> other calcite features and gets stronger over time.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Julian
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> On Jan 28, 2021, at 10:54 AM, Rui Wang <
> > >>>>> amaliu...@apache.org>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> For me to participate in the discussion for the
> > >> above
> > >>>>>>>> questions,
> > >>>>>>>>>>>> I
> > >>>>>>>>>>>>>>> will
> > >>>>>>>>>>>>>>>>> need to read a lot more to know relevant context and
> > >>>>> likely
> > >>>>>>> ask
> > >>>>>>>>>>>> lots of
> > >>>>>>>>>>>>>>>>> questions :-).  A editable doc is probably good for
> > >>>>> questions
> > >>>>>>>> and
> > >>>>>>>>>>>> back
> > >>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>> forward discussion.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> -Rui
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> On Thu, Jan 28, 2021 at 10:50 AM Rui Wang <
> > >>>>>>>> amaliu...@apache.org
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> I am also happy to help push this work into Calcite
> > >>>>> (review
> > >>>>>>>> code
> > >>>>>>>>>>>> and
> > >>>>>>>>>>>>>>> doc,
> > >>>>>>>>>>>>>>>>>> etc.).
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> While you can share your code so people can have
> > >> more
> > >>>>> idea
> > >>>>>>> how
> > >>>>>>>>>>>> it is
> > >>>>>>>>>>>>>>>>>> implemented, I think it would be also nice to have a
> > >>>> doc
> > >>>>> to
> > >>>>>>>>>>>> discuss
> > >>>>>>>>>>>>>>> open
> > >>>>>>>>>>>>>>>>>> questions above. Some points that I copy those to
> > >>>> here:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> 1. Can this solution be compatible with existing
> > >>>>> solutions in
> > >>>>>>>>>>>> Calcite
> > >>>>>>>>>>>>>>>>>> Streaming, materialized view maintenance, and
> > >>>> multi-query
> > >>>>>>>>>>>> optimization
> > >>>>>>>>>>>>>>>>>> (Sigma and Delta relational operators, lattice, and
> > >>>> Spool
> > >>>>>>>>>>>> operator),
> > >>>>>>>>>>>>>>>>>> 2. Did you find that you needed two separate cost
> > >>>> models
> > >>>>> -
> > >>>>>>> one
> > >>>>>>>>>>>> for
> > >>>>>>>>>>>>>>> “view
> > >>>>>>>>>>>>>>>>>> maintenance” and another for “user queries” - since
> > >>>> the
> > >>>>>>>>>>>> objectives of
> > >>>>>>>>>>>>>>> each
> > >>>>>>>>>>>>>>>>>> activity are so different?
> > >>>>>>>>>>>>>>>>>> 3. whether this work will hasten the arrival of
> > >>>>>>> multi-objective
> > >>>>>>>>>>>>>>> parametric
> > >>>>>>>>>>>>>>>>>> query optimization [1] in Calcite.
> > >>>>>>>>>>>>>>>>>> 4. probably SQL shell support.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> [1]:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://cacm.acm.org/magazines/2017/10/221322-multi-objective-parametric-query-optimization/fulltext
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> -Rui
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> On Wed, Jan 27, 2021 at 6:52 PM Albert <
> > >>>>> zinki...@gmail.com>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> it would be very nice to see a POC of your work.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> On Thu, Jan 28, 2021 at 10:21 AM Botong Huang <
> > >>>>>>>>>>>> pku...@gmail.com>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Hi Julian,
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Just wondering if there are any updates? We are
> > >>>>> wondering
> > >>>>>>> if
> > >>>>>>>> it
> > >>>>>>>>>>>>>>> would
> > >>>>>>>>>>>>>>>>>>> help
> > >>>>>>>>>>>>>>>>>>>> to post our code for a quick preview.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>>>>>> Botong
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> On Fri, Jan 1, 2021 at 11:04 AM Botong Huang <
> > >>>>>>>> pku...@gmail.com
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Hi Julian,
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Thanks for your interest! Sure let's figure out a
> > >>>> plan
> > >>>>>>> that
> > >>>>>>>>>>>> best
> > >>>>>>>>>>>>>>>>>>> benefits
> > >>>>>>>>>>>>>>>>>>>>> the community. Here are some clarifications that
> > >>>>> hopefully
> > >>>>>>>>>>>> answer
> > >>>>>>>>>>>>>>> your
> > >>>>>>>>>>>>>>>>>>>>> questions.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> In our work (Tempura), users specify the set of
> > >>>> time
> > >>>>>>> points
> > >>>>>>>> to
> > >>>>>>>>>>>>>>>>>>> consider
> > >>>>>>>>>>>>>>>>>>>>> running and a cost function that expresses users'
> > >>>>>>> preference
> > >>>>>>>>>>>> over
> > >>>>>>>>>>>>>>>>>>> time,
> > >>>>>>>>>>>>>>>>>>>>> Tempura will generate the best incremental plan
> > >>>> that
> > >>>>>>>>>>>> minimizes the
> > >>>>>>>>>>>>>>>>>>>> overall
> > >>>>>>>>>>>>>>>>>>>>> cost function.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> In this incremental plan, the sub-plans at
> > >>>> different
> > >>>>> time
> > >>>>>>>>>>>> points
> > >>>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>>>> different from each other, as opposed to
> > >> identical
> > >>>>> plans
> > >>>>>>> in
> > >>>>>>>>>>>> all
> > >>>>>>>>>>>>>>> delta
> > >>>>>>>>>>>>>>>>>>>> runs
> > >>>>>>>>>>>>>>>>>>>>> as in streaming or IVM. As mentioned in $2.1 of
> > >> the
> > >>>>>>> Tempura
> > >>>>>>>>>>>> paper,
> > >>>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>>>>>> mimic the current streaming implementation by
> > >>>>> specifying
> > >>>>>>> two
> > >>>>>>>>>>>>>>> (logical)
> > >>>>>>>>>>>>>>>>>>>> time
> > >>>>>>>>>>>>>>>>>>>>> points in Tempura, representing the initial run
> > >> and
> > >>>>> later
> > >>>>>>>>>>>> delta
> > >>>>>>>>>>>>>>> runs
> > >>>>>>>>>>>>>>>>>>>>> respectively. In general, note that Tempura
> > >>>> supports
> > >>>>>>> various
> > >>>>>>>>>>>> form
> > >>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>>>> incremental computing, not only the small-delta
> > >>>>>>> append-only
> > >>>>>>>>>>>> data
> > >>>>>>>>>>>>>>>>>>> model in
> > >>>>>>>>>>>>>>>>>>>>> streaming systems. That's why we believe Tempura
> > >>>>> subsumes
> > >>>>>>>> the
> > >>>>>>>>>>>>>>> current
> > >>>>>>>>>>>>>>>>>>>>> streaming support, as well as any IVM
> > >>>> implementations.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> About the cost model, we did not come up with a
> > >>>>> seperate
> > >>>>>>>> cost
> > >>>>>>>>>>>>>>> model,
> > >>>>>>>>>>>>>>>>>>> but
> > >>>>>>>>>>>>>>>>>>>>> rather extended the existing one. Similar to
> > >>>>>>> multi-objective
> > >>>>>>>>>>>>>>>>>>>> optimization,
> > >>>>>>>>>>>>>>>>>>>>> costs incurred at different time points are
> > >>>> considered
> > >>>>>>>>>>>> different
> > >>>>>>>>>>>>>>>>>>>>> dimensions. Tempura lets users supply a function
> > >>>> that
> > >>>>>>>>>>>> converts this
> > >>>>>>>>>>>>>>>>>>> cost
> > >>>>>>>>>>>>>>>>>>>>> vector into a final cost. So under this function,
> > >>>> any
> > >>>>> two
> > >>>>>>>>>>>>>>> incremental
> > >>>>>>>>>>>>>>>>>>>> plans
> > >>>>>>>>>>>>>>>>>>>>> are still comparable and there is an overall
> > >>>> optimum.
> > >>>>> I
> > >>>>>>>> guess
> > >>>>>>>>>>>> we
> > >>>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>>>> go
> > >>>>>>>>>>>>>>>>>>>>> down the route of multi-objective parametric
> > >> query
> > >>>>>>>>>>>> optimization
> > >>>>>>>>>>>>>>>>>>> instead
> > >>>>>>>>>>>>>>>>>>>> if
> > >>>>>>>>>>>>>>>>>>>>> there is a need.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Next on materialized views and multi-query
> > >>>>> optimization,
> > >>>>>>>>>>>> since our
> > >>>>>>>>>>>>>>>>>>>>> multi-time-point plan naturally involves
> > >>>> materializing
> > >>>>>>>>>>>> intermediate
> > >>>>>>>>>>>>>>>>>>>> results
> > >>>>>>>>>>>>>>>>>>>>> for later time points, we need to solve the
> > >>>> problem of
> > >>>>>>>>>>>> choosing
> > >>>>>>>>>>>>>>>>>>>>> materializations and include the cost of saving
> > >> and
> > >>>>>>> reusing
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>> materializations when costing and comparing
> > >> plans.
> > >>>> We
> > >>>>>>>>>>>> borrowed the
> > >>>>>>>>>>>>>>>>>>>>> multi-query optimization techniques to solve this
> > >>>>> problem
> > >>>>>>>> even
> > >>>>>>>>>>>>>>> though
> > >>>>>>>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>>>> are looking at a single query. As a result, we
> > >>>> think
> > >>>>> our
> > >>>>>>>> work
> > >>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>>>> orthogonal
> > >>>>>>>>>>>>>>>>>>>>> to Calcite's facilities around utilizing existing
> > >>>>> views,
> > >>>>>>>>>>>> lattice
> > >>>>>>>>>>>>>>> etc.
> > >>>>>>>>>>>>>>>>>>> We
> > >>>>>>>>>>>>>>>>>>>> do
> > >>>>>>>>>>>>>>>>>>>>> feel that the multi-query optimization component
> > >>>> can
> > >>>>> be
> > >>>>>>>>>>>> adopted to
> > >>>>>>>>>>>>>>>>>>> wider
> > >>>>>>>>>>>>>>>>>>>>> use, but probably need more suggestions from the
> > >>>>>>> community.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Lastly, our current implementation is set up in
> > >>>> java
> > >>>>> code,
> > >>>>>>>> it
> > >>>>>>>>>>>>>>> should
> > >>>>>>>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>>>> straightforward to hook it up with SQL shell.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>>>>>>> Botong
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> On Mon, Dec 28, 2020 at 6:44 PM Julian Hyde <
> > >>>>>>>>>>>>>>> jhyde.apa...@gmail.com>
> > >>>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> Botong,
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> This is very exciting; congratulations on this
> > >>>>> research,
> > >>>>>>>> and
> > >>>>>>>>>>>> thank
> > >>>>>>>>>>>>>>>>>>> you
> > >>>>>>>>>>>>>>>>>>>>>> for contributing it back to Calcite.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> The research touches several areas in Calcite:
> > >>>>> streaming,
> > >>>>>>>>>>>>>>>>>>> materialized
> > >>>>>>>>>>>>>>>>>>>>>> view maintenance, and multi-query optimization.
> > >>>> As we
> > >>>>>>> have
> > >>>>>>>>>>>> already
> > >>>>>>>>>>>>>>>>>>> some
> > >>>>>>>>>>>>>>>>>>>>>> solutions in those areas (Sigma and Delta
> > >>>> relational
> > >>>>>>>>>>>> operators,
> > >>>>>>>>>>>>>>>>>>> lattice,
> > >>>>>>>>>>>>>>>>>>>>>> and Spool operator), it will be interesting to
> > >> see
> > >>>>>>> whether
> > >>>>>>>>>>>> we can
> > >>>>>>>>>>>>>>>>>>> make
> > >>>>>>>>>>>>>>>>>>>> them
> > >>>>>>>>>>>>>>>>>>>>>> compatible, or whether one concept can subsume
> > >>>>> others.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> Your work differs from streaming queries in that
> > >>>> your
> > >>>>>>>>>>>> relations
> > >>>>>>>>>>>>>>> are
> > >>>>>>>>>>>>>>>>>>> used
> > >>>>>>>>>>>>>>>>>>>>>> by “external” user queries, whereas in pure
> > >>>> streaming
> > >>>>>>>>>>>> queries, the
> > >>>>>>>>>>>>>>>>>>> only
> > >>>>>>>>>>>>>>>>>>>>>> activity is the change propagation. Did you find
> > >>>>> that you
> > >>>>>>>>>>>> needed
> > >>>>>>>>>>>>>>> two
> > >>>>>>>>>>>>>>>>>>>>>> separate cost models - one for “view
> > >> maintenance”
> > >>>> and
> > >>>>>>>>>>>> another for
> > >>>>>>>>>>>>>>>>>>> “user
> > >>>>>>>>>>>>>>>>>>>>>> queries” - since the objectives of each activity
> > >>>> are
> > >>>>> so
> > >>>>>>>>>>>> different?
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> I wonder whether this work will hasten the
> > >>>> arrival of
> > >>>>>>>>>>>>>>> multi-objective
> > >>>>>>>>>>>>>>>>>>>>>> parametric query optimization [1] in Calcite.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> I will make time over the next few days to read
> > >>>> and
> > >>>>>>> digest
> > >>>>>>>>>>>> your
> > >>>>>>>>>>>>>>>>>>> paper.
> > >>>>>>>>>>>>>>>>>>>>>> Then I expect that we will have a back-and-forth
> > >>>>> process
> > >>>>>>> to
> > >>>>>>>>>>>> create
> > >>>>>>>>>>>>>>>>>>>>>> something that will be useful for the broader
> > >>>>> community.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> One thing will be particularly useful: making
> > >> this
> > >>>>>>>>>>>> functionality
> > >>>>>>>>>>>>>>>>>>>>>> available from a SQL shell, so that people can
> > >>>>> experiment
> > >>>>>>>>>>>> with
> > >>>>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>>>>>>>> functionality without writing Java code or
> > >>>> setting up
> > >>>>>>>> complex
> > >>>>>>>>>>>>>>>>>>> databases
> > >>>>>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>>>> metadata. I have in mind something like the
> > >> simple
> > >>>>> DDL
> > >>>>>>>>>>>> operations
> > >>>>>>>>>>>>>>>>>>> that
> > >>>>>>>>>>>>>>>>>>>> are
> > >>>>>>>>>>>>>>>>>>>>>> available in Calcite’s ’server’ module. I wonder
> > >>>>> whether
> > >>>>>>> we
> > >>>>>>>>>>>> could
> > >>>>>>>>>>>>>>>>>>> devise
> > >>>>>>>>>>>>>>>>>>>>>> some kind of SQL syntax for a “multi-query”.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> Julian
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://cacm.acm.org/magazines/2017/10/221322-multi-objective-parametric-query-optimization/fulltext
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> On Dec 23, 2020, at 8:55 PM, Botong Huang <
> > >>>>>>>> pku...@gmail.com
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Thanks Aron for pointing this out. To see the
> > >>>>> figure,
> > >>>>>>>> please
> > >>>>>>>>>>>>>>> refer
> > >>>>>>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>>>>>> Fig
> > >>>>>>>>>>>>>>>>>>>>>>> 3(a) in our paper:
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>> https://kai-zeng.github.io/papers/tempura-vldb2021.pdf
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>>>>>> Botong
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> On Wed, Dec 23, 2020 at 7:20 PM JiaTao Tao <
> > >>>>>>>>>>>> taojia...@gmail.com>
> > >>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Seems interesting, the pic can not be seen in
> > >>>> the
> > >>>>> mail,
> > >>>>>>>>>>>> may you
> > >>>>>>>>>>>>>>>>>>> open
> > >>>>>>>>>>>>>>>>>>>> a
> > >>>>>>>>>>>>>>>>>>>>>> JIRA
> > >>>>>>>>>>>>>>>>>>>>>>>> for this, people who are interested in this
> > >> can
> > >>>>>>> subscribe
> > >>>>>>>>>>>> to the
> > >>>>>>>>>>>>>>>>>>>> JIRA?
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Regards!
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Aron Tao
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Botong Huang <bot...@apache.org>
> > >> 于2020年12月24日周四
> > >>>>>>>> 上午3:18写道:
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Hi all,
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> This is a proposal to extend the Calcite
> > >>>> optimizer
> > >>>>>>> into
> > >>>>>>>> a
> > >>>>>>>>>>>>>>> general
> > >>>>>>>>>>>>>>>>>>>>>>>>> incremental query optimizer, based on our
> > >>>> research
> > >>>>>>> paper
> > >>>>>>>>>>>>>>>>>>> published
> > >>>>>>>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>>>>>>>> VLDB
> > >>>>>>>>>>>>>>>>>>>>>>>>> 2021:
> > >>>>>>>>>>>>>>>>>>>>>>>>> Tempura: a general cost-based optimizer
> > >>>> framework
> > >>>>> for
> > >>>>>>>>>>>>>>> incremental
> > >>>>>>>>>>>>>>>>>>>> data
> > >>>>>>>>>>>>>>>>>>>>>>>>> processing
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> We also have a demo in SIGMOD 2020
> > >> illustrating
> > >>>>> how
> > >>>>>>>>>>>> Alibaba’s
> > >>>>>>>>>>>>>>>>>>> data
> > >>>>>>>>>>>>>>>>>>>>>>>>> warehouse is planning to use this incremental
> > >>>>> query
> > >>>>>>>>>>>> optimizer
> > >>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>>>>>>>> alleviate
> > >>>>>>>>>>>>>>>>>>>>>>>>> cluster-wise resource skewness:
> > >>>>>>>>>>>>>>>>>>>>>>>>> Grosbeak: A Data Warehouse Supporting
> > >>>>> Resource-Aware
> > >>>>>>>>>>>>>>> Incremental
> > >>>>>>>>>>>>>>>>>>>>>>>> Computing
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> To our best knowledge, this is the first
> > >>>> general
> > >>>>>>>>>>>> cost-based
> > >>>>>>>>>>>>>>>>>>>>>> incremental
> > >>>>>>>>>>>>>>>>>>>>>>>>> optimizer that can find the best plan across
> > >>>>> multiple
> > >>>>>>>>>>>> families
> > >>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>>>>>>>> incremental computing methods, including IVM,
> > >>>>>>> Streaming,
> > >>>>>>>>>>>>>>>>>>> DBToaster,
> > >>>>>>>>>>>>>>>>>>>>>> etc.
> > >>>>>>>>>>>>>>>>>>>>>>>>> Experiments (in the paper) shows that the
> > >>>>> generated
> > >>>>>>> best
> > >>>>>>>>>>>> plan
> > >>>>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>>>>>>>>> consistently much better than the plans from
> > >>>> each
> > >>>>>>>>>>>> individual
> > >>>>>>>>>>>>>>>>>>> method
> > >>>>>>>>>>>>>>>>>>>>>>>> alone.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> In general, incremental query planning is
> > >>>> central
> > >>>>> to
> > >>>>>>>>>>>> database
> > >>>>>>>>>>>>>>>>>>> view
> > >>>>>>>>>>>>>>>>>>>>>>>>> maintenance and stream processing systems,
> > >> and
> > >>>> are
> > >>>>>>> being
> > >>>>>>>>>>>>>>> adopted
> > >>>>>>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>>>>>>>> active
> > >>>>>>>>>>>>>>>>>>>>>>>>> databases, resumable query execution,
> > >>>> approximate
> > >>>>>>> query
> > >>>>>>>>>>>>>>>>>>> processing,
> > >>>>>>>>>>>>>>>>>>>>>> etc.
> > >>>>>>>>>>>>>>>>>>>>>>>> We
> > >>>>>>>>>>>>>>>>>>>>>>>>> are hoping that this feature can help
> > >> widening
> > >>>> the
> > >>>>>>>>>>>> spectrum of
> > >>>>>>>>>>>>>>>>>>>>>> Calcite,
> > >>>>>>>>>>>>>>>>>>>>>>>>> solicit more use cases and adoption of
> > >> Calcite.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Below is a brief description of the technical
> > >>>>> details.
> > >>>>>>>>>>>> Please
> > >>>>>>>>>>>>>>>>>>> refer
> > >>>>>>>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>>>> Tempura paper for more details. We are also
> > >>>>> working
> > >>>>>>> on a
> > >>>>>>>>>>>>>>> journal
> > >>>>>>>>>>>>>>>>>>>>>> version
> > >>>>>>>>>>>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>>>>>>>> the paper with more implementation details.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Currently the query plan generated by Calcite
> > >>>> is
> > >>>>> meant
> > >>>>>>>> to
> > >>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>>> executed
> > >>>>>>>>>>>>>>>>>>>>>>>>> altogether at once. In the proposal,
> > >> Calcite’s
> > >>>>> memo
> > >>>>>>> will
> > >>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>> extended
> > >>>>>>>>>>>>>>>>>>>>>> with
> > >>>>>>>>>>>>>>>>>>>>>>>>> temporal information so that it is capable of
> > >>>>>>> generating
> > >>>>>>>>>>>>>>>>>>> incremental
> > >>>>>>>>>>>>>>>>>>>>>>>> plans
> > >>>>>>>>>>>>>>>>>>>>>>>>> that include multiple sub-plans to execute at
> > >>>>>>> different
> > >>>>>>>>>>>> time
> > >>>>>>>>>>>>>>>>>>> points.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> The main idea is to view each table as one
> > >> that
> > >>>>>>> changes
> > >>>>>>>>>>>> over
> > >>>>>>>>>>>>>>> time
> > >>>>>>>>>>>>>>>>>>>>>> (Time
> > >>>>>>>>>>>>>>>>>>>>>>>>> Varying Relations (TVR)). To achieve that we
> > >>>>>>> introduced
> > >>>>>>>>>>>>>>>>>>> TvrMetaSet
> > >>>>>>>>>>>>>>>>>>>>>> into
> > >>>>>>>>>>>>>>>>>>>>>>>>> Calcite’s memo besides RelSet and RelSubset
> > >> to
> > >>>>> track
> > >>>>>>>>>>>> related
> > >>>>>>>>>>>>>>>>>>> RelSets
> > >>>>>>>>>>>>>>>>>>>>>> of a
> > >>>>>>>>>>>>>>>>>>>>>>>>> changing table (e.g. snapshot of the table at
> > >>>>> certain
> > >>>>>>>>>>>> time,
> > >>>>>>>>>>>>>>>>>>> delta of
> > >>>>>>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>>>> table between two time points, etc.).
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> [image: image.png]
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> For example in the above figure, each
> > >> vertical
> > >>>>> line
> > >>>>>>> is a
> > >>>>>>>>>>>>>>>>>>> TvrMetaSet
> > >>>>>>>>>>>>>>>>>>>>>>>>> representing a TVR (S, R, S left outer join
> > >> R,
> > >>>>> etc.).
> > >>>>>>>>>>>>>>> Horizontal
> > >>>>>>>>>>>>>>>>>>>> lines
> > >>>>>>>>>>>>>>>>>>>>>>>>> represent time. Each black dot in the grid
> > >> is a
> > >>>>>>> RelSet.
> > >>>>>>>>>>>> Users
> > >>>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>>>>>>> write
> > >>>>>>>>>>>>>>>>>>>>>>>> TVR
> > >>>>>>>>>>>>>>>>>>>>>>>>> Rewrite Rules to describe valid
> > >> transformations
> > >>>>>>> between
> > >>>>>>>>>>>> these
> > >>>>>>>>>>>>>>>>>>> dots.
> > >>>>>>>>>>>>>>>>>>>>>> For
> > >>>>>>>>>>>>>>>>>>>>>>>>> example, the blues lines are inter-TVR rules
> > >>>> that
> > >>>>>>>>>>>> describe how
> > >>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>>>>>> compute
> > >>>>>>>>>>>>>>>>>>>>>>>>> certain RelSet of a TVR from RelSets of other
> > >>>>> TVRs.
> > >>>>>>> The
> > >>>>>>>>>>>> red
> > >>>>>>>>>>>>>>> lines
> > >>>>>>>>>>>>>>>>>>>> are
> > >>>>>>>>>>>>>>>>>>>>>>>>> intra-TVR rules that describe transformations
> > >>>>> within a
> > >>>>>>>>>>>> TVR. All
> > >>>>>>>>>>>>>>>>>>> TVR
> > >>>>>>>>>>>>>>>>>>>>>>>> rewrite
> > >>>>>>>>>>>>>>>>>>>>>>>>> rules are logical rules. All existing Calcite
> > >>>>> rules
> > >>>>>>>> still
> > >>>>>>>>>>>> work
> > >>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>> new
> > >>>>>>>>>>>>>>>>>>>>>>>>> volcano system without modification.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> All changes in this feature will consist of
> > >>>> four
> > >>>>>>> parts:
> > >>>>>>>>>>>>>>>>>>>>>>>>> 1. Memo extension with TvrMetaSet
> > >>>>>>>>>>>>>>>>>>>>>>>>> 2. Rule engine upgrade, capable of matching
> > >>>>> TvrMetaSet
> > >>>>>>>> and
> > >>>>>>>>>>>>>>>>>>> RelNodes,
> > >>>>>>>>>>>>>>>>>>>>>> as
> > >>>>>>>>>>>>>>>>>>>>>>>>> well as links in between the nodes.
> > >>>>>>>>>>>>>>>>>>>>>>>>> 3. A basic set of TvrRules, written using the
> > >>>>> upgraded
> > >>>>>>>>>>>> rule
> > >>>>>>>>>>>>>>>>>>> engine
> > >>>>>>>>>>>>>>>>>>>>>> API.
> > >>>>>>>>>>>>>>>>>>>>>>>>> 4. Multi-query optimization, used to find the
> > >>>> best
> > >>>>>>>>>>>> incremental
> > >>>>>>>>>>>>>>>>>>> plan
> > >>>>>>>>>>>>>>>>>>>>>>>>> involving multiple time points.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Note that this feature is an extension in
> > >>>> nature
> > >>>>> and
> > >>>>>>>> thus
> > >>>>>>>>>>>> when
> > >>>>>>>>>>>>>>>>>>>>>> disabled,
> > >>>>>>>>>>>>>>>>>>>>>>>>> does not change any existing Calcite
> > >> behavior.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Other than scenarios in the paper, we also
> > >>>> applied
> > >>>>>>> this
> > >>>>>>>>>>>>>>>>>>>>>> Calcite-extended
> > >>>>>>>>>>>>>>>>>>>>>>>>> incremental query optimizer to a type of
> > >>>> periodic
> > >>>>>>> query
> > >>>>>>>>>>>> called
> > >>>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>>> ‘‘range
> > >>>>>>>>>>>>>>>>>>>>>>>>> query’’ in Alibaba’s data warehouse. It
> > >>>> achieved
> > >>>>> cost
> > >>>>>>>>>>>> savings
> > >>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>>> 80%
> > >>>>>>>>>>>>>>>>>>>>>> on
> > >>>>>>>>>>>>>>>>>>>>>>>>> total CPU and memory consumption, and 60% on
> > >>>>>>> end-to-end
> > >>>>>>>>>>>>>>> execution
> > >>>>>>>>>>>>>>>>>>>>>> time.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> All comments and suggestions are welcome.
> > >>>> Thanks
> > >>>>> and
> > >>>>>>>> happy
> > >>>>>>>>>>>>>>>>>>> holidays!
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>>>>>>>> Botong
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>>> ~~~~~~~~~~~~~~~
> > >>>>>>>>>>>>>>>>>>> no mistakes
> > >>>>>>>>>>>>>>>>>>> ~~~~~~~~~~~~~~~~~~
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> Viliam Durina
> > >>>>>>> Jet Developer
> > >>>>>>>      hazelcast®
> > >>>>>>>
> > >>>>>>>  <https://www.hazelcast.com> 2 W 5th Ave, Ste 300 | San Mateo,
> > >> CA
> > >>>>> 94402 |
> > >>>>>>> USA
> > >>>>>>> +1 (650) 521-5453 <(650)%20521-5453> | hazelcast.com <
> > >> https://www.hazelcast.com>
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> This message contains confidential information and is intended
> > >> only
> > >>>> for
> > >>>>>>> the
> > >>>>>>> individuals named. If you are not the named addressee you should
> > >> not
> > >>>>>>> disseminate, distribute or copy this e-mail. Please notify the
> > >>>> sender
> > >>>>>>> immediately by e-mail if you have received this e-mail by mistake
> > >>>> and
> > >>>>>>> delete this e-mail from your system. E-mail transmission cannot be
> > >>>>>>> guaranteed to be secure or error-free as information could be
> > >>>>> intercepted,
> > >>>>>>> corrupted, lost, destroyed, arrive late or incomplete, or contain
> > >>>>> viruses.
> > >>>>>>> The sender therefore does not accept liability for any errors or
> > >>>>> omissions
> > >>>>>>> in the contents of this message, which arise as a result of e-mail
> > >>>>>>> transmission. If verification is required, please request a
> > >>>> hard-copy
> > >>>>>>> version. -Hazelcast
> > >>>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> >
> 

Reply via email to