My abstract for the meetup is below. 20 minutes should be plenty of time for me.
*Title: *Federated Query Planning w/ Calcite & Substrait *Abstract* Substrait [0] is a cross-language serialization format and specification for communicating relational plans across systems. It is currently under active development, and systems such as DataFusion [1] and DuckDB [2] have started to support consuming and producing Substrait plans. Another system that has support for Substrait is Calcite, via the Isthmus [3] library. With Isthmus, it’s possible to parse SQL queries with Calcite, perform planning and then delegate execution to external systems via Substrait plans. It’s also possible to forgo SQL entirely, and submit Substrait plans directly to Calcite for planning. This talk aims to provide an introduction to Substrait, and showcase the capabilities of Isthmus in the context of generating plans for execution across multiple data systems. [0] https://substrait.io/ [1] https://github.com/apache/datafusion/tree/main/datafusion/substrait [2] https://duckdb.org/docs/extensions/substrait.html [3] https://github.com/substrait-io/substrait-java/tree/main/isthmus On Mon, 20 Jan 2025 at 01:10, Stamatis Zampetakis <[email protected]> wrote: > Hey everyone, > > The 20' proposal was just a rough suggestion based on the number of talks > and in the absence of other information. Some people may want to talk more > and some less so if you have a concrete duration in mind please share it > and we can try to accommodate that. The goal is to enjoy the event and not > feel rushed or constrained; this goes for both the audience and the > speakers. > > > There will be a QA session after each talk and at least a 5' break every 45 > minutes or so (depending on the duration of the talks). > > The idea of doing a second meetup in 2-3 months (e.g., May 2025) is also > feasible (and relatively easy if it is fully virtual). Like that we could > also pick a time more convenient for people in Europe (such as 17:00 UTC). > I could easily move my talk for the second meetup since I am located in > Europe. Having many presentations and meetups is a good problem to have 😊 > > As usual the talks will be recorded if the speakers are OK with it and it > will be made available online some days after the event. > > Best, > Stamatis > > On Mon, Jan 20, 2025 at 5:40 AM Walaa Eldin Moustafa < > [email protected]> > wrote: > > > > Should we accommodate for Q&A? I think it would be reasonable to set the > > talk time to 20 minutes with additional 5 minutes for Q&A. > > > > (I am fine with making some talks longer — the above is just a general > > suggestion for the Q&A part). > > > > Thanks, > > Walaa. > > > > On Sun, Jan 19, 2025 at 8:25 PM Mihai Budiu <[email protected]> wrote: > > > > > Julian is way too generous with his praise of my work. > > > Moreover, we have a lot to learn from Julian as well. > > > And his talk is a very timely subject. > > > It would also be unfair to give me a bigger time share; I think the > time > > > should be divided evenly between the talks. > > > I will work to squeeze the essential ideas in 20 minutes. > > > If there is so much enthusiasm for the meetup maybe we can do another > one > > > sooner? We could have a queue with talk proposals, and when it's long > > > enough we could meet again? > > > > > > Mihai > > > > > > ________________________________ > > > From: Julian Hyde <[email protected]> > > > Sent: Sunday, January 19, 2025 9:40 AM > > > To: [email protected] <[email protected]> > > > Subject: Re: [DISCUSS] Apache Calcite Meetup February 2025 > > > > > > Since we have so many talks, I’m happy to remove myself from the agenda > to > > > give others more time. I can give my talk at a later meetup. > > > > > > I would favor giving Mihai more than twenty minutes. There is a lot of > > > depth to his work - he won best paper at VLDB last year - and much for > us > > > all to learn. > > > > > > > On Jan 15, 2025, at 3:45 PM, Mihai Budiu <[email protected]> wrote: > > > > > > > > Hello. > > > > > > > > I guess the time is fixed at 20 minutes now. My talk needs a fair > amount > > > of background, so it won't fit easily (depending on whether I can > assume > > > people have seen my previous meetup/CoC presentations). > > > > > > > > Here is my abstract: > > > > > > > > Streaming, incremental, finite-memory computations in SQL over > > > > unbounded streams > > > > > > > > SQL is the standard language for expressing computations on > > > > collections. Using modern incremental view maintenance techniques, > > > > SQL can also be adopted as the standard language for computing on > > > > *changes* to collections. In previous presentations we have shown > how > > > > to automatically convert any SQL program that defines views into an > > > > *incremental* program: the inputs of an incremental program are > > > > insertions, deletions, and updates to data tables, and the outputs of > > > > the incremental program are insertions, deletions, and updates of the > > > > maintained views. > > > > > > > > Whereas SQL queries are stateless systems, the incremental programs > > > > are stateful streaming systems that maintain complex *indexes* for > > > > performing efficient updates. The indexes enable computing all > > > > updates in time proportional to the size of the changes. > > > > > > > > In this presentation we discuss the problem of computing over data > > > > that grows unbounded (e.g., event streams), leading to potentially > > > > unbounded indexes. We present the design and implementation of a > > > > fully automatic mechanism which enables many such computations to use > > > > only finite memory by garbage-collecting the indexes at runtime. The > > > > mechanism requires users to specify bounds on the amount of > > > > "out-of-orderness" of the input data, using annotations on input > > > > tables. > > > > > > > > Thank you for organizing this! > > > > Mihai > > > > > > > > > > > > > > > > ________________________________ > > > > From: Stamatis Zampetakis <[email protected]> > > > > Sent: Wednesday, January 15, 2025 7:29 AM > > > > To: [email protected] <[email protected]> > > > > Subject: Re: [DISCUSS] Apache Calcite Meetup February 2025 > > > > > > > > I can also give a talk based on some recent work that we did in > Apache > > > > Hive around CTEs. > > > > > > > > With Walaa's and mine we now have five talks for the meetup. > > > > > > > > 1. Improving testing of JDBC adapter's dialects - Julian Hyde > (Google) > > > > 2. Federated Query Planning w/ Substrait - Victor Barua (Datadog) > > > > 3. Combining streaming and incremental computation in SQL - Mihai > > > > Budiu (Feldera) > > > > 4. Revolutionizing Data Lakes: A Dive into Coral, the SQL > Translation, > > > > Analysis, and Rewrite Engine - Walaa Eldin Moustafa (LinkedIn) > > > > 5. Optimizing Common Table Expressions in Apache Hive with Calcite - > > > > Stamatis Zampetakis (Cloudera) > > > > > > > > It would be nice to keep the duration of each talk around 20 minutes > > > > so that we finish with the presentations in ~2hrs. If more or less > > > > time is needed for some talks we can adapt. > > > > > > > > I now created the event on meetup [1] based on the information that I > > > > have so far. It would be nice to fill in the description/abstract > part > > > > for each talk so please share the necessary details when you get a > > > > chance. > > > > > > > > Best, > > > > Stamatis > > > > > > > > [1] https://www.meetup.com/apache-calcite/events/305627349 > > > > > > > >> On Fri, Jan 10, 2025 at 8:16 PM Walaa Eldin Moustafa > > > >> <[email protected]> wrote: > > > >> > > > >> That is a great idea. I would love to join in person as well. I can > talk > > > >> about Coral [1]. > > > >> > > > >> Revolutionizing Data Lakes: A Dive into Coral, the SQL Translation, > > > >> Analysis, and Rewrite Engine. > > > >> > > > >> [1] https://github.com/linkedin/coral > > > >> > > > >> Thanks, > > > >> Walaa. > > > >> > > > >> > > > >> On Fri, Jan 10, 2025 at 11:05 AM Stamatis Zampetakis < > [email protected] > > > > > > > >> wrote: > > > >> > > > >>> I now have some more clarity regarding the physical venue of the > > > >>> meetup. It will be held in Cloudera's offices in Santa Clara, > > > >>> California. More details will come over the next few weeks. > > > >>> > > > >>> Cloudera is also going to help with promoting the event to various > > > >>> channels and other meetup groups. > > > >>> > > > >>> Now with Mihai's proposal we have three topics for the meetup: > > > >>> * Improving testing of our JDBC adapter's dialects > > > >>> * Federated Query Planning w/ Substrait > > > >>> * Combining streaming and incremental computation in SQL > > > >>> > > > >>> In order to put up the agenda and publish the event to a wider > > > >>> audience we need the title, abstract, and expected duration for > each > > > >>> talk. The sooner we publish the event the better it will be for > people > > > >>> who would like to join. > > > >>> > > > >>> Best, > > > >>> Stamatis > > > >>> > > > >>> > > > >>> > > > >>> On Fri, Jan 3, 2025 at 9:02 PM Mihai Budiu <[email protected]> > wrote: > > > >>>> > > > >>>> I can give a talk about combining streaming and incremental > > > computation > > > >>> in SQL. > > > >>>> > > > >>>> Mihai > > > >>>> > > > >>>> ________________________________ > > > >>>> From: Stamatis Zampetakis <[email protected]> > > > >>>> Sent: Thursday, January 2, 2025 2:22 AM > > > >>>> To: [email protected] <[email protected]> > > > >>>> Subject: Re: [DISCUSS] Apache Calcite Meetup February 2025 > > > >>>> > > > >>>> So far we have two topics: > > > >>>> * Improving testing of our JDBC adapter's dialects > > > >>>> * Federated Query Planning w/ Substrait > > > >>>> > > > >>>> Both of them would be quite interesting and relevant for the > meetup. > > > >>>> > > > >>>> For people who are willing to give a talk, please share the title, > > > >>>> abstract, and expected duration as soon as possible. Given that > > > >>>> February is not too far away, I would like to put up a tentative > > > >>>> agenda on meetup [1] so that people who don't follow the dev list > can > > > >>>> arrange their schedules for the event. > > > >>>> > > > >>>> Best, > > > >>>> Stamatis > > > >>>> > > > >>>> [1] https://www.meetup.com/Apache-Calcite/ > > > >>>> > > > >>>> On Tue, Dec 31, 2024 at 9:00 PM Victor Barua > > > >>>> <[email protected]> wrote: > > > >>>>> > > > >>>>> If folks are interested, I could give a talk along the lines of > > > >>> "Federated > > > >>>>> Query Planning w/ Substrait" which would be about how we're using > > > >>> Isthmus > > > >>>>> < > https://github.com/substrait-io/substrait-java/tree/main/isthmus > >, > > > >>> the > > > >>>>> Substrait <https://substrait.io/> <-> Calcite bridge, to be able > to > > > >>> use > > > >>>>> Calcite as a query planner for our distributed execution system. > > > >>>>> > > > >>>>> On Tue, 24 Dec 2024 at 03:38, Ruben Q L <[email protected]> > wrote: > > > >>>>> > > > >>>>>> Great idea, Stamatis! > > > >>>>>> I'll do my best to try to participate remotely. > > > >>>>>> > > > >>>>>> > > > >>>>>> > > > >>>>>> > > > >>>>>> On Tue, Dec 24, 2024 at 10:21 AM Stamatis Zampetakis < > > > >>> [email protected]> > > > >>>>>> wrote: > > > >>>>>> > > > >>>>>>> Regarding the time it will be difficult to find a slot that is > > > >>>>>>> suitable for everyone worldwide. I proposed 5pm PST to > facilitate > > > >>>>>>> those that will participate in person. > > > >>>>>>> > > > >>>>>>> If there are people interested in giving a talk in other > regions > > > >>> and > > > >>>>>>> the timing does not work for them then we can try to see if > there > > > >>> is > > > >>>>>>> another option that could accommodate those. Please let me know > if > > > >>>>>>> that's the case. > > > >>>>>>> > > > >>>>>>> Best, > > > >>>>>>> Stamatis > > > >>>>>>> > > > >>>>>>> On Mon, Dec 23, 2024 at 7:58 PM Julian Hyde <[email protected]> > > > >>> wrote: > > > >>>>>>>> > > > >>>>>>>> Excellent idea. I'm always happy to give a talk about Calcite, > > > >>> and > > > >>>>>>>> could travel to Santa Clara in person. One possible topic is > > > >>>>>>>> "Improving testing of our JDBC adapter's dialects". > > > >>>>>>>> > > > >>>>>>>> On Mon, Dec 23, 2024 at 9:46 AM Mihai Budiu <[email protected] > > > > > >>> wrote: > > > >>>>>>>>> > > > >>>>>>>>> Great idea, I would participate in person or in the hybrid > > > >>> format. > > > >>>>>>>>> > > > >>>>>>>>> Mihai > > > >>>>>>>>> > > > >>>>>>>>> ________________________________ > > > >>>>>>>>> From: Stamatis Zampetakis <[email protected]> > > > >>>>>>>>> Sent: Monday, December 23, 2024 12:47 AM > > > >>>>>>>>> To: [email protected] <[email protected]> > > > >>>>>>>>> Subject: [DISCUSS] Apache Calcite Meetup February 2025 > > > >>>>>>>>> > > > >>>>>>>>> Hi all, > > > >>>>>>>>> > > > >>>>>>>>> It's been quite a while since our last meetup [1, 2] so I was > > > >>>>>> thinking > > > >>>>>>>>> it may be a good idea to organize one around the beginning of > > > >>> 2025. > > > >>>>>>>>> > > > >>>>>>>>> I am considering the possibility of a hybrid event with few > > > >>>>>>>>> presentations followed by open discussion and socializing. In > > > >>> terms > > > >>>>>> of > > > >>>>>>>>> location, I am discussing something around Santa Clara, > > > >>> California. > > > >>>>>>>>> For those who cannot attend physically, we will use zoom or > > > >>> another > > > >>>>>>>>> app so that people can participate remotely. > > > >>>>>>>>> > > > >>>>>>>>> The tentative date that I have in mind is Thursday, 20 > > > >>> February, > > > >>>>>>>>> 5:00pm PST but nothing is fixed and we can adapt to > > > >>> accommodate more > > > >>>>>>>>> people. > > > >>>>>>>>> > > > >>>>>>>>> I created a small anonymous google form [3] to see if there > is > > > >>> enough > > > >>>>>>>>> interest to hold an event on the proposed date and if it is > > > >>> worth > > > >>>>>>>>> organising a hybrid event instead of a fully virtual one. > > > >>> Please take > > > >>>>>>>>> a few seconds to submit your responses. > > > >>>>>>>>> > > > >>>>>>>>> Are there people willing to give a talk around Calcite? > > > >>>>>>>>> > > > >>>>>>>>> Best, > > > >>>>>>>>> Stamatis > > > >>>>>>>>> > > > >>>>>>>>> [1] https://www.meetup.com/Apache-Calcite/ > > > >>>>>>>>> [2] > > > >>> https://lists.apache.org/thread/8vbdtbjxt9zx4k93dof085yj3v5z820s > > > >>>>>>>>> [3] https://forms.gle/iuAB6hoDHuLdy3BC7 > > > >>>>>>> > > > >>>>>> > > > >>> > > > >
