Cool! I'll take a look for sure El jue, 5 sept 2024 a las 12:42, Alessandro Solimando (< alessandro.solima...@gmail.com>) escribió:
> Hi Gonzalo, > you might want to check this WIP PR from Stamatis for Hive: > https://github.com/apache/hive/pull/5249 > > Hive has its own physical plan (execution is based on Tez). The PR also > introduces a Hive Spool operator. > > I am not familiar with the Pinot side of things but I think the PR might be > relevant. > > Best regards, > Alessandro > > On Thu, Sep 5, 2024, 12:26 Gonzalo Ortiz Jaureguizar <golthir...@gmail.com > > > wrote: > > > You are right Julian, I was referring to Re*l*Nodes, not Re*x*Nodes. > > > > I didn't know about Spool. I've read that very educational Jira ticket > and > > the code using Spool in Calcite. Is very interesting to see the problems > > you already have faced with Spools. > > Given the current state of Spools in Calcite, I don't feel confident > enough > > to implement our solution on top of them. > > It would be great to be able to work on having a complete Spool solution > in > > Cacite, but I don't think I will be able to do so, specially because in > > Pinot we convert logical plans to physical plans without using Calcite > > (something I would love to change, but there are always other > priorities!). > > > > Therefore I think we are going to implement our own solution that looks > for > > subtrees once the logical planning is done. Hopefully in the future, with > > the experience we got, we could try to contribute our solution to > Calcite. > > > > Gonzalo > > > > El mié, 4 sept 2024 a las 19:48, Julian Hyde (<jhyde.apa...@gmail.com>) > > escribió: > > > > > Do you mean ‘common RelNodes’ rather than RexNodes? > > > > > > Are you aware of https://issues.apache.org/jira/browse/CALCITE-481 ? > The > > > Spool operator (and related cases) is the starting point for > discussions > > > about DAGs. It isn’t fully implemented, but at least we’d be using the > > same > > > terminology. > > > > > > Julian > > > > > > > > > > On Sep 3, 2024, at 6:51 AM, Gonzalo Ortiz Jaureguizar < > > > golthir...@gmail.com> wrote: > > > > > > > > Hi there! > > > > > > > > In Pinot we want to work on a new optimization that lets us reuse > some > > > parts of the query plan. > > > > Basically what we want is to change our nodes to be able to send the > > > same data to multiple parent operators, transforming our trees into > DAGs > > > like shown in this diagram from Vladimir Ozerov post in Querify Labs < > > > > > > https://www.querifylabs.com/blog/data-shuffling-in-distributed-sql-engines > > > >: > > > > > > > > > > > > > > > > I've looked for older messages in the dev mailing list and I found > some > > > threads saying that the Calcite model is tree based and DAGs are not > > > supported. If that is the case I will have to implement this > optimization > > > after the Calcite plan is generated, but I would like to avoid this > > because > > > we are trying to move more and more logic into Calcite procedures and > > this > > > would be a step back. > > > > > > > > >