Cool! I'll take a look for sure

El jue, 5 sept 2024 a las 12:42, Alessandro Solimando (<
alessandro.solima...@gmail.com>) escribió:

> Hi Gonzalo,
> you might want to check this WIP PR from Stamatis for Hive:
> https://github.com/apache/hive/pull/5249
>
> Hive has its own physical plan (execution is based on Tez). The PR also
> introduces a Hive Spool operator.
>
> I am not familiar with the Pinot side of things but I think the PR might be
> relevant.
>
> Best regards,
> Alessandro
>
> On Thu, Sep 5, 2024, 12:26 Gonzalo Ortiz Jaureguizar <golthir...@gmail.com
> >
> wrote:
>
> > You are right Julian, I was referring to Re*l*Nodes, not Re*x*Nodes.
> >
> > I didn't know about Spool. I've read that very educational Jira ticket
> and
> > the code using Spool in Calcite. Is very interesting to see the problems
> > you already have faced with Spools.
> > Given the current state of Spools in Calcite, I don't feel confident
> enough
> > to implement our solution on top of them.
> > It would be great to be able to work on having a complete Spool solution
> in
> > Cacite, but I don't think I will be able to do so, specially because in
> > Pinot we convert logical plans to physical plans without using Calcite
> > (something I would love to change, but there are always other
> priorities!).
> >
> > Therefore I think we are going to implement our own solution that looks
> for
> > subtrees once the logical planning is done. Hopefully in the future, with
> > the experience we got, we could try to contribute our solution to
> Calcite.
> >
> > Gonzalo
> >
> > El mié, 4 sept 2024 a las 19:48, Julian Hyde (<jhyde.apa...@gmail.com>)
> > escribió:
> >
> > > Do you mean ‘common RelNodes’ rather than RexNodes?
> > >
> > > Are you aware of https://issues.apache.org/jira/browse/CALCITE-481 ?
> The
> > > Spool operator (and related cases) is the starting point for
> discussions
> > > about DAGs. It isn’t fully implemented, but at least we’d be using the
> > same
> > > terminology.
> > >
> > > Julian
> > >
> > >
> > > > On Sep 3, 2024, at 6:51 AM, Gonzalo Ortiz Jaureguizar <
> > > golthir...@gmail.com> wrote:
> > > >
> > > > Hi there!
> > > >
> > > > In Pinot we want to work on a new optimization that lets us reuse
> some
> > > parts of the query plan.
> > > > Basically what we want is to change our nodes to be able to send the
> > > same data to multiple parent operators, transforming our trees into
> DAGs
> > > like shown in this diagram from Vladimir Ozerov post in Querify Labs <
> > >
> >
> https://www.querifylabs.com/blog/data-shuffling-in-distributed-sql-engines
> > > >:
> > > >
> > > >
> > > >
> > > > I've looked for older messages in the dev mailing list and I found
> some
> > > threads saying that the Calcite model is tree based and DAGs are not
> > > supported. If that is the case I will have to implement this
> optimization
> > > after the Calcite plan is generated, but I would like to avoid this
> > because
> > > we are trying to move more and more logic into Calcite procedures and
> > this
> > > would be a step back.
> > >
> > >
> >
>

Reply via email to