We could create serializers and deserializers for the logical plan stuff.
It looks like we can resolve the costing through metadata providers unless
I misunderstood what Julian was suggesting.



--
Jacques Nadeau
CTO and Co-Founder, Dremio

On Mon, Nov 23, 2015 at 5:12 PM, Jinfeng Ni <jinfengn...@gmail.com> wrote:

> @Jacaues,
>
> Every DrillLogicalRel has to override computeSelfCost(), and implement
> implement() method. The latter is to get Logical Plan, which is one of
> three input types Drill should accept (SQL, Logical Plan, Physical
> Plan).
>
> So, for now, we do have to override/exend all DrillLogicalRel.
>
>
> On Mon, Nov 23, 2015 at 4:55 PM, Julian Hyde <jh...@apache.org> wrote:
> > I’m not sure what properties / behavior you want to override but
> remember that Calcite specifies a lot of brings as traits or metadata.
> >
> > For example, “double RelNode.getRows()" is deprecated and you would
> these days use RelMetadataQuery.getRowCount(). You would not need to
> sub-class a RelNode to override its row-count estimate, just supply a
> different metadata provider.
> >
> > Julian
> >
> >
> >> On Nov 23, 2015, at 4:50 PM, Jacques Nadeau <jacq...@dremio.com> wrote:
> >>
> >> Yes, my suggestion is removal of DRILL_LOGICAL. @Hsuan, this is
> independent
> >> from the number of phases and I'm not suggesting changing that.
> >>
> >> My main thought was: if we only need to override one or two rels, do
> only
> >> that rather than having a wholesale copy of every operator with a bunch
> of
> >> basic noop rules.
> >>
> >> --
> >> Jacques Nadeau
> >> CTO and Co-Founder, Dremio
> >>
> >> On Mon, Nov 23, 2015 at 4:37 PM, Jinfeng Ni <jinfengn...@gmail.com>
> wrote:
> >>
> >>> @Jacques, are you talking about removing the convention DRILL_LOGICAL?
> >>>
> >>> DrillRel extends Calcite's LogialRel. It overrides some LogicalRel's
> >>> methods, and adds new methods.  Therefore, even we remove
> >>> DRILL_LOGICAL convention, we still have to maintain a set of extended
> >>> class from Calcite Logical. I'm not clear what benefit we would get by
> >>> removing the DRILL_LOGICAL convention.
> >>>
> >>> If we want to remove the complete set of DrillLogical classes, then
> >>> I'm not sure where we put the Drill specific logic, for instance,
> >>> Drill Join has certain restriction different from Calcite Join.
> >>>
> >>>
> >>>
> >>>
> >>> On Mon, Nov 23, 2015 at 4:11 PM, Hsuan Yi Chu <hyi...@maprtech.com>
> wrote:
> >>>> My understanding is:
> >>>> In logical planning, we determine the "structure" of the tree (e.g.,
> join
> >>>> order)
> >>>> And then in physical, we determine the implementation (e.g., merge vs
> >>> hash
> >>>> join).
> >>>>
> >>>> This staging seems clean to me. So what is the motivation to merge
> them
> >>> all
> >>>> together?
> >>>>
> >>>>
> >>>> On Mon, Nov 23, 2015 at 2:51 PM, Jacques Nadeau <jacq...@dremio.com>
> >>> wrote:
> >>>>
> >>>>> Anybody think we should just get rid of Drels (Rel > Drel > Prel) and
> >>> use
> >>>>> Calcite's logical representation directly (Rel > Prel)?
> >>>>>
> >>>>> --
> >>>>> Jacques Nadeau
> >>>>> CTO and Co-Founder, Dremio
> >>>>>
> >>>>> On Mon, Nov 23, 2015 at 1:57 PM, Mehant Baid <baid.meh...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Currently all rules based on Calcite logical rels and Drill logical
> >>> rels
> >>>>>> are put together and are fired together. As part of DRILL-3996,
> >>> Jinfeng
> >>>>>> will break it down into different phases. I should be able to take
> >>>>>> advantage of this and move the directory based partition pruning to
> >>> fire
> >>>>>> based on Calcite rels.
> >>>>>>
> >>>>>> Thanks
> >>>>>> Mehant
> >>>>>>
> >>>>>>
> >>>>>> On 11/23/15 10:58 AM, Hanifi GUNES wrote:
> >>>>>>
> >>>>>>> The general idea of multi-phase pruning makes sense to me. I am
> >>>>> wondering,
> >>>>>>> though, are we referring to introducing a new planning phase before
> >>> the
> >>>>>>> logical or separating out the logic so as to make directory pruning
> >>> kick
> >>>>>>> off ahead of column partitioning?
> >>>>>>>
> >>>>>>> 2015-11-23 10:33 GMT-08:00 Mehant Baid <baid.meh...@gmail.com>:
> >>>>>>>
> >>>>>>> As part of DRILL-3996 <
> >>> https://issues.apache.org/jira/browse/DRILL-3996
> >>>>>>
> >>>>>>>> Jinfeng mentioned that he plans to move the directory based
> pruning
> >>>>> rule
> >>>>>>>> earlier than column based pruning. I want to expand on that a
> >>> little,
> >>>>>>>> provide the motivation and gather thoughts/ feedback.
> >>>>>>>>
> >>>>>>>> Currently both the directory based pruning and the column based
> >>> pruning
> >>>>>>>> is
> >>>>>>>> fired in the same planning phase and are based on Drill logical
> >>> rels.
> >>>>>>>> This
> >>>>>>>> is not optimal in the case where data is organized in such a way
> >>> that
> >>>>>>>> both
> >>>>>>>> directory based pruning and column based pruning can be applied
> >>> (when
> >>>>> the
> >>>>>>>> data is organized with a nested directory structure plus the
> >>> individual
> >>>>>>>> files contain partition columns). As part of creating the Drill
> >>> logical
> >>>>>>>> scan we read the footers of all the files involved. If the
> directory
> >>>>>>>> based
> >>>>>>>> pruning rule is fired earlier (rule to fire based on calcite
> logical
> >>>>>>>> rels)
> >>>>>>>> then we will be able to prune out unnecessary directories and save
> >>> the
> >>>>>>>> work
> >>>>>>>> of reading the footers of these files.
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>> Mehant
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>
> >
>

Reply via email to