You can carry on using your own formula, but move the formula into a metadata provider. You just don’t need to create a subclass in order for it to get called. For example, if you’ve written
public class DrillLogicalFilter extends LogicalFilter { public double getRows() { return <<my formula>>; } } and getRows() is its only method you can obsolete it and register the following metadata provider: public class DrillMdRowCount { public Double getRowCount(LogicalFilter filter) { return <<my formula>>; } } Calcite uses double dispatch (dispatching to a method based the provider AND its first argument type) so the method will be called automatically. Julian > On Nov 23, 2015, at 5:56 PM, Jinfeng Ni <jinfengn...@gmail.com> wrote: > > My understanding is RelMetadataProvider gives the estimation of row > count, distinct row count, etc. But it's still up to each Rel node to > decide how to estimate it's own cost, given the row count, distinct > row count etc from MetadataProvider. Are you suggesting we completely > remove the Drill's costing estimation method, and use Calcite's > default one? > > > > On Mon, Nov 23, 2015 at 5:35 PM, Julian Hyde <jh...@apache.org> wrote: >> Yes. You don’t need an “implement” method (or yours can just throw). >> >> You could use your own serialization to/from JSON or you could use >> RelJsonWriter/RelJsonReader. >> >> Julian >> >> >>> On Nov 23, 2015, at 5:31 PM, Jacques Nadeau <jacq...@dremio.com> wrote: >>> >>> We could create serializers and deserializers for the logical plan stuff. >>> It looks like we can resolve the costing through metadata providers unless >>> I misunderstood what Julian was suggesting. >>> >>> >>> >>> -- >>> Jacques Nadeau >>> CTO and Co-Founder, Dremio >>> >>> On Mon, Nov 23, 2015 at 5:12 PM, Jinfeng Ni <jinfengn...@gmail.com> wrote: >>> >>>> @Jacaues, >>>> >>>> Every DrillLogicalRel has to override computeSelfCost(), and implement >>>> implement() method. The latter is to get Logical Plan, which is one of >>>> three input types Drill should accept (SQL, Logical Plan, Physical >>>> Plan). >>>> >>>> So, for now, we do have to override/exend all DrillLogicalRel. >>>> >>>> >>>> On Mon, Nov 23, 2015 at 4:55 PM, Julian Hyde <jh...@apache.org> wrote: >>>>> I’m not sure what properties / behavior you want to override but >>>> remember that Calcite specifies a lot of brings as traits or metadata. >>>>> >>>>> For example, “double RelNode.getRows()" is deprecated and you would >>>> these days use RelMetadataQuery.getRowCount(). You would not need to >>>> sub-class a RelNode to override its row-count estimate, just supply a >>>> different metadata provider. >>>>> >>>>> Julian >>>>> >>>>> >>>>>> On Nov 23, 2015, at 4:50 PM, Jacques Nadeau <jacq...@dremio.com> wrote: >>>>>> >>>>>> Yes, my suggestion is removal of DRILL_LOGICAL. @Hsuan, this is >>>> independent >>>>>> from the number of phases and I'm not suggesting changing that. >>>>>> >>>>>> My main thought was: if we only need to override one or two rels, do >>>> only >>>>>> that rather than having a wholesale copy of every operator with a bunch >>>> of >>>>>> basic noop rules. >>>>>> >>>>>> -- >>>>>> Jacques Nadeau >>>>>> CTO and Co-Founder, Dremio >>>>>> >>>>>> On Mon, Nov 23, 2015 at 4:37 PM, Jinfeng Ni <jinfengn...@gmail.com> >>>> wrote: >>>>>> >>>>>>> @Jacques, are you talking about removing the convention DRILL_LOGICAL? >>>>>>> >>>>>>> DrillRel extends Calcite's LogialRel. It overrides some LogicalRel's >>>>>>> methods, and adds new methods. Therefore, even we remove >>>>>>> DRILL_LOGICAL convention, we still have to maintain a set of extended >>>>>>> class from Calcite Logical. I'm not clear what benefit we would get by >>>>>>> removing the DRILL_LOGICAL convention. >>>>>>> >>>>>>> If we want to remove the complete set of DrillLogical classes, then >>>>>>> I'm not sure where we put the Drill specific logic, for instance, >>>>>>> Drill Join has certain restriction different from Calcite Join. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Nov 23, 2015 at 4:11 PM, Hsuan Yi Chu <hyi...@maprtech.com> >>>> wrote: >>>>>>>> My understanding is: >>>>>>>> In logical planning, we determine the "structure" of the tree (e.g., >>>> join >>>>>>>> order) >>>>>>>> And then in physical, we determine the implementation (e.g., merge vs >>>>>>> hash >>>>>>>> join). >>>>>>>> >>>>>>>> This staging seems clean to me. So what is the motivation to merge >>>> them >>>>>>> all >>>>>>>> together? >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Nov 23, 2015 at 2:51 PM, Jacques Nadeau <jacq...@dremio.com> >>>>>>> wrote: >>>>>>>> >>>>>>>>> Anybody think we should just get rid of Drels (Rel > Drel > Prel) and >>>>>>> use >>>>>>>>> Calcite's logical representation directly (Rel > Prel)? >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Jacques Nadeau >>>>>>>>> CTO and Co-Founder, Dremio >>>>>>>>> >>>>>>>>> On Mon, Nov 23, 2015 at 1:57 PM, Mehant Baid <baid.meh...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Currently all rules based on Calcite logical rels and Drill logical >>>>>>> rels >>>>>>>>>> are put together and are fired together. As part of DRILL-3996, >>>>>>> Jinfeng >>>>>>>>>> will break it down into different phases. I should be able to take >>>>>>>>>> advantage of this and move the directory based partition pruning to >>>>>>> fire >>>>>>>>>> based on Calcite rels. >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> Mehant >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 11/23/15 10:58 AM, Hanifi GUNES wrote: >>>>>>>>>> >>>>>>>>>>> The general idea of multi-phase pruning makes sense to me. I am >>>>>>>>> wondering, >>>>>>>>>>> though, are we referring to introducing a new planning phase before >>>>>>> the >>>>>>>>>>> logical or separating out the logic so as to make directory pruning >>>>>>> kick >>>>>>>>>>> off ahead of column partitioning? >>>>>>>>>>> >>>>>>>>>>> 2015-11-23 10:33 GMT-08:00 Mehant Baid <baid.meh...@gmail.com>: >>>>>>>>>>> >>>>>>>>>>> As part of DRILL-3996 < >>>>>>> https://issues.apache.org/jira/browse/DRILL-3996 >>>>>>>>>> >>>>>>>>>>>> Jinfeng mentioned that he plans to move the directory based >>>> pruning >>>>>>>>> rule >>>>>>>>>>>> earlier than column based pruning. I want to expand on that a >>>>>>> little, >>>>>>>>>>>> provide the motivation and gather thoughts/ feedback. >>>>>>>>>>>> >>>>>>>>>>>> Currently both the directory based pruning and the column based >>>>>>> pruning >>>>>>>>>>>> is >>>>>>>>>>>> fired in the same planning phase and are based on Drill logical >>>>>>> rels. >>>>>>>>>>>> This >>>>>>>>>>>> is not optimal in the case where data is organized in such a way >>>>>>> that >>>>>>>>>>>> both >>>>>>>>>>>> directory based pruning and column based pruning can be applied >>>>>>> (when >>>>>>>>> the >>>>>>>>>>>> data is organized with a nested directory structure plus the >>>>>>> individual >>>>>>>>>>>> files contain partition columns). As part of creating the Drill >>>>>>> logical >>>>>>>>>>>> scan we read the footers of all the files involved. If the >>>> directory >>>>>>>>>>>> based >>>>>>>>>>>> pruning rule is fired earlier (rule to fire based on calcite >>>> logical >>>>>>>>>>>> rels) >>>>>>>>>>>> then we will be able to prune out unnecessary directories and save >>>>>>> the >>>>>>>>>>>> work >>>>>>>>>>>> of reading the footers of these files. >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> Mehant >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>>> >>