You can carry on using your own formula, but move the formula into a metadata 
provider. You just don’t need to create a subclass in order for it to get 
called. For example, if you’ve written 

  public class DrillLogicalFilter extends LogicalFilter {
    public double getRows() {
      return <<my formula>>;
    }
  }

and getRows() is its only method you can obsolete it and register the following 
metadata provider:

  public class DrillMdRowCount {
    public Double getRowCount(LogicalFilter filter) {
      return <<my formula>>;
    }
  }

Calcite uses double dispatch (dispatching to a method based the provider AND 
its first argument type) so the method will be called automatically.

Julian



> On Nov 23, 2015, at 5:56 PM, Jinfeng Ni <jinfengn...@gmail.com> wrote:
> 
> My understanding is RelMetadataProvider gives the estimation of row
> count, distinct row count, etc. But it's still up to each Rel node to
> decide how to estimate it's own cost, given the row count, distinct
> row count etc from MetadataProvider. Are you suggesting we completely
> remove the Drill's costing estimation method, and use Calcite's
> default one?
> 
> 
> 
> On Mon, Nov 23, 2015 at 5:35 PM, Julian Hyde <jh...@apache.org> wrote:
>> Yes. You don’t need an “implement” method (or yours can just throw).
>> 
>> You could use your own serialization to/from JSON or you could use 
>> RelJsonWriter/RelJsonReader.
>> 
>> Julian
>> 
>> 
>>> On Nov 23, 2015, at 5:31 PM, Jacques Nadeau <jacq...@dremio.com> wrote:
>>> 
>>> We could create serializers and deserializers for the logical plan stuff.
>>> It looks like we can resolve the costing through metadata providers unless
>>> I misunderstood what Julian was suggesting.
>>> 
>>> 
>>> 
>>> --
>>> Jacques Nadeau
>>> CTO and Co-Founder, Dremio
>>> 
>>> On Mon, Nov 23, 2015 at 5:12 PM, Jinfeng Ni <jinfengn...@gmail.com> wrote:
>>> 
>>>> @Jacaues,
>>>> 
>>>> Every DrillLogicalRel has to override computeSelfCost(), and implement
>>>> implement() method. The latter is to get Logical Plan, which is one of
>>>> three input types Drill should accept (SQL, Logical Plan, Physical
>>>> Plan).
>>>> 
>>>> So, for now, we do have to override/exend all DrillLogicalRel.
>>>> 
>>>> 
>>>> On Mon, Nov 23, 2015 at 4:55 PM, Julian Hyde <jh...@apache.org> wrote:
>>>>> I’m not sure what properties / behavior you want to override but
>>>> remember that Calcite specifies a lot of brings as traits or metadata.
>>>>> 
>>>>> For example, “double RelNode.getRows()" is deprecated and you would
>>>> these days use RelMetadataQuery.getRowCount(). You would not need to
>>>> sub-class a RelNode to override its row-count estimate, just supply a
>>>> different metadata provider.
>>>>> 
>>>>> Julian
>>>>> 
>>>>> 
>>>>>> On Nov 23, 2015, at 4:50 PM, Jacques Nadeau <jacq...@dremio.com> wrote:
>>>>>> 
>>>>>> Yes, my suggestion is removal of DRILL_LOGICAL. @Hsuan, this is
>>>> independent
>>>>>> from the number of phases and I'm not suggesting changing that.
>>>>>> 
>>>>>> My main thought was: if we only need to override one or two rels, do
>>>> only
>>>>>> that rather than having a wholesale copy of every operator with a bunch
>>>> of
>>>>>> basic noop rules.
>>>>>> 
>>>>>> --
>>>>>> Jacques Nadeau
>>>>>> CTO and Co-Founder, Dremio
>>>>>> 
>>>>>> On Mon, Nov 23, 2015 at 4:37 PM, Jinfeng Ni <jinfengn...@gmail.com>
>>>> wrote:
>>>>>> 
>>>>>>> @Jacques, are you talking about removing the convention DRILL_LOGICAL?
>>>>>>> 
>>>>>>> DrillRel extends Calcite's LogialRel. It overrides some LogicalRel's
>>>>>>> methods, and adds new methods.  Therefore, even we remove
>>>>>>> DRILL_LOGICAL convention, we still have to maintain a set of extended
>>>>>>> class from Calcite Logical. I'm not clear what benefit we would get by
>>>>>>> removing the DRILL_LOGICAL convention.
>>>>>>> 
>>>>>>> If we want to remove the complete set of DrillLogical classes, then
>>>>>>> I'm not sure where we put the Drill specific logic, for instance,
>>>>>>> Drill Join has certain restriction different from Calcite Join.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, Nov 23, 2015 at 4:11 PM, Hsuan Yi Chu <hyi...@maprtech.com>
>>>> wrote:
>>>>>>>> My understanding is:
>>>>>>>> In logical planning, we determine the "structure" of the tree (e.g.,
>>>> join
>>>>>>>> order)
>>>>>>>> And then in physical, we determine the implementation (e.g., merge vs
>>>>>>> hash
>>>>>>>> join).
>>>>>>>> 
>>>>>>>> This staging seems clean to me. So what is the motivation to merge
>>>> them
>>>>>>> all
>>>>>>>> together?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Nov 23, 2015 at 2:51 PM, Jacques Nadeau <jacq...@dremio.com>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Anybody think we should just get rid of Drels (Rel > Drel > Prel) and
>>>>>>> use
>>>>>>>>> Calcite's logical representation directly (Rel > Prel)?
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Jacques Nadeau
>>>>>>>>> CTO and Co-Founder, Dremio
>>>>>>>>> 
>>>>>>>>> On Mon, Nov 23, 2015 at 1:57 PM, Mehant Baid <baid.meh...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Currently all rules based on Calcite logical rels and Drill logical
>>>>>>> rels
>>>>>>>>>> are put together and are fired together. As part of DRILL-3996,
>>>>>>> Jinfeng
>>>>>>>>>> will break it down into different phases. I should be able to take
>>>>>>>>>> advantage of this and move the directory based partition pruning to
>>>>>>> fire
>>>>>>>>>> based on Calcite rels.
>>>>>>>>>> 
>>>>>>>>>> Thanks
>>>>>>>>>> Mehant
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On 11/23/15 10:58 AM, Hanifi GUNES wrote:
>>>>>>>>>> 
>>>>>>>>>>> The general idea of multi-phase pruning makes sense to me. I am
>>>>>>>>> wondering,
>>>>>>>>>>> though, are we referring to introducing a new planning phase before
>>>>>>> the
>>>>>>>>>>> logical or separating out the logic so as to make directory pruning
>>>>>>> kick
>>>>>>>>>>> off ahead of column partitioning?
>>>>>>>>>>> 
>>>>>>>>>>> 2015-11-23 10:33 GMT-08:00 Mehant Baid <baid.meh...@gmail.com>:
>>>>>>>>>>> 
>>>>>>>>>>> As part of DRILL-3996 <
>>>>>>> https://issues.apache.org/jira/browse/DRILL-3996
>>>>>>>>>> 
>>>>>>>>>>>> Jinfeng mentioned that he plans to move the directory based
>>>> pruning
>>>>>>>>> rule
>>>>>>>>>>>> earlier than column based pruning. I want to expand on that a
>>>>>>> little,
>>>>>>>>>>>> provide the motivation and gather thoughts/ feedback.
>>>>>>>>>>>> 
>>>>>>>>>>>> Currently both the directory based pruning and the column based
>>>>>>> pruning
>>>>>>>>>>>> is
>>>>>>>>>>>> fired in the same planning phase and are based on Drill logical
>>>>>>> rels.
>>>>>>>>>>>> This
>>>>>>>>>>>> is not optimal in the case where data is organized in such a way
>>>>>>> that
>>>>>>>>>>>> both
>>>>>>>>>>>> directory based pruning and column based pruning can be applied
>>>>>>> (when
>>>>>>>>> the
>>>>>>>>>>>> data is organized with a nested directory structure plus the
>>>>>>> individual
>>>>>>>>>>>> files contain partition columns). As part of creating the Drill
>>>>>>> logical
>>>>>>>>>>>> scan we read the footers of all the files involved. If the
>>>> directory
>>>>>>>>>>>> based
>>>>>>>>>>>> pruning rule is fired earlier (rule to fire based on calcite
>>>> logical
>>>>>>>>>>>> rels)
>>>>>>>>>>>> then we will be able to prune out unnecessary directories and save
>>>>>>> the
>>>>>>>>>>>> work
>>>>>>>>>>>> of reading the footers of these files.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> Mehant
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> 

Reply via email to