That does look like a bug. Can you log it please.

The reason for the “if (false)” is that we found that costs don’t work well if 
they form only a partial order, not a total order. If you have two RelNodes R1 
and R2 in an equivalent set, and they have costs C1 and C2, and neither C1 <= 
C2 nor C2 <= C1 is true, which is the Volcano planner to pick? It will tend to 
pick the one that it saw first, and that is bad news because it is arbitrary 
and non-deterministic.

So, we should probably find a way to convert a RelOptCost to a totally-ordered 
value, such as by applying weights to cpu, io and memory cost and returning a 
double. (Anyone have a better idea?) Can you log a bug for that also.

Julian


> On Jun 13, 2017, at 10:47 AM, JD Zheng <[email protected]> wrote:
> 
> Hi, 
> 
> Our team currently uses calcite druid-adaptor to query data in druid. We 
> found that at some cases when the limit is over 10 and the data set has lots 
> of dimensions, the limit is not pushed down to druid.
> 
> We looked further at the cost calculation of the different plans, and found 
> that the following code in Sort.java looks suspicious:
> 
>  
>   @Override public RelOptCost computeSelfCost(RelOptPlanner planner,
>       RelMetadataQuery mq) {
>     // Higher cost if rows are wider discourages pushing a project through a
>     // sort.
>     double rowCount = mq.getRowCount(this);
>     double bytesPerRow = getRowType().getFieldCount() * 4;
>     return planner.getCostFactory().makeCost(
>         Util.nLogN(rowCount) * bytesPerRow, rowCount, 0);
>  
> 
> 
> And the definition of makeCost is:
> 
> public interface RelOptCostFactory {
>   /**
>    * Creates a cost object.
>    */
>   RelOptCost makeCost(double rowCount, double cpu, double io);
> 
> 
> 
> So, the first parameter should be rowCount, the second is cpu. 
> 
> It seems that caller is feeding the wrong parameters.
> 
> Once we switch these two parameters, it works out fine: the limit is pushed 
> down to the druid query.
> 
> 
> Are we doing the right thing by switching the parameters? Is it a bug here or 
> there’s any reason we feed the parameters this way?
> 
> 
> 
> 
> By the way, we found some dead code in VolcanoCost.java
> <PastedGraphic-1.tiff>
> 
> 
> Does it mean that we don’t need to bother feed in the cpu cost and io cost, 
> these costs should be somehow modeled in rowcounts?
> 
> 
> Thanks,
> 
> -Jiandan

Reply via email to