Bug in Sort.computeSelfCost()?

JD Zheng Tue, 13 Jun 2017 10:47:45 -0700

Hi, 

Our team currently uses calcite druid-adaptor to query data in druid. We found 
that at some cases when the limit is over 10 and the data set has lots of 
dimensions, the limit is not pushed down to druid.


We looked further at the cost calculation of the different plans, and found 
that the following code in Sort.java looks suspicious:

 
  @Override public RelOptCost computeSelfCost(RelOptPlanner planner,
      RelMetadataQuery mq) {
    // Higher cost if rows are wider discourages pushing a project through a
    // sort.
    double rowCount = mq.getRowCount(this);
    double bytesPerRow = getRowType().getFieldCount() * 4;
    return planner.getCostFactory().makeCost(
        Util.nLogN(rowCount) * bytesPerRow, rowCount, 0);
 


And the definition of makeCost is:

public interface RelOptCostFactory {
  /**
   * Creates a cost object.
   */
  RelOptCost makeCost(double rowCount, double cpu, double io);



So, the first parameter should be rowCount, the second is cpu. 

It seems that caller is feeding the wrong parameters.

Once we switch these two parameters, it works out fine: the limit is pushed 
down to the druid query.


Are we doing the right thing by switching the parameters? Is it a bug here or 
there’s any reason we feed the parameters this way?




By the way, we found some dead code in VolcanoCost.java



Does it mean that we don’t need to bother feed in the cpu cost and io cost, 
these costs should be somehow modeled in rowcounts?


Thanks,

-Jiandan

Bug in Sort.computeSelfCost()?

Reply via email to