Re: Bug in Sort.computeSelfCost()?

JD Zheng Wed, 14 Jun 2017 12:04:23 -0700

Sure. I’ll log them. 

As to make cost model totally ordered, can we add some kind of signature that 
uniquely identify Relnode to break the tie? So the order will be 
<cost(rowCount, cpuCost, ioCost), Relnode.signature>.


-jiandan


> On Jun 14, 2017, at 11:41 AM, Julian Hyde <[email protected]> wrote:
> 
> That does look like a bug. Can you log it please.
> 
> The reason for the “if (false)” is that we found that costs don’t work well 
> if they form only a partial order, not a total order. If you have two 
> RelNodes R1 and R2 in an equivalent set, and they have costs C1 and C2, and 
> neither C1 <= C2 nor C2 <= C1 is true, which is the Volcano planner to pick? 
> It will tend to pick the one that it saw first, and that is bad news because 
> it is arbitrary and non-deterministic.
> 
> So, we should probably find a way to convert a RelOptCost to a 
> totally-ordered value, such as by applying weights to cpu, io and memory cost 
> and returning a double. (Anyone have a better idea?) Can you log a bug for 
> that also.
> 
> Julian
> 
> 
>> On Jun 13, 2017, at 10:47 AM, JD Zheng <[email protected]> wrote:
>> 
>> Hi, 
>> 
>> Our team currently uses calcite druid-adaptor to query data in druid. We 
>> found that at some cases when the limit is over 10 and the data set has lots 
>> of dimensions, the limit is not pushed down to druid.
>> 
>> We looked further at the cost calculation of the different plans, and found 
>> that the following code in Sort.java looks suspicious:
>> 
>> 
>>  @Override public RelOptCost computeSelfCost(RelOptPlanner planner,
>>      RelMetadataQuery mq) {
>>    // Higher cost if rows are wider discourages pushing a project through a
>>    // sort.
>>    double rowCount = mq.getRowCount(this);
>>    double bytesPerRow = getRowType().getFieldCount() * 4;
>>    return planner.getCostFactory().makeCost(
>>        Util.nLogN(rowCount) * bytesPerRow, rowCount, 0);
>> 
>> 
>> 
>> And the definition of makeCost is:
>> 
>> public interface RelOptCostFactory {
>>  /**
>>   * Creates a cost object.
>>   */
>>  RelOptCost makeCost(double rowCount, double cpu, double io);
>> 
>> 
>> 
>> So, the first parameter should be rowCount, the second is cpu. 
>> 
>> It seems that caller is feeding the wrong parameters.
>> 
>> Once we switch these two parameters, it works out fine: the limit is pushed 
>> down to the druid query.
>> 
>> 
>> Are we doing the right thing by switching the parameters? Is it a bug here 
>> or there’s any reason we feed the parameters this way?
>> 
>> 
>> 
>> 
>> By the way, we found some dead code in VolcanoCost.java
>> <PastedGraphic-1.tiff>
>> 
>> 
>> Does it mean that we don’t need to bother feed in the cpu cost and io cost, 
>> these costs should be somehow modeled in rowcounts?
>> 
>> 
>> Thanks,
>> 
>> -Jiandan
>

Re: Bug in Sort.computeSelfCost()?

Reply via email to