That does look like a bug. Can you log it please. The reason for the “if (false)” is that we found that costs don’t work well if they form only a partial order, not a total order. If you have two RelNodes R1 and R2 in an equivalent set, and they have costs C1 and C2, and neither C1 <= C2 nor C2 <= C1 is true, which is the Volcano planner to pick? It will tend to pick the one that it saw first, and that is bad news because it is arbitrary and non-deterministic.
So, we should probably find a way to convert a RelOptCost to a totally-ordered value, such as by applying weights to cpu, io and memory cost and returning a double. (Anyone have a better idea?) Can you log a bug for that also. Julian > On Jun 13, 2017, at 10:47 AM, JD Zheng <[email protected]> wrote: > > Hi, > > Our team currently uses calcite druid-adaptor to query data in druid. We > found that at some cases when the limit is over 10 and the data set has lots > of dimensions, the limit is not pushed down to druid. > > We looked further at the cost calculation of the different plans, and found > that the following code in Sort.java looks suspicious: > > > @Override public RelOptCost computeSelfCost(RelOptPlanner planner, > RelMetadataQuery mq) { > // Higher cost if rows are wider discourages pushing a project through a > // sort. > double rowCount = mq.getRowCount(this); > double bytesPerRow = getRowType().getFieldCount() * 4; > return planner.getCostFactory().makeCost( > Util.nLogN(rowCount) * bytesPerRow, rowCount, 0); > > > > And the definition of makeCost is: > > public interface RelOptCostFactory { > /** > * Creates a cost object. > */ > RelOptCost makeCost(double rowCount, double cpu, double io); > > > > So, the first parameter should be rowCount, the second is cpu. > > It seems that caller is feeding the wrong parameters. > > Once we switch these two parameters, it works out fine: the limit is pushed > down to the druid query. > > > Are we doing the right thing by switching the parameters? Is it a bug here or > there’s any reason we feed the parameters this way? > > > > > By the way, we found some dead code in VolcanoCost.java > <PastedGraphic-1.tiff> > > > Does it mean that we don’t need to bother feed in the cpu cost and io cost, > these costs should be somehow modeled in rowcounts? > > > Thanks, > > -Jiandan
