Yes. Optiq separates metadata (e.g. cardinality, i.e. how many rows a RelNode
is expected to return) from cost. The default cost model is based on
cardinality, but you can define other cost models.
You can see this in action in EnumerableJoinRel.computeSelfCost [1]. Even
though joins commute (i.e. X join Y returns the same results as Y join X), an
EnumerableJoinRel will be cheaper if the input with fewer rows is on the left.
We chose a cost model to reflect that.
// Cheaper if the smaller number of rows is coming from the LHS.
// Model this by adding L log L to the cost.
final double rightRowCount = right.getRows();
final double leftRowCount = left.getRows();
if (Double.isInfinite(leftRowCount)) {
rowCount = leftRowCount;
} else {
rowCount += Util.nLogN(leftRowCount);
}
if (Double.isInfinite(rightRowCount)) {
rowCount = rightRowCount;
} else {
rowCount += rightRowCount;
}
return planner.getCostFactory().makeCost(rowCount, 0, 0);
The simple way to modify the cost model is to modify (or override)
RelNode.computeSelfCost. Metadata and cost are also pluggable [2].
Julian
[1]
https://github.com/apache/incubator-optiq/blob/5d209a509b6e2ca895626105950906cad154fac9/core/src/main/java/net/hydromatic/optiq/rules/java/JavaRules.java#L167
[2] https://issues.apache.org/jira/browse/OPTIQ-362
On Aug 11, 2014, at 2:45 AM, Ravi Nallappan <[email protected]> wrote:
> Hi,
>
>
>
> Is there a way in Optiq to promote an expression (based on our custom rule)
> as the best expression even if the two has same cost (in term of
> cardinality).
>
>
>
> ------------
>
> Sets:
>
> Set#0, type: RecordType(JavaType(class java.lang.Integer) EMPNO,
> JavaType(class java.lang.String) NAME, JavaType(class java.lang.Integer)
> DEPTNO, JavaType(class java.lang.String) GENDER, JavaType(class
> java.lang.String) CITY, JavaType(class java.lang.Integer) EMPID,
> JavaType(class java.lang.Integer) AGE, JavaType(class java.lang.Boolean)
> SLACKER, JavaType(class java.lang.Boolean) MANAGER, JavaType(class
> java.lang.String) JOINEDAT)
>
> rel#4:Subset#0.ENUMERABLE.[], best=rel#0,
> importance=0.7290000000000001
>
> rel#0:CsvTableScan.ENUMERABLE.[](table=[SALES,
> EMPS],fields=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), rowcount=100.0, cumulative
> cost={100.0 rows, 101.0 cpu, 0.0 io}
>
> Set#1, type: RecordType()
>
> rel#6:Subset#1.NONE.[], best=null, importance=0.81
>
>
> rel#5:ProjectRel.NONE.[](child=rel#4:Subset#0.ENUMERABLE.[]),
> rowcount=100.0, cumulative cost={inf}
>
> rel#13:Subset#1.ENUMERABLE.[], best=rel#21, importance=0.9
>
>
> rel#14:AbstractConverter.ENUMERABLE.[](child=rel#6:Subset#1.NONE.[],convention=ENUMERABLE,sort=[]),
> rowcount=1.7976931348623157E308, cumulative cost={inf}
>
>
> rel#17:EnumerableProjectRel.ENUMERABLE.[](child=rel#4:Subset#0.ENUMERABLE.[]),
> rowcount=100.0, cumulative cost={200.0 rows, 101.0 cpu, 0.0 io}
>
> rel#21:CsvTableScan.ENUMERABLE.[](table=[SALES,
> EMPS],fields=[]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu,
> 0.0 io}
>
> Set#2, type: RecordType(INTEGER EXPR$0)
>
> rel#8:Subset#2.NONE.[], best=null, importance=0.9
>
>
> rel#7:ProjectRel.NONE.[](child=rel#6:Subset#1.NONE.[],EXPR$0=+(4,
> 2)), rowcount=1.7976931348623157E308, cumulative cost={inf}
>
>
> rel#18:ProjectRel.NONE.[](child=rel#4:Subset#0.ENUMERABLE.[],EXPR$0=+(4,
> 2)), rowcount=100.0, cumulative cost={inf}
>
> rel#11:Subset#2.ENUMERABLE.[], best=rel#19, importance=1.0
>
>
> rel#12:AbstractConverter.ENUMERABLE.[](child=rel#8:Subset#2.NONE.[],convention=ENUMERABLE,sort=[]),
> rowcount=1.7976931348623157E308, cumulative cost={inf}
>
>
> rel#15:EnumerableProjectRel.ENUMERABLE.[](child=rel#13:Subset#1.ENUMERABLE.[],EXPR$0=+(4,
> 2)), importance=0.0, rowcount=100.0, cumulative cost={200.0 rows, 201.0
> cpu, 0.0 io}
>
>
> rel#16:EnumerableProjectRel.ENUMERABLE.[](child=rel#13:Subset#1.ENUMERABLE.[],EXPR$0=6),
> rowcount=100.0, cumulative cost={200.0 rows, 201.0 cpu, 0.0 io}
>
>
> rel#19:EnumerableProjectRel.ENUMERABLE.[](child=rel#4:Subset#0.ENUMERABLE.[],EXPR$0=+(4,
> 2)), importance=0.0, rowcount=100.0, cumulative cost={200.0 rows, 201.0
> cpu, 0.0 io}
>
>
> rel#20:EnumerableProjectRel.ENUMERABLE.[](child=rel#4:Subset#0.ENUMERABLE.[],EXPR$0=6),
> rowcount=100.0, cumulative cost={200.0 rows, 201.0 cpu, 0.0 io} ß wanted
> this
>
> -----
>
>
>
> Feel free to ask me for more details that I missed out here.
>
>
>
> ps: This is based on optiq vesion 0.5.
>
>
>
> Thanks & Regards,
>
> Ravi Nallappan