Re: [DISCUSS] CALCITE-3656, 3657, 1842: cost improvements, cost units

Vladimir Sitnikov Sat, 04 Jan 2020 12:11:04 -0800

Technically speaking, single-block read time for HDDs is pretty much
stable, so the use of seconds might be not that bad.
However, it seconds might be complicated to measure CPU-like activity (e.g.
different machines might execute EnumerableJoin at different rate :( )



What if we benchmark a trivial EnumerableCalc(EnumerableTableScan) for a
table of 100 rows and 10 columns
and call it a single cost unit?

In other words, we could have an etalon benchmark that takes X seconds and
we could call it a single cost unit.

For instance, org.apache.calcite.rel.core.Sort#computeSelfCost returns a
cost.
Of course, it has NLogN assumption, but which multiplier should it use?

One could measure the wallclock time for the sort, and divide it by the
time it takes to execute the etalon cost benchmark.

WDYT?

Vladimir

Re: [DISCUSS] CALCITE-3656, 3657, 1842: cost improvements, cost units

Reply via email to