Hi David,
you may see a non-zero startup cost even though an operation shows
cost 0.
This is because startup cost is averaged over all monadic or all
dyadic operations.
The reason for zero startup cost on the products is most likely
due to a reorg of the counter numbers.
I forgot to update ScalarBenchmark.apl;
fixed in SVN 489.
In general the OP and STAT columns in ScalarBenchmark.apl
should match the ]PSTAT command, e.g if:
]pstat 38
╔═════════════════╦════════════╤══════════╤══════════╤══════════╤══════════╗
║ A f.g B ║ 0 │ 0 │ 0
│ 0 │ 0 ║
╚═════════════════╩════════════╧══════════╧══════════╧══════════╧══════════╝
then the STAT number for f.g should
be 38 in ScalarBenchmark,apl.
/// Jürgen
On 10/17/2014 05:58 PM, David B. Lamkins wrote:
I'm seeing zero start-up costs for inner and outer products when
running ScalarBenchmark.apl.
===================== Mat1_IRC +.× Mat1_IRC ===============================
Benchmarking start-up cost for Mat1_IRC +.× Mat1_IRC ...
Length Sequ Cycles Para Cycles Linear Sequ Linear Para
====== =========== =========== =========== ===========
25 0 0 0 0
25 0 0 0 0
25 0 0 0 0
25 0 0 0 0
16 0 0 0 0
16 0 0 0 0
16 0 0 0 0
16 0 0 0 0
16 0 0 0 0
16 0 0 0 0
16 0 0 0 0
9 0 0 0 0
9 0 0 0 0
9 0 0 0 0
9 0 0 0 0
9 0 0 0 0
4 0 0 0 0
4 0 0 0 0
4 0 0 0 0
1 0 0 0 0
regression line sequential: 0 + 0×N cycles
regression line parallel: 0 + 0×N cycles
===================== Vec1_IRC ∘.× Vec1_IRC ===============================
Benchmarking start-up cost for Vec1_IRC ∘.× Vec1_IRC ...
Length Sequ Cycles Para Cycles Linear Sequ Linear Para
====== =========== =========== =========== ===========
25 0 0 0 0
25 0 0 0 0
25 0 0 0 0
25 0 0 0 0
16 0 0 0 0
16 0 0 0 0
16 0 0 0 0
16 0 0 0 0
16 0 0 0 0
16 0 0 0 0
16 0 0 0 0
9 0 0 0 0
9 0 0 0 0
9 0 0 0 0
9 0 0 0 0
9 0 0 0 0
4 0 0 0 0
4 0 0 0 0
4 0 0 0 0
1 0 0 0 0
regression line sequential: 0 + 0×N cycles
regression line parallel: 0 + 0×N cycles
But then in the summary section -- just above ]PSTAT -- I see:
-------------- Mat1_IRC +.× Mat1_IRC --------------
average sequential startup cost: 359 cycles
average parallel startup cost: 832 cycles
per item cost sequential: 0 cycles
per item cost parallel: 0 cycles
parallel break-even length: not reached
-------------- Vec1_IRC ∘.× Vec1_IRC --------------
average sequential startup cost: 359 cycles
average parallel startup cost: 832 cycles
per item cost sequential: 0 cycles
per item cost parallel: 0 cycles
parallel break-even length: not reached
Here the startup costs are nonzero, but the per-item costs are all
zero.
This doesn't look right... Or am I missing something?
In case it might shed some additional light, here's the final
section of the ]PSTAT output. The rest looks reasonable except for
epsilon-underbar, which reports all zeroes.
╔═════════════════╦════════════╤══════════╤══════════╤══════════╤══════════╗
║ Function ║ │ N │ ⌀ VLEN │ ⌀ cycles │ cyc÷VLEN ║
╟─────────────────╫────────────┼──────────┼──────────┼──────────┼──────────╢
║ f B overhead ║ 18446744003448130869 │ 283 │ 1993 │ 34818579233229 │ 17466187239 ║
║ A f B overhead ║ 18446743954621671206 │ 1114 │ 84 │ 1447585256996 │ 17221844259 ║
║ scalar B ║ 130198460 │ 283 │ 3873 │ 460065 │ 118 ║
║ A scalar B ║ 91680403 │ 1114 │ 949 │ 82298 │ 86 ║
║ clone B ║ 233950109373 │ 75391125 │ 131 │ 3103 │ 23 ║
║ A f.g B ║ 911702656227 │ 40046 │ 163 │ 22766385 │ 139671 ║
║ A ∘.g B ║ 9809803882 │ 121 │ 1000000 │ 81072759 │ 81 ║
║ A ⍴ B ║ 9071 │ 3 │ 27 │ 3023 │ 111 ║
║ PrintBuffer(B) ║ 135760049 │ 1168 │ 25 │ 116232 │ 4649 ║
╚═════════════════╩════════════╧══════════╧══════════╧══════════╧══════════╝
|