Also the step on M1 is slightly above 40ns (41.7ns) , but exactly 40
ns on Ampere Altra.

## M1 on MacBooc Air

Testing timing overhead for 3 seconds.
Total 24000177 ticks in 1000000056 ns, 24000175.655990 ticks per ns
This CPU is running at 24000175 ticks / second, will run test for 72000525 ticks
loop_count 1407639953Per loop time including overhead: 2.13 ns, min: 0
ticks (0.0 ns), same: 1335774969
Total ticks in: 72000525, in: 3000002260 nr
Log2(x+1) histogram of timing durations:
<= ticks ( <= ns) % of total running % count
0 ( 41.7) 94.8946 94.8946 1335774969
2 ( 83.3) 5.1051 99.9997 71861227
6 ( 166.7) 0.0001 99.9998 757
14 ( 333.3) 0.0000 99.9998 0
30 ( 666.7) 0.0002 99.9999 2193
62 ( 1333.3) 0.0000 100.0000 274
126 ( 2666.6) 0.0000 100.0000 446
254 ( 5333.3) 0.0000 100.0000 87
First 64 ticks --
0 ( 0.0) 94.8946 94.8946 1335774969
1 ( 41.7) 5.1032 99.9997 71834980
2 ( 83.3) 0.0019 99.9998 26247
3 ( 125.0) 0.0001 99.9998 757
15 ( 625.0) 0.0000 100.0000 1

## Ampere Altra

Testing timing overhead for 3 seconds.
Total 25000002 ticks in 1000000074 ns, 25000000.150000 ticks per ns
This CPU is running at 25000000 ticks / second, will run test for 75000000 ticks
loop_count 291630863Per loop time including overhead: 10.29 ns, min: 0
ticks (0.0 ns), same: 217288944
Total ticks in: 75000000, in: 3000000542 nr
Log2(x+1) histogram of timing durations:
<= ticks ( <= ns) % of total running % count
0 ( 40.0) 74.5082 74.5082 217288944
2 ( 80.0) 25.4886 99.9968 74332703
6 ( 160.0) 0.0000 99.9968 5
14 ( 320.0) 0.0000 99.9968 0
30 ( 640.0) 0.0000 99.9968 31
62 ( 1280.0) 0.0011 99.9979 3123
126 ( 2560.0) 0.0020 99.9999 5848
254 ( 5120.0) 0.0001 100.0000 149
510 ( 10240.0) 0.0000 100.0000 38
1022 ( 20480.0) 0.0000 100.0000 21
2046 ( 40960.0) 0.0000 100.0000 1
First 64 ticks --
0 ( 0.0) 74.5082 74.5082 217288944
1 ( 40.0) 25.4886 99.9968 74332699
2 ( 80.0) 0.0000 99.9968 4
3 ( 120.0) 0.0000 99.9968 1
4 ( 160.0) 0.0000 99.9968 3

On Tue, Jul 2, 2024 at 7:31 PM Hannu Krosing <han...@google.com> wrote:
>
> Hi Tom,
>
> On various Intel CPUs I got either steps close to single nanosecond or
> sometimes a little more on older ones
>
> One specific CPU moved in in 2 tick increments while the ration to ns
> was 2,1/1 or 2100 ticks per microsecond.
>
> On Zen4 AMD the step seems to  be 10 ns, even though the tick-to-ns
> ratio is 2.6 / 1 , so reading ticks directly gives 26, 54, ...
>
> Also, reading directly in ticks on M1 gave "loop time including
> overhead: 2.13 ns" (attached code works on Clang, not sure about GCC)
>
>
> I'll also take a look at the docs and try to propose something
>
> Do we also need tests for this one ?
>
> ----
> Hannu
>
>
>
> On Tue, Jul 2, 2024 at 7:20 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
> >
> > BTW, getting back to the original point of the thread: I duplicated
> > Hannu's result showing that on Apple M1 the clock tick seems to be
> > about 40ns.  But look at what I got with the v2 patch on my main
> > workstation (full output attached):
> >
> > $ ./pg_test_timing
> > ...
> > Per loop time including overhead: 16.60 ns
> > ...
> > Timing durations less than 128 ns:
> >       ns   % of total  running %      count
> >       15       3.2738     3.2738    5914914
> >       16      49.0772    52.3510   88668783
> >       17      36.4662    88.8172   65884173
> >       18       9.5639    98.3810   17279249
> >       19       1.5746    99.9556    2844873
> >       20       0.0416    99.9972      75125
> >       21       0.0004    99.9976        757
> > ...
> >
> > It sure looks like this is exact-to-the-nanosecond results,
> > since the modal values match the overall per-loop timing,
> > and there are no zero measurements.
> >
> > This is a Dell tower from 2021, running RHEL8 on an Intel Xeon W-2245.
> > Not exactly top-of-the-line stuff.
> >
> >                         regards, tom lane
> >


Reply via email to