Re: logging

2019-04-14 Thread Hal Murray via devel
> No, that description only holds for what are called "coarse" clocks.

Do you understand this area?

I think the term I've been missing is "dither".  I don't understand that area 
well enough to explain it to anybody.  Interesting timing.  I was at a talk a 
few weeks ago that covered dithering.  The context was audio processing and I 
wasn't smart enough to notice the NTP connection.

-

The idea with dithering is to add noise in bits below what you can measure.

There are several interesting quirks with the current code.

There isn't a sensible way to measure the step size of the system clock.  The 
current code fuzzes below the time to read the clock.  That has nothing to do 
with the clock tick size.  It's just a pipeline delay.  And the time-to-read 
that it uses includes lots more than just raw reading the clock: 150 ns vs 30 
ns on a reasonably modern Intel CPU.

You can see the actual clock step size in the histogram output of attic/clocks
I'm not sure how to automate that.

I haven't studied what ntpd does with coarse clocks.  I don't have a sample to 
test with.  The step size on Intel HPET  is ~70ns.  The time to read it is 
500-600 ns.

Step size on an Pi 1 is 1000 ns.  ~50 ns on a Pi 2 and Pi 3.


With the current code, get_systime() is fuzzed.  It's called from quite a few 
places.  The only ones that need fuzzing are the ones used for timekeeping.  
There are only 2 of those, one for sending requests and the other for sending 
replies.  The 2 packet receive time stamps don't get fuzzed.  Neither do the 
PPS time stamps.

There are several calls in the NTS code - just measuring elapsed times to do 
KE.  The API is convenient.  We should setup something to avoid fuzzing.




-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-14 Thread Hal Murray via devel
> Also the PLL goes up and offsets rise. (just like before)

Another way to maybe learn something.

Can you grab a copy of
  http://users.megapathdsl.net/~hmurray/time-nuts/60Hz/60Hz.py

It's a hack to measure line frequency using the PPS capture logic.  The idea 
is to turn that inside out and use it to measure the CPU frequency by watching 
a known-good PPS.

You may have to change the assert to clear - ntpd disables the one it isn't 
using.

If you stop ntpd, nobody should change the drift.  You can use ntptime -f 0 to 
clear it.

-

You don't actually need the program.  It's just what I use to log stuff so I 
can feed it to gnuplot.

You can get the data with:
  cat /sys/class/pps/pps0/assert
It will show something like:
  1555283247.999730528#84947
The number after the # is the number of pulses.  The number before is the time 
stamp of the last pulse.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-14 Thread Hal Murray via devel



> HPET is a travel out to ACPI system registers mapped into memory, this should
> never be never cached.

Yes.  But there is still the cache for code and data.

This sort of code is amazingly delicate.  Minor changes can make interesting 
changes in the results.

For example:
  for (i = 0; i < BATCHSIZE; i++) {
clock_gettime(type, &start);  /* warm up cache */
clock_gettime(type, &stop);
clock_gettime(type, &start);
clock_gettime(type, &stop);
...

The extra pair of clock_gettime-s cleaned things up a lot.  At least on one of 
my systems at some point in time.

At one point in time, I got different results when I reran a test.  Sometimes 
it would get duplicate samples.  Try again and they didn't happen.  That was 
on a Pi 1.  Ah.  I just got another example.

  res   avg  min  dups  CLOCK
1  1069  999 52564  CLOCK_REALTIME
 1000   820  898   -84  CLOCK_REALTIME_COARSE
1  1075  999 12920  CLOCK_MONOTONIC
1  1042 1000 62481  CLOCK_MONOTONIC_RAW
1  1521  999CLOCK_BOOTTIME

Histogram: CLOCK_REALTIME, 1 ns per bucket, 100 samples.
ns  hits
   999  9577
  1000930001
  1999   149
  2000  6856
  2999 2
  3000   152
49105 samples were duplicated.
4158 samples were bigger than 3498.


  res   avg  min  dups  CLOCK
1  1347  999CLOCK_REALTIME
 1000  1028  890  -109  CLOCK_REALTIME_COARSE
1  1349  999CLOCK_MONOTONIC
1  1279 1000CLOCK_MONOTONIC_RAW
1  1838  999CLOCK_BOOTTIME

Histogram: CLOCK_REALTIME, 1 ns per bucket, 100 samples.
ns  hits
   999  8357
  1000765081
  1999  4814
  2000215894
  299921
  3000   632
5201 samples were bigger than 3498.

I'm guessing that it's something like the cache conflicts changing because of 
address space randomization.




-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-14 Thread Achim Gratz via devel
Hal Murray via devel writes:
> devel@ntpsec.org said:
>> That's a fantastically wierd distribution.  Here's what my old single core
>> Athlon64 does: 
>
> Your sample is what I would expect from a system that isn't doing much.  If 
> there is other activity going on, the clean bell curve gets spread out due to 
> cache reloads and such.

HPET is a travel out to ACPI system registers mapped into memory, this
should never be never cached.  That's why HPET is really slow to read,
but it should be fairly consistent.  I've just tested it with (more)
load and the histogram stayed essentially the same.  It's only a single
core of course.

Here's what my (unloaded) Haswell does via TSC:

--8<---cut here---start->8---
ntpsec/attic> ./clocks
  res   avg min  dups  CLOCK
122  12CLOCK_REALTIME
  400 4 458-2  CLOCK_REALTIME_COARSE
116  12CLOCK_MONOTONIC
1   242 232CLOCK_MONOTONIC_RAW
1   241 232CLOCK_BOOTTIME

Histogram: CLOCK_REALTIME, 5 ns per bucket, 100 samples.
ns  hits
12799739
17200210
22 2
27 2
57 1
46 samples were bigger than 62.
--8<---cut here---end--->8---

That also doesn't change much with load.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptation for Waldorf Blofeld V1.15B11:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada

___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-14 Thread Hal Murray via devel


devel@ntpsec.org said:
> That's a fantastically wierd distribution.  Here's what my old single core
> Athlon64 does: 

Your sample is what I would expect from a system that isn't doing much.  If 
there is other activity going on, the clean bell curve gets spread out due to 
cache reloads and such.



-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-14 Thread Achim Gratz via devel
Udo van den Heuvel via devel writes:
> hpet:

That's a fantastically wierd distribution.  Here's what my old single
core Athlon64 does:

--8<---cut here---start->8---
ntpsec/attic> ./clocks
  res   avg  min  dups  CLOCK
1  1498  977CLOCK_REALTIME
  40042  738-2  CLOCK_REALTIME_COARSE
1  1467  908CLOCK_MONOTONIC
1  1592  977CLOCK_MONOTONIC_RAW
1  2558 1047CLOCK_BOOTTIME

Histogram: CLOCK_REALTIME, 1 ns per bucket, 100 samples.
ns  hits
   977  9496
   978 30575
  1047325172
  1048498724
  1117 68805
  1118 55456
  1187  7804
  1188  3121
  125791
  125814
742 samples were bigger than 1258.
--8<---cut here---end--->8---

Can you fix the clock to all cores to a single frequency (using
cpufreq-utils) and try that same test again?  If that gives you a
narrower distribution, please try another frequency as well to see if
the peak shifts.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptation for Waldorf Blofeld V1.15B11:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada

___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-14 Thread Udo van den Heuvel via devel
On 14-04-19 14:01, Hal Murray wrote:
> backwards runs forever.  ^C when you get bored.

No output after running `backwards` for over 30 minutes.

Udo
___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-14 Thread Udo van den Heuvel via devel
On 14-04-19 14:01, Hal Murray wrote:
> I just pushed some tweaks.  Would you please try attic/clock and 

hpet:

# ./a.out
  res   avg  min  dups  CLOCK
1  1666  419CLOCK_REALTIME
  400 5  444-1  CLOCK_REALTIME_COARSE
1  1657  419CLOCK_MONOTONIC
1  1659  419CLOCK_MONOTONIC_RAW
1  1560  419CLOCK_BOOTTIME

Histogram: CLOCK_REALTIME, 1 ns per bucket, 100 samples.
ns  hits
   41915
   48881
   489   666
   838   835
   83976
   907  1440
   908 23469
   977  8036
   978 28887
  132652
  1327  8072
  1396 31944
  1397161152
  1466 48599
  1467101238
  1815 12310
  1816 92305
  1885 90201
  1886238744
  1955 56259
  1956 74512
  2234 7
  2235   163
  2304   666
  2305  2327
17944 samples were bigger than 2305.

___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-14 Thread Udo van den Heuvel via devel
On 14-04-19 14:01, Hal Murray wrote:
> 
> udo...@xs4all.nl said:
>> ntpsec 1.1.3's ntpd from 
>> ftp://ftp.ntpsec.org/pub/releases/ntpsec-1.1.3.tar.gz
>>  gives me after startup:
> 
>> Apr 13 15:53:50 bla ntpd[12382]: CLOCK: ts_prev 1555163630 s + 594156272 ns,
>> ts_min 1555163630 s + 594155713 ns 
> ...
> 
> Thanks.
> 
> So now we know it wasn't a recent change to ntpsec.
> 
> Were there more clusters like that, or only one?

Multiple cam later.
I can try a cold boot later this week to see if that helps.

> I just pushed some tweaks.  Would you please try attic/clock and 
> attic/backwards from a recent git.  clock should print some stuff and exit.  
> backwards runs forever.  ^C when you get bored.

Thanks!
I'll see what I can do here.

Kind regards,
Udo
___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel


Re: logging

2019-04-14 Thread Hal Murray via devel


udo...@xs4all.nl said:
> ntpsec 1.1.3's ntpd from ftp://ftp.ntpsec.org/pub/releases/ntpsec-1.1.3.tar.gz
>  gives me after startup:

> Apr 13 15:53:50 bla ntpd[12382]: CLOCK: ts_prev 1555163630 s + 594156272 ns,
> ts_min 1555163630 s + 594155713 ns 
...

Thanks.

So now we know it wasn't a recent change to ntpsec.

Were there more clusters like that, or only one?

I just pushed some tweaks.  Would you please try attic/clock and 
attic/backwards from a recent git.  clock should print some stuff and exit.  
backwards runs forever.  ^C when you get bored.

clock should show something like this for tsc:
  res   avg  min  dups  CLOCK
125   17CLOCK_REALTIME
  100 7  112-6  CLOCK_REALTIME_COARSE
118   17CLOCK_MONOTONIC
1   292  283CLOCK_MONOTONIC_RAW
1   298  289CLOCK_BOOTTIME

Histogram: CLOCK_REALTIME, 1 ns per bucket, 100 samples.
ns  hits
1813
19 7
20 71131
21232969
22651647
23 38228
24  3609
25   130
26   164
27   149
1953 samples were bigger than 27.


and this for hpet:
  res   avg  min  dups  CLOCK
1   888  558CLOCK_REALTIME
  100 7   99-6  CLOCK_REALTIME_COARSE
1   878  558CLOCK_MONOTONIC
1   874  558CLOCK_MONOTONIC_RAW
1   888  489CLOCK_BOOTTIME

Histogram: CLOCK_REALTIME, 1 ns per bucket, 100 samples.
ns  hits
   558 2
   55913
   62868
   62975
   698   178
   699   101
   768   272
   76996
   838 77929
   839  9227
   907 47895
   908858449
5695 samples were bigger than 908.


-- 
These are my opinions.  I hate spam.



___
devel mailing list
devel@ntpsec.org
http://lists.ntpsec.org/mailman/listinfo/devel