Re: [chrony-dev] chronyd broken on macOS Big Sur

2020-08-27 Thread Bryan Christianson
Hi Miroslav

> On 27/08/2020, at 8:10 PM, Miroslav Lichvar  wrote:
> 
> I guess it's also possible that the ntp_adjtime() call doesn't
> actually do anything. You could try changing the frequency in larger
> steps, e.g. 100 ppm every minute and see if it has any effect on the
> offset reported by chronyd -x. If it looks the same as when you don't
> change the frequency, ntp_adjtime() isn't doing anything.
> 

It looks like Apple may have messed up signed/unsigned again in Big Sur. I 
played around with variations of the test code without much success in 
reproducing the problem. I then tried setting the frequency in the range 0 to 
-500pm in steps of 50ppm, with a 30 second sleep each iteration.

The frequency returned in the timex buffer was incorrect. Instead of seeing 
"-50 ppm =>  -50 ppm : ok" as expected I see "-50 ppm =>  65486 ppm : failed". 
On macOS 10.15 I see the expected result.

I was still unable to reproduce an abrupt change in offset, but I think there 
is a definite 'sign' problem in the Apple code.

If Apple don't respond to my bug report it may be possible to work around this 
in chrony.

Maybe the test case in chrony should also look at -ve frequencies?

Bryan Christianson
br...@whatroute.net




--
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] chronyd broken on macOS Big Sur

2020-08-27 Thread Miroslav Lichvar
On Thu, Aug 27, 2020 at 07:52:28PM +1200, Bryan Christianson wrote:
> 
> > On 27/08/2020, at 6:52 PM, Miroslav Lichvar  wrote:
> > 
> > You could start with the test/kernel/ntpadjtime.c program, modified to
> > very slowly change the frequency of the clock, e.g. 1 ppm per minute.
> > If you run at the same time chronyd with the -x option and a short
> > polling interval, you should see in the tracking log when the
> > offset jumped.
> 
> Thank you Miroslav - I'll get on to this tomorrow.

I guess it's also possible that the ntp_adjtime() call doesn't
actually do anything. You could try changing the frequency in larger
steps, e.g. 100 ppm every minute and see if it has any effect on the
offset reported by chronyd -x. If it looks the same as when you don't
change the frequency, ntp_adjtime() isn't doing anything.

-- 
Miroslav Lichvar


-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] chronyd broken on macOS Big Sur

2020-08-27 Thread Bryan Christianson


> On 27/08/2020, at 6:52 PM, Miroslav Lichvar  wrote:
> 
> You could start with the test/kernel/ntpadjtime.c program, modified to
> very slowly change the frequency of the clock, e.g. 1 ppm per minute.
> If you run at the same time chronyd with the -x option and a short
> polling interval, you should see in the tracking log when the
> offset jumped.

Thank you Miroslav - I'll get on to this tomorrow.

-- 
Bryan Christianson
br...@whatroute.net




-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] chronyd broken on macOS Big Sur

2020-08-26 Thread Miroslav Lichvar
On Thu, Aug 27, 2020 at 11:41:44AM +1200, Bryan Christianson wrote:
> It worked!!! I left the daemon running for over 30 minutes and it held the 
> system time to +/- 5 usecs of NTP, (mostly better than +/- 1usec). The timed 
> daemon did NOT change the clock at all during this test and chronyd behaved 
> as expected.

That's good to hear.
> 
> I repeated the test with ntp_adjtime() enabled and kept a log of debug 
> messages (attached). A simple analysis of the debug trace seems to say that 
> the reference clock is leaping about, but I proved that it isn't with my 
> first adjtime() test. Also the reference clock works perfectly with earlier 
> versions of macOS, my linux machines etc.
> 
> My current conclusion is that the Darwin kernel in Big Sur has a bad 
> implementation of ntp_adjtime() that somehow causes the clock to jump by 
> random intervals. I would really appreciate help in building a *simple* 
> program to be able to demonstrate this to Apple in a new bug report.

You could start with the test/kernel/ntpadjtime.c program, modified to
very slowly change the frequency of the clock, e.g. 1 ppm per minute.
If you run at the same time chronyd with the -x option and a short
polling interval, you should see in the tracking log when the
offset jumped.

-- 
Miroslav Lichvar


-- 
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] chronyd broken on macOS Big Sur

2020-08-26 Thread Bryan Christianson

> On 24/08/2020, at 12:29 AM, David Bohman  wrote:
> 
> Are you certain that that timed is remaining suspended? That launchd is not 
> resuming it? I don't have a machine which will run Big Sur, so I cannot 
> investigate it myself.
> 

I think I have ruled out timed as being the cause of the problem and have left 
it in the same state as earlier versions of macOS. It still runs every 5 
minutes (visible in the system log) but is not causing the clock to jump.

When timed was first introduced by Apple on 10.13, the managed to completely 
break adjtime() because of inconsistent treatment of signed/unsigned integers. 
I wondered if they might have done something similar in Big Sur so I compiled 
chronyd with the ntp_adjtime() calls disabled, forcing use of adjtime() to slew 
the clock.

It worked!!! I left the daemon running for over 30 minutes and it held the 
system time to +/- 5 usecs of NTP, (mostly better than +/- 1usec). The timed 
daemon did NOT change the clock at all during this test and chronyd behaved as 
expected.

I repeated the test with ntp_adjtime() enabled and kept a log of debug messages 
(attached). A simple analysis of the debug trace seems to say that the 
reference clock is leaping about, but I proved that it isn't with my first 
adjtime() test. Also the reference clock works perfectly with earlier versions 
of macOS, my linux machines etc.

My current conclusion is that the Darwin kernel in Big Sur has a bad 
implementation of ntp_adjtime() that somehow causes the clock to jump by random 
intervals. I would really appreciate help in building a *simple* program to be 
able to demonstrate this to Apple in a new bug report.

Thanks, Bryan

-- 
Bryan Christianson
br...@whatroute.net




chronyd-debug.txt.gz
Description: GNU Zip compressed data


Re: [chrony-dev] chronyd broken on macOS Big Sur

2020-08-23 Thread David Bohman
Bryan,

On Thu, Aug 20, 2020 at 10:32 PM Bryan Christianson 
wrote:
>
> > On 29/07/2020, at 12:28 AM, David Bohman  wrote:
> >
> > Why is disabling SIP not an option, at least temporarily? It seems to
me that you need to replace the timed service in order to use chronyd at
all. You cannot have two daemons running at the same time who both think
that they are disciplining the clock.
> >
>
> I am reluctant to remove timed with SIP disabled because because it is
not something I would like to do on an ongoing basis - i.e after every OS
update. Also, I don't want to support end users in disabling SIP and
removing timed if they have installed chrony via my ChronyControl
application. If there was some way to do it without booting into recovery
then I would be happy with that.

Yes, I can understand that. I run all my macOS systems with SIP disabled.

>
> Please note, on macOS 10.13, 10.14 10.15 chronyd works very well when:
> 1. the specified time server in System Preferences/Date & Time is
set to a non-existent host.
> 2. automatica update is disabled in the same System Preferences
pane.

I prefer to have the Preference UI continue to work. I do that by having my
own wrapper script in /usr/local which generates the appropriate pool
command from macOS configuration into a file which is included by the main
chrony.conf. Of course, the chrony daemon must continue to be known to
launchd as  so the Preference Pane can find it.

>
> I had an idea that if I was to send a SIGSTOP to timed, then that would
solve the problem. However, when I tried this from the command line (pkill
-STOP timed) something is still messing with the system clock and every
minute or so, chronyd is emitting messages about the clock being wrong.
This indicates that simply removing timed still won't solve the problem.

Are you certain that that timed is remaining suspended? That launchd is not
resuming it? I don't have a machine which will run Big Sur, so I cannot
investigate it myself.

David Bohman


Re: [chrony-dev] chronyd broken on macOS Big Sur

2020-08-20 Thread Bryan Christianson
David, thanks for the feedback

> On 29/07/2020, at 12:28 AM, David Bohman  wrote:
> 
> Why is disabling SIP not an option, at least temporarily? It seems to me that 
> you need to replace the timed service in order to use chronyd at all. You 
> cannot have two daemons running at the same time who both think that they are 
> disciplining the clock.
> 

I am reluctant to remove timed with SIP disabled because because it is not 
something I would like to do on an ongoing basis - i.e after every OS update. 
Also, I don't want to support end users in disabling SIP and removing timed if 
they have installed chrony via my ChronyControl application. If there was some 
way to do it without booting into recovery then I would be happy with that.

Please note, on macOS 10.13, 10.14 10.15 chronyd works very well when:
1. the specified time server in System Preferences/Date & Time is set 
to a non-existent host.
2. automatica update is disabled in the same System Preferences pane.

I had an idea that if I was to send a SIGSTOP to timed, then that would solve 
the problem. However, when I tried this from the command line (pkill -STOP 
timed) something is still messing with the system clock and every minute or so, 
chronyd is emitting messages about the clock being wrong. This indicates that 
simply removing timed still won't solve the problem.

I've had zero feedback from Apple in response to my bug report and I'm not sure 
where to proceed from here. If there are any Mac developers on the list, it 
might be a good idea to each submit a feedback report in an attempt to get 
Apple to take notice of the issue. It just doesn't feel right to me that they 
can take on timekeeping based on, according to Wikipedia 
https://en.wikipedia.org/wiki/Timed , a standard from the mid 1980s that as far 
as I am aware is used by no-one else.


Bryan Christianson
br...@whatroute.net


--
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.



Re: [chrony-dev] chronyd broken on macOS Big Sur

2020-07-28 Thread David Bohman
Why is disabling SIP not an option, at least temporarily? It seems to me
that you need to replace the timed service in order to use chronyd at all.
You cannot have two daemons running at the same time who both think that
they are disciplining the clock.

On Fri, Jul 17, 2020 at 3:05 PM Bryan Christianson 
wrote:

> In testing macOS Big Sur, I am finding that chronyd has been broken by
> Apples aggressive use of their (undocumented) timed daemon.
>
> Even when automatic updating is disabled in the Date&Time control panel,
> timed will still make adjustments to the system clock, adding spurious
> offsets of up to 0.5 secs. It appears to be doing based on some internal
> idea of the current time - it is not sending any packets to port 123
>
> I have been unable to find a way of preventing timed from launching as it
> is under the control of System Integrity Protection (SIP). Disabling SIP is
> NOT an option.
>
> I have submitted a 'Feedback' report to Apple about this, but they are
> notoriously unresponsive. If anyone on the list is able to suggest a
> workaround I'd be most grateful.
>
> Bryan Christianson
> br...@whatroute.net
>
>
>
>
> --
> To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with
> "unsubscribe" in the subject.
> For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the
> subject.
> Trouble?  Email listmas...@chrony.tuxfamily.org.
>
>


[chrony-dev] chronyd broken on macOS Big Sur

2020-07-17 Thread Bryan Christianson
In testing macOS Big Sur, I am finding that chronyd has been broken by Apples 
aggressive use of their (undocumented) timed daemon.

Even when automatic updating is disabled in the Date&Time control panel, timed 
will still make adjustments to the system clock, adding spurious offsets of up 
to 0.5 secs. It appears to be doing based on some internal idea of the current 
time - it is not sending any packets to port 123

I have been unable to find a way of preventing timed from launching as it is 
under the control of System Integrity Protection (SIP). Disabling SIP is NOT an 
option.

I have submitted a 'Feedback' report to Apple about this, but they are 
notoriously unresponsive. If anyone on the list is able to suggest a workaround 
I'd be most grateful.

Bryan Christianson
br...@whatroute.net




--
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe" 
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the 
subject.
Trouble?  Email listmas...@chrony.tuxfamily.org.