Hi,
we were about to switch on LTO by default on Ubuntu 21.04.
While doing so I was made aware of a related test breakage on s390x.
Components
Chrony: 4.0-5ubuntu2
Gcc: 4:10.2.0-1ubuntu1
Glibc: 2.33-0ubuntu2
For the test, chrony is rebuilt which now started to pick up LTO being enabled.
Most of the tests actually worked, so it is no general all-goes-to-h... problem
But when running the simulation tests it failed case 110.
Reprocuced in a s390x LXD container:
$ apt update
$ apt build-dep chrony
$ apt source chrony
$ cd chrony-4.0
$ ./debian/rules build
$ mkdir /tmp/mytest/
$ export AUTOPKGTEST_TMP=/tmp/mytest/
$ export CLKNETSIM_PATH=/tmp/mytest/
$ cd test/simulation
$ ./110-chronyc
Looking into that I had I found that 110 has quite a subset of tests
All of them worked until this one:
network with 1*0 servers and 1 clients:
non-default settings:
chronyc_conf=dns -n dns +n dns -4 dns -6 dns -46 timeout 200
retries 1 keygen keygen 10 MD5 128 keygen 11 MD5 40 help quit
nosuchcommand
chronyc_start=0
dns=1
limit=1
refclock_jitter=1e-4
server_conf=server 192.168.123.1
server_strata=0
starting node 1:OK
starting node 2:OK
running simulation:clknetsim failed
ERROR
FAIL
After eliminating a bunch of red herrings I've found that when I
trigger that I get a user space fault in dmesg like:
[2680568.517156] Last Breaking-Event-Address:
[2680568.517159] [<02aa2e603214>] 0x2aa2e603214
[2684266.007865] User process fault: interruption code 003b ilc:2 in
libc-2.33.so[3ff8f48+1c5000]
[2684266.007878] Failing address: d123 TEID: d1230800
[2684266.007880] Fault in primary space mode while using user ASCE.
[2684266.007882] AS:0006dd8341c7 R3:0024
[2684266.007887] CPU: 21 PID: 798868 Comm: chronyc Tainted: P
O 5.4.0-66-generic #74-Ubuntu
[2684266.007888] Hardware name: IBM 2964 N63 400 (LPAR)
[2684266.007890] User PSW : 070510018000 03ff8f5b4620
[2684266.007892]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0
CC:1 PM:0 RI:0 EA:3
[2684266.007894] User GPRS: 0015
0200
[2684266.007895]0001 0014
d1230123 03fff087c930
[2684266.007896]0014 0014
d1230123 0001
[2684266.007897]03ff8fa2cf98 02aa2b3971f0
02aa2b387efc 03fff087c7d0
[2684266.007905] User Code: 03ff8f5b460e: ec1200762065 clgrj
%r1,%r2,2,03ff8f5b46fa
03ff8f5b4614: ec93007f2065 clgrj
%r9,%r3,2,03ff8f5b4712
#03ff8f5b461a: ec980050007c cgij
%r9,0,8,03ff8f5b46ba
>03ff8f5b4620: 9180a002 tm 2(%r10),128
03ff8f5b4624: a7740025 brc 7,03ff8f5b466e
03ff8f5b4628: b24f0060 ear %r6,%a0
03ff8f5b462c: e310a0880004 lg %r1,136(%r10)
03ff8f5b4632: eb66002d sllg %r6,%r6,32
[2684266.007920] Last Breaking-Event-Address:
[2684266.007924] [<02aa2b383214>] 0x2aa2b383214
I was able to confirm that disabling LTO resolved the issue, so for
now that will stay disabled on s390x. But I wanted to ask here if
chrony-experts could have a look at reproducing that if it makes any
sense to you and if myber there could be a fix that then allows to
enable LTO.
FYI: See [1] for the related Ubuntu bug and the strange trip until I
found it indeed was LTO.
[1]: https://bugs.launchpad.net/ubuntu/+source/chrony/+bug/1921377
--
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd
--
To unsubscribe email chrony-dev-requ...@chrony.tuxfamily.org with "unsubscribe"
in the subject.
For help email chrony-dev-requ...@chrony.tuxfamily.org with "help" in the
subject.
Trouble? Email listmas...@chrony.tuxfamily.org.