Ok, going back version by version didn't lead to anything,
the lowest version I got to panic was 7.2-install after
10 hours of testing. The issue here is much older.

There's an update to sys/dev/pv/xen.c
(Revision 1.84, Fri Jul 14 20:08:46 2017 UTC)
reading:

> Reduce the number of CAS loops from ludicrous to ridiculous
>
> Now that the source of the delay with releasing grant table entries has
> been identified and fixed the number of attempts to CAS entry flags can
> be substantially reduced and while it's decreased by a factor of 100000,
> it should go down at least a 100 more in the future.

Where ludicrous==100000000 and ridiculous==1000.

Over time, that magic number was reduced to 10.
I believe, whether that number is sufficient, depends
on the workload, concurrency (number of CPUs),
their speed and scheduling characteristics of
the XEN host.

There's probably no "correct" number to chose
and I have no idea, if it's possible at all to fix the
problem only from the obsd-side.

However, I have determined that the number 10 is too low.

I modified the 7.7-install kernel as follows:

sys/dev/pv/xen.c:
        loop = 0;
        while (atomic_cas_uint(ptr, flags, GTF_invalid) != flags) {
                if (loop == 10) {
                        printf("loop=10 grant table reference %u is held "
                            "by domain %d: frame %#x flags %#x",
                            ref + ge->ge_start, ge->ge_table[ref].domid,
ge->ge_table[ref].frame, ge->ge_table[ref].flags);
                }
                if (loop == 100) {
                        printf("loop=100 grant table reference %u is held "
                            "by domain %d: frame %#x flags %#x",
                            ref + ge->ge_start, ge->ge_table[ref].domid,
ge->ge_table[ref].frame, ge->ge_table[ref].flags);
                }
                if (loop++ > 1000) {
                        panic("grant table reference %u is held "
                            "by domain %d: frame %#x flags %#x",
                            ref + ge->ge_start, ge->ge_table[ref].domid,
ge->ge_table[ref].frame, ge->ge_table[ref].flags);
                }
#if (defined(__amd64__) || defined(__i386__))
                __asm volatile("pause": : : "memory");
#endif
        }

After 5 hours of running rpki-client, I got some printf's, but no
subsequent panic:

Sep 16 10:36:51 obsd77 /bsd: loop=10 grant table reference 1226 is held by domain 0: frame 0x944d9 flags 0xd Sep 16 11:43:49 obsd77 /bsd: loop=10 grant table reference 1314 is held by domain 0: frame 0xa82fb flags 0xd Sep 16 11:44:20 obsd77 /bsd: loop=10 grant table reference 1204 is held by domain 0: frame 0xb3225 flags 0xd


So, in my case, the number of loops needed for the host
to release the grant reference was larger than 10 but smaller than 100.
(Maybe the time needed to do the printf did the trick.)

I'd be happy, if (preferred) a solid fix or (at least) some
code change like above finds its way into the kernel.

As a motivation: rpki-client(8) is a standard OpenBSD tool
needed as support for bgpd(8) to determine the validity
of routing announcements on the internet. Running an
internet router on OpenBSD in a virtualised environment
should be a feasible possibility.

Thanks,

--korni


Reply via email to