Re: Intermittent t_timerfd failures

2023-07-09 Thread Andreas Gustafsson
Taylor R Campbell wrote:
> sys/lib/libc/sys/t_timerfd.c timerfd_block is intermittently failing
> on the i386 test bed.

Here are some random observations from grepping the logs on various
testbeds.

1. The failures have been happening for as long as the timerfd test
case has existed.  The test case was first run with source date
2021.09.19.15.52.55, and the first failure was soon thereafter,
using source date 2021.09.21.09.24.15.

2. The execution time recorded by ATF on i386 indicates that the
failed test cases completed in less than a second, while the
successful ones took longer.  For example:

2023/2023.07.08.15.32.58/test.log.gz:timerfd_block: [1.211653s] Passed.
2023/2023.07.08.16.13.00/test.log.gz:timerfd_block: [1.540414s] Passed.
2023/2023.07.08.17.43.13/test.log.gz:timerfd_block: [1.649699s] Passed.
2023/2023.07.08.19.10.00/test.log.gz:timerfd_block: [0.864084s] Failed: 
/tmp/build/2023.07.08.19.10.00-i386/src/tests/lib/libc/sys/t_timerfd.c:198: 
then=1368.605876566 now=1369.462708087 delta=0.856831521
2023/2023.07.08.20.02.10/test.log.gz:timerfd_block: [1.203076s] Passed.
2023/2023.07.08.23.42.48/test.log.gz:timerfd_block: [1.699977s] Passed.
2023/2023.07.09.00.01.55/test.log.gz:timerfd_block: [1.231773s] Passed.

3. The TNF amd64 testbed shows failures similar to the i386 one.
Both are using "qemu -accel kvm" hosted on NetBSD 9.

4. The TNF sparc testbed shows no failures in 3456 runs (using
plain qemu, no acceleration).

5. The TNF macppc testbed (also plain qemu) shows frequent failures,
but unlike i386 and amd64, the execution time indicated by ATF is more
than a second:

./2023/2023.07.06.07.59.00/test.log.gz:timerfd_block: [1.132080s] Passed.
./2023/2023.07.07.00.20.39/test.log.gz:timerfd_block: [1.062924s] Passed.
./2023/2023.07.07.04.43.15/test.log.gz:timerfd_block: [1.029156s] Failed: 
/tmp/build/2023.07.07.04.43.15-macppc/src/tests/lib/libc/sys/t_timerfd.c:198: 
check_value_against_bounds(delta.tv_sec, 1, 1) not met
./2023/2023.07.07.08.39.23/test.log.gz:timerfd_block: [1.056571s] Passed.
./2023/2023.07.07.20.19.08/test.log.gz:timerfd_block: [1.081552s] Passed.

6. My own amd64 testbed running on real hardware shows no failures in
205 runs.

7. My own i386 testbed using "qemu -accel kvm" on Linux shows no
failures in 359 runs.
-- 
Andreas Gustafsson, g...@netbsd.org


Re: ZFS: time to drop Big Scary Warning

2021-03-25 Thread Andreas Gustafsson
Greg Troxel wrote:
> > That's a good test, but how does zfs compare in for the same test with lets
> > say ffs or ext2fs (filesystems that offer persistence)?
> 
> With the same system, booted in the  same way, but with 3 different
> filesystems mounted on /tmp, I get similar numbers of failures:
> 
> tmpfs 12
> ffs2  13
> zfs   18
> 
> So tmpfs/ffs2 are ~equal and zfs has a few more failures (but it all
> looks a bit random and non-repeatable).So it's hard to sort out "zfs
> is buggy" vs "some tests fail in timing-related hard-to-understand ways
> and that seems provoked slightly more with /tmp on zfs".

Since I'm all too familiar with the randomly failing tests in the
NetBSD test suite, let me sort them out for you.  Of the test cases in
the zfs run that didn't match the releng results, these two are known
random failures reported in PR 55770:

>   ./usr.bin/cc/t_tsan_data_race:data_race_pie
>   ./usr.bin/c++/t_tsan_data_race:data_race_pie

This is probably the known random failure reported in PR 55692:

>   ./fs/nfs/t_rquotad:get_nfs_be_1_group

This failure is already reported in PR 55603, though your report is
the first of it failing on real hardware:

>   ./modules/t_x86_pte:svs_g_bit_set

This one is reported as randomly failing in PR 55331, but since that
problem appears to be tmpfs related, it may be the case that tmpfs and
zfs are both buggy:

>   ./lib/libarchive/t_libarchive:libarchive

The remaining four failed test cases are _not_ known to fail randomly,
and as far as I know, do not have existing PRs.  Since they all
involve file system operations, it seems likely that they are in fact
zfs related in some way:

>   ./bin/cp/t_cp:file_to_file
>   ./lib/libc/stdlib/t_mktemp:mktemp_large_template
>   ./lib/libc/sys/t_stat:stat_chflags
>   ./usr.bin/ztest/t_ztest:assert

-- 
Andreas Gustafsson, g...@gson.org


Re: boot -d

2020-11-12 Thread Andreas Gustafsson
Edgar Fuß wrote:
> I had a look at the relevant commits
>   src/sys/arch/x86/include/pmap.h 1.100
>   src/sys/arch/x86/x86/pmap.c 1.330
>   src/sys/arch/xen/x86/xen_pmap.c 1.31
> but unfortunately am unable to back-port the second one to -8.
> I know nothing about pmap, and the -current version uses PTE_P and PTE_PS 
> while the -8 version uses PG_V/nothing.

It's probably easier to revert src/sys/arch/x86/x86/db_memrw.c 1.6.
-- 
Andreas Gustafsson, g...@gson.org


Re: boot -d

2020-11-12 Thread Andreas Gustafsson
Edgar Fuß wrote:
> Real hardware (AMD64), 8.2_STABLE from yesterday, custom config.

This looks like PR 53311.  The commit where that problem started
(src/sys/arch/x86/x86/db_memrw.c 1.6) was pulled up to to the -8
branch, and apparently the commits that fixed it were not.
-- 
Andreas Gustafsson, g...@gson.org


Re: Logging a kernel message when blocking on entropy

2020-09-29 Thread Andreas Gustafsson
Hauke Fath wrote:
> On Tue, 29 Sep 2020 11:09:25 +0300, Andreas Gustafsson wrote:
> > I have now committed the code to log the message, without rate
> > limiting.  If a consensus should arise that rate limiting the message
> > is a good idea, the code to do that should be committed by someone in
> > favor of it.
> 
> I have seen kernel subsystems spam the console to the point of making 
> it unusable, which can get quite awkward on production machines. 
>
> While I don't know if this is a relevant scenario for your change, 
> declaring it "somebody else's problem" has a funny ring.

I have already explained why I believe it's not a relevant scenario
for this change.  And I don't mean to make it someone else's problem -
I take full responsibility for my commit, and if it turns out that it
actually causes spammage, I will fix that one way or another.

And similarly, if rate limiting is added and causes problems, I expect
the developer adding it to take responsibility for that commit.  I just
don't want to end up being held responsible for problems caused by rate
limiting I never wanted in the first place.
-- 
Andreas Gustafsson, g...@gson.org


Re: Logging a kernel message when blocking on entropy

2020-09-29 Thread Andreas Gustafsson
I have now committed the code to log the message, without rate
limiting.  If a consensus should arise that rate limiting the message
is a good idea, the code to do that should be committed by someone in
favor of it.
-- 
Andreas Gustafsson, g...@gson.org


Re: Logging a kernel message when blocking on entropy

2020-09-24 Thread Andreas Gustafsson
nia wrote:
> Unless you buy one of the snake oil devices Mr. Gustafsson sells, of
> course.

The devices you call "snake oil" and me pushing for NetBSD to stop
generating predictable keys are both aspects of me working to address
entropy related vulnerabilities ever since recognizing them as a major
weak link in Internet security while working on BIND 9 some 20 years
ago.  But of course I have no way of proving my sincere intent to you,
just like you have no way of proving to me that you are not paid by
the NSA to undermine the security of NetBSD.  So please, let's stop
with the ad hominem attacks and focus on the technical issues.
-- 
Andreas Gustafsson, g...@gson.org


Re: Logging a kernel message when blocking on entropy

2020-09-24 Thread Andreas Gustafsson
nia wrote:
> On Tue, Sep 22, 2020 at 02:32:23PM +0200, Manuel Bouyer wrote:
> > OK, so the printf should never happen when the system has been properly
> > configured. In this case I have no objection.
> 
> No, it will happen frequently in VMs and on non-recent-x86 hardware.

The cases of processes blocking on randomness we tend to see reports
about are pkgsrc builds and web browsers.  Since entropy does not run
out in -current as already discussed on this list, each hanging pkgsrc
build or web browser will cause the message to be logged exactly once.
How often do they hang for you?

If PR 55659 is fixed, there will be more cases where the message is
logged (at least ssh-keyegn), but still only one per blocking process.
-- 
Andreas Gustafsson, g...@gson.org


Re: Logging a kernel message when blocking on entropy

2020-09-23 Thread Andreas Gustafsson
Taylor R Campbell wrote:
> I think this is fine, but it should be rate-limited the same way other
> messages in kern_entropy.c are rate-limited.
> 
>   if (ratecheck(, ))
>   printf("...");

I would prefer them not to be rate limited.  Repeated messages are
useful because they will typically contain different PIDs and program
names.

Consider a system logging the following message to the console at boot
time:

[  22.8406587] entropy: pid 188 (syslogd) blocking due to lack of entropy

and then hanging.  You hit control-c to recover, the boot continues
but then immediately hangs again as another process also blocks
on entropy.  In this case, I would like a second message to be logged
so that it is immediately apparent that another process has blocked,
and which process that is.

Or if you start two different daemons in parallel and both block
immediately, it would be helpful to get messages identifying both of
them.
-- 
Andreas Gustafsson, g...@gson.org


Re: Logging a kernel message when blocking on entropy

2020-09-22 Thread Andreas Gustafsson
Manuel Bouyer wrote:
> > In -current, entropy does not run out.
> 
> So, how can it block ?

When there's too little entropy to begin with.  Once you have
gathered enough, it unblocks, and never blocks again.

This is assuming default settings.  If you actually want entropy
to run out, you can do "sysctl -w kern.entropy.depletion=1", but
there's no good reason to ever do that outside of testing.
-- 
Andreas Gustafsson, g...@gson.org


Re: Logging a kernel message when blocking on entropy

2020-09-22 Thread Andreas Gustafsson
Manuel Bouyer wrote:
> If you run a dd on /dev/random I guess the system will run out of
> entropy pretty fast.

In -current, entropy does not run out.
-- 
Andreas Gustafsson, g...@gson.org


Re: Logging a kernel message when blocking on entropy

2020-09-22 Thread Andreas Gustafsson
Manuel Bouyer wrote:
> I think we should find and remove theses (or make them conditional)
> instead of adding unconditional new ones

It would already be conditional in the sense that it's only printed if
the system has no entropy.  If a multi-user system is lacking entropy,
a user spamming the console about it is doing the administrator a
favor.
-- 
Andreas Gustafsson, g...@gson.org


Re: Logging a kernel message when blocking on entropy

2020-09-22 Thread Andreas Gustafsson
Manuel Bouyer wrote:
> I'm not sure we want a user-triggerable kernel printf enabled by default.
> This could be used to DOS the system (especially on serial consoles)

You can already trigger kernel printfs as an unprivileged user.
The first one that comes to mind is "sorry, pid %d was killed:
orphaned traced process", but I'm sure there are many others.
-- 
Andreas Gustafsson, g...@gson.org


Logging a kernel message when blocking on entropy

2020-09-22 Thread Andreas Gustafsson
All,

The following patch will cause a kernel message to be logged when a
process blocks on /dev/random or some other randomness API.  It may
help some users befuddled by pkgsrc builds blocking on /dev/random,
and I'm finding it useful when testing changes aimed at fixing PR
55659.

OK to commit?

Index: src/sys/kern/kern_entropy.c
===
RCS file: /bracket/repo/src/sys/kern/kern_entropy.c,v
retrieving revision 1.23
diff -u -r1.23 kern_entropy.c
--- src/sys/kern/kern_entropy.c 14 Aug 2020 00:53:16 -  1.23
+++ src/sys/kern/kern_entropy.c 20 Sep 2020 13:53:46 -
@@ -1297,6 +1297,9 @@
 
/* Wait for some entropy to come in and try again.  */
KASSERT(E->stage >= ENTROPY_WARM);
+   printf("entropy: pid %d (%s) blocking due to lack of entropy\n",
+  curproc->p_pid, curproc->p_comm);
+
if (ISSET(flags, ENTROPY_SIG)) {
error = cv_wait_sig(>cv, >lock);
    if (error)

-- 
Andreas Gustafsson, g...@gson.org


Fix for slow run(4) configuration on OHCI/UHCI

2020-05-29 Thread Andreas Gustafsson
Hi all,

When I connect a USB WiFi adapter based on a Ralink RT5370 chip to a
USB port that uses an OHCI or UHCI host controller, running "ifconfig
run0 up" takes a very long time, about 20-30 seconds.

This is because the run(4) driver writes a large amount of data to the
device just two bytes at a time using the WRITE_2 command, and each
write takes two full USB frames of 1 millisecond each on UHCI, or
three frames on OCHI.  With an EHCI or XHCI controller, the large
number of transfers is not a problem as these controllers can perform
multiple control transfers within a single frame.

The driver already contains code to do the transfers in larger blocks
using the WRITE_REGION_1 command, but it's #if'ed out with a comment
saying it is "not stable on RT2860".  The FreeBSD driver is similar,
except that its version of the #if'ed-out code limits the transfers
to a maximum of 64 bytes at a time.

I have verified that enabling the use of WRITE_REGION_1 with the
64-bit limit from FreeBSD works with my RT5370 based adapter, and
makes it configure about ten times faster on OHCI and UHCI.

I'm now using the following patch, which enables the use of
WRITE_REGION_1 using blocks of up to 64 bytes for the RT5370 only:

  https://www.gson.org/netbsd/patches/run-faster.patch

OK to commit?
-- 
Andreas Gustafsson, g...@gson.org


Re: All (?) network tests failing

2020-04-05 Thread Andreas Gustafsson
Christos Zoulas wrote:
> It could be due to tcsh doing its file descriptor dance differently...
> What shell are you using?

The default shell of the root user.  The notion of changing the shell
according to a personal preference doesn't really apply when running
automated tests on a fresh automated install.

The shell command used to run the tests is:

  mkdir /tmp/tests && \
  cd /usr/tests && \
  { atf-run; echo $? >/tmp/tests/test.status; } | tee /tmp/tests/test.tps | 
atf-report -o ticker:- -o xml:/tmp/tests/test.xml

which may also matter as file descrptors are allocated for log files
and pipes.

But rather than manually doing a fresh install, logging in as root,
and running that command, it may be easier to just run the anita
command starting on line 3 of the log from the first failed test run
on b5:

  http://releng.netbsd.org/b5reports/i386/2020/2020.03.22.00.56.45/test.log

-- 
Andreas Gustafsson, g...@gson.org


Re: All (?) network tests failing

2020-04-04 Thread Andreas Gustafsson
Robert Elz wrote:
> Not an idea, but a possibility - the change to route.c (1.167) was
> unimportant - it doesn't really matter (to the tests) if it does
> anything useful or not - it is possible that it just happened that the
> fd that the setsockopt() was being performed on was a socket (a suitable
> socket) prior to the openssl update, but after that, the rump fd's
> shifted around, and what the setsockopt() was operating upon was no
> longer a socket.
> 
> No idea if that is really what happened or not, but something like that
> is at least plausible (even though it would seem that the changes of the
> sys call having worked by accident seem to be not very high).

I agree that this sounds plausible.  Also, the tests never failing for
Christos might then be explained by him running them in an environment
that has a different number of file descriptors already in use.
-- 
Andreas Gustafsson, g...@gson.org


Re: All (?) network tests failing

2020-04-04 Thread Andreas Gustafsson
Martin Husemann wrote:
> I analyzed this particular one (202 steps back because rump.netstat dumps
> core) - will fix it soon.

With martin's changes, the number of unexpected test failures
went down from 413 to 6 on my bare metal testbed:

  
http://www.gson.org/netbsd/bugs/build/amd64-baremetal/commits-2020.04.html#2020.04.03.16.20.52

The remaining 6 or so failures are unrelated.  Thanks to everyone who
helped get this fixed.

Does anyone have an idea why the tests didn't start failing
immediately when route.c 1.167 was committed, but only after the
seemingly unrelated openssl update?
-- 
Andreas Gustafsson, g...@gson.org


Re: All (?) network tests failing

2020-04-03 Thread Andreas Gustafsson
Christos Zoulas wrote:
> >That should take care of the failing network related tests that contain
> >rump.route commands, but that's not all of the failing tests.
> 
> Thanks! I fixed that now. Let's see how many break after this...

413 on my bare metal testbed (20 steps forward, 202 steps back):

  
http://www.gson.org/netbsd/bugs/build/amd64-baremetal/commits-2020.04.html#2020.04.02.21.36.03

There have also been other commits since the previous run, so these
changes in test outcomes may not all be due to your sbin/route commits.
-- 
Andreas Gustafsson, g...@gson.org


Re: All (?) network tests failing

2020-03-30 Thread Andreas Gustafsson
Christos Zoulas wrote:
> All the tests are failing for you the same way:
> 
> rump.route: SO_RERROR: Socket operation on non-socket
> 
> I doubt that my gif change affected that. This smells to me like the rump fd
> hijack is not
> working either because we have some new system call involved or something is
> messing
> up the file descriptors.  What is your build host?

The tests are failing on both the TNF testbed and my own.  The
respective OS versions are:

  NetBSD babylon5.netbsd.org 8.1_STABLE NetBSD 8.1_STABLE (BABYLON5) #1: Fri 
Jan 24 21:50:18 UTC 2020  
s...@franklin.netbsd.org:/home/netbsd/8/amd64/obj/sys/arch/amd64/compile/BABYLON5
 amd64
  NetBSD guido.araneus.fi 9.0 NetBSD 9.0 (GENERIC) #0: Fri Feb 14 00:06:28 UTC 
2020  mkre...@mkrepro.netbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64

-- 
Andreas Gustafsson, g...@gson.org


Re: All (?) network tests failing

2020-03-30 Thread Andreas Gustafsson
Christos Zoulas wrote:
> I've been looking into this:
> 1. The libcrypto/bn test just needs more time

That may be.  That one never failed on real hardware for me (it just
went from taking 3 seconds to 14), but 200+ other test cases did fail,
and still do.

> 2. The gif related tests are failing because of a recent change to record mac 
> addresses
> I committed a fix for that.

Your fix didn't work; the gif tests are still failing with
src/tests/net/net_common.sh 1.40:

  
http://www.gson.org/netbsd/bugs/build/amd64-baremetal/2020/2020.03.30.13.01.39/test.html#net_if_gif_t_gif_gif_basic_ipv4overipv4

> 3. The rest of the tests (I've sampled 5 of them) don't fail for me.

If you do a full release build from scratch, install the release, and
run the tests in the installed release like the testbeds do, I bet
they will fail for you, too.
-- 
Andreas Gustafsson, g...@gson.org


Re: All (?) network tests failing

2020-03-30 Thread Andreas Gustafsson
Martin Husemann wrote:
> -current just had a serious regression in test results, it seems like
> ~all networking tests are failing now:

Many (most?) of these have been failing for more than a week now, as
reported on current-users in

  http://mail-index.netbsd.org/current-users/2020/03/23/msg038127.html

which identified both the commits and the developer responsible for
the breakage.
-- 
Andreas Gustafsson, g...@gson.org


re: usbhist support for urtwn

2019-11-25 Thread Andreas Gustafsson
matthew green wrote:
> this looks fine, but if you feel up to it, you can re-combine
> the single DPRINTF() -> two calls using URTWNHIST_CALLARGS().
> if_aue.c is a recent file converted to this method.  it has
> the advantage of consuming fewer kernhist slots -- 1 line vs 2.

There is a separate flag bit DBG_FN in urtwn_debug that enables
logging of function calls, and I have already translated the sequence
URTWNHIST_CALLED(); DPRINTFN(DBG_FN, ...) into URTWNHIST_CALLARGS(...).
I have deliberately avoided folding DPRINTFN() calls enabled by other
flag bits into URTWNHIST_CALLARGS() so that they can still be enabled
and disabled separately from the logging of function calls.

> nit:  the new DPRINTFN(), URTWNHIST_CALLED(), and
> URTWNHIST_CALLARGS() macros should use the do { .. } while(0)
> idiom the old DPRINTFN() did so they remain/are safe for us in
> all contexts.

They already do, they're just formatted differently.
-- 
Andreas Gustafsson, g...@gson.org


usbhist support for urtwn

2019-11-24 Thread Andreas Gustafsson
Hi all,

I have this patch to replace the debug printfs in sys/dev/usb/if_urtwn.c
by usbhist calls, roughly modelled after the use of usbhist in if_axe.c:

  https://www.gson.org/netbsd/patches/urtwn-usbhist.patch

OK to commit?
-- 
Andreas Gustafsson, g...@gson.org


Re: __{read,write}_once

2019-11-22 Thread Andreas Gustafsson
Maxime Villard wrote:
> How about _onestep()? Or _nosplit()?

Or just conclude that there are no good options and we might as well
call it _once() to be compatible with other OSes.
-- 
Andreas Gustafsson, g...@gson.org


Re: x86 bootstrap features

2019-09-25 Thread Andreas Gustafsson
Michael van Elst wrote:
> This assumes that code and data segment is the same and assigns
> the following 64k block to the stack. We switch to protected
> mode soon after, but the memory layout seems to stay the same.
> Maybe changing to
> 
>   addw$0x2001, %ax
> 
> is all that is needed.

I guess we also need to bump these definitions in the Makefile?

  SAMISCCPPFLAGS+= -DHEAP_START=0x1 -DHEAP_LIMIT=0x3

-- 
Andreas Gustafsson, g...@gson.org


re: x86 bootstrap features

2019-09-24 Thread Andreas Gustafsson
matthew green wrote:
> more research with more systems would be good here, but perhaps
> we simply need to provide multiple versions of each style (eg,
> netboot + something, biosboot + something, etc.)

Does anyone know the reason for the pxeboot 64k limit, and if it might
be possible to increase it?
-- 
Andreas Gustafsson, g...@gson.org


Plentiful unpredictable randomness

2019-07-22 Thread Andreas Gustafsson
Taylor R Campbell wrote:
> It has become popular to redefine the traditional semantics of
> /dev/random or /dev/urandom so that one or both will block once at
> boot until the OS thinks the entropy pool may have been seeded, and
> then never block again.

IMO, those are the most useful semantics for a /dev/*random device,
and NetBSD ought to provide a device that works this way.

This would combine the advantages of the existing /dev/random and
/dev/urandom by providing randomness that is both unpredictable (like
/dev/random) and plentiful (like /dev/urandom) once seeded.  It would
not solve the problem of /dev/random blocking forever when a system
has no entropy available at all, but it would solve the more common
problem of it blocking due to becoming "exhausted" from use.

I expect there will be strong differences in opinion as to whether
/dev/random and/or /dev/urandom should be changed to work like this,
but we could start by implementing it under a third name so that at
least those of us who want it can use it.  An implementation might
also help clarify the debate by serving as a concrete example of what
the new semantics might be.

> I don't want to do this because code paths that may block but only
> in extreme circumstances, like early at boot on an embedded system,
> are likely never to be exercised even during what might otherwise be
> extensive testing, and as noted blocking when not expected can have
> severe consequences.

That's only an issue with changing /dev/urandom to block until seeded
(I happen to think that ought to be done regardless, but reasonable
people may disagree).  There is no such issue with changing
/dev/random to no longer block once seeded.

Full disclosure: I sell hardware random number generators for a
living.  What I'm proposing here could in theory either increase or
decrease the demand for my products - I'm not sure which.
-- 
Andreas Gustafsson, g...@gson.org


Re: Understanding PR kern/43997 (kernel timing problems / qemu)

2017-07-30 Thread Andreas Gustafsson
Frank Kardel wrote:
> Fixing that requires some more work. But I am surprised that the qemu 
> interrupt rate is seemingly somewhat around 50Hz.
> Could it be a bug in qemu getting the frequeny not right. qemu should 
> read the clock to get the frequencies right an possibly skip
> usleeps less than i/HZ possibly managing an error-budget. I haven't 
> looked into qemu at all, but an error of a factor 2 looks suspicious.

I fully agree.
-- 
Andreas Gustafsson, g...@gson.org


Re: Understanding PR kern/43997 (kernel timing problems / qemu)

2017-07-30 Thread Andreas Gustafsson
Robert Elz wrote:
> I want to leave /bin/sh to percolate for a while, make sure there are
> no issues with it as it is, before starting on the next round of
> cleanups and bug fixes, so I was looking for something else to poke
> my nose into ...
> 
> [Aside: the people I added to the cc of this message are those who have
>  added text to PR kern/43997 and so I thought might be interested, if you're
>  not, just say...]
> 
> kern/43997 is the "qemu is too slow, clock interrupts get lost, timing
> gets all messed up" problem that plagues many of the ATF tests that kind
> of expect time to be maintained rationally.

Thank you for looking into this.

> Now there's no question that qemu is slow, for example, on my amd64 Xen
> DomU test system, the shell arithmetic test of ++x (etc) takes:
>   var_preinc: [0.077617s] Passed.
> whereas from the latest completed b5 (qemu) test run (as of this e-mail)
>   var_preinc   Passed   N/A   6.200489s
> 
> That's about 80 times slower (and most of the other tests show similar
> factors).   I don't think we can blame qemu for that, given what it is
> doing.
> 
> So, it is hardly surprising that, to borrow Paul's words from the PR:
>   On (at least) amd64 architecture, qemu cannot simulate clock
>   interrupts at 100Hz.

I don't think the slowness of qemu's emulation is the actual cause of
its inability to simulate clock interrupts at 100 Hz.  Rather, I think
it is more likely caused by the inability of qemu to sleep for periods
shorter than 10 ms due to limitations of the underlying host OS, such
as that documented in the BUGS section of nanosleep(2).

That this is at least partly a host system issue is supported by the
observation that when qemu is hosted on a Linux system, the timing in
the NetBSD guest is much more accurate than when qemu is hosted on
NetBSD, on similar hardware:

  NetBSD-on-qemu-on-NetBSD# time sleep 10
 13.00 real 0.00 user 0.03 sys

  NetBSD-on-qemu-on-Linux# time sleep 10
 10.13 real 0.02 user 0.02 sys

If my theory is correct, there are at least three ways the problem
could be fixed:

 - Improve the time resolution of sleeps on the host system, as
   recently discussed on tech-kern in a thread starting with
   http://mail-index.netbsd.org/tech-kern/2017/07/02/msg022024.html

 - Make qemu deal better with hosts unable to sleep for short
   periods of time, or

 - Make the guest system deal better with missed timer interrupts.

-- 
Andreas Gustafsson, g...@gson.org


Re: Restricting rdtsc [was: kernel aslr]

2017-03-31 Thread Andreas Gustafsson
Maxime Villard wrote:
> Having read several papers on the exploitation of cache latency to defeat
> aslr (kernel or not), it appears that disabling the rdtsc instruction is a
> good mitigation on x86. However, some applications can legitimately use it,
> so I would rather suggest restricting it to root instead.

It's ASLR that's broken, not rdtsc, and I strongly object to
restricting the latter just to that people can continue to gain
a false sense of security from the former.
-- 
Andreas Gustafsson, g...@gson.org


Re: netbsd-7 XEN3PAE_DOM0 crash on agp

2017-01-14 Thread Andreas Gustafsson
Emmanuel Dreyfus wrote:
> In case it rings a bell for someone: Trying latest netbsd-7 XEN3PAE_DOM0
> build for i386, I get a reproductible crash when attaching agp:

PR 50446 may be related.
-- 
Andreas Gustafsson, g...@gson.org


Re: CVS commit: src/sys/kern

2015-08-25 Thread Andreas Gustafsson
Martin Husemann wrote:
 I have another evbarm machine now hanging as well, so it is not just alpha.

The sparc test runs on babylon5 are affected, too.  I just committed a
fix, let's see if it works.
-- 
Andreas Gustafsson, g...@netbsd.org


Re: KGDB/i386 broken/supposed to work?

2015-06-22 Thread Andreas Gustafsson
Timo Buhrmester wrote:
 Using a GENERIC kernel with only the modifications required to enable KGDB 
 (see bottom for config diff), I get the following behavior on the TARGET 
 machine:
 |  boot netbsd -d
 | 15741968+590492+466076 [689568+730405]=0x1161fd4
 | kernel text is mapped with 4 large pages and 5 normal pages
 | Loaded initial symtab at 0xc110750c, strtab at 0xc11afaac, # entries 43075
 | kgdb waiting...fatal breakpoint trap in supervisor mode
 | trap type 1 code 0 eip c02a6744 cs 8 eflags 202 cr2 0 ilevel 8 esp c1265ea0
 | curlwp 0xc1078900 pid 0 lid 1 lowest kstack 0xc12632c0
 
 There is no delay between ``kgdb waiting...'' and ``fatal breakpoint trap in 
 supervisor mode''.
 I'm not sure whether or not this is the expected behavior, because eip 
 c02a6744 is in the `breakpoint` function so that would make sense; but the 
 documentation makes it sound like it should just say ``kgdb waiting...''.

Looks like that behavior was introduced by the following commit
to src/sys/arch/i386/i386/trap.c:

  
  revision 1.239
  date: 2008-05-30 13:38:21 +0300;  author: ad;  state: Exp;  lines: +10 -11;
  Since breakpoints don't work, dump basic info about the trap before
  entering the debugger. Sometimes ddb only makes the situation worse.
  

 Any idea whether a) KGDB is tested/supposed to work and b) what I
 might be doing wrong?

KGDB over serial on i386 worked for me in January 2008.  I don't know
if it is still working now; the kernel debugging I've done since then
has been using qemu's built-in gdb stub instead, as described in
https://wiki.netbsd.org/kernel_debugging_with_qemu/.
-- 
Andreas Gustafsson, g...@gson.org


Re: recent sysctl changes

2014-03-07 Thread Andreas Gustafsson
David Laight wrote:
  Both of these aspects of the API are now broken: reading a 4-byte
  CTLTYPE_INT variable now works for any buffer size = 4 *except* 8,
 
 That wasn't the intent of the change.
 The intent was that if the size was 8 then the code would return
 a numeric value of size 8, otherwise the size would be chnaged to
 4 and/or ENOMEM (stupid errno choice) returned.

I understand the intent; I'm talking about the unintended consequences.

  and attempting to read an 8-byte CTLTYPE_QUAD variable into a buffer
  of less than 8 bytes is now guaranteed to yield ENOMEM *except* if the
  buffer size happens to be 4.
 
 A request to read a CTLTYPE_QUAD variable into a buffer that is shorter
 than 8 bytes has always been a programming error.

An application could, for example, maintain a single, shared,
malloc'ed buffer that is reused for multiple sysctl() calls and only
resized on ENOMEM returns.  IMO, this is allowed by the API, but with
your change, a read of a CTLTYPE_QUAD variable will return the wrong
result if the buffer happened to be left with a size of 4 by a
previous read of a CTLTYPE_INT variable.

 The intent of the change was to relax that is the length happened to be 4.
 
  IMO, this behavior violates both the
  letter of the sysctl() man page and the principle of least astonishment.
 
 I'm not sure about the latter.
 I run 'sysctl -a' and find the name of the sysctl I'm interested in.
 The result is a small number so I pass the address and size of a integer
 variable and then print the result.
 (Or the value is rather large and I think it might exceed 2^31 so I
 use an int64.)
 The 'principle of least astonishment' would mean that I get the value
 that 'sysctl -a' printed.

If you use a C variable of a size different than the documented size
of the sysctl variable in case, your code is incorrect, and the
resulting undefined behavior should not astonish you.

 On a BE system I have to be extremely careful with the return values
 from sysctl() or I see garbage.

Yes, you do have to be careful, and we should not encourage
carelessness by making incorrect code work in more cases.

 Note that code calling systctl() has to either know whether the value
 it is expecting is a string, structure, or number, or use the API calls
 that expose the kernel internals in order to find out.

Not only that, but in the case of a number, it also has to know
whether it's an int or a quad, or use the appropriate API calls to
find out.

  Also, the work-around is ineffective in the case of a caller that
  allocates the buffer dynamically using the size given by an initial
  sysctl() call with oldp = NULL.
 
 Code that does that for a numeric value will be quite happy with
 either a 32bit of 64bit result.

Not if it is expecting the other one.
-- 
Andreas Gustafsson, g...@netbsd.org


Recent sysctl changes

2014-03-05 Thread Andreas Gustafsson
I posted this on source-changes-d earlier today, but Jeff Rizzo asked
me to take it to tech-kern.

On Feb 27, David Laight said:
 Module Name:  src
 Committed By: dsl
 Date: Thu Feb 27 22:50:52 UTC 2014
 
 Modified Files:
   src/sys/kern: kern_sysctl.c
 
 Log Message:
 Allow CTLTYPE_INT and CTLTYPE_QUAD to be read and written as either 4 or 8
   byte values regardless of the type.
 64bit writes to 32bit variables must be valid (signed) values.
 32bit reads of large values return -1.
 Amongst other things this should fix libm's code that reads machdep.sse
   as a 32bit int, but I'd changed it to 64bit (to common up some code).
 
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.246 -r1.247 src/sys/kern/kern_sysctl.c

In private correspondence with David, I have objected to this commit
as well as his earlier change to the type of the machdep.sse
sysctl variable.

Regrettably, David and I have not been able to reach an agreement, and
he has requested that I ask for opinions on a public list.  So...

1. I object to the earlier change of the following sysctl variables
from type CTLTYPE_INT to type CTLTYPE_QUAD:

machdep.fpu_present
machdep.osfxsr
machdep.sse
machdep.sse2
machdep.biosbasemem
machdep.biosextmem

My understanding of David's rationale for changing them is that it
facilitated sharing code with the machdep.xsave_features variable
which actually needs to be of type CTLTYPE_QUAD.  By my reckoning, 
the savings from this code sharing amount to eight lines of code.

Changing the types of existing sysctl variables breaks both source 
and binary compatibility and should not be done lightly; that's
why we have both both hw.physmem and hw.physmem64, for example.
Now, the types of multiple variables have been incompatibly changed
for no reason other than saving a few lines of code.

I believe the harm caused by the incompatible type change far
outweighs the cost of a few added lines of code, and would like 
the original types to be restored.  I am attaching a patch to do so;
it also changes the newly added variables machdep.fpu_save and
machdep.fpu_save_size to CTLTYPE_INT, but if David really thinks
those should be CTLTYPE_QUAD, I will not argue that point.


2. I also object to the change of kern_sysctl.c 1.247.

This change attempts to work around the problems caused by the changes
to the variable types by making sysctl() return different types
depending on the value of the *oldlenp argument.

IMO, this is a bad idea.  The *oldlenp argument does *not* specify the
size of the data type expected by the caller, but rather the size of a
buffer.  The sysctl() API allows the caller to pass a buffer larger
than the variable being read, and conversely, guarantees that passing
a buffer that is too small results in ENOMEM.

Both of these aspects of the API are now broken: reading a 4-byte
CTLTYPE_INT variable now works for any buffer size = 4 *except* 8,
and attempting to read an 8-byte CTLTYPE_QUAD variable into a buffer
of less than 8 bytes is now guaranteed to yield ENOMEM *except* if the
buffer size happens to be 4.  IMO, this behavior violates both the
letter of the sysctl() man page and the principle of least astonishment.

Also, the work-around is ineffective in the case of a caller that
allocates the buffer dynamically using the size given by an initial
sysctl() call with oldp = NULL.

If the original types of the sysctl variables are restored, this
work-around will no longer serve a purpose, and I'm asking for it
to be removed.

Opinions?
-- 
Andreas Gustafsson, g...@netbsd.org

Index: x86_machdep.c
===
RCS file: /cvsroot/src/sys/arch/x86/x86/x86_machdep.c,v
retrieving revision 1.63
diff -u -r1.63 x86_machdep.c
--- x86_machdep.c   23 Feb 2014 22:38:40 -  1.63
+++ x86_machdep.c   3 Mar 2014 18:41:06 -
@@ -1056,7 +1056,17 @@
 }
 
 static void
-const_sysctl(struct sysctllog **clog, const char *name, u_quad_t value, int 
tag)
+const_sysctl_int(struct sysctllog **clog, const char *name, int value, int tag)
+{
+   sysctl_createv(clog, 0, NULL, NULL,
+  CTLFLAG_PERMANENT | CTLFLAG_IMMEDIATE,
+  CTLTYPE_INT, name, NULL,
+  NULL, (u_quad_t)value, NULL, 0,
+  CTL_MACHDEP, tag, CTL_EOL);
+}
+
+static void
+const_sysctl_quad(struct sysctllog **clog, const char *name, u_quad_t value, 
int tag)
 {
sysctl_createv(clog, 0, NULL, NULL,
   CTLFLAG_PERMANENT | CTLFLAG_IMMEDIATE,
@@ -1115,17 +1125,17 @@
   CTL_MACHDEP, CTL_CREATE, CTL_EOL);
 
/* None of these can ever change once the system has booted */
-   const_sysctl(clog, fpu_present, i386_fpu_present, CPU_FPU_PRESENT);
-   const_sysctl(clog, osfxsr, i386_use_fxsave, CPU_OSFXSR);
-   const_sysctl(clog, sse, i386_has_sse, CPU_SSE);
-   const_sysctl(clog, sse2, i386_has_sse2, CPU_SSE2

Re: Recent sysctl changes

2014-03-05 Thread Andreas Gustafsson
Thor Lancelot Simon wrote:
 I don't actually know of any code that hands over a wrong-size
 buffer and will therefore break, though.  Do you?

No, but I think we should aim for correctness, not just works for me.
-- 
Andreas Gustafsson, g...@netbsd.org