Re: PSA: Clock drift and pkgin
On 12/30/2023 3:42 PM, Johnny Billquist wrote: On 2023-12-31 00:11, Michael van Elst wrote: Better than 100Hz is possible and still precise. Something around 1000Hz is necessary for human interaction. Modern hardware could easily do 100kHz. ? If I remember right, anything less than 200ms is immediate response for a human brain. Which means you can get away with much coarser than even 100Hz. And there are certainly lots of examples of older computers with clocks running in the 10s of ms, where human interaction feels perfect. I'm not sure about visual and auditory sensation, but haptic VR requires position updates >= 1000Hz to get texture right. The timing of two impulses that close together may not be felt as two separate events, but the frequency of vibrations within the skin when it interacts with a surface (even through a tool, such as a stylus) is encoded by the nerve endings in the skin itself. We used to use PHANTOM haptic arms at $WORK, driven by an Indigo2. If the control loop operated at less than 1000Hz---for example, if the Indigo2 was under load--- it introduced noticeable differences in the sensation of running the pen over a virtual object. The simulation was much more sensitive to that than it was to the timing of the video output, for which anything greater than 72Hz was wasted. Take care, -Konrad
Re: Perceivable time differences [was Re: PSA: Clock drift and pkgin]
On Sun, Dec 31, 2023 at 02:54:50AM +0100, Johnny Billquist wrote: > Ok. I oversimplified. > > If I remember right, the point was that something sub 200ms is perceived by > the brain as being "instananeous" response. It don't mean that one cannot > discern shorter times, just that from an action-reaction point of view, > anything below 200ms is "good enough". The usual figure cited is 100 ms, not 200, but yeah. it is instructive to look at the stopwatch function on a digital watch; you can easily see the tenths counting but not the 100ths. -- David A. Holland dholl...@netbsd.org
Re: Perceivable time differences [was Re: PSA: Clock drift and pkgin]
Ok. I oversimplified. If I remember right, the point was that something sub 200ms is perceived by the brain as being "instananeous" response. It don't mean that one cannot discern shorter times, just that from an action-reaction point of view, anything below 200ms is "good enough". My point was merely that I don't believe you need to have something down to ms resolution when it comes to human interaction, which was the claim I reacted to. Johnny On 2023-12-31 02:47, Mouse wrote: ? If I remember right, anything less than 200ms is immediate response for a human brain. "Response"? For some purposes, it is. But under the right conditions humans can easily discern time deltas in the sub-200ms range. I just did a little psychoacoustics experiment on myself. First, I generated (44.1kHz) soundfiles containing two single-sample ticks separated by N samples for N being 1, 101, 201, 401, 801, and going up by 800 from there to 6401, with a second of silence before and after (see notes below for the commands used): for d in 0 100 200 400 800 1600 2400 3200 4000 4800 5600 6400 do (count from 0 to 44100 | sed -e "s/.*/0 0 0 0/" echo 0 128 0 128 count from 0 to $d | sed -e "s/.*/0 0 0 0/" echo 0 128 0 128 count from 0 to 44100 | sed -e "s/.*/0 0 0 0/" ) | code-to-char > zz.$d done I don't know stock NetBSD analogs for count and code-to-char. count, as used here, just counts as the command line indicates; given what count's output is piped into, the details don't matter much. code-to-char converts numbers 0..255 into single bytes with the same values, with non-digits ignored except that they serve to separate numbers. (The time delta between the beginnings of the two ticks is of course one more than the number of samples between the two ticks.) After listening to them, I picked the 800 and 1600 files and did the test. I grabbed 128 bits from /dev/urandom and used them to play, randomly, either one file or the other, letting me guess which one it was in each case: dd if=/dev/urandom bs=1 count=16 | char-to-code | cvtbase -m8 d b | sed -e 's/./& /g' -e 's/ $//' -e 's/0/800/g' -e 's/1/1600/g' | tr \ \\n | ( exec 3>zz.list 4>zz.guess 5&3 audioplay -f -c 2 -e slinear_le -P 16 -s 44100 < zz.$n skipcat 0 1 0<&5 1>&4 done ) char-to-code is the inverse of code-to-char: for each byte of input, it produces one line of output containing the ASCII decimal for that byte's value, 0..255. cvtbase -m8 d b converts decimal to binary, generating a minimum of 8 "digits" (bits) of output for each input number. skipcat, as used here, has the I/O behaviour of "dd bs=1 count=1" but without the blather on stderr: it skips no bytes and copies one byte, then exits. (The use of /dev/urandom is to ensure that I have no a priori hint which file is being played which time.) I then typed "s" when I thought it was a short-gap file and "l" when I thought it was a long-gap file. I got tired of it after 83 data samples and killed it. I then postprocessed zz.guess and compared it to zz.list: < zz.guess sed -e 's/s/800 /g' -e 's/l/1600 /g' | tr \ \\n | diff -u zz.list - I got exactly two wrong out of 83 (and the stats are about evenly balanced, 39 short files played and 44 long). So I think it's fair to say that, in the right context (an important caveat!), a time difference as short as (1602-802)/44.1=18.14+ milliseconds is clearly discernible to me. This is, of course, a situation designed to perceive a very small difference. I'm sure there are plenty of contexts in which I would fail to notice even 200ms of delay. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Perceivable time differences [was Re: PSA: Clock drift and pkgin]
> ? If I remember right, anything less than 200ms is immediate response > for a human brain. "Response"? For some purposes, it is. But under the right conditions humans can easily discern time deltas in the sub-200ms range. I just did a little psychoacoustics experiment on myself. First, I generated (44.1kHz) soundfiles containing two single-sample ticks separated by N samples for N being 1, 101, 201, 401, 801, and going up by 800 from there to 6401, with a second of silence before and after (see notes below for the commands used): for d in 0 100 200 400 800 1600 2400 3200 4000 4800 5600 6400 do (count from 0 to 44100 | sed -e "s/.*/0 0 0 0/" echo 0 128 0 128 count from 0 to $d | sed -e "s/.*/0 0 0 0/" echo 0 128 0 128 count from 0 to 44100 | sed -e "s/.*/0 0 0 0/" ) | code-to-char > zz.$d done I don't know stock NetBSD analogs for count and code-to-char. count, as used here, just counts as the command line indicates; given what count's output is piped into, the details don't matter much. code-to-char converts numbers 0..255 into single bytes with the same values, with non-digits ignored except that they serve to separate numbers. (The time delta between the beginnings of the two ticks is of course one more than the number of samples between the two ticks.) After listening to them, I picked the 800 and 1600 files and did the test. I grabbed 128 bits from /dev/urandom and used them to play, randomly, either one file or the other, letting me guess which one it was in each case: dd if=/dev/urandom bs=1 count=16 | char-to-code | cvtbase -m8 d b | sed -e 's/./& /g' -e 's/ $//' -e 's/0/800/g' -e 's/1/1600/g' | tr \ \\n | ( exec 3>zz.list 4>zz.guess 5&3 audioplay -f -c 2 -e slinear_le -P 16 -s 44100 < zz.$n skipcat 0 1 0<&5 1>&4 done ) char-to-code is the inverse of code-to-char: for each byte of input, it produces one line of output containing the ASCII decimal for that byte's value, 0..255. cvtbase -m8 d b converts decimal to binary, generating a minimum of 8 "digits" (bits) of output for each input number. skipcat, as used here, has the I/O behaviour of "dd bs=1 count=1" but without the blather on stderr: it skips no bytes and copies one byte, then exits. (The use of /dev/urandom is to ensure that I have no a priori hint which file is being played which time.) I then typed "s" when I thought it was a short-gap file and "l" when I thought it was a long-gap file. I got tired of it after 83 data samples and killed it. I then postprocessed zz.guess and compared it to zz.list: < zz.guess sed -e 's/s/800 /g' -e 's/l/1600 /g' | tr \ \\n | diff -u zz.list - I got exactly two wrong out of 83 (and the stats are about evenly balanced, 39 short files played and 44 long). So I think it's fair to say that, in the right context (an important caveat!), a time difference as short as (1602-802)/44.1=18.14+ milliseconds is clearly discernible to me. This is, of course, a situation designed to perceive a very small difference. I'm sure there are plenty of contexts in which I would fail to notice even 200ms of delay. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: Proposal: Restore malloc(9) interface
On Sat, Dec 30, 2023 at 10:44:52PM +, Taylor R Campbell wrote: > Note: I am NOT proposing any substantive changes to the implementation > of the allocator -- I'm just proposing that we go back to the old > _interface_, using the new pool-cache-based _implementation_, and to > add lightweight per-CPU, per-tag usage counting to the malloc and free > paths. Can we just add tags to kmem(9)? At this point that seems like a path of lesser resistance, and also it avoids having a standard function name with a nonstandard interface. (Also, let's make the tags be typed as pointers instead of an enum) -- David A. Holland dholl...@netbsd.org
Re: PSA: Clock drift and pkgin
mo...@rodents-montreal.org (Mouse) writes: >> Modern hardware could easily do 100kHz. >Not with curren^Wat least one moderately recent NetBSD version! >At work, I had occasion to run 9.1/amd64 with HZ=8000. This was to get >8-bit data pushed out a parallel port at 8kHz; I added special-case >hooks between the relevant driver and the clock (I forget whether >softclock or hardclock). It worked for its intended use fairly >nicely...but when I tried one of my SIGALRM testers on it, instead of >the 100Hz it asked for, I got signals at, IIRC, about 77Hz. Scheduling and switching userland processes is heavy. For a test try to schedule kernel callouts with high HZ values. That still generates lots of overhead with the current design but you should be able to go faster than 8kHz.
PSA: Clock drift and pkgin
On Sun, Dec 31, 2023 at 12:42:29AM +0100, Johnny Billquist wrote: > > Better than 100Hz is possible and still precise. Something around 1000Hz > > is necessary for human interaction. Modern hardware could easily do 100kHz. > > ? If I remember right, anything less than 200ms is immediate response for a > human brain. Which means you can get away with much coarser than even 100Hz. > And there are certainly lots of examples of older computers with clocks > running in the 10s of ms, where human interaction feels perfect. You may not be able to react faster than 200ms, but you can notice shorter time periods. > I think that is a separate question/problem/issue. That we fail when guest > and host run at the same rate is something I consider a flaw in the system. With a fixed tick, they cannot run at the same speed. This becomes obvious when you try to run at different speeds that aren't just integer multiples. N.B. my m68k emulator runs a HZ=100 guest without a problem. But that's a fake, in reality it only runs 100 ticks per second on average, In particular when the guest becomes idle. Greetings, -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: PSA: Clock drift and pkgin
On 2023-12-31 00:11, Michael van Elst wrote: On Sat, Dec 30, 2023 at 10:48:26PM +0100, Johnny Billquist wrote: Right. But if you expect high precision on delays and scheduling, then you start also having issues with just random unpredictable delays because of other interrupts, paging, and whatnot. So in the end, your high precision delays and scheduling becomes very imprecise again. So, is there really that much value in that higher resolution? Better than 100Hz is possible and still precise. Something around 1000Hz is necessary for human interaction. Modern hardware could easily do 100kHz. ? If I remember right, anything less than 200ms is immediate response for a human brain. Which means you can get away with much coarser than even 100Hz. And there are certainly lots of examples of older computers with clocks running in the 10s of ms, where human interaction feels perfect. Another advantage is that you can use independent timing (that's what bites in the emulator case where guest and host clocks run at the same rate). I think that is a separate question/problem/issue. That we fail when guest and host run at the same rate is something I consider a flaw in the system. It's technically perfectly possible to run such a combo good, and the fact that we didn't (don't) is just sad (in my opinion). Not sure what you mean by independent timing here. For me, that would be if you had two different clock sources independent of each other. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
> Better than 100Hz is possible and still precise. Something around > 1000Hz is necessary for human interaction. That doesn't sound right. I've had good HCI experiences with HZ=100. Why do you see a higher HZ as necessary for human interaction? > Modern hardware could easily do 100kHz. Not with curren^Wat least one moderately recent NetBSD version! At work, I had occasion to run 9.1/amd64 with HZ=8000. This was to get 8-bit data pushed out a parallel port at 8kHz; I added special-case hooks between the relevant driver and the clock (I forget whether softclock or hardclock). It worked for its intended use fairly nicely...but when I tried one of my SIGALRM testers on it, instead of the 100Hz it asked for, I got signals at, IIRC, about 77Hz. I never investigated. I think I still have access to the work machine in question if anyone wants me to try any other quick tests, but trying to figure out an issue on a version I don't use except at work is something I am unmotivated to do on my own time, and using work time to dig after an issue that doesn't affect work's use case isn't an appropriate use of work resources. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: PSA: Clock drift and pkgin
On Sat, Dec 30, 2023 at 10:48:26PM +0100, Johnny Billquist wrote: > > Right. But if you expect high precision on delays and scheduling, then you > start also having issues with just random unpredictable delays because of > other interrupts, paging, and whatnot. So in the end, your high precision > delays and scheduling becomes very imprecise again. So, is there really that > much value in that higher resolution? Better than 100Hz is possible and still precise. Something around 1000Hz is necessary for human interaction. Modern hardware could easily do 100kHz. Another advantage is that you can use independent timing (that's what bites in the emulator case where guest and host clocks run at the same rate). -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Proposal: Restore malloc(9) interface
I propose to deprecate the kmem(9) interface and go back to the malloc(9) interface. 1. The main difference is that the malloc(9) interface enables attribution of memory usage: how many bytes have been used for this purpose vs that purpose, partitioned by named malloc tags like M_MBUF or M_ACPI. The conversion to the kmem(9) interface lost all this valuable diagnostic information. I've personally spent probably dozens of hours over the last year or two puzzling over `vmstat -m' output to guess which subsystem might be leaking memory based on allocation sizes and which kmem-N pool looks fishy. This is extremely frustrating and a huge waste of time to recover information we used to gather and report systematically. 2. A secondary difference is reduced diffs from FreeBSD and OpenBSD drivers if we use malloc(9). 3. A small difference is that kmem(9) distinguishes legacy allocation from interrupt context, kmem_intr_alloc/free, from front ends that forbid that, kmem_alloc/free. I'm not sure this has provided much valuable diagnostic information, but it has provided a lot of frustrating crashes. If we want the same frustrating crashes we could introduce an M_INTR flag which is mandatory when calling malloc from interrupt context. Note: I am NOT proposing any substantive changes to the implementation of the allocator -- I'm just proposing that we go back to the old _interface_, using the new pool-cache-based _implementation_, and to add lightweight per-CPU, per-tag usage counting to the malloc and free paths. Nor am I suggesting changing anything about uvm_km(9), pool_cache(9), or anything else -- just changing kmem_alloc(N, KM_[NO]SLEEP) back to malloc(N, T, M_NOWAIT/WAITOK) and kmem_free(P, N) back to free(P, T), or possibly free(P, T, N) like OpenBSD does. Thoughts? I asked for rationale for the kmem(9) interface last year, and none of the answers gave any compelling reason to have changed interfaces in the first place or to finish the conversion now: https://mail-index.netbsd.org/tech-kern/2022/10/29/msg028498.html As far as I can tell, we just spent umpteen hundreds of hours on engineering effort over the last decade to convert various drivers and subsystems from malloc(9) to kmem(9), in exchange for the loss of valuable diagnostic information about leaks, for increased cost to porting drivers, and for crashes when old subsystems newly converted to kmem(9) still allocate from interrupt context.
Re: PSA: Clock drift and pkgin
On 2023-12-30 22:10, Michael van Elst wrote: b...@softjar.se (Johnny Billquist) writes: Being able to measure time with high precision is desierable, but we can already do that without being tickless. We cannot delay with high precision. You can increase HZ to some degree, but that comes at a price. Right. But if you expect high precision on delays and scheduling, then you start also having issues with just random unpredictable delays because of other interrupts, paging, and whatnot. So in the end, your high precision delays and scheduling becomes very imprecise again. So, is there really that much value in that higher resolution? But of course, this all becomes a question of tradeoffs, preferences and desires. Not sure if we need to have an argument about it. I don't know if anyone is working on a tickless design, or how far it has come. I will certainly not complain if someone does it. But I'm personally not feeling much of a lack that we don't have it. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
b...@softjar.se (Johnny Billquist) writes: >Being able to measure time with high precision is desierable, but we can >already do that without being tickless. We cannot delay with high precision. You can increase HZ to some degree, but that comes at a price.
Re: PSA: Clock drift and pkgin
On 2023-12-30 19:43, Martin Husemann wrote: On Sat, Dec 30, 2023 at 06:25:29PM +, Jonathan Stone wrote: You can only do tickless if you can track how much time is elapsing when no ticks fire, or none are pending. I don't see how to do that without a high-res timer like a CPU cycle counter, or I/O bus cycle counter, or what-have-you. Gong fully tickless would therefore end support for machines without such a timer. Is NetBSD ready to do that? Kernels on that machines just would not run fully tickless. Right. There is no reason to assume that all platforms would have to go tickless just because it becomes a possibility. However, I also am not sure how much value tickless adds here. The main reason I know of for tickless systems is power consumption. Not having to wake up just to count time can make a big difference. Sure, you can get higher precision for some scheduling with tickless, but I'm not sure it generally makes any actual significant difference. Being able to measure time with high precision is desierable, but we can already do that without being tickless. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
On Sat, Dec 30, 2023 at 06:25:29PM +, Jonathan Stone wrote: > You can only do tickless if you can track how much time is elapsing when no > ticks fire, or none are pending. > I don't see how to do that without a high-res timer like a CPU cycle counter, > or I/O bus cycle counter, > or what-have-you. Gong fully tickless would therefore end support for > machines without such a timer. > Is NetBSD ready to do that? Kernels on that machines just would not run fully tickless. Martin
Re: PSA: Clock drift and pkgin
On Saturday, December 23, 2023 at 10:19:53 PM PST, Simon Burge wrote: > I have a grotty hack that attempted to spin if the requested timeout > was less than a tick based on what DragonflyBSD does. It mostly > worked for simple tests but I haven't tested it seriously. It's at > https://www.NetBSD.org/~simonb/pollfixhack.diff . is that really viable on uniprocessor machines? > This is potentially >another direction until we get a pure tickless kernel... You can only do tickless if you can track how much time is elapsing when no ticks fire, or none are pending. I don't see how to do that without a high-res timer like a CPU cycle counter, or I/O bus cycle counter, or what-have-you. Gong fully tickless would therefore end support for machines without such a timer. Is NetBSD ready to do that?