Re: PSA: Clock drift and pkgin

2023-12-30 Thread Konrad Schroder

On 12/30/2023 3:42 PM, Johnny Billquist wrote:

On 2023-12-31 00:11, Michael van Elst wrote:

Better than 100Hz is possible and still precise. Something around 1000Hz
is necessary for human interaction. Modern hardware could easily do 
100kHz.


? If I remember right, anything less than 200ms is immediate response 
for a human brain. Which means you can get away with much coarser than 
even 100Hz.
And there are certainly lots of examples of older computers with 
clocks running in the 10s of ms, where human interaction feels perfect.


I'm not sure about visual and auditory sensation, but haptic VR requires 
position updates >= 1000Hz to get texture right.  The timing of two 
impulses that close together may not be felt as two separate events, but 
the frequency of vibrations within the skin when it interacts with a 
surface (even through a tool, such as a stylus) is encoded by the nerve 
endings in the skin itself.  We used to use PHANTOM haptic arms at 
$WORK, driven by an Indigo2. If the control loop operated at less than 
1000Hz---for example, if the Indigo2 was under load--- it introduced 
noticeable differences in the sensation of running the pen over a 
virtual object.  The simulation was much more sensitive to that than it 
was to the timing of the video output, for which anything greater than 
72Hz was wasted.


Take care,

-Konrad



Re: Perceivable time differences [was Re: PSA: Clock drift and pkgin]

2023-12-30 Thread David Holland
On Sun, Dec 31, 2023 at 02:54:50AM +0100, Johnny Billquist wrote:
 > Ok. I oversimplified.
 > 
 > If I remember right, the point was that something sub 200ms is perceived by
 > the brain as being "instananeous" response. It don't mean that one cannot
 > discern shorter times, just that from an action-reaction point of view,
 > anything below 200ms is "good enough".

The usual figure cited is 100 ms, not 200, but yeah.

it is instructive to look at the stopwatch function on a digital
watch; you can easily see the tenths counting but not the 100ths.

-- 
David A. Holland
dholl...@netbsd.org


Re: Perceivable time differences [was Re: PSA: Clock drift and pkgin]

2023-12-30 Thread Johnny Billquist

Ok. I oversimplified.

If I remember right, the point was that something sub 200ms is perceived 
by the brain as being "instananeous" response. It don't mean that one 
cannot discern shorter times, just that from an action-reaction point of 
view, anything below 200ms is "good enough".


My point was merely that I don't believe you need to have something down 
to ms resolution when it comes to human interaction, which was the claim 
I reacted to.


  Johnny

On 2023-12-31 02:47, Mouse wrote:

? If I remember right, anything less than 200ms is immediate response
for a human brain.


"Response"?  For some purposes, it is.  But under the right conditions
humans can easily discern time deltas in the sub-200ms range.

I just did a little psychoacoustics experiment on myself.

First, I generated (44.1kHz) soundfiles containing two single-sample
ticks separated by N samples for N being 1, 101, 201, 401, 801, and
going up by 800 from there to 6401, with a second of silence before and
after (see notes below for the commands used):

for d in 0 100 200 400 800 1600 2400 3200 4000 4800 5600 6400
do
(count from 0 to 44100 | sed -e "s/.*/0 0 0 0/"
echo 0 128 0 128
count from 0 to $d | sed -e "s/.*/0 0 0 0/"
echo 0 128 0 128
count from 0 to 44100 | sed -e "s/.*/0 0 0 0/"
) | code-to-char > zz.$d
done

I don't know stock NetBSD analogs for count and code-to-char.  count,
as used here, just counts as the command line indicates; given what
count's output is piped into, the details don't matter much.
code-to-char converts numbers 0..255 into single bytes with the same
values, with non-digits ignored except that they serve to separate
numbers.  (The time delta between the beginnings of the two ticks is of
course one more than the number of samples between the two ticks.)

After listening to them, I picked the 800 and 1600 files and did the
test.  I grabbed 128 bits from /dev/urandom and used them to play,
randomly, either one file or the other, letting me guess which one it
was in each case:

dd if=/dev/urandom bs=1 count=16 |
   char-to-code |
   cvtbase -m8 d b |
   sed -e 's/./& /g' -e 's/ $//' -e 's/0/800/g' -e 's/1/1600/g' |
   tr \  \\n |
   ( exec 3>zz.list 4>zz.guess 5&3
audioplay -f -c 2 -e slinear_le -P 16 -s 44100 < zz.$n
skipcat 0 1 0<&5 1>&4
 done
   )

char-to-code is the inverse of code-to-char: for each byte of input, it
produces one line of output containing the ASCII decimal for that
byte's value, 0..255.  cvtbase -m8 d b converts decimal to binary,
generating a minimum of 8 "digits" (bits) of output for each input
number.  skipcat, as used here, has the I/O behaviour of "dd bs=1
count=1" but without the blather on stderr: it skips no bytes and
copies one byte, then exits.  (The use of /dev/urandom is to ensure
that I have no a priori hint which file is being played which time.)

I then typed "s" when I thought it was a short-gap file and "l" when I
thought it was a long-gap file.  I got tired of it after 83 data
samples and killed it.  I then postprocessed zz.guess and compared it
to zz.list:

< zz.guess sed -e 's/s/800 /g' -e 's/l/1600 /g' | tr \  \\n | diff -u zz.list -

I got exactly two wrong out of 83 (and the stats are about evenly
balanced, 39 short files played and 44 long).  So I think it's fair to
say that, in the right context (an important caveat!), a time
difference as short as (1602-802)/44.1=18.14+ milliseconds is clearly
discernible to me.

This is, of course, a situation designed to perceive a very small
difference.  I'm sure there are plenty of contexts in which I would
fail to notice even 200ms of delay.

/~\ The ASCII Mouse
\ / Ribbon Campaign
  X  Against HTML   mo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Perceivable time differences [was Re: PSA: Clock drift and pkgin]

2023-12-30 Thread Mouse
> ? If I remember right, anything less than 200ms is immediate response
> for a human brain.

"Response"?  For some purposes, it is.  But under the right conditions
humans can easily discern time deltas in the sub-200ms range.

I just did a little psychoacoustics experiment on myself.

First, I generated (44.1kHz) soundfiles containing two single-sample
ticks separated by N samples for N being 1, 101, 201, 401, 801, and
going up by 800 from there to 6401, with a second of silence before and
after (see notes below for the commands used):

for d in 0 100 200 400 800 1600 2400 3200 4000 4800 5600 6400
do
(count from 0 to 44100 | sed -e "s/.*/0 0 0 0/"
echo 0 128 0 128
count from 0 to $d | sed -e "s/.*/0 0 0 0/"
echo 0 128 0 128
count from 0 to 44100 | sed -e "s/.*/0 0 0 0/"
) | code-to-char > zz.$d
done

I don't know stock NetBSD analogs for count and code-to-char.  count,
as used here, just counts as the command line indicates; given what
count's output is piped into, the details don't matter much.
code-to-char converts numbers 0..255 into single bytes with the same
values, with non-digits ignored except that they serve to separate
numbers.  (The time delta between the beginnings of the two ticks is of
course one more than the number of samples between the two ticks.)

After listening to them, I picked the 800 and 1600 files and did the
test.  I grabbed 128 bits from /dev/urandom and used them to play,
randomly, either one file or the other, letting me guess which one it
was in each case:

dd if=/dev/urandom bs=1 count=16 |
  char-to-code |
  cvtbase -m8 d b |
  sed -e 's/./& /g' -e 's/ $//' -e 's/0/800/g' -e 's/1/1600/g' |
  tr \  \\n |
  ( exec 3>zz.list 4>zz.guess 5&3
audioplay -f -c 2 -e slinear_le -P 16 -s 44100 < zz.$n
skipcat 0 1 0<&5 1>&4
done
  )

char-to-code is the inverse of code-to-char: for each byte of input, it
produces one line of output containing the ASCII decimal for that
byte's value, 0..255.  cvtbase -m8 d b converts decimal to binary,
generating a minimum of 8 "digits" (bits) of output for each input
number.  skipcat, as used here, has the I/O behaviour of "dd bs=1
count=1" but without the blather on stderr: it skips no bytes and
copies one byte, then exits.  (The use of /dev/urandom is to ensure
that I have no a priori hint which file is being played which time.)

I then typed "s" when I thought it was a short-gap file and "l" when I
thought it was a long-gap file.  I got tired of it after 83 data
samples and killed it.  I then postprocessed zz.guess and compared it
to zz.list:

< zz.guess sed -e 's/s/800 /g' -e 's/l/1600 /g' | tr \  \\n | diff -u zz.list -

I got exactly two wrong out of 83 (and the stats are about evenly
balanced, 39 short files played and 44 long).  So I think it's fair to
say that, in the right context (an important caveat!), a time
difference as short as (1602-802)/44.1=18.14+ milliseconds is clearly
discernible to me.

This is, of course, a situation designed to perceive a very small
difference.  I'm sure there are plenty of contexts in which I would
fail to notice even 200ms of delay.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Proposal: Restore malloc(9) interface

2023-12-30 Thread David Holland
On Sat, Dec 30, 2023 at 10:44:52PM +, Taylor R Campbell wrote:
 > Note: I am NOT proposing any substantive changes to the implementation
 > of the allocator -- I'm just proposing that we go back to the old
 > _interface_, using the new pool-cache-based _implementation_, and to
 > add lightweight per-CPU, per-tag usage counting to the malloc and free
 > paths.

Can we just add tags to kmem(9)? At this point that seems like a path
of lesser resistance, and also it avoids having a standard function
name with a nonstandard interface.

(Also, let's make the tags be typed as pointers instead of an enum)

-- 
David A. Holland
dholl...@netbsd.org


Re: PSA: Clock drift and pkgin

2023-12-30 Thread Michael van Elst
mo...@rodents-montreal.org (Mouse) writes:

>> Modern hardware could easily do 100kHz.

>Not with curren^Wat least one moderately recent NetBSD version!

>At work, I had occasion to run 9.1/amd64 with HZ=8000.  This was to get
>8-bit data pushed out a parallel port at 8kHz; I added special-case
>hooks between the relevant driver and the clock (I forget whether
>softclock or hardclock).  It worked for its intended use fairly
>nicely...but when I tried one of my SIGALRM testers on it, instead of
>the 100Hz it asked for, I got signals at, IIRC, about 77Hz.


Scheduling and switching userland processes is heavy. For a test
try to schedule kernel callouts with high HZ values. That still
generates lots of overhead with the current design but you should
be able to go faster than 8kHz.



PSA: Clock drift and pkgin

2023-12-30 Thread Michael van Elst
On Sun, Dec 31, 2023 at 12:42:29AM +0100, Johnny Billquist wrote:
> > Better than 100Hz is possible and still precise. Something around 1000Hz
> > is necessary for human interaction. Modern hardware could easily do 100kHz.
> 
> ? If I remember right, anything less than 200ms is immediate response for a
> human brain. Which means you can get away with much coarser than even 100Hz.
> And there are certainly lots of examples of older computers with clocks
> running in the 10s of ms, where human interaction feels perfect.

You may not be able to react faster than 200ms, but you can notice
shorter time periods.


> I think that is a separate question/problem/issue. That we fail when guest
> and host run at the same rate is something I consider a flaw in the system.

With a fixed tick, they cannot run at the same speed. This becomes
obvious when you try to run at different speeds that aren't just
integer multiples.

N.B. my m68k emulator runs a HZ=100 guest without a problem. But that's
a fake, in reality it only runs 100 ticks per second on average, In
particular when the guest becomes idle.


Greetings,
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: PSA: Clock drift and pkgin

2023-12-30 Thread Johnny Billquist

On 2023-12-31 00:11, Michael van Elst wrote:

On Sat, Dec 30, 2023 at 10:48:26PM +0100, Johnny Billquist wrote:


Right. But if you expect high precision on delays and scheduling, then you
start also having issues with just random unpredictable delays because of
other interrupts, paging, and whatnot. So in the end, your high precision
delays and scheduling becomes very imprecise again. So, is there really that
much value in that higher resolution?


Better than 100Hz is possible and still precise. Something around 1000Hz
is necessary for human interaction. Modern hardware could easily do 100kHz.


? If I remember right, anything less than 200ms is immediate response 
for a human brain. Which means you can get away with much coarser than 
even 100Hz.
And there are certainly lots of examples of older computers with clocks 
running in the 10s of ms, where human interaction feels perfect.



Another advantage is that you can use independent timing (that's what
bites in the emulator case where guest and host clocks run at the same
rate).


I think that is a separate question/problem/issue. That we fail when 
guest and host run at the same rate is something I consider a flaw in 
the system. It's technically perfectly possible to run such a combo 
good, and the fact that we didn't (don't) is just sad (in my opinion).


Not sure what you mean by independent timing here. For me, that would be 
if you had two different clock sources independent of each other.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: PSA: Clock drift and pkgin

2023-12-30 Thread Mouse
> Better than 100Hz is possible and still precise.  Something around
> 1000Hz is necessary for human interaction.

That doesn't sound right.  I've had good HCI experiences with HZ=100.
Why do you see a higher HZ as necessary for human interaction?

> Modern hardware could easily do 100kHz.

Not with curren^Wat least one moderately recent NetBSD version!

At work, I had occasion to run 9.1/amd64 with HZ=8000.  This was to get
8-bit data pushed out a parallel port at 8kHz; I added special-case
hooks between the relevant driver and the clock (I forget whether
softclock or hardclock).  It worked for its intended use fairly
nicely...but when I tried one of my SIGALRM testers on it, instead of
the 100Hz it asked for, I got signals at, IIRC, about 77Hz.

I never investigated.  I think I still have access to the work machine
in question if anyone wants me to try any other quick tests, but trying
to figure out an issue on a version I don't use except at work is
something I am unmotivated to do on my own time, and using work time to
dig after an issue that doesn't affect work's use case isn't an
appropriate use of work resources.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: PSA: Clock drift and pkgin

2023-12-30 Thread Michael van Elst
On Sat, Dec 30, 2023 at 10:48:26PM +0100, Johnny Billquist wrote:
> 
> Right. But if you expect high precision on delays and scheduling, then you
> start also having issues with just random unpredictable delays because of
> other interrupts, paging, and whatnot. So in the end, your high precision
> delays and scheduling becomes very imprecise again. So, is there really that
> much value in that higher resolution?

Better than 100Hz is possible and still precise. Something around 1000Hz
is necessary for human interaction. Modern hardware could easily do 100kHz.

Another advantage is that you can use independent timing (that's what
bites in the emulator case where guest and host clocks run at the same
rate).

-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Proposal: Restore malloc(9) interface

2023-12-30 Thread Taylor R Campbell
I propose to deprecate the kmem(9) interface and go back to the
malloc(9) interface.


1. The main difference is that the malloc(9) interface enables
   attribution of memory usage: how many bytes have been used for this
   purpose vs that purpose, partitioned by named malloc tags like
   M_MBUF or M_ACPI.  The conversion to the kmem(9) interface lost all
   this valuable diagnostic information.

   I've personally spent probably dozens of hours over the last year
   or two puzzling over `vmstat -m' output to guess which subsystem
   might be leaking memory based on allocation sizes and which
   kmem-N pool looks fishy.  This is extremely frustrating and a
   huge waste of time to recover information we used to gather and
   report systematically.

2. A secondary difference is reduced diffs from FreeBSD and OpenBSD
   drivers if we use malloc(9).

3. A small difference is that kmem(9) distinguishes legacy allocation
   from interrupt context, kmem_intr_alloc/free, from front ends that
   forbid that, kmem_alloc/free.

   I'm not sure this has provided much valuable diagnostic
   information, but it has provided a lot of frustrating crashes.  If
   we want the same frustrating crashes we could introduce an M_INTR
   flag which is mandatory when calling malloc from interrupt context.

Note: I am NOT proposing any substantive changes to the implementation
of the allocator -- I'm just proposing that we go back to the old
_interface_, using the new pool-cache-based _implementation_, and to
add lightweight per-CPU, per-tag usage counting to the malloc and free
paths.

Nor am I suggesting changing anything about uvm_km(9), pool_cache(9),
or anything else -- just changing kmem_alloc(N, KM_[NO]SLEEP) back to
malloc(N, T, M_NOWAIT/WAITOK) and kmem_free(P, N) back to free(P, T),
or possibly free(P, T, N) like OpenBSD does.

Thoughts?


I asked for rationale for the kmem(9) interface last year, and none of
the answers gave any compelling reason to have changed interfaces in
the first place or to finish the conversion now:

https://mail-index.netbsd.org/tech-kern/2022/10/29/msg028498.html

As far as I can tell, we just spent umpteen hundreds of hours on
engineering effort over the last decade to convert various drivers and
subsystems from malloc(9) to kmem(9), in exchange for the loss of
valuable diagnostic information about leaks, for increased cost to
porting drivers, and for crashes when old subsystems newly converted
to kmem(9) still allocate from interrupt context.


Re: PSA: Clock drift and pkgin

2023-12-30 Thread Johnny Billquist

On 2023-12-30 22:10, Michael van Elst wrote:

b...@softjar.se (Johnny Billquist) writes:


Being able to measure time with high precision is desierable, but we can
already do that without being tickless.


We cannot delay with high precision. You can increase HZ to some degree,
but that comes at a price.


Right. But if you expect high precision on delays and scheduling, then 
you start also having issues with just random unpredictable delays 
because of other interrupts, paging, and whatnot. So in the end, your 
high precision delays and scheduling becomes very imprecise again. So, 
is there really that much value in that higher resolution?


But of course, this all becomes a question of tradeoffs, preferences and 
desires. Not sure if we need to have an argument about it. I don't know 
if anyone is working on a tickless design, or how far it has come. I 
will certainly not complain if someone does it. But I'm personally not 
feeling much of a lack that we don't have it.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: PSA: Clock drift and pkgin

2023-12-30 Thread Michael van Elst
b...@softjar.se (Johnny Billquist) writes:

>Being able to measure time with high precision is desierable, but we can 
>already do that without being tickless.

We cannot delay with high precision. You can increase HZ to some degree,
but that comes at a price.



Re: PSA: Clock drift and pkgin

2023-12-30 Thread Johnny Billquist

On 2023-12-30 19:43, Martin Husemann wrote:

On Sat, Dec 30, 2023 at 06:25:29PM +, Jonathan Stone wrote:

You can only do tickless if you can track how much time is elapsing when no 
ticks fire, or none are pending.
I don't see how to do that without a high-res timer like a CPU cycle counter, 
or I/O bus cycle counter,
or what-have-you. Gong fully tickless would therefore end support for machines 
without such a timer.
Is NetBSD ready to do that?


Kernels on that machines just would not run fully tickless.


Right. There is no reason to assume that all platforms would have to go 
tickless just because it becomes a possibility.
However, I also am not sure how much value tickless adds here. The main 
reason I know of for tickless systems is power consumption. Not having 
to wake up just to count time can make a big difference.
Sure, you can get higher precision for some scheduling with tickless, 
but I'm not sure it generally makes any actual significant difference.
Being able to measure time with high precision is desierable, but we can 
already do that without being tickless.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: PSA: Clock drift and pkgin

2023-12-30 Thread Martin Husemann
On Sat, Dec 30, 2023 at 06:25:29PM +, Jonathan Stone wrote:
> You can only do tickless if you can track how much time is elapsing when no 
> ticks fire, or none are pending.
> I don't see how to do that without a high-res timer like a CPU cycle counter, 
> or I/O bus cycle counter,
> or what-have-you. Gong fully tickless would therefore end support for 
> machines without such a timer.
> Is NetBSD ready to do that?  

Kernels on that machines just would not run fully tickless.

Martin


Re: PSA: Clock drift and pkgin

2023-12-30 Thread Jonathan Stone
 

On Saturday, December 23, 2023 at 10:19:53 PM PST, Simon Burge 
 wrote:



> I have a grotty hack that attempted to spin if the requested timeout
> was less than a tick based on what DragonflyBSD does.  It mostly
> worked for simple tests but I haven't tested it seriously.  It's at
> https://www.NetBSD.org/~simonb/pollfixhack.diff . 

is that really viable on uniprocessor machines?

> This is potentially
>another direction until we get a pure tickless kernel...

You can only do tickless if you can track how much time is elapsing when no 
ticks fire, or none are pending.
I don't see how to do that without a high-res timer like a CPU cycle counter, 
or I/O bus cycle counter,
or what-have-you. Gong fully tickless would therefore end support for machines 
without such a timer.
Is NetBSD ready to do that?