Re: How does PCIe appear to the host?

2024-10-04 Thread Mouse
>>> It's supposed to negotiate down to x1?
>> Yes.
> Okay.  I've dropped the manufacturer an email asking whether this
> device is supposed to work in a mechanically x16 slot which has only
> one lane available to it.

It's not.  Vantec wrote back, saying

> I'm sorry, but this card won't work with Any PCIe slot with only 1 lane.
> Yes, technically it should be able to run in 1 lane, but we designed it not
> to support using 1 lane for 5 devices, it will be slow.

I offered my opinion that "slow" is better than "not working at all",
but, obviously, that's not relevant to the existing hardware.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: How does PCIe appear to the host?

2024-10-03 Thread Mouse
>> Here's what ACPI has to say.  As I mentioned, dmesg is identical
>> with or without the card.
> There are devices out there that require a relatively recent host (
> '3rd generation PCIE' or somesuch ).

I am inclined to doubt that's what's up here.  The Q1900M fails; the
one that works is an Asus M3A78-CM, which still has PCI slots.
Certainly I've owned the Asus for years, whereas the Q1900N was bought
this summer, not that time I've possessed it necessarily goes with time
it's existed (or been designed).  Certainly the Q1900M *looks* like a
newer design.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: How does PCIe appear to the host?

2024-10-03 Thread Mouse
>> It's supposed to negotiate down to x1?
> Yes.

Okay.  I've dropped the manufacturer an email asking whether this
device is supposed to work in a mechanically x16 slot which has only
one lane available to it.

The primary chip on the board is indeed marked with the JMicron logo
and "JMB585" (it's also marked "2143 QHBA1 A" and "E771C0011"), but
there are numerous other components, including at least two other
chips, there

>> Then either Vantec or ASRock has done something odd or my particular
>> Q1900M has a duff "x16" slot, because it doesn't work.
> I once had a PCIe network card in a x16 slot that didn't work
> reliable and wasn't recognized now and then.  Reason was that the
> edge connector wasn't correctly aligned and I had to shape it with a
> file.  Some things are just too cheap.

I may have to do some such.  I'll see what Vantec has to say.  (In
particular, I don't want to do anything permanent to this card while
there's still some chance I may need to RMA it.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: How does PCIe appear to the host?

2024-10-03 Thread Mouse
>> ahcisata0 at pci2 dev 0 function 0: vendor 0x197b product 0x0585
> That's a JMicron JMB585 which has a PCIe Gen3 x2 interface and
> provides five 6Gbps SATA ports.

That sounds right; the card has five SATA connectors and its ppb
reports "link is x2 @ 5.0GT/s".

> If your board has eight SATA ports,

No, the card has only five ports, both physically (only five SATA
connectors) and digitally (five atabus instances appear in autoconf).

> A JMB585 should have no problems to work in a x1 slot.

It's supposed to negotiate down to x1?  Then either Vantec or ASRock
has done something odd or my particular Q1900M has a duff "x16" slot,
because it doesn't work.  I think my quad wm is a x1 card; if I can
find the silly thing I'll try it in the ASRock to see if the slot
itself works.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: How does PCIe appear to the host?

2024-10-03 Thread Mouse
 0xd0: 0xc000 0x0842 0xc9118000 0x
0xe0: 0x 0x 0x0004 0x
0xf0: 0x0050 0x00c0 0x010e0f1a 0x0100

When I put the card into the other machine, into the x16 slot that
actually _is_ x16, the one it works in, it shows up as ahcisata0,
attached via

ppb1 at pci0 dev 2 function 0: vendor 0x1022 product 0x9603 (rev. 0x00)
ppb1: PCI Express capability version 2  x16 @ 
5.0GT/s
ppb1: link is x2 @ 5.0GT/s
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
ahcisata0 at pci2 dev 0 function 0: vendor 0x197b product 0x0585
ahcisata0: interrupting at ioapic0 pin 18
ahcisata0: AHCI revision 0x10301, 5 ports, 32 command slots, features 0xef33e080

and pcictl dump /dev/pci2 -d 0 shows (114 lines)

PCI configuration registers:
  Common header:
0x00: 0x0585197b 0x00100107 0x01060100 0x0010

Vendor Name: JMicron Technology (0x197b)
Device ID: 0x0585
Command register: 0x0107
  I/O space accesses: on
  Memory space accesses: on
  Bus mastering: on
  Special cycles: off
  MWI transactions: off
  Palette snooping: off
  Parity error checking: off
  Address/data stepping: off
  System error (SERR): on
  Fast back-to-back transactions: off
  Interrupt disable: off
Status register: 0x0010
  Capability List support: on
  66 MHz capable: off
  User Definable Features (UDF) support: off
  Fast back-to-back capable: off
  Data parity error detected: off
  DEVSEL timing: fast (0x0)
  Slave signaled Target Abort: off
  Master received Target Abort: off
  Master received Master Abort: off
  Asserted System Error (SERR): off
  Parity error detected: off
Class Name: mass storage (0x01)
Subclass Name: SATA (0x06)
Interface: 0x01
Revision ID: 0x00
BIST: 0x00
Header Type: 0x00 (0x00)
Latency Timer: 0x00
Cache Line Size: 0x10

  Type 0 ("normal" device) header:
0x10: 0xdc01 0xd881 0xd801 0xd481
0x20: 0xd401 0xfbefe000 0x 0x197b
0x30: 0xfbee 0x0080 0x 0x010a

Base address register at 0x10
  type: i/o
  base: 0xdc00, not sized
Base address register at 0x14
  type: i/o
  base: 0xd880, not sized
Base address register at 0x18
  type: i/o
  base: 0xd800, not sized
Base address register at 0x1c
  type: i/o
  base: 0xd480, not sized
Base address register at 0x20
  type: i/o
  base: 0xd400, not sized
Base address register at 0x24
  type: 32-bit nonprefetchable memory
  base: 0xfbefe000, not sized
Cardbus CIS Pointer: 0x
Subsystem vendor ID: 0x197b
Subsystem ID: 0x
Expansion ROM Base Address: 0xfbee
Capability list pointer: 0x80
Reserved @ 0x38: 0x
Maximum Latency: 0x00
Minimum Grant: 0x00
Interrupt pin: 0x01 (pin A)
Interrupt line: 0x0a

  Capability register at 0x80
type: 0x01 (Power Management, rev. 1.0)
  Capability register at 0x90
type: 0x05 (MSI)
  Capability register at 0xc0
type: 0x10 (PCI Express)

  PCI Power Management Capabilities Register
Capabilities register: 0x4003
  Version: 1.2
  PME# clock: off
  Device specific initialization: off
  3.3V auxiliary current: self-powered
  D1 power management state support: off
  D2 power management state support: off
  PME# support: 0x08
Control/status register: 0x0008
  Power state: D0
  PCI Express reserved: off
  No soft reset: on
  PME# assertion disabled
  PME# status: off

  PCI Express Capabilities Register
Capability version: 2
Device type: Legacy PCI Express Endpoint device
Interrupt Message Number: 0

  Device-dependent header:
0x40: 0x 0x 0x 0x
0x50: 0x 0x 0x 0x
0x60: 0x 0x 0x 0x
0x70: 0x 0x 0x 0x
0x80: 0x40039001 0x0008 0x 0x
0x90: 0x0086c005 0x 0x 0x
0xa0: 0x 0x 0x 0x
0xb0: 0xc011 0x 0x0008 0x
0xc0: 0x00120010 0x10008102 0x00092810 0x0041a023
0xd0: 0x00220040 0x 0x 0x
0xe0: 0x 0x00140392 0x 0x000e
0xf0: 0x00010003 0x 0x 0x

I hope the above includes whatever you'd be looking for!

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: How does PCIe appear to the host?

2024-10-03 Thread Mouse
> [...].  I just today picked up a 5-port PCIe SATA card and tried it.

In case it matters to anyone, the card is identified, on the box and on
a sticker on the card itself, as a UGT-ST655, and it comes from Vantec.
As one of my messages quoted autoconf as saying, it shows up as vendor
0x197b product 0x0585.

/~\ The ASCII     Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: How does PCIe appear to the host?

2024-10-03 Thread Mouse
>> If you're plugging in a card and it isn't seen, I'd check the BIOS
>> and look for any pcie settings it might have.
> I suspect it's worse than that in this case; see mlelstv's mail,
> explaining that there are only four lanes available total, so my
> "x16" slot is [actually x1 electrically]

I tried the card in another machine, in a "x16" slot.

It appears _that_ marking is honest in that respect:

ppb1 at pci0 dev 2 function 0: vendor 0x1022 product 0x9603 (rev. 0x00)
ppb1: PCI Express capability version 2  x16 @ 
5.0GT/s
ppb1: link is x2 @ 5.0GT/s
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
ahcisata0 at pci2 dev 0 function 0: vendor 0x197b product 0x0585
ahcisata0: interrupting at ioapic0 pin 18
ahcisata0: AHCI revision 0x10301, 5 ports, 32 command slots, features 0xef33e080
atabus0 at ahcisata0 channel 0
atabus1 at ahcisata0 channel 1
atabus2 at ahcisata0 channel 2
atabus3 at ahcisata0 channel 3
atabus4 at ahcisata0 channel 4
...
ahcisata0 port 2: device present, speed: 6.0Gb/s
wd0 at atabus2 drive 0: 
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: HPA enabled, protected area present
wd0: limit not raised (not enabled in configuration)
wd0: 1863 GB, 3876021 cyl, 16 head, 63 sec, 512 bytes/sect x 3907029168 sectors
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
wd0: block sizes: medium 4096, interface 512, alignment 0
wd0(ahcisata0:2:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133) 
(using DMA)

Note (a) the ppb1: line says x16 (though it also says "link is x2"; if
I'm reading the code right this might mean the card actually requires
only x2, not x4, though it's x4 mechanically) and (b) ahcisata0 is
found, including the drive I had connected as a test.  So the card
appears fine.  I think mlelstv's explanation is (not surprisingly, in
view of whom it's from) right about why it doesn't work on the ASRock.

I also tried it in a "x8" slot, in a Dell PowerEdge 840.  This one is
more confusing, at least to me:

pchb0 at pci0 dev 0 function 0
pchb0: vendor 0x8086 product 0x2778 (rev. 0x00)
ppb0 at pci0 dev 1 function 0: vendor 0x8086 product 0x2779 (rev. 0x00)
ppb0: PCI Express capability version 1  x8 @ 
2.5GT/s
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
vendor 0x197b product 0x0585 (SATA mass storage, interface 0x01) at pci1 dev 0 
function 0 not configured

It says it actually is x8, but then...not configured?  This is using
the same kernel as the above (each one was using my PXE boot setup,
just on different hardware), so I really want to figure out why the
same hardware - reported as same vendor/product, even recognized as
"SATA mass storage" - would fail to match on the second machine.  I
took that amchine out of service, but I can't now recall why; maybe
it's very slightly broken, returning the right vendor/product numbers
but trash in other fields?  I'll have to investigate.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: How does PCIe appear to the host?

2024-10-03 Thread Mouse
First, my thanks to everyone; while the news is not great from the
perspective of getting things working, it has greatly improved my
understanding of PCIe, so in that sense it was a success.  You people
are a marvelous resource!

> It's been my experience that pcie busses show up as pci busses from
> the software perspective

My experience is _very_ limited - I've used PCIe in all of one other
case - but that's how it worked in that case.

> If you're plugging in a card and it isn't seen, I'd check the BIOS
> and look for any pcie settings it might have.

I suspect it's worse than that in this case; see mlelstv's mail,
explaining that there are only four lanes available total, so my "x16"
slot is x16 only mechanically - it seems to me it would have been more
honest of ASRock to use a x1 socket that's open at the end so any size
card can fit mechanically but it doesn't appear to be more than it is.

> You don't say which version of NetBSD you're running, but I've used
> pcie as early as NetBSD-3 and quite extensively on NetBSD-5.

I didn't?  *check*  Ah, yes, just "relatively old".  It's my (somewhat
mutant) 5.2, and, yes, I have pcictl.

> You might experiment by booting without ACPI to see how the PCI
> busses probe in that case.

I may try that.  But I suspect the PCI(e) busses are doing the best
they can.  The ppbs are visible with pcictl list

000:28:0: Intel product 0x0f48 (PCI bridge, revision 0x0e)
000:28:1: Intel product 0x0f4a (PCI bridge, revision 0x0e)
000:28:2: Intel product 0x0f4c (PCI bridge, revision 0x0e)
000:28:3: Intel product 0x0f4e (PCI bridge, revision 0x0e)

which matches with what autoconf reports at boot time.  pcictl dump on
the ppbs reports a lot of stuff, but nothing interesting that the
backported dump of the XCAP/LCAP values doesn't, and, indeed, doesn't
report the max width value from PCIE_LCAP as far as I can see.

Here's what ACPI has to say.  As I mentioned, dmesg is identical with
or without the card.

ACPI Warning (tbfadt-0327): FADT (revision 5) is longer than ACPI 2.0 version, 
truncating length 0x10C to 0xF4 [20080321]
[...]
acpi0 at mainbus0: Intel ACPICA 20080321
acpi0: X/RSDT: OemId , AslId 
acpi0: SCI interrupting at int 9
acpi0: fixed-feature power button present
timecounter: Timecounter "ACPI-Safe" frequency 3579545 Hz quality 900 ACPI-Safe 
24-bit timer
hpet0 at acpi0 (HPET, PNP0103-0): mem 0xfed0-0xfed003ff
timecounter: Timecounter "hpet0" frequency 14318179 Hz quality 2000
FWHD (INT0800) at acpi0 not configured
attimer1 at acpi0 (TIMR, PNP0100): io 0x40-0x43,0x50-0x53 irq 0
LPTE (PNP0400) at acpi0 not configured
UAR1 (PNP0501) at acpi0 not configured
ADMA (DMA0F28) at acpi0 not configured
acpibut0 at acpi0 (PWRB, PNP0C0C): ACPI Power Button
acpibut1 at acpi0 (SLPB, PNP0C0E): ACPI Sleep Button
BTH0 (BCM2E1A) at acpi0 not configured
GPS0 (BCM4752) at acpi0 not configured
CAM0 (INTCF0B) at acpi0 not configured
CAM1 (INTCF1A) at acpi0 not configured
STRA (INTCF1C) at acpi0 not configured
SHUB (SMO91D0) at acpi0 not configured
FAN0 (PNP0C0B) at acpi0 not configured
acpitz0 at acpi0 (TZ01): active cooling level 0: 50.0C critical 90.0C hot 85.0C 
passive 26.8C

That first line looks possibly worrisome; if I knew ACPI I'd have a
better idea whether it's anything to be concerned over.

> If the PCIE card you're using is working properly, it should be seen
> by the BIOS as additional SATA ports.  If you have a drive plugged
> into it when you boot the BIOS, you might even be rewarded with a
> listing of the make and model of the drive you have connected.  If
> you see that, you then know it's not hardware trouble.

Hard to tell.  The BIOS setup facility is horrible; it was designed by
someone who thinks GUI glitz is more important than functionality.  But
I *think* I didn't get any additional drives listed (not surprising in
view of the above).

I'll experiment a bit more.  I think I have another machine with a PCIe
slot (I once had a quad wm in there, but I can't recall whether it was
x1 or x4 or what); if I can find it I may try the SATA card there.  If
I can find _it_, may also try the quad wm in the ASRock.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


How does PCIe appear to the host?

2024-10-02 Thread Mouse
How does PCIe differ from PCI from the CPU's point of view?  I'm
running into an issue and it seems to me this is important.

I've been having hardware (partially) fail on me recently, which is
breaking my backups.  In particular, I'm having trouble finding a
machine to connect the backup disks to - they're SATA, and I don't have
very many machines with SATA, and some of those SATA ports appear to be
broken (one machine, for example, has six, of which I've been able to
make only two work).

One of these machines is an ASRock Q1900M.  It has only two SATA ports
onboard; it has two PCIe x1 slots and a PCIe x16 slot.  I just today
picked up a 5-port PCIe SATA card and tried it.

The reason I'm asking about PCIe is that, as far as I can tell from the
host, it isn't there at all.  While this is a relatively old kernel,
I'd expect at least a "not configured" line - but dmesg is identical
between a boot with it and a boot without it.  So now I'm wondering
whether I've got a DOA card, or a duff slot, or I need to backport
something, or whether this is somehow expected, or what.

I've already backported the printout of PCIe capability in ppb.c.  The
ppbs report as

ppb0: PCI Express capability version 2  x1 @ 
5.0GT/s
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
ppb1 at pci0 dev 28 function 1: vendor 0x8086 product 0x0f4a (rev. 0x0e)
ppb1: PCI Express capability version 2  x1 @ 
5.0GT/s
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
ppb2 at pci0 dev 28 function 2: vendor 0x8086 product 0x0f4c (rev. 0x0e)
ppb2: PCI Express capability version 2  x1 @ 
5.0GT/s
pci3 at ppb2 bus 3
pci3: i/o space, memory space enabled, rd/line, wr/inv ok
ppb3 at pci0 dev 28 function 3: vendor 0x8086 product 0x0f4e (rev. 0x0e)
ppb3: PCI Express capability version 2  x1 @ 
5.0GT/s

I note a possible conflict between the "x1" and the presence of a x16
slot; that 1 is coming from the PCIE_LCAP_MAX_WIDTH bits in PCIE_LCAP,
which makes me wonder whether something needs configuring to run the
x16 slot at more than x1.  The card does say it needs a x4 or higher
slot to work, so if the x16 slot is running x1 (is that even possible?)
that might be responsible.

Any thoughts?  Any pointers to where I might usefully look?

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: vio9p vs. GENERIC.local vs. XEN3_DOM[0U]

2024-08-12 Thread Mouse
>> If virtio were declared normally in the kernels that provide it and
>> declared as valid but specifically absent in XEN3_DOM* kernels?
>> Then I think that's what I'd want (to my limited understanding, this
>> is close to what "no virtio" does at present).

> A fair point, but are you suggesting that every bus that could ever
> exist be declared and all other kernels have "no", as a general
> approach?

Well, every bus that you'd want to support this feature for.

Alternatively, perhaps add some declaration that says "this is a valid
bus name" without declaring any instances of it, then do the
silent-drop thing for any attachment which is at a bus declared as
valid but without instances.  Then all - well, most - configs would
include all-buses.conf or some such to get those declarations, then
declare normally the buses they want to actually have.

Of course, then you have people wondering why the new device attachment
line they added isn't working.  But you'll have that potential for
_any_ design with a "silently drop this line" semantic.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: vio9p vs. GENERIC.local vs. XEN3_DOM[0U]

2024-08-11 Thread Mouse
> The right level of abstraction is to do something that says

>   if there is a virtio bus, add viop* at virtio*

> and this is true of pretty much anything that attaches to a bus that
> may or may not present.

> I wonder if there are good reasons to avoid "just skip lines that
> refer to a bus that don't exist".

My answer is, error-checking.  If I, say, typo "pci" as "cpi" in

mydev* at cpi?

I'd want an error rather than having the line silently ignored.  (That
particular typo is not all that plausible.  It's just an example.)

Now, if virtio were specifically declared as "this name is valid but
may or may not be present"?  I'm on the fence.

If virtio were declared normally in the kernels that provide it and
declared as valid but specifically absent in XEN3_DOM* kernels?  Then I
think that's what I'd want (to my limited understanding, this is close
to what "no virtio" does at present).

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Moving memory in VM?

2024-07-19 Thread Mouse
>> Is there any way to move memory within a process?  That is, [...]
> mremap?  Or what am I missing?

No, thank you; it was I who was missing something (specifically, the
existence of mremap - not something I've run into before).  I was
hoping it would be that simple!

/~\ The ASCII     Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Moving memory in VM?

2024-07-19 Thread Mouse
Is there any way to move memory within a process?  That is, I have a
bunch of valid pages at addresses [A..B) and I want to move those pages
to [C..C+(B-A)).

The principal use case I have at the moment has the initial mapping
mapped MAP_ANON, with C fixed and mmap()-style semantics for anything
that was in the target range to start with.  It's OK if the [A..B)
range is unmapped in the process, and it's OK to requires that the old
and new address ranges not overlap.  (I'm trying to implement an
alternative memory allocator and want realloc() for multi-page blocks
to be faster than copying all the bytes.)

/~\ The ASCII     Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: strange zfs size allocation data

2024-07-08 Thread Mouse
>> Is bup zfs-specific?
> No, it is a general backup program.  I just happen to have sources
> for it on zfs.  Which people tell me is a great filesystem and is now
> not odd on NetBSD

Well, great for some use cases.  I have, for example, seen it said that
it's ludicrously RAM-hungry (as in, you need multiple gigs of RAM or
you shouldn't even think about using it).  This is fine if you have a
machine so overspecced that you have that much RAM to burn; it's less
great if you're looking for something more general-purpose.  (To be
fair, it also has upsides; for example, I think I've seen it described
as having the ability to add and remove partitions live, and as keeping
integrity checksums.)

>> Because, if you're not doing something filesystem-specific, I
>> actually think you will have trouble even _defining_ what "100%
>> right" is for this test, since everything about sparseness, right
>> down to whether it's even a thing, is filesystem-dependent.

> True.  the point is to try to verify that the backup program, when
> restoring a sparse file, writes it in such a way that the normal
> implementation of sparse files works, meaning results in a file
> without blocks storing all the zeros.

Fair point.  You might want to have the test verify that the filesystem
in use does support sparseness in the form you're looking for before
doing the rest of the testing.

> What you are missing, and everybody else too, is that the fact that
> this is theoretically impossible is irrelevant to it being useful in
> the real world, to detect regressions, even if it also occasionally
> detects bizarre behavior.

I, at least, haven't been missing that.  When you talk about getting a
test "100% right", though, I read that as including "...even in
relatively unlikely circumstances".

Running on msdosfs strikes me as unlikely enough to not care about.
tmpfs, though, is relatively plausible as a filesystem for tests.

> A better test would be 'fuse-sparsetest' that makes metadata
> available for inspection later about the writes it sees.  But that's
> hard to write.

You could get much of that information by ktracing and looking for the
relevant calls - {,p}write{,v} and lseek seem to me to be the most
likely candidates.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: strange zfs size allocation data

2024-07-07 Thread Mouse
> This is a test case, to see if backing up and restoring a sparse file
> results in a sparse file.  I realize that this probably requires a
> logging fuse driver and a lot of complexity to do 100% right.

Is bup zfs-specific?  Because, if you're not doing something
filesystem-specific, I actually think you will have trouble even
_defining_ what "100% right" is for this test, since everything about
sparseness, right down to whether it's even a thing, is
filesystem-dependent.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: NetBSD-10.0/i386 spurious SIGSEGV

2024-06-11 Thread Mouse
>> pid 853.853 (nagios): signal 11 code=1 (trap 6) @eip 0xbb61057a addr 
>> 0x7513f5a9 error=14
> Anyone has ideas of things to investigate?

I don't have anything specific to suggest; I don't work with either
10.* or Xen myself.  But...

> I am about to upgrade the offending domU to amd64, in order to work
> around the problem.  If that works (and I hope it will), I will have
> no way left to test for this bug.

...if you have the space and can set up a test environment you don't
mind giving a copy of to others, it might be useful to help someone
else track it down.  Ideally, that would be a copy of the whole dom0
and domU in question, but I recognize that might involve a prohibitive
amount of space - though if you can strip the test setup down to
essentials, it'll both save on space and reduce what people have to
look at.

But, of course, that depends on you having the resources (disk space,
time, and energy/motivation) to do that.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: NetBSD-10.0/i386 spurious SIGSEGV

2024-06-09 Thread Mouse
>> I have seen many crashes on system call returns.  [...]

> I would suggest printing the siginfo, but apparently our gdb doesn't
> support it (so I filed PR 58325): [...]

Okay, this is strictly a debugging workaround: how about building a
kernel with code added so that, whenever a SIGSEGV is delivered, the
siginfo is printed on the console?  It would at least let you get the
information, and I suspect SEGVs are rare enough you wouldn't have to
sift through too many false positives.

It does, though, assume you're comfortable adding code to your kernel
and rebuilding it.  (If trying to build a new kernel SEGVs, maybe
cross-build it?)

/~\ The ASCII     Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: NetBSD-10.0/i386 spurious SIGSEGV

2024-06-08 Thread Mouse
> After upgrading i386 XEN3PAE_DOMU to NetBSD 10.0, various daemons on
> multuple machines get SIGSEGV at places I could not figure any reason
> why it happens.  [...]

> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0xbb610579 in __gettimeofday50 () from /lib/libc.so.12
> (gdb) bt
> #0  0xbb610579 in __gettimeofday50 () from /lib/libc.so.12
> #1  0xbb60ca82 in __time50 (t=t@entry=0xbf7fde88)
> at /usr/src/lib/libc/gen/time.c:52
> #2  0x0808afdd in update_check_stats (check_type=3, check_time=1717878817)
> at utils.c:3015

First thing I'd look at is the userland instruction(s) around the crash
point, maybe look at instructions starting at 0xbb610480 or something
and then disassemble forwards looking for 0xbb610579.  In particular,
I'd be interested in whether it's a store instruction that failed or
whether this happened during a syscall trap.

Are all the failures in __gettimeofday50?  All in trap-to-the-kernel
calls?

You say "multiple machines"; are those multiple domUs on a single dom0,
or are they spread across multiple underlying hardware machines?  If
the latter, how similar are those underlying machines?  I'm wondering
if perhaps something is broken in a subtle way such that it manifests
on only certain hardware (I'm talking about something along the lines
of "this tickles erratum #2188 in stepping 478 of Intel CPUs from the
Forest Lawn family").

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: TCP vs urgent data [was Re: poll(): IN/OUT vs {RD,WR}NORM]

2024-05-29 Thread Mouse
>> I might rip out the OOB stuff just to find and fix anything trying
>> to use it, though.
> I think ripping it out would be the right thing.  And I suspect very
> little would notice.

I just did a first pass, doing

find . -type f -print0 | xargs -0 mcgrep -H -l MSG_OOB

in /usr/src.

Most of it is no surprise, or is irrelevant here: telnet and ftp,
documentation, the kernel support that backends it, sys/compat, those
are expected.

But I got one surprise: rlogin{,d}.  And I had a quick look at the
code - it actually _uses_ it.  It is not a case where completely
ignoring URG will work.  (It actually uses it as out-of-band data, too.
You'd almost think it came from the same people who tried to turn the
urgent pointer into out-of-band data in the first place.)

Fortunately or unfortunately, I don't care about rlogin.  I would ditch
it when eliminating MSG_OOB.  In theory, eliminating MSG_OOB is wrong,
because TCP may not be the only protocol that uses it.  My sweep found
sys/netiso/tp_usrreq.c, and searching for SSD_RCVATMARK under sys/
finds hits in netiso as well as netinet.  But I care about netiso about
as much as I care about rlogin; I certainly don't mind losing it for
long enough to find everything using TCP "OOB".

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: TCP vs urgent data [was Re: poll(): IN/OUT vs {RD,WR}NORM]

2024-05-29 Thread Mouse
> But reading RFC 959, there is no mention of using urgent data under
> any circumstances.

No _explicit_ mention.  It's there by reference.

> What that RFC says about aborting is:

> [...]
>2. User system sends the Telnet "Synch" signal.

This involves setting URG.  Read the telnet spec's description of the
SYNCH operation (the top of page 9 of RFC854 is a good starting point).

See also SO_OOBINLINE, SIOCATMARK, and SS_RCVATMARK.  I am sorely
tempted to try to rip out the OOB botch and design a socket interface
to the urgent pointer that isn't so badly broken, but I doubt I'd
actually find any use for the latter.  I might rip out the OOB stuff
just to find and fix anything trying to use it, though.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


TCP vs urgent data [was Re: poll(): IN/OUT vs {RD,WR}NORM]

2024-05-29 Thread Mouse
Should we maybe move this to tech-net?  It's no longer about poll().

>> I question whether it actually works except by accident; see RFC
>> 6093.
> I hadn't seen that one before,

Neither had I until Johnny Billquist mentioned it upthread.  (I tend to
share your reaction to the modern IETF, though I have additional
reasons.)

>> But the facility it provides is of little-to-no use.  I can't recall
>> anything other than TELNET that actually uses it,
> TELNET and those protocols based upon it (SMTP and FTP command at
> least).

FTP command, yes.  SMTP I'm moderately sure doesn't do TELNET in any
meaningful sense; for example, I'm fairly sure octet 0xff is not
special.  I find no mention of TELNET in 5321.

> SMTP has no actual use for urgent data, and never sends any, but FTP
> can in some circumstances I believe (very ancient unreliable memory).

Yes.  It should, according to the spec, be done when sending an ABOR to
abort a transfer in progress.  But, unlike TELNET's specification that
data is to be dropped while looking for IAC DM, the urgent bit can be
completely ignored by an FTP server which is capable of paying
attention to the control channel while a data transfer is in progress.

>> then botched it further by pointing the urgent sequence number to
>> the wrong place,
> In fairness, when that was done, it wasn't clear it was wrong - that
> all long predated anyone even being aware that there were two
> different meanings in the TCP spec, people just used whichever of
> them was most convenient (in terms of how it was expressed, not which
> is easier to implement) and ignored the other completely.   That's
> why it took decades to get fixed - no-one knew that the spec was
> broken for a long time.

So...I guess next to nothing depended on it even then, or someone would
have noticed the interoperability fail sooner than decades.

> Further, if used properly, it really doesn't matter much, the
> application is intended to recognise the urgent data by its content
> in the data stream, all the U bit (& urgent pointer) should be doing
> is giving it a boot up the read stream to suggest that it should
> consume more quickly than it otherwise would.

Right.  But...

> Whether than indication stops one byte earlier or later should not
> really matter.

That depends.  Consider TELNET, which is defined to drop data while
searching for IAC DM.  If the sender consider the urgent pointer to
point _after_ the last urgent octet but the receiver considers it to
point _to_ the last urgent octet, the receiver will get the IAC DM and
notice the urgent pointer points past it and continue reading and
dropping, looking for another IAC DM, dropping at least one data octet
the sender didn't expect.

> The text in that RFC about multiple urgent sequences also misses that
> I think -

I thought that was probably there for clarity, clarifying what
logically follows from the rest.

> all that matters is that as long as there is urgent data coming, the
> application should be aware of that and modify its behaviour to read
> more rapidly than it otherwise might (if it never delays reading from
> the network, always receives & processes packets as soon as they
> arrive, which for example, systems which do remote end echo need to
> do) then it doesn't need to pay attention to the U bit at all).

Well, there are correctness issues in some cases.  For example, in
TELNET's case, it is defined to drop data while sarching for the IAC DM
that makes up part of a synch; ignoring the urgent bit means that
dropping won't happen.  (Does that matter in practice?  Probably not,
especially given how little TELNET is used outside walled gardens.  But
it still is a correctness issue.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: poll(): IN/OUT vs {RD,WR}NORM

2024-05-28 Thread Mouse
>>> However, the urgent pointer is close to useless in today's network,
>>> in that there are few-to-no use cases that it is actually useful
>>> for.
> That's probably correct too.  It is however still used (and still
> works) in telnet - though that is not a frequently used application
> any more.

I question whether it actually works except by accident; see RFC 6093.

> [That's where the off by one occurred, there were two references to
> it, one suggested that the urgent pointer would reference the final
> byte of what is considered urgent, the other that it would reference
> one beyond that, that is, the first byte beyond the urgent data.
> This was corrected in the Hosts Requirements RFCs, somewhere in the
> mid 80's if I remember them, roughly.]

But only a few implementors paid any attention, it appears.

> That is all very simple, and works very well, particularly on high
> latency or lossy networks, as long as you're not expecting "urgent"
> to mean "out of band" or "arrive quickly" or anything else like that.

But the facility it provides is of little-to-no use.  I can't recall
anything other than TELNET that actually uses it, though I am by no
stretch fmailiar with more than some of the commonest protocols out
there.

Furthermore, given that probably the most popular API to TCP, sockets,
botched it by trying to turn it into an out-of-band data stream, then
botched it further by pointing the urgent sequence number to the wrong
place, I'd say it is questionable whether it is good for _anything_ any
longer.

> If an application needs a mechanism like this, it works well.

That's a bit like saying that car hand crank starter handles are useful
if you need them: strictly true, but to a first and even second
approximation both the statement and the thing stated about are
irrelevant to everyone.

Also, it's true only provided you don't use sockets for your API (or
fixed sockets - has anyone done a TCP socket interface that exposes the
urgent popinter _properly_?), and provided your and the peer's
implementations agree on which sequence number goes in the urgent
field.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: poll(): IN/OUT vs {RD,WR}NORM

2024-05-27 Thread Mouse
>> Where do we attach 3 priority levels to data?
> [I]n the context of poll itself, it's undefined.  But it's easy to
> think that the TCP urgent data would be something usable in this
> context.  But as you note, the urgent data is a somewhat broken thing
> that noone ever really figured out how it was meant to be used or
> anything about it at all.

TCP's urgent pointer is well defined.  It is not, however, an
out-of-band data stream, nor, despite Berkeley's attempts, can it
really be twisted and bent into one, unless you are on a network which
is high bandwidth, low latency, and low loss (as compared to the
"out-of-band" data rate).  Even then, the receiving process has to
handle data promptly.  Which, probably not coincidentally, describes
Berkeley's network and most of their network programs at the time.

However, the urgent pointer is close to useless in today's network, in
that there are few-to-no use cases that it is actually useful for.

> It's not really even oob.

No, and it never has been, despite Berkeley's hallucinations.

> But poll isn't really getting into any details about this.  Just that
> if you have some sort of [data stream], where you can assign multiple
> priorities to the data, then poll can make use of it.

For a few priorities, yes.  It's a poor design in that it doesn't
provide any good way to handle more than two or maybe three different
priorities.

> I have no idea if any such device or file ever have had such a
> distinction.

Possibly nobody except System V (or possibly its direct ancestors) ever
did.  My impression is that it's a STREAMS thing, but that's a fuzzy
impression, mostly coming from manpage notes ("The distinction between
some of the fields in the events and revents bitmasks is really not
useful without STREAMS.").

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: poll(): IN/OUT vs {RD,WR}NORM

2024-05-27 Thread Mouse
>>  POLLPRIHigh priority data may be read without blocking.
>>  POLLRDBAND Priority data may be read without blocking.
>>  POLLRDNORM Normal data may be read without blocking.

> Is this related to the "oob data" scheme in TCP (which is a hack that
> doesn't work)?

I really wish BSD hadn't tried to turn the urgent pointer into an
out-of-band data stream, because, as you say, it doesn't work for that.

It doesn't really work very well even as an API to TCP's urgent
pointer.

> Where do we attach 3 priority levels to data?

That's part of what I was wondering.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: poll(): IN/OUT vs {RD,WR}NORM

2024-05-27 Thread Mouse
> Also, I suspect mouse was thinking of the TCP URG concept, and not
> PUSH when he wrote what he did, but I don't know for sure.

Ouch.  Yes, you are entirely correct in that regard.  Total braino on
my part.  My apologies.

/~\ The ASCII     Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: poll(): IN/OUT vs {RD,WR}NORM

2024-05-27 Thread Mouse
>> I can understand treating POLLWRNORM as identical to POLLOUT.  But
>> why the distinction between POLLRDNORM and POLLIN?  Might anyone
>> know the reason and be willing to explain?
> I'd hazard a guess that you'll likely not get an explanation.

Quite possibly not, but I figured it was still workt asking.

>> ...  I'd still be curious where it came from.
> Those answers are CVS:

Not where it came from in the sense of who committed it or what it
looked like at the time, but rather where that person got the various
distinctions from.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


poll(): IN/OUT vs {RD,WR}NORM

2024-05-27 Thread Mouse
In sys/sys/poll.h, I see (in 1.4T, -current according to cvsweb, and
everything I've checked in between):

#define POLLIN  0x0001
#define POLLPRI 0x0002
#define POLLOUT 0x0004
#define POLLRDNORM  0x0040
#define POLLWRNORM  POLLOUT
#define POLLRDBAND  0x0080
#define POLLWRBAND  0x0100

I can understand treating POLLWRNORM as identical to POLLOUT.  But why
the distinction between POLLRDNORM and POLLIN?  Might anyone know the
reason and be willing to explain?

Not that it matters tremendously.  But I'm curious, because it
indicates there's something I don't understand somewhere there.  The
wording is a bit confusing.  In -current's poll(2), POLLIN is said to
be synonymous with POLLRDNORM (and POLLOUT with POLLWRNORM); in 1.4T
and 5.2, there is confusing wording about "high priority data" and
"data with a non-zero priority", which may or may not be talking about
the same thing and may or may not map onto other concepts such as TCP's
push semantics, and the wording is different in the two directions.
-current's manpage's BUGS entry implies, to me, that this distinction
is something of a historical accident that should be fixed, but, even
if that reading is correct, I'd still be curious where it came from.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Using mmap(2) in sort(1) instead of temp files

2024-04-03 Thread Mouse
Why is this on tech-kern?  It seems to me it belongs on tech-userlevel.

> I'm trying to speed up sort(1) by using mmap(2) instead of temp
> files.

If you're going to sort in RAM, why bother with temporaries at all?
Just slurp it all in and sort it in core.

But.

Part of the point of using temp files, it seems to me, is to be able to
sort datasets larger than will fit memory.

Unless NetBSD is prepared to completely desupport small machines like
the MicroVAX-II or shark, I think this might be a misguided thing to
do.  Given the way swap works, I suspect it will work better to use of
temp files instead of mmap()ed memory to sort datasets larger than will
fit in RAM, even if VM is available.  Furthermore, VM can be limited;
sorting input bigger than 3G on i386 shouldn't break (and, from a
usability standpoint, shouldn't even require any special options).
Even on 64-bit, VM can be comparatively small; on a 9.1 amd64 machine
at work, proc.$$.rlimit.datasize.hard is only 8 gigs.

At the very least, I would strongly recommend adding an option to
disable this, to continue to use real files for temporaries.

> ftmp() (see code below) is called in the sort functions to create and
> return a temp file.  mkstemp() is used to create the temp file, then
> the file pointer (returned by fdopen) is returned to the sort
> functions for use.  I'm trying to understand where and how mmap
> should come into the picture here, and how to implement this feature.

I think the biggest issue you'll have (aside the ones raised above) is
that an mmap()ed memory block has a fixed size, set at map time.  Files
are sized much more dynamically.  I suspect you'll end up
(re)implementing a ramfs (a simplified one, because the application
needs are relatively simple, but still.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Fullfs file system

2024-03-22 Thread Mouse
> Thanks Mouse and Martin. I got past that error.  But now I'm running
> into another problem - unable to determine when write op occurs (to
> be able to return ENOSPC error).

> I want to return ENOSPC error whenever write occurs.  Which variable
> contains this info?  I'm confused which structs contain what
> information.

I don't know.  What I'd do in this case is to trace down through what
you call (VCALL in this case) until you get to a point where it
determines what operation to call, then see what it uses at to select
the vnode operation to be performed.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Fullfs file system

2024-03-20 Thread Mouse
> Hi, when I try to run `mount_full /root/xyz/a /root/xyz/b`
> I get the following error:
> `mount_full: /root/xyz/a on /root/xyz/b: Operation
> not supported by device`
> Any tips for debugging this?

Add printfs in the kernel codepaths?  That's what I'd start with.
(Well, actually, I'd start by rereading the code, but I assume you've
already done that.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Perceivable time differences [was Re: PSA: Clock drift and pkgin]

2023-12-30 Thread Mouse
> ? If I remember right, anything less than 200ms is immediate response
> for a human brain.

"Response"?  For some purposes, it is.  But under the right conditions
humans can easily discern time deltas in the sub-200ms range.

I just did a little psychoacoustics experiment on myself.

First, I generated (44.1kHz) soundfiles containing two single-sample
ticks separated by N samples for N being 1, 101, 201, 401, 801, and
going up by 800 from there to 6401, with a second of silence before and
after (see notes below for the commands used):

for d in 0 100 200 400 800 1600 2400 3200 4000 4800 5600 6400
do
(count from 0 to 44100 | sed -e "s/.*/0 0 0 0/"
echo 0 128 0 128
count from 0 to $d | sed -e "s/.*/0 0 0 0/"
echo 0 128 0 128
count from 0 to 44100 | sed -e "s/.*/0 0 0 0/"
) | code-to-char > zz.$d
done

I don't know stock NetBSD analogs for count and code-to-char.  count,
as used here, just counts as the command line indicates; given what
count's output is piped into, the details don't matter much.
code-to-char converts numbers 0..255 into single bytes with the same
values, with non-digits ignored except that they serve to separate
numbers.  (The time delta between the beginnings of the two ticks is of
course one more than the number of samples between the two ticks.)

After listening to them, I picked the 800 and 1600 files and did the
test.  I grabbed 128 bits from /dev/urandom and used them to play,
randomly, either one file or the other, letting me guess which one it
was in each case:

dd if=/dev/urandom bs=1 count=16 |
  char-to-code |
  cvtbase -m8 d b |
  sed -e 's/./& /g' -e 's/ $//' -e 's/0/800/g' -e 's/1/1600/g' |
  tr \  \\n |
  ( exec 3>zz.list 4>zz.guess 5&3
audioplay -f -c 2 -e slinear_le -P 16 -s 44100 < zz.$n
skipcat 0 1 0<&5 1>&4
done
  )

char-to-code is the inverse of code-to-char: for each byte of input, it
produces one line of output containing the ASCII decimal for that
byte's value, 0..255.  cvtbase -m8 d b converts decimal to binary,
generating a minimum of 8 "digits" (bits) of output for each input
number.  skipcat, as used here, has the I/O behaviour of "dd bs=1
count=1" but without the blather on stderr: it skips no bytes and
copies one byte, then exits.  (The use of /dev/urandom is to ensure
that I have no a priori hint which file is being played which time.)

I then typed "s" when I thought it was a short-gap file and "l" when I
thought it was a long-gap file.  I got tired of it after 83 data
samples and killed it.  I then postprocessed zz.guess and compared it
to zz.list:

< zz.guess sed -e 's/s/800 /g' -e 's/l/1600 /g' | tr \  \\n | diff -u zz.list -

I got exactly two wrong out of 83 (and the stats are about evenly
balanced, 39 short files played and 44 long).  So I think it's fair to
say that, in the right context (an important caveat!), a time
difference as short as (1602-802)/44.1=18.14+ milliseconds is clearly
discernible to me.

This is, of course, a situation designed to perceive a very small
difference.  I'm sure there are plenty of contexts in which I would
fail to notice even 200ms of delay.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: PSA: Clock drift and pkgin

2023-12-30 Thread Mouse
> Better than 100Hz is possible and still precise.  Something around
> 1000Hz is necessary for human interaction.

That doesn't sound right.  I've had good HCI experiences with HZ=100.
Why do you see a higher HZ as necessary for human interaction?

> Modern hardware could easily do 100kHz.

Not with curren^Wat least one moderately recent NetBSD version!

At work, I had occasion to run 9.1/amd64 with HZ=8000.  This was to get
8-bit data pushed out a parallel port at 8kHz; I added special-case
hooks between the relevant driver and the clock (I forget whether
softclock or hardclock).  It worked for its intended use fairly
nicely...but when I tried one of my SIGALRM testers on it, instead of
the 100Hz it asked for, I got signals at, IIRC, about 77Hz.

I never investigated.  I think I still have access to the work machine
in question if anyone wants me to try any other quick tests, but trying
to figure out an issue on a version I don't use except at work is
something I am unmotivated to do on my own time, and using work time to
dig after an issue that doesn't affect work's use case isn't an
appropriate use of work resources.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: PSA: Clock drift and pkgin

2023-12-23 Thread Mouse
> So even though we added one tick, you can still get two timer events
> in much closer proximity than a single tick as far as the process is
> concerned.

Certainly.  I think that's unavoidable without resetting the timer
inside the signal handler, or hard realtime guarantees (which are Not
Easy).

> And we probably do need to talk about the timer expiration and
> rearming as separate from signal deliver and process scheduling.

There are plenty of reasons user code running the signal handler may be
delayed from the time the timer is supposed to tick.

But without the timer ticking as requested, I don't think the rest
matters nearly as much.  When even an _unloaded_ machine can't get the
ticks it asks for, something is wrong.  A machine which gets that
overloaded just from delivering 100 signals to a mostly-trivial signal
handler per second, well, I doubt NetBSD runs on anything that weak.

> And from a program point of view, that is what really matters in the
> end.  If the program really wants a minimum amount of time before the
> next timeout, it needs to do the request for the next time event at
> the processing point, not something kernel internal which happend
> very disconnected from the process.

Agreed.  ITIMER_REAL in the form I've been arguing for is of little
help to a process that wants timer ticks separated by a hard minimum
interval as seen by the signal handler.  At least when using
it_interval to get repeated signals.

But then, so is every other ITIMER_REAL I've ever used.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: PSA: Clock drift and pkgin

2023-12-23 Thread Mouse
> The attached (untested) patch reverts to the old algorithm
> specifically for the case of rearming a periodic timer, leaving the
> new algorithm with +1 in place for all other uses.

> Now, it's unclear to me whether this is correct, because it can have
> the following effect.  Suppose ticks happen on intervals of time T.
> Then, owing to unpredictable and uncontrollable scheduling delays,
> the following sequence of events may happen:

> Time 0*T: timer_settime(.it_value =3D T, .it_interval =3D T), arms timer at 
> 1*T
> Time 1*T + 0.9*T: timer expires, rearms timer at 2*T
> Time 2*T + 0.1*T: timer expires, rearms timer at 3*T

> The duration between these consecutive expirations of the timer is
> 0.2*T, even though we asked for an interval of T.

True.

In my opinion that is the correct behaviour; userland requested timer
ticks at multiples of T, so there is a conceptually infinite stream of
(conceptual) ticks generated at those times.  Those then get turned
into real events when the kernel can manage it.  But a delay for one of
them should not affect any other, except for the case where one is
delayed long enough to occur after another's ideal time, in which case
I would consider it acceptable (though not required) to drop one of the
two.

> [...POSIX...]

IMO if POSIX forbids the above, POSIX is broken and should, in this
respect, be ignored.  One reason for using facilities taking structs
itimerval is for ticks to _not_ be delayed by delay of previous ticks.
If POSIX cannot be ignored for whatever reason, I would argue that a
new facility that _does_ provide undelayed ticks should be provided.
(I'm partial to timer sockets, but I am hardly unbiased. :-)

> On the other hand, if it's not correct to do that, I'm not sure
> correct POSIX periodic timers can attain a target _average_ interval
> between expirations [...]

I would argue that it's misleading, to the point I would call it
incorrect, to call such a thing "periodic".

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: PSA: Clock drift and pkgin

2023-12-23 Thread Mouse
>> Specifically, under a kernel built with HZ=100, requesting signals
>> at 100Hz actually delivers them at 50Hz.  [...]
> This is the well-known problem that we don't have timers with
> sub-tick resolution, PR kern/43997: https://gnats.netbsd.org/43997

It doesn't need sub-tick resolution; one-tick resolution would fix the
problem.  The problem appears to be that an ITIMER_REAL timer can't
deliver signals more often than every second tick.

> In particular, there is no way to reliably sleep for a duration below
> 1/hz sec.

Nor does there need to be to fix this.  1.4T/sparc and /i386 get it
right, even when running with HZ=100 and requesting 100Hz SIGALRMs.
(My timer sockets get it right too, but their codepath is completely
different, depending on only timeout(...,0,1) (1.4T) or
callout_schedule(...,1) (4.0.1 and 5.2).

> Fixing this requires adopting an MI API and MD clock driver support
> for wakeups with sub-tick resolution,

You must be talking about something different from what I'm talking
about.

What I want fixed does not involve sub-tick-resolution timers at any
level.  If using setitimer(ITIMER_REAL,...) to request SIGALRMs every
tick actually delivered a SIGALRM every tick, I'd be fine.  But,
instead, doing that delivers a SIGALRM every second tick.

> which nobody's done yet --

Nobody's done the sub-tick resolution you're talking about, maybe.  But
1.4T long ago did what I'm looking for.  Something between 1.4T and
4.0.1 broke it, and it's stayed broken until at least 9.1, probably 9.3
based on someone else's report on port-vax.  (Okay, strictly, I don't
know that it's stayed broken.  It could have been fixed and then
re-broken.)

>> } else if (sec <= (LONG_MAX / 100))
>> ticks = (((sec * 100) + (unsigned long)usec + (tick - 1))
>> / tick) + 1;

> Whether this is a bug depends on whether: [...]

I think that code is not a bug per se; for sleeps, that codepath
is...well, "reasonable" at the very least.  The bug is that it is
broken for timer reloads, but timer reloads are using it anyway;
whether you think of this as a bug in timer reloads or a bug in tvtohz
is a question of which way you prefer to squint your mind.

Always adding an extra tick may be fine for sleeps (though that's
arguable for short sleeps on a system with a high-res wallclock), but
not for timer reloads.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: PSA: Clock drift and pkgin

2023-12-23 Thread Mouse
> [...], but we are in fact rounding it up to the double amount of time
> between alarms/interrupts.  Not what I think anyone would have
> expected.

Quite so.  Whatever the internals behind it, the overall effect is "ask
for 100Hz, get 50Hz", which - at least for me - violates POLA hard.

/~\ The ASCII     Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: PSA: Clock drift and pkgin

2023-12-23 Thread Mouse
>>>} else if (sec <= (LONG_MAX / 100))
>>>ticks = (((sec * 100) + (unsigned long)usec + (tick - 1))
>>>/ tick) + 1;
>> The delay is always rounded up to the resolution of the clock, so
>> waiting for 1 microsecond waits at least 10ms.

But it is increased by 1 tick when it is an exact multiple of the clock
resolution, too.  For sleeps, that makes some sense.  For timer
reloads, it doesn't.

I could of course be wrong about that code being responsible, but
reading realtimerexpire() makes me think not; it uses tshzto, which
calls tstohz, which calls tvtohz, which is where the code quoted above
comes from.  Maybe realtimerexpire should be using other code?

> Look at the wording sleep(3), nanosleep(2), etc.  They all use
> wording like "... the number of time units have elapsed ..."

True.

And, if the misbehaviour involved sleep, nanosleep, etc, that would be
relevant.  The symptom I'm seeing has nothing to do with them (except
that both are related to time); what I'm talking about is the timing of
SIGALRMs generated by setitimer(ITIMER_REAL,...) when it_interval is
set to 1/HZ (which in my test cases is exact).  setitimer(2) does say
that "[t]ime values smaller than the resolution of the system clock are
rounded up to this resolution (typically 10 milliseconds)", but it does
_not_ have language similar to what you quote for sleep() and
relatives.  Nor, IMO, should it.  The signals should be delivered on
schedule, though of course process scheduling means the target process
may not run the handler on schedule.  Under interrupt load sufficient
that softclock isn't running when it should, I'd consider this
excusable.  That does not describe my test systems.

1.4T does not have this bug.  As I mentioned, it works fine on sparc.
Even on i386, I see:

$ date; test-alrm > test-alrm.out; date
Sat Dec 23 07:57:45 EST 2023
Sat Dec 23 07:58:46 EST 2023
$ sed -n -e 1p -e \$p < test-alrm.out
1703336265.921251
1703336325.916413
$ 

Linux, at least on x86_64, gets this right too.  On a work machine:

$ date; ./test-alrm > test-alrm.out; date
Sat Dec 23 08:18:15 EST 2023
Sat Dec 23 08:19:15 EST 2023
$ sed -n -e 1p -e \$p < test-alrm.out
1703337495.219734
1703337555.209737
$ uname -a
Linux mouchine 5.15.0-86-generic #96-Ubuntu SMP Wed Sep 20 08:23:49 UTC 2023 
x86_64 x86_64 x86_64 GNU/Linux
$ 

> Two options are to increase HZ on the host as suggested, or halve HZ
> on the guest.

I suppose actually fixing the bug isn't an option?

I don't know whether that would mean using different code for timer
reloads and sleeps or what.  But 1.4T is demonstration-by-example that
it is entirely possible to get this right, even in a tickful system.
(I don't know whether 1.4T sleeps may be slightly too short; I haven't
tested that.  But, even if so, fixing that should not involve breaking
timer reloads.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: PSA: Clock drift and pkgin

2023-12-22 Thread Mouse
In a discussion of timekeeping on emulated VAXen, over on port-vax@, I
mentioned that I've found that, on 4.0.1, 5.2, and 9.1, and, based on a
report on port-vax@, apparently 9.3 as well, there's a bug in
ITIMER_REAL signals (possibly on only some hardware - I've seen it on
amd64 and i386, and, if my guess below is correct, it should manifest
on various others as well).

Specifically, under a kernel built with HZ=100, requesting signals at
100Hz actually delivers them at 50Hz.  This is behind the clock running
at half speed on my VAX emulator, and quite likely behind similar
behaviour from simh (which emulates VAXen, among other things) on 9.3.
I suspect it will happen on any port when requesting signals one timer
tick apart (ie, at HZ Hz).

In case anyone wants it, I wrote a small test program.  It requests
100Hz signals, then, in the signal handler, takes a gettimeofday()
timestamp.  After taking 6000 timestamps (which ideally should take
60.00 seconds), it then prints out all the timestamps, thus indicating
the actual rate signals were delivered at.  It's on
ftp.rodents-montreal.org (which supports HTTP fetches as well as FTP)
in /mouse/misc/test-alrm.c for anyone interested.  On machines with the
half-speed bug, it takes two minutes rather that one, and the
timestamps average about 20ms apart, instead of 10ms.  ("About" because
in most of my tests there is usually at least one interval that is
slightly longer than it should be.)

I don't _know_ what's behind it.  But today I went looking, and, in
5.2, there is code which looks suspicious.  I don't know where the
analogous code is in 9.x, but presumably plenty of people here do.
Speaking of 5.2, then: in kern/subr_time.c, there is tvtohz(), which
has code

} else if (sec <= (LONG_MAX / 100))
ticks = (((sec * 100) + (unsigned long)usec + (tick - 1))
/ tick) + 1;

which looks suspicious.  If sec is zero and usec is tick, that
expression will return 2 instead of the 1 I suspect it needs to return.

I haven't yet actually tried changing that.  Holiday preparations and
observations are likely to occupy much of my time for the next week or
so, but I'll try to fit in the time to change that and see if it helps
any.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: DRM-KMS: add a devclass DV_DRMKMS and allow userconf to deal with classes

2023-10-19 Thread Mouse
>>> [...DV_DRMKMS...userconf...]
>> [...devices in multiple classes...maybe use a separate namespace,
>> used by only config(1) and userconf?...]
> This is precisely why I ask for comment ;-)

:-)

> I have two requirements:

> - that the solution is not ad hoc i.e. that it can provide, in
> userconf, facilities not limited to drmkms (I don't want to implement
> a special case to recognize "drmkms" and to expand to all the STARred
> driver names implied);

I agree with this; special-casing drmkms would be...suboptimal.

> - that it will not imply to have to maintain special data for
> userconf to recognize some "magic" strings.

You already need that, in that userconf has to have some way to
recognize the string "drmkms" as a device category (hinted by the
"class =" syntax, but it still needs error-checking) and map it into
the corresponding DV_ value.  I don't see it as significantly worse for
config(1) to generate some data structure mapping device class names
into whatever userconf would need to affect all devices of that class.

Though it occurs to me that there are too many things called "class"
here.  "Group"?  "Category"?  "Collection"?

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: DRM-KMS: add a devclass DV_DRMKMS and allow userconf to deal with classes

2023-10-19 Thread Mouse
> I propose to add a DV_DRMKMS class to sys/device.h:enum_devclass; to
> augment cfdata with a devclass member [...]

> Comments?

This is not intended as criticism; I am just trying to examine all
sides of this question.

Why use the sys/sys/device.h kind of device class for userconf?  Is
there some reason to think it will be useful to userconf other device
classes, or do you expect other device-class machinery to have a use
for DV_DRMKMS, or is it a question of just reusing the existing device
class rather than creating a new kind of device class, or what?

I'm also thinking it could be useful for a device to fall into multiple
classes for userconf, but I _think_ DV_* classes don't support a device
being in multiple classes.  It also could be useful for custom kernels
to have custom modifications to device classification.  So I'm
wondering if it would be better for this to be a namespace specific to
config(1) and userconf rather than having anything to do with DV_*
values.

Or is that getting into "the best is the enemy of the good" territory?

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: drm.4 man page and import of X11 drm-kms.7 and al.

2023-10-17 Thread Mouse
>> So to clarify: I'm proposing to convert the rst doc pages to man
>> pages [...]
> I wouldn't bother installing man pages for this, someone working on
> the kernel already has the source tree.

With vanishingly few exceptions (notable by their rarity), source code
is not documentation.  There's a reason NetBSD has section 9.

For that matter, most manpages are depressingly poor, but they are
usually better than telling people "UTSL".  (I don't know anything
about the quality of the manpages in question here, except that someone
invented yet another markup language for them, which is a mild negative
to me; the xsrc trees I have on hand don't have any *.rst files.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: drm.4 man page and import of X11 drm-kms.7 and al.

2023-10-17 Thread Mouse
> There is no man page for drmkms (the kernel part), but there are man
> pages in the X sources, in the rst format [...]

Would it maybe be worth creating a tiny manpage that just says "go look
over there for docs"?  I don't _like_ having manpages in yet _another_
markup language, but I prefer that to no doc - and, without any
manpage, I have no idea how someone looking for doc would know to look
for .rst (!) files buried in xsrc, for doc on something in the kernel.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: kern.boottime drift after boot?

2023-10-11 Thread Mouse
>> On embedded systems without an RTC, you _could_ get the same UUID in
>> rare cases.  But I think this would be a bug, because on Linux, any
>> kind of jitter-source (interrupt timing, for instance) would perturb
>> the generated UUID.

> Hopefully!  Even though this jitter might not be high entropy, it
> should (in theory) be enough to give a differnet UUID each boot.

Probably.  Depending on how the entropy and RNG are handled, it is
likely to be different only with some (relatively high) probability.

What probability of collision (ie, of a boot giving the same value as
the previous boot) is acceptable here?  One in 256?  One in 64k?  One
in 4G?  If you're doing "random" generation, it's hard to avoid some
chance of collision.

If the system can tolerate repeated writes to a piece of its mass
storage (disk etc - "disk" for brevity), you could set up something
with (say) a 32-bit value saved in a fixed place.  Each boot, you read
it, save the value somewhere for this boot, and write it back to disk
incremented before doing anything else.  The value saved is your boot
ID.  But this depends on having a small piece of disk you can afford to
write to once per boot.

It also demands custom kernel code, unless NetBSD wants to adopt
something of the sort or it's acceptable to have duplicate boot IDs if
a boot attempt crashes "too early".  If the latter is acceptable
(which, based on the fragments of the original post I saw quoted,
sounds likely), you could even do it entirely in userland, immediately
upon having a writable persistent filesystem available.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: panic options

2023-09-12 Thread Mouse
>> There is a call to panic() if the kernel detects that there is no
>> console device found, I would like to make this call to it just
>> reboot without dropping into ddb.

> Well.  Not completely what you ask for, but...

> sysctl -w ddb.onpanic=0 (or even -1)

Well, notice that the original post writes of panicking if "there is no
console device found", which sounds to me like something that happens
during boot, much earlier than /etc/sysctl.conf can affect.

I'd say, rather, DDB_ONPANIC=0 in the kernel config and then
ddb.onpanic=1 during boot.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: kcmp(2)

2023-07-20 Thread Mouse
>>> Don't cloner instances differ in minor number?  [...]
>> Not that I?m aware of.  [...]
> Well, as noted in this thread, traditionally you can tell when two
> files are the same by examining stat results.

Maybe that should be a hint.  See below.

> And the cloner mechanism replaced an older scheme where you had to
> pick the number of instances you wanted, and unless I'm
> misremembering badly in that world each had to have its own minor
> number.

That's my understanding as well.

> That said, it almost certainly isn't important...

Well, if it means that with a minor tweak NetBSD could have kcmp(3)
instead of kcmp(2), it could be.

It occurs to me that, if you give each device_t a unique-per-boot
serial number and return that in structs stat, writing a kcmp(3) would
border on trivial.  (The _implementation_ would be NetBSD-specific, in
that it would depend on st_serial or whatever you call it, but the
_interface_ wouldn't.)

Except, hmm.  The above covers only DTYPE_VNODE.  I'm not sure what
could be done about other DTYPEs.  If you really want to support
Linuxisms - IMO that way lies madness, but it's ben somewhere between a
long time and forever since NetBSD cared about MO - it might have to be
kcmp(2) in order to DTRT for all DTYPEs.  Or else each DTYPE would need
its own analog of st_serial.  Perhaps st_serial could be done in a way
that's common across all DTYPEs?  It'd probably need to be 64-bit to
avoid running out in the face of extreme use cases, but that's hardly
impossible.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: kcmp(2)

2023-07-18 Thread Mouse
>> What does Mesa use kcmp(2) for?
> Working out whether two device handles are to the same DRM device.

>> Is there another way to accomplish what Mesa is trying to use it for?
> I don't know of one.

Is fstat() plus checking st_rdev (and possibly other fields)
insufficient?

/~\ The ASCII     Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: [PATCH] style(5): No struct typedefs

2023-07-12 Thread Mouse
>> [...] as I see it the divide you're sketching here ([...]) is the
>> divide between interface and implementation, and in some cases the
>> interface is more than just the typedefs.

> Sort of.

>   // contains the "vnode_t" opaque type definition
>  // contains the guts of "struct vnode" and the other 
> implementation details

>   // Contains some of the file system interfaces, some of which 
> use vnode_t
>   // Contains the vnode interfaces, which definitely use vnode_t

> The latter if the two would each include .

You're right, I hadn't fully understood you.

Hmm.  What value is provided by separating the vnode_t type from the
rest of the vnode interface (in )?  If taken to its
logical extreme (which of course ends up at a satirical position, like
most logical extremes), this leads to

 // vnode_t
 // enum vtype
 // enum vtagtype
 // #define VNODE_TAGS
 // struct vnlock
 // #define IO_UNIT
 // #define IO_APPEND


which I hope you agree is madness.  What makes it worth singling out
vnode_t for special treatment here?

I would prefer to draw include-file boundaries more or less matching
conceptual-API boundaries, which I thought was more or less where we
started:  defines the API to the vnode subsystem,
including types, #defines, externs, etc.  But I'm not sure how that
would differ from what we have now.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: [PATCH] style(5): No struct typedefs

2023-07-12 Thread Mouse
> I'm tempted to say that an opaque struct declaration in a .c file
> ought be treated suspiciously -

Depending on what you mean by "an opaque struct declaration", I may
agree or disagree.

If you mean the "struct foo;" without a body, then I think I agree.

But the "struct foo { ... };" that completes the incomplete type
corresponding to the include file's "struct foo;", that I think is the
whole point of opaque structs: the completion is available only under
the implementation hood.  While that may be in a foo-impl.h file, if
only a single file needs it I see no harm in the completion being in
the .c (and indeed I've done it that way often enough).

> (and it would be kind of nice if C had a way to say "all functions
> defined beyond this point are static").

Personally, I'd prefer "all functions defined before this point are
static", since I prefer calls to move textually backward (or, to put it
another way, I prefer to avoid forward declarations when possible).

But I doubt either of those will appear anytime soon.

> And to return briefly, to the issue way up the top of simplifying the
> leader files, there is one change I've wanted to make for ages, but
> just haven't been brave enough to do,

> That is to rip the definition of __NetBSD_Version__ [...] into a new
> header file all of its own [...] with the rule that *no* other header
> file is oermitted to include it.

I'm...not sure I agree with that.

I once built a kernel facility which I wanted to be completely drop-in
to multiple kernel versions (as widely disparate as 1.4T and 5.2).  The
design I came up with was (names probably changed; I'm not digging up
the code to see what names I actually used) a "version.h" file which
looked like

#include  // to get __NetBSD_Version__

#if __NetBSD_Version__ == whatever value 1.4T had
#include  // where 1.4T keeps it
#define THINGY() do { ...1.4T code for THINGY() ... } while (0)
typedef int COMMON_TYPE; // just an int on 1.4T
#endif

#if __NetBSD_Version__ == whatever value 5.2 had
#include  // on 5.2 we need two
#include  // include files
#define THINGY() thingy() // 5.2 has THINGY() nicely encapsulated
typedef struct something COMMON_TYPE; // 5.2 uses a struct
#endif

etc.  It sounds as though you would prefer the #include that pulls in
__NetBSD_Version__ be in every C file that wants to include
"version.h".  This seems counterintuitive, even counterproductive, to
me (and see also below, about #include order).  Or perhaps you'd prefer
that I'd designed those interfaces some other way, rather than with a
version-switch include file?

My own pet include-file peeve is rather different: I strongly believe
that, with very few exceptions ( is the only one that comes
to mind), you should be able to re-order your #include lines
arbitrarily without producing any semantic change, and that, if this is
not so, at least one of the include files involved is broken.  I've
been making small steps towards fixing this in my own trees, but it's
still a major mess.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: [PATCH] style(5): No struct typedefs

2023-07-12 Thread Mouse
[paragraph-length-line damage repaired manually]

> What about something like  and
> ?  The type definitions go into the former
> header file, [...]

Well, I don't like the "typedefs" name, because as I see it the divide
you're sketching here (which I support, in general) is the divide
between interface and implementation, and in some cases the interface
is more than just the typedefs.  Some structs have their struct
definition, or some of it (regex_t is an example), as part of their
advertised interface, and many have #defines as well.

But I'd rather see the division done with (what I see as) an inaccurate
name than see the division not done.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: [PATCH] style(5): No struct typedefs

2023-07-11 Thread Mouse
> Sometimes I even wonder why typedef exists in C.  Feels like I could
> accomplish the same with a #define

For `typedef struct foo Something', yes, you could (well, except for
their different scopes, and possibly some other corner cases).

For `typedef void (*callback_t)(void *, status_t, int)', not so much.

/~\ The ASCII     Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: [PATCH] style(5): No struct typedefs

2023-07-11 Thread Mouse
> I don't get it.  Why the "void *" stuff?  That is where I think the
> real badness lies, and I agree we should not have that.

> But defining something like

> typedef struct bus_dma_tag *bus_dma_tag_t;

> would mean we could easily change what bus_dma_tag_t actually is,
> keeping it opaque, while at the same time keeping the type checking.

Um, no, you get the type checking only as long as "what [it] actually
is" is a tagged type - a struct, union, or (I think; I'd have to check)
enum.  Make it (for example) a char *, or an unsigned int, and you lose
much of the typechecking.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: [PATCH] style(5): No struct typedefs

2023-07-11 Thread Mouse
[riastradh@]

> I propose the attached change to KNF style(5) to advise against
> typedefs for structs or unions, or pointers to them.

Pointers to them, I agree.  I don't like typedefs for pointers to
structs, except possibly for a few special cases.  I think it should be
pellucid from the declaration whether you're dealing with a pointer.

But most - all, I think - of the benefits you cite are still available
when using typedefs for the structs themselves.  Indeed, different
files do not have to agree on whether to use typedefs, and external
references, such as your

> struct vnode;
> int frobnitz(struct vnode *);

can do exactly that, even if other code does "typedef struct vnode
VNODE;" and then uses VNODE (or vnode_t, or whatever name you prefer;
personally, I like all-caps).

> There isn't even any need to define `struct bus_dma_tag' anywhere;
> the pointer can be cast in sys/arch/x86/x86/bus_dma.c to `struct
> x86_bus_dma_tag', for instance (at some risk in type safety, but a
> much smaller risk than having void * as the public interface),

But at a risk in formal nonportability, unless the struct bus_dma_tag *
was obtained by casting _from_ a struct x86_bus_dma_tag * to begin with
(which in this case it probably would have been).  I'd have to look up
the details to tell whether it's possible for casting a pointer to a
completed struct type to a pointer to a never-completed struct type to
lose information, fall afoul of alignment requirements, or the like.

[uwe@]

> Typedefs make sense when the type is *really* opaque and can, behind
> the scenes, be an integer type, a pointer or a struct.

Agreed.

> [Ab]using typedefs to save 8 bytes of "struct " + "*" just adds
> cognitive load (and whatever logistical complications that you have
> enumerated in the elided part of the quote).

Personally, I find that *in*cluding the "struct" adds cognitive load.
Perhaps it's just a question of what I'm used to, but having a noise
word present - and the "struct" is close to that, from a conceptual
point of view - means more noise to ignore.  Especially when the type
is referred to multiple times; I haven't seen it often, but I have seen
statements that look as though half the alphanumerics are "struct" (I
doubt any actually make it to the point of half, since each "struct"
needs a tag to be useful, and at least a few other identifiers to make
a useful statement, but it sure feels like it occasionally).

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Anonymous vnodes?

2023-06-27 Thread Mouse
>> Is it possible to create a vnode for a regular file in a file system
>> without linking the vnode to any directory, so that it disappears
>> when all open file descriptors to it are closed?  (As far as I can
>> tell, this isn't possible with any of the vn_* or VOP_* functions?)
> That's completely normal.

It's a normal state to be in.  But, as I read it, the post was asking
for a way to reach that state _without_ passing through a "has a name
in some directory" state; it's not clear to me whether that's possible
in general (ie, without doing something filesystem-specific).

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: RFC: Native epoll syscalls

2023-06-22 Thread Mouse
> It is definitely a real problem that people write linuxy code that
> seems unaware of POSIX and portability.

While I feel a bit uncomfortable appearing to defend the practice (and,
to be sure, it definitely can be a problem) - but, it's also one of the
ways advancements happen: add an extension, use it, it turns out to be
useful, it gets popular

I've done it myself (well, except for the "gets popular" part, which no
one person can do alone): labeled control structure, AF_TIMER sockets,
pidconn, validusershell, the list goes on.

/~\ The ASCII     Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: PROPOSAL: Split uiomove into uiopeek, uioskip

2023-05-10 Thread Mouse
>>> To the extent that it's impossible, it's impossible only because
>>> the APIs provided by the kernel to userland don't have any way to
>>> represent such a thing.  This would border on trivial to fix,
>>> except that it would be difficult to get much userland code to use
>>> the resulting APIs because of their lack of portability to other
>>> UNIX variants.
> Since write(2) is one of the oldest interfaces in Unix the chances of
> any change taking hold are vanishingly small...

Oh, I'm not suggesting that write(2) change.  What I'm suggesting is
the creation of some alternative interface, write_extended(2) let's
call it for the sake of concreteness[%], which is just like write(2)
except that it _does_ provide some way to unambiguously return "wrote
N, then error E".  (Exactly how is pretty much irrelevant; I'm sure
practically everyone here can imagine plenty of possible alternatives.
If it comes to arguing choices for that, I'd paint the bikeshed a dark
forest green.)  But write(2) would continue to exist, with more or less
its traditional semantics.  Only if - and only when - write_extended
becomes so popular that nobody uses plain write(2) any longer would I
propose removing it.

[%] If anything like this happens I certainly hope someone will invent
a better name.

But userland uptake for write_extended will be minimal, especially
initially; that's the portability issue I was talking about.

> All of this is not _independent_ of fixing uiomove callers, [...],
> but it's largely orthogonal to the original problem of incorrectly
> rolling back partial uiomoves. :-(

Agreed.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: PROPOSAL: Split uiomove into uiopeek, uioskip

2023-05-10 Thread Mouse
>> - uiopeek leaves uio itself untouched ([...]).
> Hmâ?? Iâ??m having second thoughts about uiopeek(), as well.  It implies a d$

That is a good point.  But would it be a problem to have uiopeek
(works only to move from uio to buffer) and uiopoke (the other way)?

I've never liked the way uiomove can move data one direction or the
other depending on how the uio is set up.  (I'd rather have uioread and
uiowrite except that I'd be constantly trying to remember whether
uioread reads from the uio or moves data in the direction a read()
operation does.)  Maybe uioget and uioput?

Is there any significant amount of code that calls uiomove without
knowing which direction the bits are moving?

As for uiocopy versus uiomove, those are similar enough to memcpy and
memmove that the implication feels to me more like "buffers can
overlap" (for all that that is a highly unlikely use case) or some
such.

Finding good names is a mess.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: PROPOSAL: Split uiomove into uiopeek, uioskip

2023-05-09 Thread Mouse
> (In general, erroring in I/O is a whole additional can of worms; it's
> wrong to not report back however much work happened before the error
> when it can't be undone, but also impossible to both report work and
> raise an error.  [...])

To the extent that it's impossible, it's impossible only because the
APIs provided by the kernel to userland don't have any way to represent
such a thing.  This would border on trivial to fix, except that it
would be difficult to get much userland code to use the resulting APIs
because of their lack of portability to other UNIX variants.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: PROPOSAL: Split uiomove into uiopeek, uioskip

2023-05-09 Thread Mouse
>> I'm not a fan of uioskip() as a name - [...]
> I agree.  "skip" seem to have wrong connotations (cf. dd(1)).

I'm not sure I agree.  (The dd analogy is weak; dd has no peek
operation, and uioskip - under whatever name - would border on useless
without uiopeek.)

For uioskip, it _is_ skip semantics.  The bytes skipped are not copied
anywhere, not by uioskip.  (They may have been copied earlier with
uiopeek, but that doesn't affect what uioskip does.)  If you want
skip-*with*-copy, well, uiomove is still there.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: PROPOSAL: Split uiomove into uiopeek, uioskip

2023-05-09 Thread Mouse
> I propose adding uiopeek(buf, n, uio) and uioskip(n, uio) which
> together are equivalent to successful uiomove(buf, n, uio).

For what it may be worth, I like this.  I don't _think_ I've ever run
into issues caused by this issue before, but I have trouble seeing it
as anything other than a bug waiting to happen.

/~\ The ASCII     Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: dkwedge: checksum before swapping?

2023-05-08 Thread Mouse
>> But that comment clearly indicates that _someone_ thought it
>> reasonable to checksum before swapping, so I can't help wondering
>> what use case that's appropriate for.
> It's a checksum over the 16bit words in native byte order.  So when
> you access the words in opposite byte order, you need to swap the
> checksum too.

Yes.  But, to me, that would mean byteswap, then checksum.

In the case at hand, the checksum is also bit-order-independent (in C
terms, it's ^, not +), which means that byteswapping is almost
irrelevant.  But there are 8-bit members involved too (p_fstype and
p_frag) and strings (d_typename and d_packname), which don't get
swapped, so swapping and checksumming do not commute.

As a toy example, consider:

struct toy {
uint16_t a;
uint16_t b;
uint8_t c;
uint8_t d;
uint16_t checksum;
};

Let's set a=0xfeed, b=0xface, c=0xf0 d=0x0d.  On a big-endian machine,
the resulting octet stream is fe ed fa ce f0 0d, checksum f4 2e.  On
little-endian, ed fe ce fa f0 0d, checksum d3 09 - not 2e f4.

> Unlike the regular disklabel code (which ignores other-endian
> disklabels) the wedge autodiscover code will accept either.

Is that actually known to work, or is it more "is intended to accept
either"?  Because it looks to me as though it will not accept labels
checksummed and written natively by the other endianness, unless the
strings and fstype/frag values happen to be such that they checksum to
a multiple of 0x0101 (which is possible but unlikely).  And the comment
indicates that someone thought about the issue and came to what looks
to me like an incorrect conclusion.

> As for padding, the structure is nowadays defined with fixed size
> types and explicit padding fields, so we may still assume that the
> compiler won't add any further padding by itself.

Currently true, though still disturbingly fragile.  ("Nowadays"?  It
was so even as far back as 1.4T.  Well, as far as the fixed-size types
thing goes, at least; there are no explicit padding fields in any
version I have, but it is (presumably carefully) defined so there is no
need for them, either.  At least assuming a power-of-two
octet-addressed machine, a char * that's no bigger than 8 bytes, and a
non-malicious compiler.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


dkwedge: checksum before swapping?

2023-05-07 Thread Mouse
In sys/dev/dkwedge/dkwedge_bsdlabel.c, I find (and I see more or less
the same code in 5.2 and what cvsweb.n.o shows me)

static int
validate_label(mbr_args_t *a, daddr_t label_sector, size_t label_offset)
{
...
/*
 * We have validated the partition count.  Checksum it.
 * Note that we purposefully checksum before swapping
 * the byte order.
 */
...code that does indeed checksum before swapping...
}

Does anyone know what the intent of this is?  The only reason I would
expect to see a byteswapped label is when it's written by a machine of
the other endianness, and in that case the checksum will be wrong
unless the label is swapped before checksumming (and even then only if
the compiler doesn't insert shims in the struct, not that it's likely
to given the present layout).

But that comment clearly indicates that _someone_ thought it
reasonable to checksum before swapping, so I can't help wondering what
use case that's appropriate for.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Per-descriptor state

2023-05-05 Thread Mouse
>>> But I kind of think it'd be preferable to make a way to clone a
>>> second independent struct file for the same socket than to start
>>> mucking with per-descriptor state.
>> [...] it should be easy to extend dup3() to make that possible [...]
> If we were to add such a thing it should be called something else,
> not dup, because it's a fundamentally different operation from dup
> and we don't need people confusing them.

Different?  Some.  But not very.  It _is_ closely related to dup().  I
don't think dup3() would be a horrible way to do it - not nearly as
horrible as, say, the hack I once implemented where calling
wait4(0x456d756c,(int *)0x61746f72,0x4d616769,(struct rusage *)0x633a2d29)
would produce suitably magic effects.  (This was on a 32-bit machine.)

But, honestly, when I saw the idea my reaction was to make it a new
operation to fcntl.  F_REOPEN, say, since it's creating new per-open
state.  Or, if you want to be all overloady, how about
open(0,O_REOPEN,existingfd)?  It _is_ creating new per-open state, so
open is in _some_ sense right.

My choice, for what it's worth, would be fcntl, dup3 second, with
O_REOPEN a distant third.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Per-descriptor state

2023-05-04 Thread Mouse
>>> I should probably add [close-on-fork] here, then, though use cases
>>> will likely be rare.  I can think of only one program I wrote where
>>> it'd be useful; I created a "close these fds post-fork" data
>>> structure internally.
>> I can't think of any at all; to begin with it's limited to forks
>> that don't exec, and unless just using it for convenience as you're
>> probably suggesting,

Yes.  If the application does all the forking (ie, except for forks
inside libraries, for which see below), it is just convenience, freeing
the application from keeping track of which fds need closing.

Well, except for libraries that open fds internally, without exposing
them to the calling code.  Depending on the user case, they may want
them closed if the application forks.

>> it only applies when also using threads, and if one's using threads
>> why is one also using forks?

Because one wants to exec a child process, maybe?

>> So it seems like it's limited to badly designed libraries that want
>> to fork behind the caller's back instead of setting up their forks
>> at initialization time.  Or something.

What about libraries that fork _not_ behind the caller's back?
(system(3) being, perhaps, the poster child.)

> Or it is needed for a little used application called Firefox.

What part of "badly designed" does that not fit?  (Okay, admittedly, I
don't know what Firefox looks like internally.  But the UI design is
bad enough I would expect the internals to be little better.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Per-descriptor state

2023-04-30 Thread Mouse
>> Close-on-fork is apparently either coming or already here, [...]
> We don't have it, but it will be in Posix-8.  [...] not sure if
> anyone has thought of a way to add it to socket() -

It's looking to me as though more and more syscalls are having to grow
flags arguments to do things right.  Auto-close-on-fork for socket(),
accept(), and socketpair(); per-operation non-blocking for at least
some half-dozen calls

I don't see how to avoid it for socket().  For accept(), it could be
shoehorned into a SOL_SOCKET sockopt (SO_AUTO_CLOFORK_ON_ACCEPT, say,
better name welcomed).

> that doesn't look to be trivial, though it might be possible to abuse
> one of the params [socket()] has - probably domain - and add flags in
> upper bits ...

Possible?  Probably.  Good?  No, IMO.

> while having it able to be set/reset via fcntl is useful, to work, it
> needs to be able to be set atomically with the operation that creates
> the fd,

Well, to work for one particularly important use case.  It can work
just fine for various other use cases without that.

> and having it default "on", which would work, would break almost all
> existing non-trivial code).

What about having it default to a per-proces (or per-thread) settable
state?

Mouse


Re: Per-descriptor state

2023-04-30 Thread Mouse
> Close-on-fork is apparently either coming or already here, not sure
> which, but it's also per-descriptor.

I should probably add that here, then, though use cases will likely be
rare.  I can think of only one program I wrote where it'd be useful; I
created a "close these fds post-fork" data structure internally.

> The thing is, per-descriptor state is a mess and it shouldn't be
> allowed to proliferate.  The reason: the descriptor is an array
> index.  There's plenty of room to store stuff in the object it looks
> up (that is, struct file) but storing stuff on the key side of the
> array is a problem.

(References to -current here really mean "filedesc.h 1.70 according to
cvsweb.netbsd.org".)

Looking at the include files, it looks to me as though descriptors are
indices into an array of structs (struct fdfile) in -current or 5.2, or
an index into two parallel arrays of pointers and flags in 1.4T.  Those
then point to the structs file (the per-open state).

It's true the flags fields are chars (two bits used in 1.4T, two
separate chars storing one bit each in 5.2 or -current).  But it's a
far cry from being as bad as you outline.  There are multiple bits
free, and, even if they run out, growing them from chars is a low
(memory) cost on 1.4T and probably zero on 5.2 and -current on most
ports.

> For a couple bits you can mask them off from the pointer (though even
> that's abusive);

If that were what were being done, I would agree it's abusive.

> more than that and suddenly you need to double the size of the
> fdtable so it contains an extra machine word for random state bits as
> well as the pointer to the struct file.

That is quite possibly why 1.4T uses parallel arrays rather than an
array of structs.  In 5.2 and -current, there is enough additional
state that someone (rightly, IMO) decided it wasn't worth the code
complexity of keeping parallel arrays.  (See struct fdfile in
sys/filedesc.h for the additional state I'm talking about.)

> (Then there's also another issue, which is that in nearly all cases
> nonblocking I/O is a workaround for interface bugs, e.g. nonblocking
> accept or open of dialout modem devices, or for structural problems
> in software that also needs to use the same file handle, like your
> original problem with curses. In the long run it's probably better to
> kill off the reasons to want to use nonblocking I/O at all.)

And replace nbio with...what?  Multiple threads doing blocking calls?
Or do you think everything should be nonblocking by default, with
blocking behaviour implemented userland-side?  Or am I completely
misinterpreting somehow?

> (also, "mirabile visu")

What did I write?

*checks*

Oops.  Thanks.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Per-descriptor state

2023-04-29 Thread Mouse
Back in late March, I wrote here (under the Subject: Re: flock(2):
locking against itself?) about a locking issue, which drifted into
discussing per-descriptor state versus per-open state (thank you
dholland for that phrasing!) versus backing-object state.  In
particular, someone (I forget who) said that non-blocking really should
be per-operation rather than being state anywhere.

That's correct, but, thinking about it since then, that is not as easy
as one might wish, because there are quite a number of calls that can
be affected.  (Offhand: {,p}{read,write}{,v}, connect, accept,
{send,recv}{,msg}, sendto, recvfrom.  There are probably others, but
this is sixteen already - though I'm not sure p{read,write}{,v} need
the option; are there any seekable objects on which non-blocking is
significant?)  Some of these, such as send*, already have a flags
argument that could grow a new bit to indicate nonblocking, but the
rest - more than half - would need to have an alternative version
created with a flags field, or some such.  While hardly impossible,
this gets annoying, and indeed might be best addressed by (to use
read() as an example) making the real syscall backing read() always
take a flags argument, with the libc stub supplying a fixed value for
the flags when the flagless API is called.

This is pushing towards making it per-descriptor state.  At present,
the versions I have don't have anything but close-on-exec as true
per-descriptor state.  A quick look at 9.1 and cvsweb.netbsd.org
(which, mirabilu visu, actually works for me, unlike www.netbsd.org and
mail-index.netbsd.org) sys/sys/filedesc.h makes me think that that's
true of them as well.

For backward compatibility, I would be inclined to leave the existing
mechanisms in place, theoretically to be removed eventually.  This also
means divorcing "non-blocking open" from "non-blocking I/O after open".

So: does anyone have any comments on the above analysis, or thoughts on
good, bad, or even just notable effects making it real per-descriptor
state might have?

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: flock(2): locking against itself?

2023-03-30 Thread Mouse
>> Another option is to use newterm() in curses with a funopen()ed
>> stream for output which queues the output to be written
>> (nonblockingly) to the real stdout.
> Would toggling O_NONBLOCK using fcntl() work for you?  A bit tedious,
> but it can be done "per operation".

...ish.  I hadn't thought of that.  But, the way the code is
structured, that actually isn't a crazy suggestion at all.

I'm not sure it would work, but it's a very good thought.

Thank you!

Mouse


Re: flock(2): locking against itself?

2023-03-30 Thread Mouse
> You probably already researched this, but it looks like newterm() is
> in the curses library in NetBSD-5,

It is.  (I wouldn't've even discovered it if it weren't.  I started
looking at libcurses with an eye to providing some way to output data
via a callback instead of a file descriptor and discovered newterm().)
It's in 4.0.1's libcurses, even - or, at least, it's in my 4.0.1
derivative, and diff -u -r between that and 5.2's libcurses shows
enough differences I doubt I ported it between them, so it probably is
in the base OS I started with.

But we have now definitely drifted off-topic for this list.  Moving
non-blocking I/O out of the object towards userland, that's on-topic
here.  Working around the issue in userland, not so much.

> so getting it to work in NetBSD-1.4T shouldn't be that difficult.

That's what I'm hoping.  I'll see what I can get working in my Copious
Spare Time

Mouse


Re: flock(2): locking against itself?

2023-03-30 Thread Mouse
>> [...non-blocking stdin vs stdout, with curses...]
> The only way I've thought of to work around this is to fork a helper
> process [...]

I just realized this is not quite true.

Another option is to use newterm() in curses with a funopen()ed stream
for output which queues the output to be written (nonblockingly) to the
real stdout.  That, however, would mean backporting libcurses, because
I'd like this to run on my 1.4T as well as 4.0.1 or 5.2, and 1.4T's
libcurses has no newterm().  I've started looking at that backport, but
it's going to take a while.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: flock(2): locking against itself?

2023-03-30 Thread Mouse
> I may be missing something in your curses non-blocking case, but
> can't you work around the issue by setting up an independent file
> descriptor, and hence tty structure, by performing a dup2(2) on stdin
> and then closing the original stdin file descriptor?

No.  dup2, like dup, creates another file descriptor reference to the
original open file table entry.  (Something very much like that is how
stdin and stdout got set up to begin with.)

Furthermore, in the case of non-blocking I/O, it is the underlying
object (the tty, here) that is nonblocking, not the open file table
entry and definitely not the file descriptor.  As Taylor R Campbell
said, nonblocking really _should_ be a property of the operation, not
of the descriptor, not of the open file table entry, and _definitely_
not of the object.

The only way I've thought of to work around this is to fork a helper
process which reads - blockingly - from stdin and writes the data to a
pipe; that pipe is then set nonblocking in the main process and is
independent of everything else.  I might need a third process to make
the reader process die reliably

/~\ The ASCII     Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: flock(2): locking against itself?

2023-03-30 Thread Mouse
>> I'm not sure whether I think it'd be better for O_NONBLOCK to apply
>> to the descriptor - [...]
> O_NONBLOCK should really be part of the _operation_, not the
> _object_.  [...]

Agreed - or, at least, I agree that it should be possible to make it
part of the operation rather than of the object.  Even aside from
installed-base arguments, I'm not sure whether I think non-blocking
mode on the object should go away.  I'd have to think about it more.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: flock(2): locking against itself?

2023-03-30 Thread Mouse
>> "They're per-open"
> That's not bad for this level of description.

Agreed!

>> ...which is not actually difficult to understand since it's the same
>> as the seek pointer behavior; that is, seek pointers are per-open
>> too.
> and almost all the other transient attributes, that is distinct from
> stable attributes like owner, group, permissions, which are inode
> attributes.  In our current system I think just close on exec is a
> per fd (transient) attribute though if we follow linux (I think) and
> soon to be POSIX, and add close on fork, that would be another.

I actually ran into a case where this distinction caused trouble.  I
have a program that uses curses for output but wants to do non-blocking
input.  So I set stdin nonblocking and threw fd 0 into the poll() loop.

But, in normal use, stdin and stdout come from the same open on the
session's tty, so setting stdin nonblocking also sets stdout
nonblocking, which curses is not prepared to handle, leading to large
screen updates getting partially lost.

I'm not sure whether I think it'd be better for O_NONBLOCK to apply to
the descriptor - if that could even be done; the way things are
currently designed, in a lot of cases it needs to get pushed all the
way to the underlying object, in my case the tty (which then is
responsible for making that state non-permanent).  I may need to find
some other approach.

I've also wished for a way to suppress SIGPIPE akin to MSG_NOSIGNAL to
send().  This is relevant here because it would be useful to have it as
a per-descriptor (or for my immediate use case even a per-open-file)
option, though it would also be nice to have it as a per-call option
(akin to Linux's writev2, though the Linux writev manpage I checked
doesn't have any such flag - it merely has room for it).

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: flock(2): locking against itself?

2023-03-18 Thread Mouse
>  Locks are on files, not file descriptors.

Except they aren't.  They're on open file table entries, something
remarkably difficult to describe in a way that doesn't just refer to
the kernel-internal mechanism behind it (which for _this_ list isn't a
big deal, but...).  If they were truly on files, rather than open file
table entries, then it wouldn't matter whether my test program opened
the file once or twice, since it's the same file either way.

> Applying flock() to an already locked (of this kernel file*) file is
> an attempt to upgrade, or downgrade (including unlock) the file,

Hm, okay, I can see how the second flock call in my test was taken as
an attempt to equalgrade (neither upgrade nor downgrade) the exclusive
lock to another exclusive lock.

I'll have to think more about my locking paradigm.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


flock(2): locking against itself?

2023-03-18 Thread Mouse
I ran into an issue, which turned out to be, loosely put, that an open
file table entry cannot lock against itself.  Here's a small test
program (most error checking omitted for brevity):

#include 
#include 
#include 
#include 
#include 
#include 
#include 

static int fd;

int main(void);
int main(void)
{
 fd = open(".lock",O_RDWR|O_CREAT,0666);
 if (fork() == 0)
  { if (flock(fd,LOCK_EX|LOCK_NB) < 0)
{ printf("lock child: %s\n",strerror(errno));
  exit(1);
}
sleep(5);
exit(0);
  }
 if (flock(fd,LOCK_EX|LOCK_NB) < 0)
  { printf("lock parent: %s\n",strerror(errno));
exit(1);
  }
 sleep(5);
 wait(0);
 return(0);
}

An earlier version skipped the fork, doing the two flock calls in
succession from the same process, without the sleeps.  Neither version
produces EWOULDBLOCK from either flock call on any of the systems I
tried it on (my mutant 1.4T, my mutant 5.2, and a guest account on a
stock (as far as I know) 9.0).

This is not what I was expecting.  On examination, the manpages
available to me (including the one at http://man.netbsd.org/flock.2)
turn out to say nothing to clarify this.  Moving the open after the
fork, so the parent and child open separately, I do, of course, get the
expected EWOULDBLOCK from one process.

Is this expected behaviour, or is it a bug?

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: [GSoC] Emulating missing Linux syscalls project questions

2023-03-12 Thread Mouse
> (2) Is there a better binary-finding strategy than trying Linux
> binaries on NetBSD, and if they fail (have a script) compare the
> output of strace from a Linux run of the program with the table in
> sys/compat/linux/arch/*/linux_syscalls.c?

Better?  Maybe, maybe not.  But what I did in a similar case (an
emulator that emulated just userland, handling directly anything that
traps to the kernel on real hardware) was to `implement' any
unimplemented syscalls with code that just prints a message and
terminates.  Here, that would map into unimplemented syscalls
printing/logging something and killing the process.

Obviously, that code would not survive into the end result, but
something like it might be a useful intermediate step.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: building 9.1 kernel with /usr/src elsewhere?

2023-03-08 Thread Mouse
>> Does it make a difference if you set
>> NETBSDSRCDIR=/home/abcxyz/netbsd-9.1/usr/src when you run make?
> Yes, that appears to make the symptom go away.

Also, I can reproduce the problem by setting
NETBSDSRCDIR=/infibulated/gonkulator when running make depend even with
a source tree in /usr/src.

Is this enough of a bug that it's worth sending-pr?  Or is this a case
of me expecting something that's no longer supported to work?

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: building 9.1 kernel with /usr/src elsewhere?

2023-03-08 Thread Mouse
Omnibus reply here.  Thank you, everyone; I have a better understanding
of the actual problem (admittedly that's a low bar, given how little I
understood it before) and two different workarounds.

[Brian Buhrow]

> hello.  I regularly build kernels outside of the /usr/src location.

But does /usr/src exist at the time?

> My technique is to install the source in some location:
> /usr/local/netbsd/src-91, for example, then put my configuration file
> in: /usr/local/netbsd/src-91/sys/arch//conf/
> Then
> cd /usr/local/netbsd/src-91/sys/arch//conf
> config 
> cd ../compile/
> make -j 4 >& make.log

I'd prefer to avoid assuming the user who wants to build the kernel can
write into the source directory tree.  You may note the source tree was
(admittedly only by implication) owned by abcxyz but I was doing the
build as mouse.

That said, this does appear to work.  (Well, I didn't use -j, but then,
I practically never do, and the machine I'm doing this on is
single-core.)  I didn't wait for the whole build, but it doesn't fail
fast the way the run that prompted my mail did.

[Taylor R Campbell]

> Does it make a difference if you set
> NETBSDSRCDIR=/home/abcxyz/netbsd-9.1/usr/src when you run make?

Yes, that appears to make the symptom go away.  (I probably would not
have stumbled across that; /usr/src/BUILDING mentions NETBSDSRCDIR only
twice, neither time documenting it, only mentioning it in examples.  It
likely would have taken enough digging to locate the actual culprit for
me to discover it.  But still, it does seem to work.)

> I always build out of my home directory, never /usr/src, but I also
> always use build.sh and the make wrapper it creates [...]

Ugh, I hate using build.sh for small things like individual kernels.
It always (well, far too often, at least) insists on rebuilding make,
which takes significant time on some machines, like my shark, and
requires extra writable filesystem space.  If there's a reasonably easy
way to avoid it, I prefer to.

That said, if NetBSD wants to desupport building even kernels without
using build.sh, that's its choice; what I think of it does't matter.
(But I do think that, in that case, config(1) should be documented as
an internal tool, not intended for use other than by build.sh.)

[Johnny Billquist]

> You should build the kernel using build.sh, [...]

See above.

> Don't try to make things complicated by doing all that stuff by hand.
> :-)

build.sh _is_ the complicated way, to me.  It's a large, complex, and
slow tool I find even less understandable than config(1) and make(1).
It also has way too much "when we want your opinion we'll give it to
you" for my taste.

Which I suppose is just another way of saying that NetBSD, having lost
interest in users like me, is moving even farther in that same
direction.  Sic transit gloria mundi.

I don't squawk much about build.sh because it does bring benefits; the
biggest one I notice is probably painless cross-compiles.  But I'd
never run into this price before.  5.2 doesn't exhibit the misbehaviour
at all, so I couldn't've noticed it except at work, and I think I've
never tried to build a kernel without /usr/src in place before (at work
or not).

[matthew green]

>> make[1]: don't know how to make absvdi2.c. Stop
> what happens if you run "make USETOOLS=no"?

Fast failure, superficially looking like the same one.

[Valery Ushakov]

> Mail-Followup-To: matthew green ,
>   Mouse , tech-kern@netbsd.org

Um, why would you think I'd want people to mail followups to _me_?  I
would prefer - though admittedly it's a weak preference, weak enough I
practically never mention it - that people _not_ mail me when they're
already sending to the list.

> The problem is that NETBSDSRCDIR cannot be inferred for a randomly
> located kernel builddir and sys/lib/libkern/Makefile.compiler-rt uses
> it.

In that case, maybe config(1) should write a suitable setting of
NETBSDSRCDIR into the Makefile it generates?  At least when -s is given
with an absolute path?

> Our makefile spaghetti is a bit out of control.

I've felt so often enough myself.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


building 9.1 kernel with /usr/src elsewhere?

2023-03-07 Thread Mouse
Okay, I'm trying to help someone with a NetBSD 9.1 machine at work.
Today's issue is one I had trying to build a kernel.  We needed to do
this because that project has a few customziations in the system needed
to build the application layer, and more needed to run it.

He installed stock 9.1, but did not install any source; /usr/src and
/usr/xsrc did not exist.  We then set up the customized source trees in
his homedir, which I will here call /home/abcxyz, under a directory
9.1.  Thus, the path to kernel source, for example, was
/home/abcxyz/netbsd-9.1/usr/src/sys.

Then I copied in a kernel config (GEN91) into my ~/kconf/GEN91, from
back when I was working on that project.  I then ran

% config -b ~/kbuild/GEN91 -s /home/abcxyz/netbsd-9.1/usr/src/sys ~/kconf/GEN91

This completed apparently normally, reporting the build directory and
telling me to remember to make depend.  I then went to ~/kbuild/GEN91
and ran make depend && make.  It failed fast - no more than a second or
two - with

make[1]: don't know how to make absvdi2.c. Stop

(full log below).  I then moved /home/abcxyz/netbsd-9.1/usr/{src,xsrc}
to /usr/{src,xsrc}, chowned them -R to 0, destroyed ~/kbuild/GEN91, and
repeated, only this time I passed /usr/src/sys to config's -s flag.

This time the kernel built fine (at least apparently - we haven't tried
booting it yet, but I've built enough kernels to be confident there are
no obvious red flags in the log; it certainly did not fail a second or
two in with a cryptic message about absvdi2.c).  Note in particular
that the source tree content was identical; only the path and ownership
differed.

Is this a bug?  Or am I doing something wrong?

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

The logfile includes one line that's over 3300 characters long,
guaranteeing that it'd get mangled by email (soft limit 78 characters,
hard limit 998).  So, I ran the logfile through bzip2 and then
base64-encoded the output.  Here's the result:

QlpoOTFBWSZTWfzLRGEAA6xfgGAQSZP/cj/v36qwUANjwoKAAGSNUbUn6mpv
U1NPU8hAZB6mQA9EMymmagOaMmJgAmIwI0wIMRgmTAIw5oyYmACYjAjTAgxGCZMA
jDmjJiYAJiMCNMCDEYJkwCMEUQkwmQJoBR6jR5QA02ppo0Gm2qaekaI6tlUVe/S1
HAv0T332gt7Q3uZ7SnDMNrMXxRm2qdT965avS81vYtWJqOVqFHMMG2lXjZAzLhEs
27kbS8RZOB1YXdRZGFfB+Vv64SXOnJbdALlWMEEIQABCjGnQhAOAkGYImXVAYFSV
gDpIIgIk4iJU661xD4IOouYQv7DytQCYMTqXUIYqMGgCfbG40vx0e6+cYsRMxWbG
Ry7dS5oRQAwI3XHxETDWFMrFa1+URPAsCmnoETwfSvQInSHI418MxAY/xIzvVCzA
h8iP4i6OcEEPxhJRZKmkZ6dtYVdUDx1F0h7z+ryOBD/FDb5RQsf2pwBlyggpIeVR
92uNxc3aZYSUPe2TV3aWC5CNOUBELwKW0ESTM5thvQmeUmlz7thjtk0ZKaUlm7gU
OASVllCpleWIQpx0y3morldhNMw1tTjtzvoRkc3rx23xrveRHQbDZTLTOChswIDD
Hd1nzGHP6A6yTEnMgL84Z9SCSxgYH37YdFFZdV2mdEHoHkd4UEEKgyDwOZDxQfyI
LY3SF3TJCfWg+wRPXIPqESyDXzK+OW6lP1KEJkIlSAsAbcBcD4Z3iJiVKGIiUIVP
2N8kaG0HbAXu37Pbdt2AG2OyBH02gG/nzEShoSSVebJPP+Tvi5LgcVHE0ETWo+og
lPYdofIamwKb8sqHQKEUOcsc6XWAN4TwoSAdclLBN1QIjmJqi9pYEKRoAYHCIhB2
JbpbzqBS+ki6xaG47NWFzrJmUkSRYUZQK3CJKcZTX/htKF+MY7OgFIFyd+gvIvDY
9NSBvxpfE85ThkVMyOwRJdkiJYLaw0Bb1Qzg77rchEyOLkoB1ApmIlADnLcxULoE
TYnnCmYiQ6/ao6pPnJ+4sPC3GA6D4mvuANDYUESE1XGG9JKlu+DVtUdYidGAibhE
4iJ/4u5IpwoSH5lojCA=


Re: crash in timerfd building pandoc / ghc94 related

2023-02-06 Thread Mouse
>> It seems so far, from not really paying attention, that there is
>> nothing wrong with ghc but that there is a bug in the kernel.
> Yes of course no userland code should be able to crash the kernel :D

I used to think so.  Then it occurred to me that there are various ways
for userland to crash the kernel which are perfectly reasonable, where
of course "reasonable" is a vague term, meaning maybe something like "I
don't think they indicate anything in need of fixing".  Perhaps the
simplest is

dd if=/dev/urandom bs=65536 of=/dev/mem

but there are others.

Yet I can't help feeling that there is some sense in which it *is* fair
to say that userland should never be able to crash the kernel.  I have
been mulling over this paradox for some time but have not come up with
an alternative phrasing that avoids the reasonable crashes while still
capturing a significant fraction of the useful meaning.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Finding the slot in the ioconf table a module attaches to?

2023-02-01 Thread Mouse
> Usually your user program would attempt to open xyz0 which would find
> the major/minor in th devsw tables.  You're relying on a hard coded
> major.  That's the difference.

Okay, I'm probably exposing my ignorance of something here, but, what's
the difference?  You still have to get the major number, whether chosen
at module load time or chosen at kernel build time, into the /dev/xyz0
special file in the filesystem.  That requires exfiltrating it from the
kernel *somehow*

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Add five new escape sequences to wscons

2023-01-18 Thread Mouse
>> Technically the wscons terminal type is wsvt25, an extended ANSI
>> compatible terminal, already supporting more sequences than vt100.

Well, it calls itself "vt100"

>> Having it also support a useful subset of xterm [...] seems like a
>> useful addition,

> 100% agree.

Oh, certainly.  It just seems to me that the name "vt100" for that
emulation type is becoming more and more misleading.  I have nothing at
all against adding the sequences in question, except the mismatch
between the implications of the name and the actual emulation (and even
that is relatively weak, given how many "vt100" emulations are at least
as far from VT-100s as wscons is).

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Add five new escape sequences to wscons

2023-01-16 Thread Mouse
>> [wscons's "vt100" is] not a very good VT-100 emulation, handling a
>> bunch of sequences differently from VT-100s (mostly things wscons
>> implements but VT-100s don't) and having a handful of other
>> mismatches (such as supporting sizes other than 80x24 and 132x24).
> A lot of the sequences that it supports are from later VT-series
> terminals.

Sure, and other things from X3.64.  When I did the X3.64 (and, if
turned on, ISO 6429 colour SGR values) mode for my terminal emulator, I
called the emulation types "ansi" and "decansi" for basically this
reason.  (The difference between ansi and decansi is support for
various DEC extensions, such as scrolling regions or ?-flagged
arguments to CSI h and CSI l.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Add five new escape sequences to wscons

2023-01-16 Thread Mouse
> Here is a patch to implement five additional escape sequences in the
> wscons vt100 emulation.

Not to pick on this particular addition...but is it really still
appropriate to call it "vt100"?  It's not a very good VT-100 emulation,
handling a bunch of sequences differently from VT-100s (mostly things
wscons implements but VT-100s don't) and having a handful of other
mismatches (such as supporting sizes other than 80x24 and 132x24).

/~\ The ASCII     Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: ATA TRIM?

2022-12-25 Thread Mouse
>> I find it far more plausible that I'm doing something wrong.
> Or maybe the drive just doesn't obey the spec?

That's possible, I suppose.  But it's a brand new Kingston SSD, which I
would expect would support TRIM.  And it self-identifies as supporting
TRIM.

The packaging promises free technical support.  I suppose I should try
to chase down a contact (the packaging gives no hint whom to contact
for that promised support) and ask.  At worst I'll be told nothing
useful.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: ATA TRIM?

2022-12-25 Thread Mouse
>> According to that PDF, dholland is wrong.
> I fail to see a behaviour that would be allowed due to dholland@'s
> definition, but not according to the one you cited, nor the other way
> round.

A read returning the pre-TRIM contents.  Two of the options
specifically state "independent of the previously written value"; the
third is simply zero, which is also independent of the previously
written value.  dholland wrote

> The state of the data after TRIM is unspecified; you might read the
> old data, you might read zeros or ones, you might (I think) even read
> something else.

and, as I read that PDF, "you might read the old data" is specifically
disallowed.  You may read zeros or ones, or something else, but the
only way you'll read the old data is if the old data matches what the
drive's algorithm happens to return for those sectors (for example, if
the drive returns zeros but zeros were what you had written).

It is theoretically possible that the data I wrote happens to match
what the drive returns for trimmed sectors.  Given the data, I find
that extremely unlikely.  (I may try again with different data, just in
case, but I still don't like the way the command is timing out.  I find
it far more plausible that I'm doing something wrong.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: ATA TRIM?

2022-12-23 Thread Mouse
[dholland]
> The state of the data after TRIM is unspecified; you might read the
> old data, you might read zeros or ones, you might (I think) even read
> something else.

[RVP]
> OK, I've now actually looked at what the spec[1] says instead of
> relying on my faulty recall of stuff I read on lwn.net years ago.

> [1] [...]
> https://web.archive.org/web/20200616054353if_/http://t13.org/Documents/UploadedDocuments/docs2017/di529r18-ATAATAPI_Command_Set_-_4.pdf

According to that PDF, dholland is wrong.  PDF page 150, page-number
page 113, includes examples of "propperties associated with trimmed
logical sectors" including

a)  no storage resources; and
b)  read commands return:
A)  a nondeterministic value that is independent of the previously written 
value;
B)  a deterministic value that is independent of the previously written 
value; or
C)  zero.

though it seems to me (b)(C) is actually a special case of (b)(B).  See
table 33, later on that page, for more.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: ATA TRIM?

2022-12-12 Thread Mouse
> are you trying to trim a really large section at once?  i think
> that's what i see:

>> [ - root] 3> date; ./trim /dev/rwd1d 4 2; date

That means "first six bytes contain 4, LE; second two bytes contain 2,
LE".  I thought that in turn meant "2 sectors at offset 4".  Apparently
it actually means "2 * max_dsm_blocks at offset 4", but max_dsm_blocks
is 8 for this device, so that's still only 8K.

> at least in my experience, the problem is that most devices take a
> while to handle a TRIM request, longer than the 30s timeout typically
> used.

That's...odd.  How can it be useful if it takes that long?  Is the
intent that it be useful only for very occasional "erase this whole
filesystem" use, or what?  I thought it was intended for routine
filesystem use upon deleting files.

> this is why blkdiscard(8) defaults to 32MiB chunks.

I once did what I thought was trying to trim 16M, but my current
understanding says that attempt would have been 128M.  That didn't work
any better.

I just tried increasing the timeout to 30 (ie, five minutes) and
trimming offset 0 size 8, which I now think for this device (with
max_dsm_blocks 8) should mean 64 (interface) sectors, ie, 32k.

It still timed out, with the same followup timeouts.  Note the date
output here; it took five minutes for the TRIM to time out, then thirty
seconds for wd_flushcache.

[ - root] 4> date; trim /dev/rwd1d 0 8; date
Mon Dec 12 08:22:29 EST 2022
TRIM wd1: arg 00 00 00 00 00 00 08 00
Version 2040.283, max DSM blocks 8
TRIM wd1: calling exec
piixide1:0:1: lost interrupt
type: ata tc_bcount: 512 tc_skip: 0
TRIM wd1: returned 1
ATAIOCTRIM workd
wd1: wd_flushcache: status=128
Mon Dec 12 08:27:59 EST 2022
[ - root] 5> dd if=/dev/rwd1d of=/dev/null count=8
piixide1:0:1: wait timed out
wd1d: device timeout reading fsbn 0 (wd1 bn 0; cn 0 tn 0 sn 0),
retrying
wd1: soft error (corrected)
8+0 records in
8+0 records out
4096 bytes transferred in 0.005 secs (819200 bytes/sec)
[ - root] 6> 

> maybe port that tool back,

I'll try to have a look at it.  I haven't been trying to match the -9
userland API, though, so I'm not sure how useful it will actually be.
It may point me in a useful direction, though.

> it's also supposed to match the linux command of the same name.  it's
> not in netbsd-9, but last i tried, the interfaces the -current tool
> uses are available in -9 kernels.

I did bring over the 9.2 syssrc set, so I should be able to figure
_something_ out.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: ATA TRIM?

2022-12-09 Thread Mouse
>> OK, so any requests >4K will have to be packaged into further range
>> requests [...]

> This isn't right.  Bytes 7 & 8 of a TRIM range request form a
> counter.  So, a counter of 1 = (1 x max_dsm_blocks); 2 = (2 x
> max_dsm_blocks) up to 0x counts.

So is max_dsm_blocks misnamed, or is it just being abused as a
dsm_granularity value by TRIM, whereas other DSM commands do use it as
a maximum?  If the former, I'd like to rename it in my tree

> And you can have 64 range requests (contiguous or disjoint) in a 512
> byte DSM payload.

You clearly know a lot more about the relevant commands than I do,
though admittedly at the moment that's a very very low bar.

> Start with a `count' of 1 after you set the LBA48 flag.

Once I figure out how to get some analog to LBA48, at least. :)  Yes,
my code sets r_count to 1 because the code I started with does
analogously.  Until I saw your email, I had no idea there was even any
way to _represent_ multiple ranges in a single request.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: ATA TRIM?

2022-12-09 Thread Mouse
>> I tried trimming 8 at 0.  Still the same syndrome: TRIM timeout,
>> flush timeout, device timeout reading [...]
> You may have to set the AT_LBA48 flag (not sure if this is present on
> 5.2)

It is not.  5.2 has an ATA_LBA48 flag, going with the flags field of
struct ata_bio, but no LBA48 flag for ata_command.flags.

My evolution of 5.2 has AT_READREG48, which I added as part of my
attempt to support HPAs.  But that's the closest thing I see, and
that's not really close enough to be useful here.

> so that `wdccommandext' gets called rather than `wdccommand' for the
> ATA_DATA_SET_MANAGEMENT command.  All this from [FreeBSD]

And presumably the NetBSD wdccommand/wdccommandext difference matches
the FreeBSD one closely enough for that to be relevant?  I shall have
to read wdccommand* over in more detail.

    Mouse


Re: ATA TRIM?

2022-12-09 Thread Mouse
>> Okay, that now seems unlikely.  I tried to TRIM 32M at zero.

(Actually, 16M - 32K blocks is 16M.)

> What is the value of `max_dsm_blocks' that your drive reports?
> Unfortunately, atactl(8) doesn't show this currently.

I added that - and the version numbers - to my printf.

atap_ata_major is 2040, 0x7f8.
atap_ata_minor is 283, 0x11b.
max_dsm_blocks is 8.

I tried trimming 8 at 0.  Still the same syndrome: TRIM timeout, cache
flush timeout, device timeout reading - and I just now noticed that the
last timeout is a timeout reading wd*0*.  This leads me to suspect that
it's the host hardware, not the drive, that's falling over here
(presumably trying to load dd to read wd1 with).  Is that plausible?

I did another test.  I tried to trim 8 at 0, but, first, I started a
loop that reads successive blocks of wd0, the OS's disk, one per
second, printing timestamps as it goes.

wd0 access locks up during the TRIM attempt.  One read got through
between that and the cache flush; it locked up again during that.  It
then came back.  But when I tried to read wd1 it locked up again during
that.

Dunno what all this means

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: ATA TRIM?

2022-12-08 Thread Mouse
[Replying to two messages at once here, both from the same person]

[First message]
>> printf("TRIM %s: calling exec\n",device_xname(wd->sc_dev));
>>   rv = wd->atabus->ata_exec_command(wd->drvp,&cmd);
>> printf("TRIM %s: returned %d\n",device_xname(wd->sc_dev),rv);
>>   return(0);

> ata_exec_command() will start the command, but, the completion of it
> is usually signalled by an interrupt.  Presumably, the 9.2
> ATA-related code takes care of this as ata_exec_command() takes a
> `xfer' parameter rather than a bare command struct.  How does 5.2
> wait for ATA command completion?

I will have to dig into that more.  It does seem to be waiting, in that
the call does not return until the thirty seconds specified in the
timeout field have elapsed.  (It then takes about another 30s before
printing the cache-flush timeout message and returning to userland.)

Since the data on the device is still there afterwards, I don't think
it's just a question of not correctly handling completion.  If it were,
I'd expect the operation to work in the sense of dropping the blocks
described by the argument values.

[Other message]
>>case ATAIOCTRIM:
>> { unsigned char rq[512];
>>   struct ata_command cmd;
...
>>   rv = wd->atabus->ata_exec_command(wd->drvp,&cmd);
>> printf("TRIM %s: returned %d\n",device_xname(wd->sc_dev),rv);
>>   return(0);
>> }

> Ah, shouldn't `cmd' be allocated memory rather than being a
> locally-scoped variable?

Why?  cmd.flags specifies AT_WAIT, and as I remarked above it is indeed
waiting, so cmd, on the kernel stack, should outlive the I/O attempt.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: ATA TRIM?

2022-12-08 Thread Mouse
>> [...TRIM...]
> It could perhaps be that the area you're trying to trim is too small,
> or badly aligned?

Okay, that now seems unlikely.  I tried to TRIM 32M at zero.  (Much
more than that seems implausible, since the request has only 16 bits of
size, so the maximum representible size is 65535 blocks, or a smidgen
under 64M.  And zero certainlky ought to be aligned.)

The behaviour is basically the same.  Except for the details, like the
argument area, it looks the same:

[ - root] 3> trim /dev/rwd1d 0 32768
TRIM wd1: arg 00 00 00 00 00 00 00 80
TRIM wd1: calling exec
piixide1:0:1: lost interrupt
type: ata tc_bcount: 512 tc_skip: 0
TRIM wd1: returned 1
ATAIOCTRIM workd
wd1: wd_flushcache: status=128
[ - root] 4> 
[ - root] 4> dd if=/dev/rwd1d of=/dev/null count=64
piixide1:0:1: wait timed out
wd1d: device timeout reading fsbn 0 (wd1 bn 0; cn 0 tn 0 sn 0), retrying
wd1: soft error (corrected)
64+0 records in
64+0 records out
32768 bytes transferred in 0.008 secs (4096000 bytes/sec)
[ - root] 5> 

That is, the request starts and nothing happens until the 30-second
timeout expires, at which point it reports "lost interrupt" and says it
worked.  It then reports another timeout on cache flush.  Attempting to
read gives _another_ timeout, from which it recovers and then works.

And, as before, reading the beginning of the drive indicates that the
first hundred sectors, at least, still retain the test data I wrote to
them before I started all this.

Hm, the device packaging promises free technical support.  As cynical
as I may be about vendor support, I suppose I really ought to call them
up and see if they can put me in touch with someone who actually knows
how TRIM works.  I don't really have anything to lose except some time.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: ATA TRIM?

2022-12-08 Thread Mouse
>>> I'm trying to understand TRIM, such as is used on SSDs.  [...]
>> [...]
> It could perhaps be that the area you're trying to trim is too small,
> or badly aligned?

Entirely possible.  What are the restrictions?  Are they
device-specific, or generic?  (While wedging seems like a rather broken
response to such issues, I've seen brokener.)

    Mouse


Re: ATA TRIM?

2022-12-08 Thread Mouse
I wrote

> I'm trying to understand TRIM, such as is used on SSDs.  [...]

I forgot to ask: does anyone know whether TRIM is known to work?  It
occurs to me that I don't actually know whether the code I'm trying to
backport works.  The code looks more or less identical in current,
according to cvsweb, but that still doesn't tell me whether anyone is
_using_ it.

/~\ The ASCII     Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


ATA TRIM?

2022-12-07 Thread Mouse
I'm trying to understand TRIM, such as is used on SSDs.

As a first step towards this, I'm trying to do a rudimentary backport
to a 5.2 derivative I'm using - nothing teaches a thing like
implementing it.  I found wd_trim() in 9.2's wd.c and had a stab at
integrating a form of it into my kernel.

It doesn't work, as you can probably infer from my writing this mail.
Userland issues the ioctl I'm using as aan early-stage API, printfs
indicate that my kernel code is running, and...it times out.

I'm writing to ask if there's anyone who knows TRIM well enough to have
a stab at telling what's wrong and is willing to try.

Anyone not interested in details can stop reading now without loss; the
rest of this mail details what I've done and what I got.  I've
presumably made some mistake somewhere, but it's not clear to me.

Here's the code I dropped into wdioctl(), adapted from 9.2's wd_trim().
I also lifted ataparams fields 129 through 208 from 9.2, including
things such as ATA_SUPPORT_DSM_TRIM.  (5.2 has all those fields as a
reserved [80] array.)

case ATAIOCTRIM:
 { unsigned char rq[512];
   struct ata_command cmd;
   int rv;
   if (! (flag & FWRITE)) return(EBADF);
   if (! (wd->sc_params.atap_ata_major & WDC_VER_ATA7))
{ printf("ATAIOCTRIM: %s: not 
ATA-7\n",device_xname(wd->sc_dev));
  return(EINVAL);
}
   if (! (wd->sc_params.support_dsm & 
ATA_SUPPORT_DSM_TRIM))
{ printf("ATAIOCTRIM: %s: has no TRIM 
support\n",device_xname(wd->sc_dev));
  return(EINVAL);
}
   bcopy(addr,&rq[0],8);
printf("TRIM %s: arg %02x %02x %02x %02x %02x %02x %02x %02x\n",
device_xname(wd->sc_dev),
rq[0], rq[1], rq[2], rq[3], rq[4], rq[5], rq[6], rq[7]);
   bzero(&rq[8],512-8);
   bzero(&cmd,sizeof(cmd)); // XXX API botch
   cmd.r_command = ATA_DATA_SET_MANAGEMENT;
   cmd.r_count = 1;
   cmd.r_features = ATA_SUPPORT_DSM_TRIM;
   cmd.r_st_bmask = WDCS_DRDY;
   cmd.r_st_pmask = WDCS_DRDY;
   cmd.timeout = 3;
   cmd.data = &rq[0];
   cmd.bcount = 512;
   cmd.flags |= AT_WRITE | AT_WAIT;
printf("TRIM %s: calling exec\n",device_xname(wd->sc_dev));
   rv = wd->atabus->ata_exec_command(wd->drvp,&cmd);
printf("TRIM %s: returned %d\n",device_xname(wd->sc_dev),rv);
   return(0);
 }
break;

When I run my userland program, I get

[ - root] 3> date; ./trim /dev/rwd1d 4 2; date
Wed Dec  7 11:46:43 EST 2022
TRIM wd1: arg 04 00 00 00 00 00 02 00
TRIM wd1: calling exec
piixide1:0:1: lost interrupt
type: ata tc_bcount: 512 tc_skip: 0
TRIM wd1: returned 1
ATAIOCTRIM workd
wd1: wd_flushcache: status=128
Wed Dec  7 11:47:43 EST 2022
[ - root] 4> 

1 is ATACMD_COMPLETE.  (The "ATAIOCTRIM workd" message is coming from
the userland program.)

Then attempting to read the drive times out but recovers:

[ - root] 4> dd if=/dev/rwd1d of=/dev/null bs=512 count=64
piixide1:0:1: wait timed out
wd1d: device timeout reading fsbn 0 (wd1 bn 0; cn 0 tn 0 sn 0), retrying
wd1: soft error (corrected)
64+0 records in
64+0 records out
32768 bytes transferred in 0.008 secs (4096000 bytes/sec)
[ - root] 5> 

Reading the device after that, I find the original contents are still
accessible up through (at least) sector 17, so the TRIM did not
actually work.

wd1 is a Kingston SATA SSD:

wd1 at atabus1 drive 1: 
wd1: drive supports 1-sector PIO transfers, LBA48 addressing
wd1: HPA enabled, no protected area
wd1: 111 GB, 232581 cyl, 16 head, 63 sec, 512 bytes/sect x 234441648 sectors
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
wd1: non-rotational device
wd1(piixide1:0:1): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133) (using DMA)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: #pragma once

2022-10-15 Thread Mouse
> Traditionally to avoid problems with repeated inclusion of a header
> file, you put #include guards around it, say in sys/dev/foo.h:
> [...]

> With newer compilers this can be replaced by a single line in the
> header file:

> #pragma once

Some newer compilers, perhaps.  Unless and until it is standardized,
there's no telling what #pramga once might mean to the next compiler to
come along - except that, for Eliza reasons, it presumably will be
related to doing something only once, but there are a lot of such
possibilities.

Furthermore, even when implementors agree on the basic meaning, unless
and until it is precisely specified and standardized, implementations
will differ in corner cases.

foo.h
#define FOO(x) _Pragma(x)
bar.h
#define BAR() FOO("once")
hdr.h
#include "bar.h"
#include "foo.h"
BAR()

Which file gets the include-once semantic?  Why or why not?  I could
make an argument for each of the three (some of the arguments will be
stronger than others...but which ones are which will vary by person).

> It's nonstandard, but using  #pragma once  is maybe a bit less
> error-prone -- don't have to have to pollute the namespace with
> have-I-been-included macros, and I've made mistakes with copying &
> pasting the per-file have-I-been-included macro into the wrong file.

I'm not sure.  I see arguments each way.

The biggest problems I see with using it in NetBSD-provided include
files:

(1) Developers may see it and think it's more portable than it is as a
result.  Developers are already way too ready to assume that anything
that works on their development machines is suitable for release.

(2) Unless and until the functionality is standardized, it makes the
system gratuitously nonportable.  ("Portable between what I think are
the currently most popular two compilers" is awfully weak, even if
"what I think" is correct.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: MP-safe /dev/console and /dev/constty

2022-10-02 Thread Mouse
>> Your suggestion of pushing it into a separate function (which
>> presumably would just mean using return instead of break to
>> terminate the code block) strikes me as worth considering in general
>> but a bad idea in this case; there are too many things that would
>> have to be passed down to the function in question.
> Of course, GCC offers nested functions for exactly this, but...

Yes.  I would not expect gcc-specific code to be accepted.

In this case, I see no benefit to using a nested function over one of
the constructs that supports a break-out to a well-defined point:
do{...}while(0), switch(0){case 0:...}, while(1){...;break;}, or the
like.  (I would say do-while(0) is the closest to a canonical version
of those.)  In some cases there may be a benefit, if you want to break
out of multiple nested constructs.  (In that case I'd actually use
labeled control structure, but that's even less well supported than
gccisms like nested functions.)

However, this is all armchair quarterbacking when we don't know what
mrg disliked about the code as given.  I still think all it really
needs is to be reformatted so the do and the while(0) don't visually
disappear into the containing if-elseif construct.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: MP-safe /dev/console and /dev/constty

2022-10-01 Thread Mouse
> i really like this except for the if () do { ... } while (0); else
> abuse portion.  please rework that part.  it looks easiest to push
> into a separate function, perhaps.

You don't say what you don't like about it.

There are only two things I don't like about it, and one of them
(indentation) is shared with almost all of /usr/src.  (The other I'll
get to below.)  Given the lack of information about what you don't like
about it, I'm going to guess that you don't like using an un-braced
do-while as the consequent of an if.  Or, perhaps, you don't like that
use of do-while at all?

Using do { ... } while (0); to provide a context in which break can be
used to skip the rest of a well-defined block of code is, IMO, far
preferable to using a goto, which latter seems to be the historically
usual approach to such things.  Your suggestion of pushing it into a
separate function (which presumably would just mean using return
instead of break to terminate the code block) strikes me as worth
considering in general but a bad idea in this case; there are too many
things that would have to be passed down to the function in question.
And the only benefit I see is avoiding the do-while, which I have
trouble seeing anything wrong with, except the second of the two things
I mentioend above.

Would you feel better if it were wrapped in switch (0) { case 0: ... }
instead?  Worse?  Why or why not?

I would prefer to see braces around the do-while, with a corresponding
indentation level, but that's the only change I would say needs making
there.  With the current formatting, the do and while(0) tend to
visually disappear into the if control structure, making the contained
breaks too easy to misread.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: fallocate for FFS

2022-09-26 Thread Mouse
>>> I will try to figure it out, since its not yet implemented the
>>> syscall is a good way to start dev in BSD kernel.
>> I'm not sure about that; many people have started looking at it and
>> not got anywhere.
> It is true that adding a system call is an easy entry point to learn
> about the kernel.  But here the syscall is the easy part, the real
> work is modifying FFS code to suport it, and that is a steep learning
> curve.

Not just a steep learning curve.  Some of the fallocate operations
(mode zero in particular) would be fairly easy.  But, as someone who
knows somewhat of FFS, I would say some of the operations, in
particular anything that involves allocating space after EOF, would be
rather difficult to implement even for someone who's past the learning
curve.  (Some others, while easy, would be tedious and expensive;
FALLOC_FL_COLLAPSE_RANGE is an example.  While it would technically be
possible to make it fast and simple by taking advantage of the
granularity requirement leeway, such an implementation would be too
restrictive to be worth doing.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Lock of NetBSD-current with ifconfig down / up

2022-09-17 Thread Mouse
> I've ordered some PS/2 keyboards, because I take it that's the only
> way to reliably get in to the kernel debugger on amd64, unless
> someone knows a trick to make USB keyboards usable.

This is true - to the extent it _is_ true - only if you insist on video
console.  I find a break condition on serial console works well too.

But I think that used to, and may still, depend on having a real serial
port, which I gather recent machines may not, even if they have the
connector for it.  (I've heard it said they tend to have a
USB-to-serial chip on an internal USB hub, though I have very limited
experience with machines that recent.)

As for USB keyboards, if you tell the BIOS to fake a real keyboard and
then arrange for the OS to ignore the keyboard's USB existence, you may
find yourself with a USB-hardware keyboard that looks like a PS/2
keyboard to the OS.  But this may require disabling all OS knowledge of
USB or something comparably drastic, which may or may not be an option
for your use case.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


  1   2   3   4   5   6   7   8   9   10   >