Re: How does PCIe appear to the host?

2024-10-03 Thread Michael van Elst
mo...@rodents-montreal.org (Mouse) writes:

>It's supposed to negotiate down to x1?

Yes.

> Then either Vantec or ASRock
>has done something odd or my particular Q1900M has a duff "x16" slot,
>because it doesn't work.

I once had a PCIe network card in a x16 slot that didn't work reliable
and wasn't recognized now and then. Reason was that the edge connector
wasn't correctly aligned and I had to shape it with a file. Some things
are just too cheap.


Greetings,


Re: How does PCIe appear to the host?

2024-10-03 Thread Michael van Elst
mo...@rodents-montreal.org (Mouse) writes:

>ahcisata0 at pci2 dev 0 function 0: vendor 0x197b product 0x0585

That's a JMicron JMB585 which has a PCIe Gen3 x2 interface and
provides five 6Gbps SATA ports. If your board has eight SATA
ports, then one of the SATA ports probably has an additional
1-to-4 port multiplier (so these share the bandwidth).

A JMB585 should have no problems to work in a x1 slot.



Re: How does PCIe appear to the host?

2024-10-03 Thread Michael van Elst
mo...@rodents-montreal.org (Mouse) writes:

>I note a possible conflict between the "x1" and the presence of a x16
>slot; that 1 is coming from the PCIE_LCAP_MAX_WIDTH bits in PCIE_LCAP,
>which makes me wonder whether something needs configuring to run the
>x16 slot at more than x1.  The card does say it needs a x4 or higher
>slot to work, so if the x16 slot is running x1 (is that even possible?)
>that might be responsible.

If the card requires x4 and you only have x1 it will not work.

PCIe cards generally are supposed to also run with fewer lanes, but
some older or more esoteric cards will not negotiate this.


>One of these machines is an ASRock Q1900M.  It has only two SATA ports
>onboard; it has two PCIe x1 slots and a PCIe x16 slot.  I just today
>picked up a 5-port PCIe SATA card and tried it.

The J1900 SoC on that board only provides 4 PCIe lanes.

While the SoC does also support x4 and x2 configurations, your
board is limited to the three slots with one lane each, and
the fourth lane is used for the RTL8111 ethernet chip.

I use a

ahcisata2 at pci2 dev 0 function 0: Marvell 88SE9215 SATA Controller (rev. 0x11)

card in a x16 slot with also only one lane. It provides 4 SATA ports.
The 88SE9215 is a ahcisata compatible controller (unlike older Marvell
chips that require their own driver and have some problems).

I also have a

jmide0 at pci1 dev 0 function 0: JMicron Technology JMB363 SATA/PATA Controller 
(rev. 0x03)

similar in a x4 slot with only one lane. It provides two SATA ports
and a PATA port that I use for an old DVD writer.



Re: Can't get SPI to work

2024-09-30 Thread Michael van Elst
nikitka.dons...@yandex.ru ("Nikita Donskov") writes:

>from RPI's firmware repo:
>https://github.com/raspberrypi/firmware/blob/master/boot/overlays/spi1-1cs.=
>dtbo

Maybe spi0-1cs.dtbo works better.


>   dtoverlay=3Dspi1-1cs,cs0_spidev=3Doff

The *spidev* attributes are for Linux and the Linux driver.

You can use http://cdn.netbsd.org/pub/NetBSD/misc/mlelstv/spi.dtbo,
the source is http://cdn.netbsd.org/pub/NetBSD/misc/mlelstv/spi.dts:

/dts-v1/;
/plugin/;

/ {
compatible = "brcm,bcm2835";

fragment@0 {
target = <&spi>;

__overlay__ {
status = "okay";
pinctrl-names = "default";
pinctrl-0 = <&spi0_gpio7>;
};
};
};


Greetings,


Re: assertion "lp_max >= core_max" failed

2024-09-22 Thread Michael van Elst
cme...@cmeerw.org (Christof Meerwald) writes:

>Turns out in my case the CPUID_HTT flag is not set (so lp_max = 1),
>but the maximum number of cores per package is set to 16 - 1, so
>core_max = 16.

That's a confusing number, the chip has 14 cores and 28 threads.
If HTT is disabled, that would still be only 14 threads. So
I guess, the VM software fakes these values.


>but we don't check for that condition in the Intel case - should we?
>Adding that case for Intel seems to fix it for me, e.g.

>   /* Check for leaf 4 support. */
>   if (ci->ci_max_cpuid >= 4 &&
>   (ci->ci_feat_val[0] & CPUID_HTT) != 0) {
>   /* Maximum number of Cores per package (eax[31:26]). */
>   x86_cpuid2(4, 0, descs);
>   core_max = __SHIFTOUT(descs[0], CPUID_DCP_CORE_P_PKG)
>   + 1;
>   } else {
>   core_max = 1;
>   }

Wouldn't that tell that we only have 1 core / 1 thread ?

Maybe instead of failing when lp_max < core_max, we should just
gracefully handle this case as lp_max == core_max.



Re: Out of memory debug

2024-08-26 Thread Michael van Elst
m...@netbsd.org (Emmanuel Dreyfus) writes:

>Running bt/a c6f1f980 at various times shows pageadaemon is alive. Is
>it just that the system is out of memory, and nothing can be reclaimed? 
>There is no swap configured.


The system is waiting for kernel memory. On 32bit systems
that can be more limited than physical memory.

pgdaemon is spinning because no memory gets freed (there are already
free pages) and no KVA space is released, by spinning it also slows
down the process of freeing memory.

Usually there is some unused memory by the vnode cache and
associated data. Reducing desiredvnodes (in DDB) which is the
same as the sysctl item kern.maxvnodes can revive the system,
but of course that doesn't solve the problem.



Re: config: conditional at clause (was: vio9p vs. GENERIC.local vs. XEN3_DOM[0U])

2024-08-12 Thread Michael van Elst
e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes:

>> The right level of abstraction is to do something that says
>> 
>>   if there is a virtio bus, add viop* at virtio*
>I have no idea about config's internal workings, but what about
>   viop* at? virtio*


What you can do is e.g.

ifndef xpci
vio9p* at virtio?
endif

The conditional configuration only works on attributes and the various
virtio attributes are currently defined unconditionally.

But the xpci attribute only exists for x86/xen builds.



Re: could fstat(1) show files in use by vnd(4)?

2024-08-11 Thread Michael van Elst
wo...@planix.ca ("Greg A. Woods") writes:

>After a reboot we can see the vnd(4) uses:

>   # vndconfig -l
>   vnd0: /build (/dev/mapper/vg0-build) inode 861956
>   vnd1: /build (/dev/mapper/vg0-build) inode 861966
>   vnd2: /build (/dev/mapper/vg0-build) inode 861953
>   vnd3: not in use

>So, might it be possible to have fstat show these somehow?  (perhaps
>with the/a kernel thread identified as having them open)

fstat reports file descriptors, not the underlying file handles.

pstat -f reports the file handles, but it dumps only a pointer to a vnode
and nothing that identifies a file directly.

pstat -v does dump the vnodes.

# ls -li testimage 
23471029 -rw-r--r--  1 mlelstv  staff  10485760 Aug 12 07:11 testimage
...

# vnconfig -l
vnd0: /home (/dev/dk5) inode 23471029
...

# pstat -v
...
*** MOUNT ffs /dev/dk5 on /home (log,nodev,nosuid,local)
ADDR TYP VFLAG  USE HOLD TAG NPAGE FILEID IFLAG RDEV|SZ
...
19fe4c00 reg M20   1 0 23471029   - 10485760
...

N.B.

# pstat -f | grep 19fe4c00
#

There is no file handle, vnd uses the vnode directly.



>Also, is this a crash that should be fixed, or is "umount -f" always a
>Buyer-Beware operation with expected "undefined behaviour"?

umount -f should either do nothing or do the umount, while mogrifying
each vnode to reference deadfs where all operations are supposed to
fail except for releasing the vnode. Still buyer-beware, but otherwise
a clearly defined behaviour.

The crash seems to happen at:

error = VOP_BMAP(vnd->sc_vp, bn / bsize, &vp, &nbn, &nra);

where

bsize = vnd->sc_vp->v_mount->mnt_stat.f_iosize;

deadfs hasn't initialized f_iosize. Maybe:

Index: vnd.c
===
RCS file: /cvsroot/src/sys/dev/vnd.c,v
retrieving revision 1.289
diff -p -u -r1.289 vnd.c
--- vnd.c   19 May 2023 15:42:43 -  1.289
+++ vnd.c   12 Aug 2024 05:31:36 -
@@ -875,6 +875,9 @@ handle_with_strategy(struct vnd_softc *v
bn = obp->b_rawblkno * vnd->sc_dkdev.dk_label->d_secsize;
 
bsize = vnd->sc_vp->v_mount->mnt_stat.f_iosize;
+   /* use default if the filesystem didn't specify a block size */
+   if (bsize <= 0)
+   bsize = BLKDEV_IOSIZE;
skipped = 0;
 
/*



Re: SCSI changes - PR58452 for review

2024-08-09 Thread Michael van Elst
nathanialsl...@yahoo.com.au (Nat Sloss) writes:

>Not sure I don't know if other scsi controller drivers have this issue (cjep@ 
>reported to me that there was kernel message "Should have flushed the queue" 
>reported with bluescsi-v2 and esp(4) when used along with the dse(4) driver).

I'm using a "ZuluSCSI" device that is almost identical to the BlueSCSI v2
and so far no problems. It's connected to an Amiga using the ahsc driver
(WD33C93 controller).

I doubt that the scsipi layer requires any changes to accomodate an NCR5380.



Re: MNT Reform2 USB LCP flash

2024-02-04 Thread Michael van Elst
staf...@shangtai.net (=?UTF-8?B?U3RhZmZhbiBUaG9tw6lu?=) writes:

>While I was fiddling around with it, I booted a FreeBSD-14 thumbdrive 
>and there it does work as well, and their driver helpfully tells you 
>what quirks it uses. This is what I found:

>umass quirks: 0xc104
>0x0004 - NO_START_STOP, "The drive does not support START STOP"
>0x0100 - NO_GETMAXLUN, "No GetMaxLun call"
>0x4000 - NO_SYNCHRONIZE_CACHE, "Deice cannot handle a SCSI synchronize 
>cache command."
>0x8000 - NO_PREVENT_ALLOW, "Device does not support PREVENT/ALLOW MEDIUM 
>REMOVAL"

>da: quirks: 0x2
>0x2 NO_6_BYTE - use SBC (10-byte) commands instead of RBC (6-byte) commands


There is sys/dev/usb/umass_quirks.c.

These quirks exist:

PQUIRK_NOSYNCCACHE (like NO_SYNCHRONIZE_CACHE)
PQUIRK_NODOORLOCK (like NO_PREVENT_ALLOW)
PQUIRK_ONLYBIG (like NO_6_BYTE)

We don't do GetMaxLun.

There seems to be nothing yet for NO_START_STOP. There is

PQUIRK_START

that forces a start at attach time. But at open time when
the unit still does not report ready, we issue the comamnd
again (and fail if it doesn't succeed). We probably need
another quirk PQUIRK_NOSTART and check it in scsipi_start()
similar to the PQUIRK_NODOORLOCK in scsipi_prevent().



Re: MNT Reform2 USB LCP flash

2024-02-04 Thread Michael van Elst
On Sun, Feb 04, 2024 at 10:37:59AM +0200, Staffan Thomen wrote:

> [ 214.0188739] umass0: NXP (0x1fc9) LPC1XXX IFLASH (0x000b), rev 2.00/7.04,

> [ 214.0288745] sd0(umass0:0:0:0):  sense debug information:
> [ 214.0288745] code 0x70 valid 0
> [ 214.0288745] seg 0x0 key 0x2 ili 0x0 eom 0x0 fmark 0x0
> 
> [ 214.0288745] info: 0x0 0x0 0x0 0x0 followed by 10 extra bytes
> [ 214.0288745] extra (up to 10 bytes): 0x0 0x0 0x0 0x0 0x30 0x1 0x0 0x0
> 0x0 0x0

That's what the device answers, but I cannot tell why. Maybe
the device is not (yet) in the correct mode to accept USB
access.

The product code 0x1fc9:0x000b seems to be a LPC11U24,
there is an application note AN11305 from NXP for
"USB In-System Programming with th LPC11U3X/LPC1U2X",
but I didn't find any hints in that document.


Greetings,
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: MNT Reform2 USB LCP flash

2024-02-04 Thread Michael van Elst
On Sun, Feb 04, 2024 at 09:58:39AM +0200, Staffan Thomén wrote:
> 
> The man page for scsictl(8) says that SCSIPI_DEBUG is the required option...

SCSIPI_DEBUG is it.

It should also set the default debug flags (that the ioctl may
change).

Greetings,
-- 
    Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: MNT Reform2 USB LCP flash

2024-02-03 Thread Michael van Elst
On Sat, Feb 03, 2024 at 09:55:47PM +0200, Staffan Thomén wrote:
> Staffan Thomen wrote:
> > [ 188.679957] sd0: 34816, 1 cyl, 64 head, 32 sec, 512 bytes/sect x 68
> > sectors
> > [ 188.689958] autoconfiguration error: sd0: unable to open device,
> > error = 5
> 
> Any thoughts of how to continue debugging this?


A kernel compiled with SCSI_DEBUG can show which commands to the
device actually fail and how. This should be the initial TEST_UNIT_READY
and possibly the START command. No idea why, it's possible that
these are not implemented and errors need to be ignored.


Greetings,
-- 
        Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: MNT Reform2 USB LCP flash

2024-01-26 Thread Michael van Elst
k...@munnari.oz.au (Robert Elz) writes:

>I have been meaning to suggest for ages that we remove all the
>geometry nonsense from everywhere in the kernel, except those
>drivers that actually need it.

We use that nonsense without actually knowing.

The "cylinder" value is used to sort disk accesses.
The "sector" value was used to optimize filesystem allocation.

Neither takes the values as is (the values are mostly fake
anyway), but as a hint.

Newer technologies may not use C/H/S coordinates, but every HDD
still uses cylinders and every SSD has a topology based on erase
blocks where "cylinder" could be a hint to optimize accesses.

So, such information should not be removed but needs to be exposed.
Better if it were exposed 1:1 from the underlying technology, but
it still needs to be compatible with the abstractions that use
it.

Doesn't mean that you could not find a better abstraction for a
storage medium in the future. For now, pretending everything is
a rotational disk from 50 years ago doesn't hurt but helps.



Re: MNT Reform2 USB LCP flash

2024-01-26 Thread Michael van Elst
k...@munnari.oz.au (Robert Elz) writes:

>If you are able, try building a kernel with the patch below.

>I suspect this should probably apply without too many problems
>to any reasonably modern NetBSD kernel version, patch is to
>src/sys/dev/scsipi/sd.c

>+  if (dp->cyls == 0)  /* very small devices */
>+  dp->cyls = 1;   /* round up # cyls */


People using the cylinder count assume that a disk is made of cylinders,
heads (surfaces) and sectors and that cyls * heads * sectors is the capacity.

For modern disks that's not true.

The values are intentionally truncated so that such people cannot access
blocks beyond the end of the device and software that (still) uses C/H/S
coordinates has a chance to use modern devices.

An alternative handling would be round up the values so that you can reach
all blocks using C/H/S coordinates and non-existent blocks return errors.
But what purpose would such fictious C/H/S coordinates serve and would
software relying on C/H/S be able to handle the errors ?

Rounding up only for disks with less than one full cylinder only helps
people that suffer from oudenophobia.



Re: MNT Reform2 USB LCP flash

2024-01-26 Thread Michael van Elst
staf...@shangtai.net (=?UTF-8?B?U3RhZmZhbiBUaG9tw6lu?=) writes:

>[21.611880] scsibus1 at umass1: 2 targets, 1 lun per target
>[21.611880] sd1 at scsibus1 target 0 lun 0: 1.0> disk removable
>[21.611880] sd1: fabricating a geometry
>[21.611880] sd1: 34816, 0 cyl, 64 head, 32 sec, 512 bytes/sect x 68 
>sectors
>[21.611880] autoconfiguration error: sd1: unable to open device, 
>error = 5

>It seems a bit interesting that it reports 2 targets, but only creates 
>an sd for one,

The '2 targets' is a parameter of 'scsibus1', it tells the SCSI layer
that it may look for up to 2 targets. USB mass storage usually only
has a single 'sd' target, but some also provide an extra 'ses'
enclosure target.


>and 0 cylinders seems a bit suspicous but I don't know if 
>that's ok or not.

When the drive doesn't return a valid geometry, the driver uses
a fake one, based on 64 heads and 32 sectors per head. In your
case the drive is smaller than a single cylinder (64*32), so
you get zero (full) cylinders.

Fortunately the drive geometry isn't really used anywhere. All
accesses just use the logical block addresses.


The EIO (5) error probably occurs because the drive is reported
as 'offline'. This is like a drive with a removable medium but
no medium has been loaded.

It is possible that there needs to be some action to 'load'
the 'medium', or it might just take some time to appear online.
You may use

   scsictl sd1 start

to attempt another access.

The LPC1xxx manual didn't reveal anything obvious about
this problem. It just claims that you can copy the firmware
to the storage. It also doesn't say how, with just 68 sectors
that's not a fake filesystem, you probably need to write
the firmware image to the raw device.



Re: *oldlenp comes back with wrong value in helper sysctl_createv() function

2024-01-20 Thread Michael van Elst
On Sat, Jan 20, 2024 at 10:48:12AM +0100, Emile 'iMil' Heitor wrote:


Hi,

that's from sysctl.c:

case CTLTYPE_STRING: {
unsigned char buf[1024], *tbuf;
tbuf = buf;
sz = sizeof(buf); 
rc = prog_sysctl(&name[0], namelen, tbuf, &sz, NULL, 0);

The sysctl command first tries with a buffer of 1024 bytes
and retries with the right size when that was too small.

Compared to "probing" with a NULL buffer this saves a round trip
to the kernel for most sysctls.

A simple helper function would always return the needed size and only
copy out when oldp was set. sysctl will check the returned *oldlenp 
against the value passed by the caller and return ENOMEM as appropriate.


Greetings,
-- 
        Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: *oldlenp comes back with wrong value in helper sysctl_createv() function

2024-01-20 Thread Michael van Elst
i...@home.imil.net (Emile 'iMil' Heitor) writes:

>Except it does not, the first time it calls back the helper function,
>*oldlenp value is 1024 no matter what I set it to before.
>But if I return once again (either with ENOMEM or 0, doesn't matter),
>the helper function will now be called with the right *oldlenp value.

The helper function produces the value that is returned in *oldlenp.

If you happen to use CTL_DESCRIBE (e.g. running sysctl -d), it's
not your helper function being called but the sysctl_describe helper
that returns a value of 1024.

Maybe you can show your helper routine and how you call sysctl ?





Re: PSA: Clock drift and pkgin

2023-12-30 Thread Michael van Elst
mo...@rodents-montreal.org (Mouse) writes:

>> Modern hardware could easily do 100kHz.

>Not with curren^Wat least one moderately recent NetBSD version!

>At work, I had occasion to run 9.1/amd64 with HZ=8000.  This was to get
>8-bit data pushed out a parallel port at 8kHz; I added special-case
>hooks between the relevant driver and the clock (I forget whether
>softclock or hardclock).  It worked for its intended use fairly
>nicely...but when I tried one of my SIGALRM testers on it, instead of
>the 100Hz it asked for, I got signals at, IIRC, about 77Hz.


Scheduling and switching userland processes is heavy. For a test
try to schedule kernel callouts with high HZ values. That still
generates lots of overhead with the current design but you should
be able to go faster than 8kHz.



PSA: Clock drift and pkgin

2023-12-30 Thread Michael van Elst
On Sun, Dec 31, 2023 at 12:42:29AM +0100, Johnny Billquist wrote:
> > Better than 100Hz is possible and still precise. Something around 1000Hz
> > is necessary for human interaction. Modern hardware could easily do 100kHz.
> 
> ? If I remember right, anything less than 200ms is immediate response for a
> human brain. Which means you can get away with much coarser than even 100Hz.
> And there are certainly lots of examples of older computers with clocks
> running in the 10s of ms, where human interaction feels perfect.

You may not be able to react faster than 200ms, but you can notice
shorter time periods.


> I think that is a separate question/problem/issue. That we fail when guest
> and host run at the same rate is something I consider a flaw in the system.

With a fixed tick, they cannot run at the same speed. This becomes
obvious when you try to run at different speeds that aren't just
integer multiples.

N.B. my m68k emulator runs a HZ=100 guest without a problem. But that's
a fake, in reality it only runs 100 ticks per second on average, In
particular when the guest becomes idle.


Greetings,
-- 
    Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: PSA: Clock drift and pkgin

2023-12-30 Thread Michael van Elst
On Sat, Dec 30, 2023 at 10:48:26PM +0100, Johnny Billquist wrote:
> 
> Right. But if you expect high precision on delays and scheduling, then you
> start also having issues with just random unpredictable delays because of
> other interrupts, paging, and whatnot. So in the end, your high precision
> delays and scheduling becomes very imprecise again. So, is there really that
> much value in that higher resolution?

Better than 100Hz is possible and still precise. Something around 1000Hz
is necessary for human interaction. Modern hardware could easily do 100kHz.

Another advantage is that you can use independent timing (that's what
bites in the emulator case where guest and host clocks run at the same
rate).

-- 
            Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: PSA: Clock drift and pkgin

2023-12-30 Thread Michael van Elst
b...@softjar.se (Johnny Billquist) writes:

>Being able to measure time with high precision is desierable, but we can 
>already do that without being tickless.

We cannot delay with high precision. You can increase HZ to some degree,
but that comes at a price.



Re: PSA: Clock drift and pkgin

2023-12-24 Thread Michael van Elst
sim...@netbsd.org (Simon Burge) writes:

>qemu uses ppoll() which is implemented with pollts() to do emulated
>timers, so that doesn't help here.  I don't know what simh uses, nor
>any of the other emulators.

simh uses pthread_cond_timedwait().

This actually waits using TIMER_ABSTIME for a deadline, but which is
converted to a timeout with ts2timo() and passed to sleepq_block()
as a number of ticks to wait for.

ts2timo() uses tvothz() which rounds up...



Re: PSA: Clock drift and pkgin

2023-12-23 Thread Michael van Elst
mo...@rodents-montreal.org (Mouse) writes:

>} else if (sec <= (LONG_MAX / 100))
>ticks = (((sec * 100) + (unsigned long)usec + (tick - 1))
>/ tick) + 1;

>which looks suspicious.  If sec is zero and usec is tick, that
>expression will return 2 instead of the 1 I suspect it needs to return.


The delay is always rounded up to the resolution of the clock,
so waiting for 1 microsecond waits at least 10ms.

The interval to the next tick can be arbitrarily short. Waiting
for at least 10ms therefore means to wait for the second next
tick.

In a tickless system, such a problem doesn't exist.



Re: Change max ttys from 8 to 12?

2023-12-19 Thread Michael van Elst
u...@stderr.spb.ru (Valery Ushakov) writes:

>Switching from a fixed size array to a dynamic one is probably not too
>much work either.  But then, overall, I think that trying to make the
>kernel substitute for screen, tmux (in base), etc is kinda dead end,
>so I'd rather we don't encourage it.


We currently already have an inconsistent configuration. wscons
is limited to 8 screens but there are keyboard symbols to switch
to 10 screens where the last two cannot be used.

So bumping the screen limit to 10 shouldn't really be the question.

Bumping both limits to 12 (and augmenting keysyms and the keymaps
for this) would align this with the other *BSDs. The con side here
is that some keyboards either only have 10 function keys or already
use F11 and F12 for other purposes (like DEC keyboard has F11=ESC,
so ctrl-alt-f11 invokes DDB).



Re: [RFC 2] userconf(4): 2nd proposal

2023-11-04 Thread Michael van Elst
tlaro...@kergis.com writes:

>disable {drmkms}   # NEW: disable devices belonging to group "drmkms"

Almost noone would need to turn off all drmkms drivers. What you may
want to control is that a GPU isn't used as a console. Disabling a driver
is just our crude workaround to achieve this.

I don't think that autoconf is the right place for such a control,
it should be a boot parameter, maybe even something that can be
changed at runtime later.

The current system of boot parameters is limited and differs a lot
between platforms. We need a common way to set boot parameters and
these should be mostly defined in a platform-agnostic way.


>Hint: Linuces distributions "work" as proposed images on servers,
>where NetBSD fails.

Servers usually do no have drmkms capable hardware, and if they have,
you probably want to use that hardware.



Re: [RFC 2] userconf(4): 2nd proposal

2023-11-04 Thread Michael van Elst
r...@sdf.org (RVP) writes:

>On Sat, 4 Nov 2023, RVP wrote:

>> 1) Allowing shell-like patterns (not hard to implement):
>>
>> uc>  disable *drm* *usb$ # all with `drm' anywhere and those ending in 
>>

>Ah, since these are shell-like patterns there's not need for a `$' to
>denote EOL. So:


userconf already supports "patterns", just not in the way you think.



Re: Unexpected out of memory kills when running parallel find instances over millions of files

2023-10-19 Thread Michael van Elst
mjgu...@gmail.com (Mateusz Guzik) writes:

>> While vnodes would be recyclable, they hardly get recycled unless
>> an filesystem object is deleted or the filesystem is unmounted.

>They get recycled all the time by vdrain thread if numvnodes goes above
>desiredvnodes, like it does in this test.

They should also be recycled when memory gets tight but they aren't.

As a consequence, not only the vnodes stay in memory, but also all
cached pages (the UVM objects).


>> Without swap, the kernel also has no chance to evict process pages
>> to grow the vnode cache further.

>It should not be trying to grow the vnode cache. If anything it should
>stop it from blowing out of proportion and definitely should not kill
>processes in presence of swaths of immediately freeable vnodes.

As long as you don't exceed maxvnodes (a value that got larger in
netbsd-10), almost nothing is freed.



Re: Unexpected out of memory kills when running parallel find instances over millions of files

2023-10-19 Thread Michael van Elst
mjgu...@gmail.com (Mateusz Guzik) writes:

>Running 20 find(1) instances, where each has a "private" tree with
>million of files runs into trouble with the kernel killing them (and
>others):
>[   785.194378] UVM: pid 1998.1998 (find), uid 0 killed: out of swap


>This should not be happening -- there is tons of reusable RAM as
>virtually all of the vnodes getting here are immediately recyclable.

While vnodes would be recyclable, they hardly get recycled unless
an filesystem object is deleted or the filesystem is unmounted.

>Specs are 24 cores, 24G of RAM and ufs2 with noatime. swap is *not* configured.

Without swap, the kernel also has no chance to evict process pages
to grow the vnode cache further.

You can try to avoid that situation by reducing the amount of cached
vnodes by setting kern.maxvnodes with sysctl. That value would need
to be more dynamic to actually exercise pressure when memory runs short.

N.B. it's possible for the system to lock up too. You can change maxvnodes
through ddb by setting the kernel variable desiredvnodes. There is a good
chance that the system recovers.



Re: dumping on RAIDframe

2023-09-25 Thread Michael van Elst
e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes:

>> you dump a memory block that isn't a multiple of a disk sector 
>> (according to disklabel)
>You mean this one (from disklabel raid0):
>   bytes/sector: 512
>?

Yes. Which makes it unlikely.

amd64/machdep.c:

this dumps at least DEV_BSIZE:

to_write = roundup(dump_headerbuf_ptr - dump_headerbuf, dbtob(1));
error = bdev->d_dump(dumpdev, dump_header_blkno,
dump_headerbuf, to_write);

and this is called with bytes being a multiple of the page size:

for (i = 0; i < bytes; i += n, dump_totalbytesleft -= n) {
n = bytes - i; 
if (n > BYTES_PER_DUMP)
n = BYTES_PER_DUMP;
...
error = (*dump)(dumpdev, blkno, (void *)dumpspace, n);
...
}




Re: dumping on RAIDframe

2023-09-25 Thread Michael van Elst
e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes:

>GO>Dumping to a RAID 1 set is supported in -8.  But yes, none of those 
>GO>values seem to align with each other.  18,1 is 'raid0b' thouugh, so that 
>GO>part seems correct.

>MvE> offset and size relate to the dump data (dumplo and dumpsize), not
>MvE> the partition.


The "device not ready" comes from the driver dump routine
returning EFAULT. The error code is abused, it is reported
when a) you dump a memory block that isn't a multiple of
a disk sector (according to disklabel) or b) you start a dump
while a dump is already running.

Maybe dump errors shouldn't be printed with DPRINTF.



Re: GPT attributes in dkwedge [PATCH]

2023-09-25 Thread Michael van Elst
k...@munnari.oz.au (Robert Elz) writes:

>Date:Mon, 25 Sep 2023 05:57:49 +
>From:Emmanuel Dreyfus 
>Message-ID:  

>  | bootme.cfg is searched in EFI paririon /EFI/NetBSD/boot.cfg

>Which EFI partition?   I think I have about 5 or 6, sprinkled around
>various bootable devices (more than one on some).


There can be multiple EFI system partitions on a drive, but it sometimes
confuses software, some boot procedures will only handle the first ESP.

But you "should" be able to select an ESP in UEFI just like you would
select a boot device in BIOS.



Re: dumping on RAIDframe

2023-09-25 Thread Michael van Elst
os...@netbsd.org (Greg Oster) writes:

>> dumping to dev 18,1 (offset=1090767, size=8252262):
>> 
>Dumping to a RAID 1 set is supported in -8.  But yes, none of those 
>values seem to align with each other.  18,1 is 'raid0b' thouugh, so that 
>part seems correct.

offset and size relate to the dump data (dumplo and dumpsize), not
the partition.



Re: GPT attributes in dkwedge [PATCH]

2023-09-24 Thread Michael van Elst
On Mon, Sep 25, 2023 at 05:57:49AM +, Emmanuel Dreyfus wrote:

> On Mon, Sep 25, 2023 at 12:20:00PM +0700, Robert Elz wrote:
> [bootme flag]
> > I'd always assumed it to be where efiboot should locate boot.cfg.
> > Where the kernel and root filesystems are located are in boot.cfg.
> 
> Bootme tels bootstrap where to look root partition. bootme.cfg is 
> searched in EFI paririon /EFI/NetBSD/boot.cfg  and root partition /boot.cfg. 

Boot partition and Root partition are something separate, even when the
default root partition is the same as the boot partition.

For EFI boot, the boot partition is where boot.cfg is searched
and where the kernel is loaded from, unless you specify something else
to the bootloader (e.g. in boot.cg). Where the kernel is loaded
from is passed to the kernel.

The kernel will use the informatiun to find the root partition,
unless you specify something else (or magic like the raidframe
hack comes into play).

So, the bootme flag effectively specifies the root partition, but only
by virtue of defaults being passed down the chain. The kernel should
not outguess things and interpret the flag itself.


-- 
        Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: GPT attributes in dkwedge [PATCH]

2023-09-18 Thread Michael van Elst
a...@absd.org (David Brownlee) writes:

>Our gpt(8) states "bootme flag is used to indicate which partition
>should be booted by UEFI boot code", which could be read either way.

The flag is used to find the partition to load /boot, /boot.cfg or
the kernel from. The boot disk information is also passed to the
kernel and may be used to find the root filesystem.

So in many cases bootme marks the root partition. It is rarely 
used on the EFI partition, but that's also a possibility.




Re: GPT attributes in dkwedge [PATCH]

2023-09-16 Thread Michael van Elst
On Sat, Sep 16, 2023 at 08:57:28PM +1000, Simon Burge wrote:

> The only corner case that an older kernel won't understand a longer root
> device name passed in by a newer /boot as the old kernel will still have
> the 16 char length limit.

Yes, old kernels will truncate the name. Since that's supposed to be
not found, the kernel will just come up without a root and ask.

A new record type on the other hand won't be understood and the kernel
will fall back to other root deducing methods and might scucceed in
finding a wrong partition. You could prevent this by always providing
an old dummy record with a non-empty string.

-- 
        Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: GPT attributes in dkwedge [PATCH]

2023-09-15 Thread Michael van Elst
mar...@duskware.de (Martin Husemann) writes:

>But the more general solution (which would be just as easy for the end
>user, but more flexibel) is to add support for a rootdev statement
>in boot.cfg and then put the label name or the guid there. Similar
>to evbarm taking a root=dev argument passed from the bootloarder.

You can already specify the root in boot.cfg.


>It would also be cool if boot.cfg could specify the partition to load
>the kernel from via something similiar to NAME=.

You can already specify the disk to load the kernel from with NAME=.



>You insist on this not working for multiboot (which I understand), but I

Multiboot specifies a BIOS drive number and 2 (or 3) partition numbers,
the interpretation of the partition numbers is not really defined
but commonly it's as MBR partition number (1..4 for primary 5+ for extended)
and optionally a BSD disklabel partition number (assuming the primary
partition happens to be identified as *BSD*).

That isn't really flexible or abstract enough or matches the
raidframe scenario and trying to do so (by fixing an interpretation)
is the wrong way to go.



>don't understand how you get this far at all with multiboot. And I think
>multiboot can pass a command line, so root=dev support and make it equivalent
>to the boot.cfg rootdev statement would solve it too.

The multiboot support code in the kernel already interprets the passed
command line, including a "root" option, and this also allows to pass
a wedge name.


There is one caveat. Since all x86 bootloader data is funneled through
the bootinfo structure we have:

struct btinfo_rootdevice {  
struct btinfo_common common;
char devname[16];   
};

So we have 16 chars to store the identifier (including a NAME= or
wedge: prefix), that's not enough for a UUID.




Re: GPT attributes in dkwedge [PATCH]

2023-09-15 Thread Michael van Elst
On Fri, Sep 15, 2023 at 03:15:10PM +, Emmanuel Dreyfus wrote:
> On Fri, Sep 15, 2023 at 03:06:46PM -0000, Michael van Elst wrote:
> > What about just telling the kernel what to use in /boot.cfg ?
> > No need to add more magic to the kernel.
> 
> Ths user took care of setting bootme so that botstrap finds 
> the kernel, and we should disregard this explicit setting
> when mounting root? 

That setting tells where /boot is loaded from and where /boot.cfg
is loaded from. The bootloader tells the kernel where the boot
partition is, traditionally in terms of offset and size, but optionally
as wedge identifier. The kernel will try to mount that partition.


-- 
            Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: GPT attributes in dkwedge [PATCH]

2023-09-15 Thread Michael van Elst
m...@netbsd.org (Emmanuel Dreyfus) writes:

>multitboot lets the bootloader pass boot device information as BIOS
>driver, partition number, subpartition number. This is intended
>for MBR extended partitions or MBR/disklabel.

What about just telling the kernel what to use in /boot.cfg ?
No need to add more magic to the kernel.



Re: panic options

2023-09-12 Thread Michael van Elst
r...@fdy2.co.uk (Robert Swindells) writes:

>There is a call to panic() if the kernel detects that there is no
>console device found, I would like to make this call to it just reboot
>without dropping into ddb.

Not without modifications.

You could include the nullcons (i.e. boot without console) and
detect the situation and reboot later in an rc script.



Re: raidctl -A softroot and a failed component

2023-09-12 Thread Michael van Elst
e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes:

>I had a RAIDframe level 1 RAID with the first component marked as failed, e,g,
>   component0: failed
>   /dev/dkN: optimal
>and although the set was configured -A softroot, the kernel didn't configure 
>raid0a as the root file system, presumably because the dk numbers didn't match.

The kernel will collect disks into raid sets based on the raidframe label
and use the found partitions or wedges on the raid sets, it then checks
each raid set if the booted_device matches a component device and choses
partition 0 from it (wedges don't have partitions, but "partition 0"
is the wedge itself).

IMHO the builroothack should just go away, same for lots of "guesswork"
done by the machdep code.

If you use unique wedge names, you can specify the root volume by name
at least on some archs, e.g. x86 and arm are fine.




Re: GPT attributes in dkwedgeq

2023-09-12 Thread Michael van Elst
mar...@duskware.de (Martin Husemann) writes:

>   if (flags & DKW_FLAGS_BOOTME)
>   rf_boot_from_filesystem_starting_at(dkw.offset)


A flag in GPT that is supposed to be used by a bootloader now causes
changes in the kernel disk infrastructure to be used for a magic solution
limited to the raidframe driver ?




Re: Maxphys on -current?

2023-08-04 Thread Michael van Elst
On Thu, Aug 03, 2023 at 11:04:18PM -0700, Brian Buhrow wrote:

> speed of the transfers on either system.  Interestingly enough, however, the 
> FreeBSD
> performance is markedly worse on this test.  


162MB/s or 179MB/s is just the speed of the disk, so I would guess the
disks are different.

There might also be some difference in command queuing parameters,
some disks get slower for this kind of test.


-- 
        Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: Maxphys on -current?

2023-08-03 Thread Michael van Elst
g...@lexort.com (Greg Troxel) writes:

>When you run dd with bs=64k and then bs=1m, how different are the
>results?  (I believe raw requests happen accordingly, vs MAXPHYS for fs
>etc. access.)

'raw requests' are split into MAXPHYS size chunks. While using bs=1m
reduces the syscall overhead somewhat, the major effect is that the
system will issue requests for all 16 chunks (1M / MAXPHYS) concurrently.
16 chunks is also the maximum, so between bs=1m and bs=2m the difference
is only the reduced syscall overhead.

The filesystem can do something similar, asynchronous writes are also
issued in parallel, for reading it may chose to read-ahead blocks to
optimize I/O requests, also for up to 16 chunks. In reality, large
contigous I/O rarely happens and the current UVM overhead (e.g. mapping
buffers) becomes more significant, the faster your drive is.

A larger MAXPHYS also reduces SATA command overhead, that's up to 10%
for SATA3 (6Gbps) that you might gain, assuming that you manage to
do large contigous I/O.

NVME is a different thing. While the hardware command overhead is
neglible, you can mitigate software overhead by using larger chunks
for I/O and the gain can be much higher, at least for raw I/O.



Re: LINEAR24 userland format in audio(4) - do we really want it?

2023-05-08 Thread Michael van Elst
On Mon, May 08, 2023 at 06:17:41PM -0400, Greg Troxel wrote:

Hi Greg,

> I'm not following.  Are you saying
> 
>   we should remove suppport from the kernel API for 24-bit linear?

24-bit support was disabled a long time for fear of "confusing software"
and I have enabled it to support 24-bit hardware. Without "userland
support", the code for handling 24-bit audio sources and sinks was
disabled (and broken). I also added support for 24-bit data to
audioplay/audiorecord, mostly to exercise the driver.

For other reasons, the whole audio system was limited to 16-bit audio
or some parts would silently assume to be 16-bit and even panic if
something told it to use anything else.

It still is limited to some degree as the audio mixer still runs in
16-bit, so all data gets truncated and later expanded as necessary
unless you rebuild the system with AUDIO_INTERNAL_BITS=32. (There
are a few uncommon drivers that may need to be fixed).

Support for 24-bit audio (or 32-bit audio) is rarely needed. For
all practical purposes 16-bit is enough for recording or playing audio.
There is studio equipment that goes beyond that, not much for audio
fidelity, but for comfort. I'm still not sure if 32-bit internal audio
processing is worthwhile as it adds overhead for low-end systems
(on the other hand the in-kernel mixer is much heavier, so these
already lost).


As for "confusing software", we basically hide capabilities of the
audio system. Applications only talk to virtual audio hardware
and pass an audio format that supports linear audio data only
as consecutive bytes (1,2,4 and now 3 bytes per sample) but not
creative bit allocations (like the feared 24-bit bits inside a 32-bit
word) or odd bit counts. You simply cannot specify this (or specify it
"wrongly"). So the only difference is now that an application does not
fail if it specifies 3 bytes per sample.

N.B. most software just hardcodes 16-bit little-endian signed samples and
one fixed sample rate of 48kHz (or 44.1kHz if the programmer remembered
audio CDs) and our audio driver will crudely resample this to what it
and the hardware supports.


Greetings,
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: LINEAR24 userland format in audio(4) - do we really want it?

2023-05-08 Thread Michael van Elst
n...@netbsd.org (nia) writes:


Hi nia,

>I believe this should not be enabled, and that applications
>should be trained to write 32-bit linear samples instead.

Two things.

The "userland" 24bit format flag is required for internal 24bit
processing because someone thought that without 24bit userland
there wouldn't be 24bit hardware to support.

I can easily find "userland" files with 24bit audio that can be
processed unless there is the artificial rule to forbid their use.
These are standard files used on other platforms and audioplay
can now handle a standard 24bit WAV file.



>The reason being that it's very confusing how exactly
>24-bit samples should be encoded, and different applications
>and implementations have different ideas.

That's why we have container formats that exactly describe
this for compatibility and applications that follow standards.

Please also don't forget that audio(4) isn't that flexible
when it comes to handling of sample encoding. For userland
there is still only one precision value that doubles as
stride. There is only one bit assignment (modulo endianess)
that an application could specify. Everything else still
requires reformatting by the application, so there is
little to confuse. We only support linear samples that
use 1,2,4 and now also 3 consecutive bytes in memory.

If an application needs to handle samples that aren't
stored in consecutive bytes, it has to parse and
reencode them, just like it always had to do.


>I think the confusing situation means that a lot of
>applications will be broken.

It's now no longer broken to handle 24bit WAV files.



Re: dkwedge: checksum before swapping?

2023-05-07 Thread Michael van Elst
mo...@rodents-montreal.org (Mouse) writes:

>But that comment clearly indicates that _someone_ thought it
>reasonable to checksum before swapping, so I can't help wondering what
>use case that's appropriate for.

It's a checksum over the 16bit words in native byte order. So when
you access the words in opposite byte order, you need to swap the
checksum too.

Unlike the regular disklabel code (which ignores other-endian disklabels)
the wedge autodiscover code will accept either.

As for padding, the structure is nowadays defined with fixed size types
and explicit padding fields, so we may still assume that the compiler
won't add any further padding by itself.



Re: NFS issue with 10.0_BETA

2023-02-27 Thread Michael van Elst
m...@ecs.vuw.ac.nz (Mark Davies) writes:

linux->netbsd10
>18 ->  V3 SETATTR Call (Reply in 19), FH: 0xf9f94117
>19 <-  V3 SETATTR Reply (Call In 18) Error: NFS3ERR_ACCESS

linux->netbsd9
>16 ->  V3 SETATTR Call (Reply in 17), FH: 0xfe8a620f
>17 <-  V3 SETATTR Reply (Call In 16)

netbsd10->netbsd10
>11 ->  V3 SETATTR Call (Reply in 12), FH: 0xf9f94117
>12 <-  V3 SETATTR Reply (Call In 11)


That's the truncate operation where things fail. Somewhere
the NFS call must be different. Please dump the full RPC
call and reply.



Re: kernel goes dark on boot

2023-01-11 Thread Michael van Elst
m...@netbsd.org (Emmanuel Dreyfus) writes:

>I suspect a mode problem. In the boot prompt, gop displays:
>*0: 1920x1080 BGRR pitch 1920 bpp 32
> 1: 640x480 BGRR pitch 640 bpp 32
> 2: 800x600 BGRR pitch 800 bpp 32
> 3: 1024x768 BGRR pitch 1024 bpp 32
> 4: 1280x1024 BGRR pitch 1280 bpp 32

>Trying gop [1-3] causes the screen to go dark. But the machine is not
>crashed, I can still type blindly gop 0 and get the display back.

That's not part of the kernel. The bootloader is an EFI application
and just talks to the firmware.

The list presented usually matches the VESA BIOS modes. In my experience
the manufacturers often do not care about this list as modern OSes just
take over the display. So you may find modes that do not work or you
miss modes that match the native display resolution.

When the NetBSD kernel boots and the display driver (usually DRM nowadays)
supports your graphics device, it will also take over the display. This
happens in about the middle of the autoconfig process.

Without DRM you are usually stuck with the configuration left from UEFI.



Re: -10.0_BETA panics when system is rebooting

2023-01-06 Thread Michael van Elst
k...@munnari.oz.au (Robert Elz) writes:

>Do we do crash dumps onto raidsets?

We can dump on raidsets, the code selects a working component or a spare
disk to dump on. Doesn't mean that it works, the more complicated the
device access gets, the less likely it will succeed.



Re: uhidev1 BMC Virtual Keyboard via HP iLO

2022-12-12 Thread Michael van Elst
c...@sdf.org ("Stephen M. Jones") writes:

>While it partially works, shift/shift lock key and sometimes space bar =
>does
>not seem to work properly.

Can you be specific in how it does not work properly?


>[ 11434.0227330] ukbd0 at uhidev0
>[ 11434.4428808] ums0 at uhidev1: 3 buttons

>login: abcdefghijklmnopqrstuvxyzabcdefghijklmnopqrstuvwxyz

This looks fine.


>for constty I've tried a couple of different settings in /etc/ttys other =
>than wsvt25 thinking that
>may help (such as pc or vt100).

The value in /etc/ttys sets the TERM value which controls how programs
should interact with the terminal. This is already above anything
related to keyboard layout and key codes.

If you want to debug this, it is necessary to look at the lower levels,
the wskbd keymapping and even the USB stack. This is difficult if
this is the console, but you could configure the system to use
a serial console instead.




Re: Autoconfigure timeout on MMC drive

2022-11-23 Thread Michael van Elst
bsd...@proton.me (Salil) writes:

>sdhc0 at pci0 dev 28 function 0: vendor 8086 product 31cc (rev. 0x06)

This is the intel GLK SDHC controller.

>sdmmc0: sdmmc_mem_enable failed with error 60
>sdmmc0: autoconfiguration error: couldn't enable card: 60

>Does anyone know the resolution?

The sdhc driver needs to learn about pecularities of the Intel controller,
the Linux driver has specific code for power control (which is probably
the main problem), special resume handling and a workaround for a UHS bug.



Re: Deadlock (maybe related to PR kern/56925)

2022-11-19 Thread Michael van Elst
On Sat, Nov 19, 2022 at 03:39:55PM +0100, BERTRAND Joël wrote:
> > you need to build the module and install it.
> > 
> > http://ftp.netbsd.org/pub/NetBSD/misc/mlelstv/iscsi.kmod is built
> > for a GENERIC netbsd-9 amd64 kernel and might be sufficient.
> 
>   Patched and system is rebooted.
> 
>   I don't understand how modules really work. For example :
> legendre# modstat | grep iscsi
> iscsi  driver   builtin  -0   - -
> 
>   If I understand, this module is loaded from kernel, not from
> filesystem. Thus, i have rebuilt whole kernel. How can I force kernel to
> switch to a module in /stand tree ?

That depends on how you built the kernel. If you include the module
in the kernel build it's "builtin" and cannot be loaded.

In this case you need to build the kernel (not modules) and install it.


Greetings,
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: Deadlock (maybe related to PR kern/56925)

2022-11-19 Thread Michael van Elst
On Sat, Nov 19, 2022 at 11:00:56AM +0100, BERTRAND Joël wrote:
> 
>   All workstations behind this server are diskless (root filesystem and
> swap). Linux boxes can survive (not everytime), but FreeBSD boxes always
> crash when main server is rebooted. I don't understand as they answer to
> ping but consoles don't respond and I can not log to them by ssh anymore.

If root/swap are iSCSI, they probably get an I/O error and cannot handle it.
NFS is much more graceful.

N.B. our kernel initiator will try to detach the device when all
connections are lost (similar to a USB drive being unplugged).


Greetings,
-- 
            Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: Deadlock (maybe related to PR kern/56925)

2022-11-19 Thread Michael van Elst
On Sat, Nov 19, 2022 at 08:41:49AM +0100, BERTRAND Joël wrote:
> 
>   Just with a -current kernel and not all userland ? I can try, but it's
> a critical server and I cannot reboot this server without rebooting all
> workstation on network...

Current kernel with -9 userland will work mostly, so it's maybe better to
use a patched -9 kernel.

http://ftp.netbsd.org/pub/NetBSD/misc/mlelstv/iscsi-9-current.diff

will bring the netbsd-9 iscsi driver to -current. For a regular kernel
you need to build the module and install it.

http://ftp.netbsd.org/pub/NetBSD/misc/mlelstv/iscsi.kmod is built
for a GENERIC netbsd-9 amd64 kernel and might be sufficient.


Why would the workstations require rebooting ?


-- 
            Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: Deadlock (maybe related to PR kern/56925)

2022-11-18 Thread Michael van Elst
joel.bertr...@systella.fr (=?UTF-8?Q?BERTRAND_Jo=c3=abl?=) writes:

>>> For a very long time (I don't remember if it was from 9.0 or 9.1), my
>>> main server randomly panics or enters in a deadlock when it tries to
>>> access to an iSCSI NAS.
>> 
>> Can you provide information about the panic? kernel backtraces?

>   All informations are in PR/56925


Can you test with a -current kernel? The code could also be moved
with only little changes to netbsd-9.



Re: Deadlock (maybe related to PR kern/56925)

2022-11-18 Thread Michael van Elst
joel.bertr...@systella.fr (=?UTF-8?Q?BERTRAND_Jo=c3=abl?=) writes:

>   For a very long time (I don't remember if it was from 9.0 or 9.1), my
>main server randomly panics or enters in a deadlock when it tries to
>access to an iSCSI NAS.

Can you provide information about the panic? kernel backtraces?
When the system deadlocks, can you then enter DDB ?



Re: boothowto(9) options with Raspberry Pi

2022-10-05 Thread Michael van Elst
stephan...@googlemail.com (Stephan) writes:

>Hello,

>I am re-asking this question here because I have not received a reply
>on port-arm@, surprisingly.

>I am doing some experimentation with the Raspberry Pi for which I have
>set up a serial connection to another computer. Now I=C2=B4d like to
>prevent certain drivers from being loaded at boot using the
>userconf(4) prompt. However, I wasn=C2=B4t able to find out how to pass the
>corresponding boothowto(9) parameter (-c) to the kernel (I tried
>several variants in cmdline.txt).

Nothing there

This patch adds the missing boot options.

Index: sys/arch/arm/arm32/arm32_machdep.c
===
RCS file: /cvsroot/src/sys/arch/arm/arm32/arm32_machdep.c,v
retrieving revision 1.144
diff -p -u -r1.144 arm32_machdep.c
--- sys/arch/arm/arm32/arm32_machdep.c  28 Jul 2022 09:14:23 -  1.144
+++ sys/arch/arm/arm32/arm32_machdep.c  5 Oct 2022 12:25:52 -
@@ -575,6 +575,26 @@ parse_mi_bootargs(char *args)
|| get_bootconf_option(args, "-a", BOOTOPT_TYPE_BOOLEAN, &integer))
if (integer)
boothowto |= RB_ASKNAME;
+   if (get_bootconf_option(args, "userconf", BOOTOPT_TYPE_BOOLEAN, 
&integer)
+   || get_bootconf_option(args, "-c", BOOTOPT_TYPE_BOOLEAN, &integer))
+   if (integer)
+   boothowto |= RB_USERCONF;
+   if (get_bootconf_option(args, "halt", BOOTOPT_TYPE_BOOLEAN, &integer)
+   || get_bootconf_option(args, "-b", BOOTOPT_TYPE_BOOLEAN, &integer))
+   if (integer)
+   boothowto |= RB_HALT;
+   if (get_bootconf_option(args, "-1", BOOTOPT_TYPE_BOOLEAN, &integer))
+   if (integer)
+   boothowto |= RB_MD1;
+   if (get_bootconf_option(args, "-2", BOOTOPT_TYPE_BOOLEAN, &integer))
+   if (integer)
+   boothowto |= RB_MD2;
+   if (get_bootconf_option(args, "-3", BOOTOPT_TYPE_BOOLEAN, &integer))
+   if (integer)
+   boothowto |= RB_MD3;
+   if (get_bootconf_option(args, "-4", BOOTOPT_TYPE_BOOLEAN, &integer))
+   if (integer)
+   boothowto |= RB_MD4;
 
 /* if (get_bootconf_option(args, "nbuf", BOOTOPT_TYPE_INT, &integer))
bufpages = integer;*/
@@ -603,6 +623,10 @@ parse_mi_bootargs(char *args)
|| get_bootconf_option(args, "-x", BOOTOPT_TYPE_BOOLEAN, &integer))
if (integer)
boothowto |= AB_DEBUG;
+   if (get_bootconf_option(args, "silent", BOOTOPT_TYPE_BOOLEAN, &integer)
+   || get_bootconf_option(args, "-z", BOOTOPT_TYPE_BOOLEAN, &integer))
+   if (integer)
+   boothowto |= AB_SILENT;
 }
 
 #ifdef __HAVE_FAST_SOFTINTS




Re: 9.99.100 fallout: file(1)

2022-09-21 Thread Michael van Elst
k...@munnari.oz.au (Robert Elz) writes:

>The way you have it coded, I suspect that 9.1 binaries will appear to
>be 9.1.0 instead (the ver_patch data is always appended for ver_maj >= 9).

True. Here is a patch that ignores a zero patch level.

Index: external/bsd/file/dist/src/readelf.c
===
RCS file: /cvsroot/src/external/bsd/file/dist/src/readelf.c,v
retrieving revision 1.25
diff -p -u -r1.25 readelf.c
--- external/bsd/file/dist/src/readelf.c9 Apr 2021 19:11:42 -   
1.25
+++ external/bsd/file/dist/src/readelf.c21 Sep 2022 23:17:49 -
@@ -456,7 +456,13 @@ do_note_netbsd_version(struct magic_set 
 
if (file_printf(ms, " %u.%u", ver_maj, ver_min) == -1)
return -1;
-   if (ver_rel == 0 && ver_patch != 0) {
+   if (ver_maj >= 9) {
+   ver_patch += 100 * ver_rel;
+   if (ver_patch != 0) {
+   if (file_printf(ms, ".%u", ver_patch) == -1)
+   return -1;
+   }
+   } else if (ver_rel == 0 && ver_patch != 0) {
if (file_printf(ms, ".%u", ver_patch) == -1)
return -1;
} else if (ver_rel != 0) {


>However, I wonder why this kind of info is embedded in ELF files, what
>point does that have?   Maybe it would be better to have them just say
>x.99 (and forget the kernel ABI bump number) ?

The note has little value, best it can do is identify from which
release the binary came, in case you have a mixed installation. And
then the full version number is required.



Re: 9.99.100 fallout: file(1)

2022-09-21 Thread Michael van Elst
campbell+netbsd-tech-k...@mumble.net (Taylor R Campbell) writes:

>We appear to have revived the old alphanumeric versioning scheme,
>according to file(1)!  Someone needs to teach file(1) that this is
>9.99.100, not 9.99A(.0).

Index: external/bsd/file/dist/src/readelf.c
===
RCS file: /cvsroot/src/external/bsd/file/dist/src/readelf.c,v
retrieving revision 1.25
diff -p -u -r1.25 readelf.c
--- external/bsd/file/dist/src/readelf.c9 Apr 2021 19:11:42 -   
1.25
+++ external/bsd/file/dist/src/readelf.c21 Sep 2022 19:32:32 -
@@ -456,7 +456,11 @@ do_note_netbsd_version(struct magic_set 
 
if (file_printf(ms, " %u.%u", ver_maj, ver_min) == -1)
return -1;
-   if (ver_rel == 0 && ver_patch != 0) {
+   if (ver_maj >= 9) {
+   ver_patch += 100 * ver_rel;
+   if (file_printf(ms, ".%u", ver_patch) == -1)
+   return -1;
+   } else if (ver_rel == 0 && ver_patch != 0) {
if (file_printf(ms, ".%u", ver_patch) == -1)
return -1;
} else if (ver_rel != 0) {

% file /bin/ls
/bin/ls: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically 
linked, interpreter /libexec/ld.elf_so, for NetBSD 9.99.100, not stripped



Re: Dell PERC H330: no disks, no volumes

2022-09-15 Thread Michael van Elst
e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes:

>What /does/ work is setting the controller to RAID mode and create two 
>volumes with a one-element RAID-0. But that feels like crazy.

That's a common setup. In particular it allows to present a "right-sized" disk
so that you can replace a faulty disk with a similar model of slightly different
size without changing the visible disk layout.

It also means that the RAID controller could do media checks (which it will
not do if you pass through a raw disk).

N.B. the controller should be able to pass through raw disks (JBOD), but
maybe your controller setup tool doesn't support this.



Re: Dell PERC H330: no disks, no volumes

2022-09-14 Thread Michael van Elst
e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes:

>> I don't remember the details (and it depends on the controller version),
>> but you need to have physical disks assigned to one (or more) RAID volume,
>> and then the RAID volume has to be exported as one (or more) virtual disks.
>But what if I want to pass the bare discs to NetBSD for a RAIDframe use?

If possible configure the disk as JBOD or as a 1 disk RAID-0 volume.



Re: Dell PERC H330: no disks, no volumes

2022-09-14 Thread Michael van Elst
e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes:

>> -> This is attaching a H330 (RAID version) and it gets the mfii driver.
>> mfii0 at pci1 dev 0 function 0: "PERC H330 Mini", firmware 25.5.9.0001
>OK, remains the question why I don't see any discs in bioctl.

>   0 Non-RAID Disk(s) found on the host adapter.
>   0 Non-RAID Disk(s) handled by BIOS
>   0 Virtual Disk(s) found on the host adapter.
>   0 Virtual Disk(s) handled by BIOS

This means you do have a RAID controller (so much for marketing) but that
you have haven't configured any RAID volumes.

>Is this normal? The only place I see discs being recognized is in the BIOS 
>setup's controller setup.

Yes, in the controller setup you can create "Non-RAID Disks" (aka
JBOD) or "Virtual Disks" (aka RAID volumes) and at least the latter
could then be visible to bootloader (BIOS) and kernel. Disks that
aren't configured as either aren't visible.

In theory you could use bioctl to create and manage volumes, but the
driver doesn't implement it.



Re: panic in sysmon_envsys_unregister

2022-09-14 Thread Michael van Elst
e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes:

>The other question is why the register call fails.
>According to the BIOS setup, the controller has no sensors. Could that be 
>the problem?

The bio framework uses sensors so that you can watch for failed volumes.
No volumes means no sensors and the register code fails.



Re: panic in sysmon_envsys_unregister

2022-09-13 Thread Michael van Elst
e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes:

>This is -current from around yesterday.
>I guess the problem is related to
>   mfii0: autoconfiguration error: unable to register with sysmon (rv = 86)
>   mfii0: autoconfiguration error: unable to create sensors
>So probably someone is trying to un-resgister something not registered.

Indeed:

Index: mfii.c
===
RCS file: /cvsroot/src/sys/dev/pci/mfii.c,v
retrieving revision 1.26
diff -p -u -r1.26 mfii.c
--- mfii.c  16 Jul 2022 07:23:51 -  1.26
+++ mfii.c  14 Sep 2022 04:05:23 -
@@ -3980,6 +3980,8 @@ mfii_create_sensors(struct mfii_softc *s
sc->sc_sme->sme_refresh = mfii_refresh_sensor;
rv = sysmon_envsys_register(sc->sc_sme);
if (rv) {
+   sysmon_envsys_destroy(sc->sc_sme);
+   sc->sc_sme = NULL;
aprint_error_dev(sc->sc_dev,
"unable to register with sysmon (rv = %d)\n", rv);
}



Re: Dell PERC H330: no disks, no volumes

2022-09-13 Thread Michael van Elst
e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes:

>> These controller chips can run two different kinds of firmware.
>> The mfii driver is for talking to the RAID firmware ("IR mode")
>> while the mpii driver is for talking to the vanilla SAS firmware
>> ("IT mode").
>Ah, and how do I know which mode my card runs?
>mpii(4) explicitly mentions the Dell PERC HBA330, but the "R" in PERC 
>is for RAID.
>The controller can be switched to RAID or HBA mode in the BIOS setup, 
>so does it run both firmware versions?

The different firmware versions return different PCI-IDs, so that
the right driver attaches, e.g.:

mpii.c: { PCI_VENDOR_SYMBIOS,   PCI_PRODUCT_SYMBIOS_SAS3008 },
mfii.c: { PCI_VENDOR_SYMBIOS,   PCI_PRODUCT_SYMBIOS_MEGARAID_3008,

PCI_PRODUCT_SYMBIOS_SAS3008 = 0x0097
PCI_PRODUCT_SYMBIOS_MEGARAID_3008 = 0x005f

There is a PERC H330 and a PERC HBA330 and the Dell PERC9 user manual
(includes the H330) says you can boot it in HBA mode. Not sure if
that means that you can chose the firmware.

-> This is attaching a H330 (RAID version) and it gets the mfii driver.
mfii0 at pci1 dev 0 function 0: "PERC H330 Mini", firmware 25.5.9.0001



Re: Dell PERC H330: no disks, no volumes

2022-09-13 Thread Michael van Elst
p...@whooppee.com (Paul Goyette) writes:

>On Tue, 13 Sep 2022, Edgar Fu=DF wrote:
>> It appears to me we have two drivers for the SAS3008: mfii(4) and mpii(4)=
>=2E
>> Why?

>I know nothing about these drivers, but the man pages show that mfii
>works for MegaRAID devices, while mpii deaels with LSI devices.  I
>have no idea if this makes a difference.


These controller chips can run two different kinds of firmware.
The mfii driver is for talking to the RAID firmware ("IR mode")
while the mpii driver is for talking to the vanilla SAS firmware
("IT mode").

Some of the cards can even be re-flashed and used with either firmware.



Re: Devices without power management support

2022-08-18 Thread Michael van Elst
p...@whooppee.com (Paul Goyette) writes:

>Don't forget to deregister the device if the xxx_attach() later exits...

I think the point was to not do this, so that a failed attach doesn't
prevent the system from entering sleep mode.

Here, calling pmf_register first on attachment is then required,
but then you have to think on how to handle failures of pmf_register.
You wouldn't want the device to fail attachment in that case,
but rather accept that you cannot suspend in that situation.

You also have to think about drivers that are left only partially
configured after a failed attachment. That is already unsafe with
drivers that only do a dummy pmf registration, but what about
hardware that actually needs support routines ? Registering these
for a partially configured driver may not work either. Unregister
again and register dummy routines ?

An easy solution would be to make attach actually fail, autoconf
could then detached the driver immediately and it won't interfere
with pmf.

N.B. if autoconf API is changed, you could also integrate pmf
into cfattach.



Re: fan control

2022-08-10 Thread Michael van Elst
m...@netbsd.org (Emmanuel Dreyfus) writes:

>On Wed, Aug 10, 2022 at 03:09:34PM -0000, Michael van Elst wrote:
>> That's the fan "alarm status" being FALSE, meaing, there is no
>> alarm.  It's all well.

>I know one of the five fans is dead, hence I wonder what can trigger
>an alarm.

It's probably only one specific fan (CPU fan?) that is reported. Many
fans also can only be controlled, but not monitored.



Re: fan control

2022-08-10 Thread Michael van Elst
m...@netbsd.org (Emmanuel Dreyfus) writes:

>and not even get their status. That examplains envstat(8) output
>about the fans:

>  Current  CritMax  WarnMax  WarnMin  CritMin  Unit
>[acpifan0]
> state: FALSE

That's the fan "alarm status" being FALSE, meaing, there is no
alarm.  It's all well.




Re: fan control

2022-08-10 Thread Michael van Elst
m...@netbsd.org (Emmanuel Dreyfus) writes:

>Is there any way to control fans through ACPI? acpi(4) says "The acpifan 
>driver does not support controlling the fan", but is it something that
>needs to be implmented, or is it not possible?

Depends. Some ACPI implementations allow you to set fan parameters
and thresholds, but often they do not. Sometimes can set such
parameters in BIOS.

https://www.kernel.org/doc/html/latest/admin-guide/acpi/fan_performance_states.html

So, we could add support for "Fan Fine Grain Control", and maybe
your ACPI code offers it.

Linux also has drivers for specific hardware (mostly laptops) that
can control the fans directly or can set temperature thresholds
that ACPI will then honor.



Re: Scanning floppy devices with assumed density

2022-07-03 Thread Michael van Elst
On Sun, Jul 03, 2022 at 01:20:15PM +0200, Martin Husemann wrote:
> On Sun, Jul 03, 2022 at 05:21:27AM -0000, Michael van Elst wrote:
> > Another question of course is why the isa fd driver reads a disklabel at all
> > when it (ab-)uses the partition number to select densities.
> 
> Yeah, can we commit that fix with a log like:
> 
>  fd(4): only support GPT partitioning on floppies, welcome to this millenium
> 
> please?

Would be a confusing comment, there is no partitioning at all.


-- 
        Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: Scanning floppy devices with assumed density

2022-07-02 Thread Michael van Elst
j...@ziaspace.com (John Klos) writes:

>boot device: fd0 [ 5.121888] fd0d: hard error reading fsbn 0 of 0-2 (st0 
>0x40 st1 0x1 st2 0x0 cyl 0 head 0 sec 1)

>She wondered why fd0d is being used here. I can't imagine this is due to 
>scanning for a disklabel, since they've been around forever, so is this 
>perhaps due to dkwedge_discover?

fd doesn't try to discover wedges, but for identifying a possible wedge
rootconf() calls opendisk() and VOP_IOCTL(, DIOCGDINFO, ) to check the
disklabel. And opendisk() only opens the RAW_PART.

It's difficult to skip floppy devices, these are disks with a disklabel
like anything else. But the MD autoconf code could do that and
provide partition data (offset, length) instead of a partition number,
then the MI code doesn't have to find this.

An alternative would be to change opendisk() to accept a partition number.
But that's way more intrusive.

Another question of course is why the isa fd driver reads a disklabel at all
when it (ab-)uses the partition number to select densities.

The amiga fd driver also handles disk formats that way, but it also just
fakes a disklabel, the isa fd driver should do the same.



Re: read(2) failed: Read-only file system

2022-06-27 Thread Michael van Elst
s...@stix.id.au (Paul Ripke) writes:

>I guess I'm a little surprised by this error?
>Nor does read(2) list EROFS as a possible return, which seems sensible.

The errors listed with system calls are rarely complete, in particular for 
errors you rarely observe or here for operations on something else than regular 
files.


>Looking at the code, I'm guessing the drive is returning
>SKEY_DATA_PROTECT for some reason, so this is likely not a bug, but
>interesting behaviour.

>Jun 27 16:19:40 slave /netbsd: [ 3710565.0174648] sd0(umass0:0:0:0):  Check 
>Condition on CDB: 0x28 00 2a 44 2f 68 00 00 08 00
>Jun 27 16:19:40 slave /netbsd: [ 3710565.0174648] SENSE KEY:  Write 
>Protected
>Jun 27 16:19:40 slave /netbsd: [ 3710565.0174648]  ASC/ASCQ:  Logical Unit 
>Access Not Authorized
>Jun 27 16:19:40 slave /netbsd: [ 3710565.0174648] sd0d: error reading fsbn 
>709111656 of 709111656-709111663 (sd0 bn 709111656; cn 346245 tn 59 sn 8)

It's still a READ_10 command. The translation from SKEY_DATA_PROTECT to "Write 
Protected" isn't always correct, but the choices are limited.

The more detailed SCSI status "Logical Unit Access Not Authorized" usually 
refers to the drive being password locked. Now find out if
that reflects reality or if the USB enclosure just tries to interpret some 
other condition and reports this code for lack of something
more specific.



Re: New iwn firmware & upgrade procedure

2022-06-19 Thread Michael van Elst
h...@netbsd.org (Havard Eidnes) writes:

>1) Could the if_iwn driver fall back to using the 6000g2a-5 microcode
>   without any code changes?  (My gut feeling says "yes", but I have
>   no existence proof of that.)

The only change that happened was the update of the firmware blob.
But there is no code that would search for it.


>2) Should I have to extract parts of user-land in order to make the
>   wireless driver in the kernel work as intended?

That's how it is. It's true for kernel modules, it's true for most
firmware blobs.

At some point we probably need to make the kernel a kind of archive
or package that embeds all necessary parts for a change.


>What about GPU firmware?  I see it also has a new set all of its own,
>and some of the related compatibility questions can be raised there.

Often you need to change driver and firmware together. If the firmware
filenames aren't versioned (and currently they aren't), this may require
to exchange the files with the kernel or at least to provide alternate
search paths.



Re: killed: out of swap

2022-06-15 Thread Michael van Elst
b...@softjar.se (Johnny Billquist) writes:

>> They might be the reason for the memory shortage. You can prefer large
>> processes as victims or protect system services to keep the system
>> managable.

>So when one process tries to grow, you'd kill a process that currently 
>have no issues in running?


All processes have issues on that system and the goal is to keep things
alive so that you can recover, a system hang, crash or reboot is the
worst outcome.

Obviously there is no heuristic that can predict what action will have
the best outcome and which causes the least damage. Guessing on the
cost of various kinds of damage is an impossible task by itself as
that is fairly subjective.

But there can be a heuristic that helps in many cases, and for the rest
you can hint the system.




Re: killed: out of swap

2022-06-14 Thread Michael van Elst
b...@softjar.se (Johnny Billquist) writes:

>I don't see any realistic way of doing anything with that.
>It's basically the first process that tries to allocate another page 
>when there are no more. There are no other processes at that moment in 
>time that have the problem, so why should any of them be considered?

They might be the reason for the memory shortage. You can prefer large
processes as victims or protect system services to keep the system
managable.



Re: File system corruption due to UFS2 extended attributes

2022-05-23 Thread Michael van Elst
c...@chuq.com (Chuck Silvers) writes:

> - fsck will take a new option "-c ea" to specify that an existing UFS2
>   file system should be converted to support extended attributes
>   (ie. converted to UFS2ea).  This conversion first clears all of the on-disk
>   pointers to extended attribute blocks (the inode "di_extb" field),
>   since in NetBSD releases prior to NetBSD 10, those pointers could only
>   have been set to non-zero values by corruption in the file system.

There should be a way back so that the filesystem becomes usuable
by netbsd-9 again (basically: clear di_extb and set magic to UFS2).
Would also be nice to pull up that feature to netbsd-9.



Re: two keys with same keycode on ADB

2022-05-12 Thread Michael van Elst
m...@netbsd.org (Emmanuel Dreyfus) writes:

>On Thu, May 12, 2022 at 07:26:58AM +0200, Michael van Elst wrote:
>> This seems to suggest that we have special codes on the JIS keyboard
>> for:
>> 
>> Yen   0x5d  (KC_JPY ?)
>> Ro0x5e  (KC_RO ?)
>> Eisu  0x66  (Switch to Roman)
>> Kana  0x68  (Switch to Hiragana)
>> , 0x5f  (numeric keypad)

>   switch (key) {
>   case 0x5F:  // numpad ',' using raw ADB scan code
>   charCode = ',';
>   break;

>Does that means USB scancode for ADB KS_comma (43), that is 54?
>Or ASCII code for comma, that is 44?


I think that code translates from keycode (0x5f) to ASCII ',' (44).



Re: two keys with same keycode on ADB

2022-05-11 Thread Michael van Elst
On Thu, May 12, 2022 at 01:19:05AM +, Emmanuel Dreyfus wrote:
> 
> I miss KC_JPY and KC_RO. Internet seems to know nothing about them.

https://opensource.apple.com/source/IOHIDFamily/IOHIDFamily-247/IOHIDSystem/IOHIKeyboardMapper.cpp.auto.html

This seems to suggest that we have special codes on the JIS keyboard
for:

Yen   0x5d  (KC_JPY ?)
Ro0x5e  (KC_RO ?)
Eisu  0x66  (Switch to Roman)
Kana  0x68  (Switch to Hiragana)
, 0x5f  (numeric keypad)


Greetings,
-- 
    Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: two keys with same keycode on ADB

2022-05-09 Thread Michael van Elst
m...@netbsd.org (Emmanuel Dreyfus) writes:

>It seems the layout can be detected using sc->sc_adbdev->handler_id 
>I borrowed the values from tmk_keyboard and it works fine. There
>are just a few numerical values for which I have no name. Where is the
>ADB keyboard handler id name list?

I found

https://developer.apple.com/documentation/coreservices/1471510-keyboard_selectors

but no idea how these relate to different layouts.



Re: two keys with same keycode on ADB

2022-05-08 Thread Michael van Elst
m...@netbsd.org (Emmanuel Dreyfus) writes:

>The french ADB keyboard features a < key at the right of the left shift
>key. On the console it works fine, however with X, it displays a @. The
>@ key also displays a @.

Apparently there are two different encodings for ANSI and ISO keyboard
layouts that conflict for exactly these keys. The driver apparently assumes
the ANSI variant.

Here is some discussion:

https://github.com/tmk/tmk_keyboard/issues/35

You need a different translation for either case and a way to select
which to use.



Re: libsa and 4K devices

2022-04-23 Thread Michael van Elst
mlel...@serpens.de (Michael van Elst) writes:


>- The copied filesystem code in libsa uses absolute disk offsets
>  in bytes to locate superblocks. A backend that uses physical blocks
>  cannot easily address such offsets if it doesn't know the block
>  size.
>  The filesystem code deduces this as fsize >> fsbtodb, but both
>  values are only available after reading the superblock.

Here is a patch for stand/efiboot and libsa that lets it handle
large sectors for ffs,lfs,ext2fs,minix3fs.

http://ftp.netbsd.org/pub/NetBSD/misc/mlelstv/efiboot.patch




Re: libsa and 4K devices

2022-04-23 Thread Michael van Elst
nick.hud...@gmx.co.uk (Nick Hudson) writes:

>https://github.com/skrll/src/commit/a5432c0ce71ea2fd1b7ad22ff6c26d01f4dca7=
>1a


When looking at this, I got a few more issues:

- The copied filesystem code in libsa uses absolute disk offsets
  in bytes to locate superblocks. A backend that uses physical blocks
  cannot easily address such offsets if it doesn't know the block
  size.
  The filesystem code deduces this as fsize >> fsbtodb, but both
  values are only available after reading the superblock.

- fsck_msdos (used by arm64.img as the EFI partition is mounted)
  fails for large sectors. The bootblock (with disk info) is
  always read as 512 bytes.

- When testing in qemu, you can emulate a specific disk geometry
  including large sectors, but you cannot use a physical disk
  with large sectors as backend. Qemu will use I/O transfers
  as small as 512 bytes and it won't accept a block device.
  


Re: libsa and 4K devices

2022-04-23 Thread Michael van Elst
nick.hud...@gmx.co.uk (Nick Hudson) writes:

>To enable efiboot to work from Apple M1 nvme I had to apply this diff so
>that libsa picks up the fs_fshift based FFS_FSBTODB.

>Is this correct or does it mean the FS has an incorrect fs_fsbtodb? (and
>there's a bug in mkfs somewhere)


The bug is probably in the arm efi code.

The NetBSD kernel once changed how to use disk addresses by making
device drivers work with DEV_BSIZE units.

Everything else, userland and standalone code, accesses disks in
sector size units and needs to use the traditional definition in
the #else part.


The filesystem image stores the factor between filesystem blocks (frags)
and physical disk blocks in the superblock. Traditional code needs this
information, the NetBSD kernel ignores it.

You can use to 'dumpfs -s' to look at the numbers.



Re: framebuffer refresh rate and geometry

2022-04-11 Thread Michael van Elst
On Mon, Apr 11, 2022 at 04:23:04PM -0700, Paul Goyette wrote:
> On Mon, 11 Apr 2022, RVP wrote:
> 
> > On Mon, 11 Apr 2022, Michael van Elst wrote:
> > 
> > > N.B. if the display driver provides EDID data to wscons it can be
> > > queried with
> > > 
> > > wsconsctl -d edid
> > > 
> > 
> > newdrm seems to have lost this ability (since, at least, Oct '21). The old
> > DRM in 9.2_STABLE still fetches an EDID on the same HW (IvyBridge mobile
> > IGPU).
> 
> Works for me on -current as of 9.99.93 with nouveau and reredrm

wsconsctl learned about the edid attribute in -current.

Either a pullup is necessary, or you can try with the -current binary.


-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: framebuffer refresh rate and geometry

2022-04-11 Thread Michael van Elst
On Tue, Apr 12, 2022 at 12:52:36AM +, Emmanuel Dreyfus wrote:
> On Mon, Apr 11, 2022 at 11:37:29AM -0000, Michael van Elst wrote:
> > I rather doubt that a black display comes from the refresh rate.
> 
> In X11, I get it working at 50 Hz, but I get a black display at 60 Hz.
> 
> > N.B. if the display driver provides EDID data to wscons it can be
> > queried with
> > 
> > wsconsctl -d edid
> 
> # wsconsctl -d edid
> wsconsctl: edid: not found

-current only :-/



-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: framebuffer refresh rate and geometry

2022-04-11 Thread Michael van Elst
m...@netbsd.org (Emmanuel Dreyfus) writes:

>When the kernel initialize a framebuffer, the signal ouput changes a bit 
>fro mwhat is inherited from the BIOS. I face the situation where the 
>display looses the signal, and I suspect this is related to the refresh
>rate.

BIOS usually sets fixed VESA modes.
The DRM code usually sets a mode derived from EDID data sent by the monitor.

With X11 you can use xrandr to select modes provided by the driver
(usally derived from EDID data) and also add custom modes if the
driver supports these.

I rather doubt that a black display comes from the refresh rate.

N.B. if the display driver provides EDID data to wscons it can be
queried with

wsconsctl -d edid




Re: question about enumeration of buses on arm systems with dtb files

2022-04-03 Thread Michael van Elst
dty...@anduin.org.uk (Dave Tyson) writes:

>The contents of the dtb shoved in-core can be modified by having code snippets 
>in the overlay directory and enabling them in config.txt. This is how bullseye 
>linux seems to do things and I guess this should work with NetBSD.

Yes, works the same.

The firmware will also modify the in-core DTB, depending on the RPI
model and there is also the dtparam instruction in config.txt
that makes the firmware (start*.elf) generate some overlays on
the fly. However, some of that is Linux specific and no use for
NetBSD.



Re: question about enumeration of buses on arm systems with dtb files

2022-04-03 Thread Michael van Elst
campbell+netbsd-tech-k...@mumble.net (Taylor R Campbell) writes:

>I think usually it's first-come-first-serve according to the ordering
>in the device tree (that is, in a pre-order traversal, which is the
>same as the sequential order in the .dts file) -- not necessarily the
>numbers in the device tree node names.  So if there device tree lists
>nodes i2c2, i2c0, i2c1, in that order, then it'll be

>/dev/iic0 -> i2c2
>/dev/iic1 -> i2c0
>/dev/iic2 -> i2c1


If you enable all interfaces (status = 'okay' in DTS) you get:

% ofctl -p | grep /i2c@
2474: /i2c@7e205000
2fa4: /i2c@7e804000
30a0: /i2c@7e805000

This is also the order in the DTB:

i2c@7e205000 {
phandle = <0x46>;
...
i2c@7e804000 {
phandle = <0x4d>;
...
i2c@7e805000 {
phandle = <0x15>;
...


In dmesg you will just see:

bsciic0 at simplebus1: Broadcom Serial Controller
iic0 at bsciic0: I2C bus
bsciic1 at simplebus1: Broadcom Serial Controller
iic1 at bsciic1: I2C bus
bsciic2 at simplebus1: Broadcom Serial Controller
iic2 at bsciic2: I2C bus

But with drvctl you can see that the order is different.

% drvctl -p bsciic0 fdt-path
/soc/i2c@7e805000
% drvctl -p bsciic1 fdt-path
/soc/i2c@7e205000
% drvctl -p bsciic2 fdt-path
/soc/i2c@7e804000

The FDT code sorts nodes by an 'order' value, which currently
is the phandle number, see the fdt_get_order() function.


Another issue is that depending on the RPI model, different
I2C interfaces are used by the VideoCore and attaching the
driver may break functionality, so you normally do not
attach all 3 interfaces but just the one that is routed to
the GPIO pins.




Re: Hinted mmap(2) without MAP_FIXED can fail prematurely

2022-02-18 Thread Michael van Elst
b...@softjar.se (Johnny Billquist) writes:

>Which then basically means that without MAP_FIXED, the hint don't really 
>mean anything? It will take whatever address it can come up with, no 
>matter what you put into the hint.

It still might reuse the address (or just a close address) for efficiency.


>With MAP_FIXED, the hint needs to be exactly on a page boundary, which 
>makes sense. Without MAP_FIXED, and with a hint, I would expect that 
>things like rounding the address to the proper alignment, and so on, 
>would be allowed, but not that it would just take any address. If I'm ok 
>with it taking any random address, then I shouldn't provide a hint.

Unfortunately it doesn't matter. Linux will try the hint (aligned to a page)
and fall back to an arbitrary address. Software relies on this.



Re: Hinted mmap(2) without MAP_FIXED can fail prematurely

2022-02-17 Thread Michael van Elst
p...@cielonegro.org (PHO) writes:

>I expected mmap(2) to search for an available region from the entire 
>address space starting from the hint, not only half of it.

It's not even half.



Re: Hinted mmap(2) without MAP_FIXED can fail prematurely

2022-02-17 Thread Michael van Elst
b...@softjar.se (Johnny Billquist) writes:

>If it would ignore the hint, what's the point of the hint then?

With MAP_FIXED it must use the hint, without it's just a best effort
attempt.



Re: reboot panic: "error == EOPNOTSUPP" in vfs_vnode.c line 1120

2022-02-06 Thread Michael van Elst
m...@eterna.com.au (matthew green) writes:

>while rebooting a quartz64 with a usb attached disk that just
>had a about 3.5GB of data written to it, i the umass gave some
>errorse and then i got a panic shortly later:

>[ 6179.6038961] Skipping crash dump on recursive panic
>[ 6179.6038961] panic: kernel diagnostic assertion "error =3D=3D EOPNOTSUP=
>P" failed: file "/usr/src/sys/kern/vfs_vnode.c", line 1120 =

genfs_suspendctl() may return ENOENT if IMNT_GONE is set in mnt_iflag.
I'm not sure if mnt_iflag is even protected against concurrent access
but vrevoke() racing with dounmount() might be enough.




Re: Some guidance/suggestion please

2022-01-14 Thread Michael van Elst
m...@eterna.com.au (matthew green) writes:

>it seems to me that if some driver depends upon altq, then
>altq should simply always refuse to unload if a driver is
>loaded that depends upon it.  this should be an explicit
>dependency, and probably implicit via symbols.

>if, say there's a fully modular system with two NICs, and
>only one of them supports altq.  only one of the NIC drivers
>will declare a dep on altq, blocking the unload of altq
>while the driver remains loaded.

>if this isn't already the case, can we arrange it to be?


Isn't altq going to be redesigned anyway for NET_MPSAFE ?
Efforts to move it into a module might be a bit premature then.

N.B. I'd just make the different queuing mechanisms loadable
but keep hooks for the network drivers in the kernel itself,
similar to bufq for disk drivers.



Re: block/dk devices and lseek()

2021-11-30 Thread Michael van Elst
r...@sdf.org (RVP) writes:

>Are block and dk* (wedge) devices supposed to support lseek()?

When a disk device is opened, the DIOCGPARTINFO ioctl is used to
query the size of the disk and cache it in the vnode.

The dk driver doesn't represent a disk and doesn't support DIOCGPARTINFO.
It does support DIOCGWEDGEINFO, DIOCGDISKINFO and DIOCGSECTORSIZE +
DIOCGMEDIASIZE that would reveal the information.

The same is true for the dm driver.


But that's only half of the story.

>$ sudo stat -f '%N: %z' /dev/rsd0
>/dev/rsd0: 7849115648  # works
>$ sudo stat -f '%N: %z' /dev/sd0
>/dev/sd0: 0# is this correct?

The stat command uses the lstat() system call, it does not
open a device, but only returns the cached size information
in the vnode.

A device that is open or where the vnode is still cached after an open
will reveal the size, otherwise you get a zero result. E.g.

# stat -f "%N: %z" /dev/raid0b
/dev/raid0b: 968884224
# stat -f "%N: %z" /dev/rraid0b
/dev/rraid0b: 0
# dd if=/dev/rraid0b of=/dev/null count=10
10+0 records in
10+0 records out
5120 bytes transferred in 0.014 secs (365714 bytes/sec)
# stat -f "%N: %z" /dev/rraid0b
/dev/rraid0b: 968884224

The root block device can also be special as it usually isn't opened
(with spec_open) and cannot be opened later as it is always busy, so
the size is never cached.




Re: wsvt25 backspace key should match terminfo definition

2021-11-24 Thread Michael van Elst
On Wed, Nov 24, 2021 at 12:05:28AM +, RVP wrote:
> So, if I had a USB keyboard (don't have one to check right now), the
> terminfo entry would be correct? How do we make this consistent then?
> Have 2 terminfo entries: wsvt25-ps2 and wsvt25-usb (and fix-up getty
> to set the correct one)?

Most programs will use information from the terminal driver and only
fall back to terminfo when the driver doesn't return anything.

Some people also helped themselves with $TERMCAP. But I wouldn't
create new entries for ps2 or usb (if anything use '-bs'/'-del').


Greetings,
-- 
    Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: wsvt25 backspace key should match terminfo definition

2021-11-23 Thread Michael van Elst
r...@sdf.org (RVP) writes:

>The kernel currently defines the backspace key as:

>$ fgrep CERASE /usr/include/sys/ttydefaults.h
>#define CERASE  0177

There is no 'defined as', in particular with emulated terminals
that aren't even the same on all platforms.

If you restrict yourself to PC hardware (i386/amd64 arch) then
you probably have either

a PS/2 keyboard -> the backspace key generates a DEL.
a USB keyboard  -> the backspace key generates a BS.

That's something you cannot shoehorn into a single terminfo
attribute and that's why many programs simply ignore terminfo
here, in particular when you think about remote access.

The more appropriate data source is the tty setting that
is also most often inherited from the remote computer.

Terminfo becomes more real when you actually talk to
a specific hardware terminal, e.g. connected to a
serial console. Terminal emulations again add a level
of ambiguity to such 'definitions'.



Re: Representing a rotary encoder input device

2021-09-22 Thread Michael van Elst
thor...@me.com (Jason Thorpe) writes:

>Well, ultimately, we translate =E2=80=9CHID report=E2=80=9D -> =E2=80=9Cws=
>* input event=E2=80=9D.  Or are you suggesting that we should have a new =
>interface to user-space that just sends HID reports?

We already have a user-space interface that sends HID reports, we could
make it a bit more universal.



Re: Representing a rotary encoder input device

2021-09-21 Thread Michael van Elst
thor...@me.com (Jason Thorpe) writes:

>Trying to think about the best way to represent such a device, I guess =
>within wscons (they almost seem sort of like a 1-axis mouse, but I could =
>be convinced otherwise).

You can make it a HID, because that's what it is.

Currently we only expose USB hid devices and sys/dev/hid is not much
more than a framework to support these. This would need to be extended
to a real abstraction (and uhidev might then be just a subclass).

If someone then wants to use this input device for console, you can
easily attach a wsmouse to it.



  1   2   3   4   5   6   >