Date: Thu, 3 Oct 2024 01:13:52 -0400 (EDT) From: Mouse <mo...@rodents-montreal.org> Message-ID: <202410030513.baa08...@stone.rodents-montreal.org>
| One of these machines is an ASRock Q1900M. It has only two SATA ports | onboard; it has two PCIe x1 slots and a PCIe x16 slot. I just today | picked up a 5-port PCIe SATA card and tried it. I have one of those (though 6-port I think) - also requires x4, I have it in an x8 slot (motherboard has no x4s). (I have a lot of drives connected, this isn't because the motherboard has limited SATA.) The first of them I had obviously required BIOS (firmware) support to be enabled - it used to periodically vanish, as in even PCI config cycles would not find the thing, just as you reported, it simply appeared not to be there at all. Playing around with the firmware settings, turning on various legacy modes (often including getting it into modes that would prevent NetBSD booting) would make it appear, though I was never confident that I ever had a real handle on what change, or combination of changes, did it. It took a bios reset (so it could rescan the busses) after the config changes - but that's normal on most systems as a default action I think. After that all would remain OK, even when I returned the settings to their "proper" values, so NetBSD could boot successfully, until there was a complete power loss (no power at all to the card) which would generally put it back to "invisible" state again. After that I managed to break the card (as in physically break, by pulling too hard on a SATA cable that was still plugged in, breaking the connector). It still worked like that, but the SATA cable connection was not very stable, so I eventually replaced it. The newer one (same model, same brand, but about a year younger) has had no issue making itself visible without any weird firmware config changes). Even to the point now that I almost trust it enough to disable my "detect when it has failed" config changes - I ran raidframe mirrors with one drive on the motherboard controller, and one on the add-in, so that if one of them should fail, the other can keep things running. But when the controller vanished, if the system booted, it would merely see failed raid components and continue (once, early on, I didn't even notice that the add-in controller had vanished for a few days!). That should be harmless, but my raid mirrors are 12TB or beyond, and every time this would happen - which wasn't all that rare, power glitches aren't uncommon here, and were worse back then than now, and I didn't yet have a UPS, the raids would all need a reconstruct, which for the biggest would take more than 24 hours (and not even any guarantee they would complete without there being another interruption). So, I disabled raidframe autoconf (I never needed it for booting, I boot from manually mirrored filesystems, so if I mess up an update to one, I can boot from the other and fix things, rather than need to get involved in CD/USB booting). Then I installed a rc.d script, set to run very very early, before raidframe would run, which would just look for the presence of the drives on the add-in controller, and abort the boot if none were present. That worked well, and informed me when I need to play with the firmware to make it come back again. None of that has been needed since the controller was replaced. All this boils down to: the controller is probably not dead (but of course, might be), and playing with firmware settings, as Brian suggested, might work, just hope that your bios has the right kind of thing to put it into a state where it does complete hardware initialisation (which used to be needed on ancient WinTrash I believe, so most older ones should be able to do that, one way or another). Once visible it should show up in NetBSD autoconfig, with a very high probability that it will be known as a SATA controller, and "just work", though if you're doing this with a very old NetBSD kernel, it is possible that the chipset has no support. But note that in mine, even when the card is visible to NetBSD, the firmware shows no signs of recognising it at all, at least not in its visible interface. I recently discovered that it does apparently know there are drives out there, as it can locate them, and boot from them - I recently swapped two drives between the controllers (largely as I thought I needed to, as I wanted to be able to boot from the one that had been on the add-in) - only to find that the one now moved to the add-in controller (which had an ESP partition, and could be booted from previously) still appears in the boot menu (EFI boot) but giving no indication there which drive it is booting from, where the boot images in the drives on the motherboard controller do. That's how I can tell the difference between boot images, as NetBSD doesn't yet (hopefully soon perhaps) have the ability to name the firmware boot choices the way FreeBSD Linux and WinTrash can do, my firmware just shows the ones it finds by itself as "UEFI Boot" with the drive name (in firmware notation) appended - except for from that add in controller (if I had two bootable drives out there, I'd be in trouble!) | The reason I'm asking about PCIe is that, as far as I can tell from the | host, it isn't there at all. Yes, that's exactly what I would see when it was in its "hidden" (probably not enabled by the firmware) mode. | I note a possible conflict between the "x1" and the presence of a x16 | slot; that 1 is coming from the PCIE_LCAP_MAX_WIDTH bits in PCIE_LCAP, | which makes me wonder whether something needs configuring to run the | x16 slot at more than x1. likely to depend upon what (if anything) is plugged in, and enabled. The firmware should set that up, it generally depends upon what cards are connected, and how many lanes each claims to require. But again, "broken" is also a possibility. | so if the x16 slot is running x1 (is that even possible?) It is, you can always run any lower number of lanes than the slot supports (at least in powers of two). Some (usually high end) motherboards (and bios) even support dividing the slot, so it appears as more lower capacity ones. kre