When this occurs, have you tried "dmesg -T" or checked /var/log/kern.log or /var/log/syslog for suspicious log entries around the time of the error? Might be worth taking a peek as well.
Best Regards, Paul Goins On Sun, Nov 27, 2022 at 9:24 PM John Jason Jordan <[email protected]> wrote: > On Sun, 27 Nov 2022 19:06:35 -0600 > Bill Barry <[email protected]> dijo: > > >On Sun, Nov 27, 2022 at 7:02 PM Bill Barry <[email protected]> wrote: > >> > >> On Sun, Nov 27, 2022 at 6:55 PM John Jason Jordan <[email protected]> > >> wrote: > >> > > >> > On Sun, 27 Nov 2022 16:07:09 -0600 > >> > Bill Barry <[email protected]> dijo: > >> > > >> > >So reassembling is what was needed, but you don't need to force > >> > >it. Now if you just had some way to automate that. Have you > >> > >connected your external enclosure to the UPS so that it is > >> > >protected from temporary power disruptions? > >> > > >> > Yes, everything is on my three massive APCs (connected together), > >> > even the stereo. > >> > > >> > I keep thinking that the problem lurks somewhere in the crazy > >> > setup: > >> > > >> > 4 Intel 8TB NVMe drives, mounted two each on two long PCIe cards, > >> > inside a 4-bay PCI enclosure, in turn connected to a Thunderbolt 3 > >> > port on a Lenovo dock, in turn connected to my Lenovo P73 laptop > >> > via one of its TB3 ports. The whole setup is close to two years > >> > old, during which time this is the sixth case where the RAID has > >> > failed. > >> > > >> > When I got up this morning everything was running, so it couldn't > >> > have been caused by a power glitch. There are lots of other > >> > electrical devices around the house that tell me when PGE hasn't > >> > been playing nice > >> > - e.g., the clocks on the range and microwave will be flashing, the > >> > bedroom television that runs all night will be off, among > >> > others. > >> > >> If it's not a power problem then maybe cabling. How long are your > >> Thunderbolt cables? > >> > >> Bill > > > >I see lots of posts out there claiming that their Thunderbolt > >connection randomly disconnects. That could certainly cause your > >problem. > > > >Bill > > Ah! Now that is news I was not aware of! I will do some searching and > see what additional information I can come up with. And (crossing > fingers) a solution. > > Regarding the cables, I have several TB3 cables, and after the first > couple of failures I swapped them, with no change in the problem. Of > course, there are lots more TB3 connection suspects - the three ports, > for example. But I should have mentioned before that the two PCIe cards > and their connections in the enclosure, and the connections of the four > drives to the cards are the least likely suspects, because after each > of the failures all four drives appear correctly and 'mdadm --query' and > 'mdadm --detail' both list them as available without a hint of a > problem. >
