On 2/2/21 12:31 PM, John Jason Jordan wrote:
On Tue, 2 Feb 2021 12:09:06 -0800
John Jason Jordan <joh...@gmx.com> dijo:

On Mon, 1 Feb 2021 23:48:03 -0800
Ben Koenig <techkoe...@gmail.com> dijo:

A simple test to help everyone here understand what your machine is
doing would be to run through a few reboots and grab the list of
devices, like so

1) unplug your TB-3 drives and reboot.

2) record the output of 'ls -l /dev/nvme*' here

3) turn the computer off

4) plug in the TB-3 drives

5) turn the computer on and run 'ls /dev/nvme*' again.

This will clearly isolate the device nodes for your enclosure
independently of everything else on your computer. Once we have the
drives isolate, it's trivial to watch them for irregular behavior.
Until we have more confidence in the existence of your /dev/nvme nodes
we can ignore the other symptoms.
Here are the results:

1: (after unplugging TB3 device and rebooting)
crw------- 1 root root 239, 0 Feb  2 12:01 /dev/nvme0
brw-rw---- 1 root disk 259, 0 Feb  2 12:01 /dev/nvme0n1
brw-rw---- 1 root disk 259, 1 Feb  2 12:01 /dev/nvme0n1p1
brw-rw---- 1 root disk 259, 2 Feb  2 12:01 /dev/nvme0n1p2
Note that nvme0 is a 1TB m.2 drive inside the Thinkpad that holds / and
/home.

2: (after turning off computer, plugging in TB3 device, and booting)
crw------- 1 root root 239, 0 Feb  2 11:47 /dev/nvme0
brw-rw---- 1 root disk 259, 0 Feb  2 11:47 /dev/nvme0n1
crw------- 1 root root 239, 1 Feb  2 11:47
/dev/nvme1 brw-rw---- 1 root disk 259, 2 Feb  2 11:47 /dev/nvme1n1
crw------- 1 root root 239, 2 Feb  2 11:47 /dev/nvme2
brw-rw---- 1 root disk 259, 1 Feb  2 11:47 /dev/nvme2n1
crw------- 1 root root 239, 3 Feb  2 11:47 /dev/nvme3
brw-rw---- 1 root disk 259, 3 Feb  2 11:47 /dev/nvme3n1
crw------- 1 root root 239, 4 Feb  2 11:47 /dev/nvme4
brw-rw---- 1 root disk 259, 4 Feb  2 11:47 /dev/nvme4n1
brw-rw---- 1 root disk 259, 5 Feb  2 11:47 /dev/nvme4n1p1
brw-rw---- 1 root disk 259, 6 Feb  2 11:47 /dev/nvme4n1p2
After the above I opened Ktorrent. It presented me with a couple error
messages about missing files, where I pointed it to a folder that I had
renamed, which it happily accepted, and it is now seeding all its
torrents. The renamed folders were my fault. Then I opened a file
manager to /dev and scrolled down to nvme entries; which gave me:

nvme0
nvme0n1
nvme1
nvme1n1
nvme2
nvme2n1
nvme3
nvme3n1
nvme4
nvme4n1
nvme4n1p1
nvme4n1p2

And scrolling up a bit I see md127 and md127p1.

Everything is back to normal. My only problem is what happens when the
md127 and md127p1 suddenly become read-only again. It happened during
the night of February 1, so I can assume that eventually it's going to
happen again.


There's probably a bunch of debug information dumped into a log somewhere. If it seems to be working now and you didn't actually change anything then it was probably a minor drop in connection with one or more of the drives. Depending on how paranoid you want to be there are steps you can take to try and root cause the problem.


The one thing to keep an eye on is the number of nvme devices in /dev. If you encounter a situation where you have drives numbered above nvme4 then that means the kernel is losing track of your connected drives. That's a very specific class of problem and usually easy to deal with once isolated.


For now it's probably best to continue as normal and if it happens again just take a look at your nvme devices in /dev and see if they have changed.

-Ben

_______________________________________________
PLUG: https://pdxlinux.org
PLUG mailing list
PLUG@pdxlinux.org
http://lists.pdxlinux.org/mailman/listinfo/plug

Reply via email to