On Thu, Jul 25, 2024 at 6:29 PM Jeffrey Walton <noloa...@gmail.com> wrote:

> On Thu, Jul 25, 2024 at 2:15 PM Richard Shaw <hobbes1...@gmail.com> wrote:
> >
> > I recently had the Fedora install on my laptop go sideways (Ryzen 5
> 4500U w/ nvme disk).
> >
> > The filesystem was going readonly so I installed System Rescue CD to a
> thumb drive to investigate. Sure enough I had 4 unrecoverable errors.
> >
> > I don't keep anything critical on it so I decided to just reinstall with
> Fedora 40. Installation went fine but I did notice weird dnf output on my
> first updated buy everything SEEMED fine...
> >
> > I rebooted after the update and tried to log in when after a minute or
> two the system froze. Rebooted and sure enough a `dmesg | grep BTRFS`
> showed an error.
> >
> > Back to booting with System Rescue CD neither a `btrfs check
> --check-data-csum` or after mounting, a `btrfs scrub` show any errors.
> >
> > So who's right? And if there is an error, what's causing it? I've
> checked the drive with smartctl and even let the factory HP firmware diag
> tools run in a loop overnight checking everything without error.
>
> The (1) irrecoverable disk errors from the original install, and (2)
> the errors from the current install, and (3) the errors from dnf
> indicate (to me) you have a failed NVMe drive. I used to see the
> symptoms all the time when using SDcards in ARM dev boards. I would
> put a swap file on the dev board (due to lack of resources), and the
> drives would fail within about 6 months with the symptoms you
> describe.
>
> Now the interesting part (to me) is, (4) lack of errors reported by
> some tools. That indicates to me a Chinese drive that misreports drive
> size and statistics. They usually show up on thumb drives, but I
> experienced one on a SSD drive years ago. Also see
> <https://www.google.com/search?q=counterfeit+drive+misreport+size>.
>
> All in all, I would replace the NVMe drive with a new one from a
> trusted source. Not Amazon or eBay.
>

It's the drive that came with the laptop so unlikely to be a cheap/phony
drive but the mystery does get deeper...

1. I was able to see the same results even if I booted to a F40 Live USB.
I'm thinking that the system caught the problem quick enough the error
didn't actually get written to the disk.

2. I consistently see the problem at about 30 seconds (from dmesg) if I
boot the 6.9.9 or 6.9.10 kernels that have been installed via updates. If I
boot 6.8.5, the kernel that shipped with F40 I can't reproduce the problem.

Of course that's strange because if this was a widespread issue there would
be tons of people complaining.

Thanks,
Richard
-- 
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to