On Fri, Jul 26, 2024 at 8:59 AM John Mellor <john.mel...@gmail.com> wrote:

> On 2024-07-26 8:25 a.m., Richard Shaw wrote:
>
> On Thu, Jul 25, 2024 at 6:29 PM Jeffrey Walton <noloa...@gmail.com> wrote:
>
>> On Thu, Jul 25, 2024 at 2:15 PM Richard Shaw <hobbes1...@gmail.com>
>> wrote:
>> >
>> > I recently had the Fedora install on my laptop go sideways (Ryzen 5
>> 4500U w/ nvme disk).
>> >
>> > The filesystem was going readonly so I installed System Rescue CD to a
>> thumb drive to investigate. Sure enough I had 4 unrecoverable errors.
>> >
>> > I don't keep anything critical on it so I decided to just reinstall
>> with Fedora 40. Installation went fine but I did notice weird dnf output on
>> my first updated buy everything SEEMED fine...
>> >
>> > I rebooted after the update and tried to log in when after a minute or
>> two the system froze. Rebooted and sure enough a `dmesg | grep BTRFS`
>> showed an error.
>> >
>> > Back to booting with System Rescue CD neither a `btrfs check
>> --check-data-csum` or after mounting, a `btrfs scrub` show any errors.
>> >
>> > So who's right? And if there is an error, what's causing it? I've
>> checked the drive with smartctl and even let the factory HP firmware diag
>> tools run in a loop overnight checking everything without error.
>>
>> The (1) irrecoverable disk errors from the original install, and (2)
>> the errors from the current install, and (3) the errors from dnf
>> indicate (to me) you have a failed NVMe drive. I used to see the
>> symptoms all the time when using SDcards in ARM dev boards. I would
>> put a swap file on the dev board (due to lack of resources), and the
>> drives would fail within about 6 months with the symptoms you
>> describe.
>>
>> Now the interesting part (to me) is, (4) lack of errors reported by
>> some tools. That indicates to me a Chinese drive that misreports drive
>> size and statistics. They usually show up on thumb drives, but I
>> experienced one on a SSD drive years ago. Also see
>> <https://www.google.com/search?q=counterfeit+drive+misreport+size>.
>>
>> All in all, I would replace the NVMe drive with a new one from a
>> trusted source. Not Amazon or eBay.
>>
>
> It's the drive that came with the laptop so unlikely to be a cheap/phony
> drive but the mystery does get deeper...
>
> 1. I was able to see the same results even if I booted to a F40 Live USB.
> I'm thinking that the system caught the problem quick enough the error
> didn't actually get written to the disk.
>
> 2. I consistently see the problem at about 30 seconds (from dmesg) if I
> boot the 6.9.9 or 6.9.10 kernels that have been installed via updates. If I
> boot 6.8.5, the kernel that shipped with F40 I can't reproduce the problem.
>
> Of course that's strange because if this was a widespread issue there
> would be tons of people complaining.
>
> Odds are that you have bad ram or are running the processor clock higher
> than what it can handle.  I also had this kind of issue when I had a bad
> video card, but the system generally froze or crashed and left the drive in
> an unrecoverable state.  The tools for fixing a btrfs partition are
> generally lacking in Fedora, and the tools that come with btrfs are also
> useless when the  failing partition is your active root partition.  I don't
> know if Suse has better tools, but its a huge problem with Fedora
> recoverability.
>

It's an HP Envy Laptop, no ability to overclock. I did upgrade the memory
when I first got it over 3 years ago from 8GB to 16GB but it's plain
DDR4-3200. As I previously mentioned I let the HP diag tools run overnight
and completed 14 cycles without any errors and now I just finished letting
Memtest86+ run for 5 complete cycles without any errors.

The only common denominator I have found so far is the two 6.9 kernels I
have installed.

Thanks,
Richard
-- 
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to