Re: rm: fts_read: No such file or directory
Hi Otto, Thanks for your reply. On Thu, Jan 14, 2021 at 08:22:33AM +0100, Otto Moerbeek wrote: | > Could there be some TOCTOU issue here somewhere? Or some cache | > misbehaviour? Or is it really dying hardware? | | My first bet would be some form of corruption. FLipped bits in e..g | directories while operating normally cannot be seen by the | clean/unclean flag in the superblock. That one only records if the | filesystem was unmounted before reboot, shutdown or crash. I understand that - but then why would the error clear on subsequent runs of rm? | The forced fsck might reveal more. It did find some issues, and then was waiting for my input over night (when the backup run mounted the filesystem and changed things). ** /dev/sd2a (ebb54a869d056df3.a) ** File system is already clean ** Last Mounted on /backup ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ZERO LENGTH DIR I=57604332 OWNER=root MODE=40755 SIZE=0 MTIME=Jan 13 13:56 2021 CLEAR? [Fyn?] y ** Phase 5 - Check Cyl groups FREE BLK COUNT(S) WRONG IN SUPERBLK SALVAGE? [Fyn?] y SUMMARY INFORMATION BAD SALVAGE? [Fyn?] y BLK(S) MISSING IN BIT MAPS SALVAGE? [Fyn?] y 27766624 files, 396630326 used, 267754002 free (2016066 frags, 33217242 blocks, 0.3% fragmentation) * FILE SYSTEM WAS MODIFIED * I ran it once more after that, more issues were found: ** /dev/sd2a (ebb54a869d056df3.a) ** File system is already clean ** Last Mounted on /backup ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups FREE BLK COUNT(S) WRONG IN SUPERBLK SALVAGE? [Fyn?] y SUMMARY INFORMATION BAD SALVAGE? [Fyn?] y BLK(S) MISSING IN BIT MAPS SALVAGE? [Fyn?] y 27884252 files, 397169471 used, 267214857 free (1944825 frags, 33158754 blocks, 0.3% fragmentation) * FILE SYSTEM WAS MODIFIED * Until the third fsck came back clean: ** /dev/sd2a (ebb54a869d056df3.a) ** File system is already clean ** Last Mounted on /backup ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 27884252 files, 397169471 used, 267214857 free (1944825 frags, 33158754 blocks, 0.3% fragmentation) 136m19.01s real 4m00.56s user20m33.85s system I'll write it off to those errors, but I still don't understand why re-trying would fix these kinds of issues. Thanks again, Otto! Paul -- >[<++>-]<+++.>+++[<-->-]<.>+++[<+ +++>-]<.>++[<>-]<+.--.[-] http://www.weirdnet.nl/
Re: rm: fts_read: No such file or directory
On Wed, Jan 13, 2021 at 09:46:27PM +0100, Paul de Weerd wrote: > Hi all, > > While doing some clean-up on my backup filesystem (which extensively > uses hardlinks), I came across the error in Subject: > > rm: fts_read: No such file or directory > > Traversing the hierarchy I was trying to remove, I get similar > fts_read errors when I `ls` in certain places, but a repeated rm runs > to completion fine (the tree is gone afterwards). > > There's nothing in dmesg suggesting filesystem corruption, the > filesystem unmounts and remounts cleanly, I'm running a forced fsck > now which says "** File system is already clean". It's a rather large > filesystem with many inodes in use, so it'll take some time to > complete. Also, it's on a softraid crypto device, if that matters: > > sd2: 5231654MB, 512 bytes/sector, 10714427745 sectors > > Reading fts_read(3) wasn't really enlightening as to why a directory > that's supposedly there, wouldn't be there anymore. (note that I > wasn't running another rm in the same tree in parallel when I got > these errors - I did try to force the error by doing just that, but > that went through without a single error). > > Could there be some TOCTOU issue here somewhere? Or some cache > misbehaviour? Or is it really dying hardware? My first bet would be some form of corruption. FLipped bits in e..g directories while operating normally cannot be seen by the clean/unclean flag in the superblock. That one only records if the filesystem was unmounted before reboot, shutdown or crash. The forced fsck might reveal more. -Otto > > Paul 'WEiRD' de Weerd > > OpenBSD 6.8-current (GENERIC.MP) #267: Sat Jan 9 19:23:55 MST 2021 > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > real mem = 34311208960 (32721MB) > avail mem = 33256046592 (31715MB) > random: good seed from bootblocks > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xe6690 (57 entries) > bios0: vendor Dell Inc. version "2.10.0" date 05/24/2018 > bios0: Dell Inc. PowerEdge R210 II > acpi0 at bios0: ACPI 4.0 > acpi0: sleep states S0 S4 S5 > acpi0: tables DSDT FACP SPMI DMAR ASF! HPET APIC MCFG BOOT SSDT ASPT SSDT > SSDT SPCR HEST ERST BERT EINJ > acpi0: wakeup devices P0P1(S4) GLAN(S0) EHC1(S4) EHC2(S4) XHC_(S4) RP01(S5) > PXSX(S4) RP02(S5) PXSX(S4) RP03(S5) PXSX(S4) RP04(S5) PXSX(S4) RP05(S5) > PXSX(S4) RP06(S5) [...] > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpihpet0 at acpi0: 14318179 Hz > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: Intel(R) Xeon(R) CPU E31260L @ 2.40GHz, 2394.91 MHz, 06-2a-07 > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN > cpu0: 256KB 64b/line 8-way L2 cache > cpu0: smt 0, core 0, package 0 > mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges > cpu0: apic clock running at 99MHz > cpu0: mwait min=64, max=64, C-substates=0.2.1.1, IBE > cpu1 at mainbus0: apid 1 (application processor) > cpu1: Intel(R) Xeon(R) CPU E31260L @ 2.40GHz, 2394.58 MHz, 06-2a-07 > cpu1: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN > cpu1: 256KB 64b/line 8-way L2 cache > cpu1: smt 1, core 0, package 0 > cpu2 at mainbus0: apid 2 (application processor) > cpu2: Intel(R) Xeon(R) CPU E31260L @ 2.40GHz, 2394.58 MHz, 06-2a-07 > cpu2: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN > cpu2: 256KB 64b/line 8-way L2 cache > cpu2: smt 0, core 1, package 0 > cpu3 at mainbus0: apid 3 (application processor) > cpu3: Intel(R) Xeon(R) CPU E31260L @ 2.40GHz, 2394.58 MHz, 06-2a-07 > cpu3: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SS
rm: fts_read: No such file or directory
Hi all, While doing some clean-up on my backup filesystem (which extensively uses hardlinks), I came across the error in Subject: rm: fts_read: No such file or directory Traversing the hierarchy I was trying to remove, I get similar fts_read errors when I `ls` in certain places, but a repeated rm runs to completion fine (the tree is gone afterwards). There's nothing in dmesg suggesting filesystem corruption, the filesystem unmounts and remounts cleanly, I'm running a forced fsck now which says "** File system is already clean". It's a rather large filesystem with many inodes in use, so it'll take some time to complete. Also, it's on a softraid crypto device, if that matters: sd2: 5231654MB, 512 bytes/sector, 10714427745 sectors Reading fts_read(3) wasn't really enlightening as to why a directory that's supposedly there, wouldn't be there anymore. (note that I wasn't running another rm in the same tree in parallel when I got these errors - I did try to force the error by doing just that, but that went through without a single error). Could there be some TOCTOU issue here somewhere? Or some cache misbehaviour? Or is it really dying hardware? Paul 'WEiRD' de Weerd OpenBSD 6.8-current (GENERIC.MP) #267: Sat Jan 9 19:23:55 MST 2021 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 34311208960 (32721MB) avail mem = 33256046592 (31715MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xe6690 (57 entries) bios0: vendor Dell Inc. version "2.10.0" date 05/24/2018 bios0: Dell Inc. PowerEdge R210 II acpi0 at bios0: ACPI 4.0 acpi0: sleep states S0 S4 S5 acpi0: tables DSDT FACP SPMI DMAR ASF! HPET APIC MCFG BOOT SSDT ASPT SSDT SSDT SPCR HEST ERST BERT EINJ acpi0: wakeup devices P0P1(S4) GLAN(S0) EHC1(S4) EHC2(S4) XHC_(S4) RP01(S5) PXSX(S4) RP02(S5) PXSX(S4) RP03(S5) PXSX(S4) RP04(S5) PXSX(S4) RP05(S5) PXSX(S4) RP06(S5) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 14318179 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Xeon(R) CPU E31260L @ 2.40GHz, 2394.91 MHz, 06-2a-07 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.1, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel(R) Xeon(R) CPU E31260L @ 2.40GHz, 2394.58 MHz, 06-2a-07 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 1, core 0, package 0 cpu2 at mainbus0: apid 2 (application processor) cpu2: Intel(R) Xeon(R) CPU E31260L @ 2.40GHz, 2394.58 MHz, 06-2a-07 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu2: 256KB 64b/line 8-way L2 cache cpu2: smt 0, core 1, package 0 cpu3 at mainbus0: apid 3 (application processor) cpu3: Intel(R) Xeon(R) CPU E31260L @ 2.40GHz, 2394.58 MHz, 06-2a-07 cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu3: 256KB 64b/line 8-way L2 cache cpu3: smt 1, core 1, package 0 cpu4 at mainbus0: apid 4 (application processor) cpu4: Intel(R) Xeon(R) CPU E31260L @ 2.40GHz, 2394.58 MHz, 06-2a-07 cpu4: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu4: 256KB 64b/line 8-way L2 cache cpu4: smt 0, core 2, package 0 cpu5 at