Bug#789290: e2fsprogs: e2fsck claims to have fixed fs, but a second run finds all the same problems
On Sat, Jun 20, 2015 at 1:37 PM, Theodore Ts'o wrote: > > Or the disk could just be 6+ years old, and it's just too old. If I > were you I would just replace the hard drives and be done with it. That's probably going to happen in the near future, yes. >> In this case, that would have been cfdisk as of roughly 9 months ago, >> and I *think* the problem was it didn't know what to do with an MD >> device. Notice how the outer partitions start at offset 2048 but the >> inner partitions start at offset 63? > > Or this was just the case where cfdisk didn't want to mess with a > prexisting partition table, and the original partition table as > shipped from the manufacturer was Windows XP compatible. It's the partition table *inside* the MD container that's misaligned, so that can't be it. I'd like to make certain that there isn't an fsck or kernel bug here. Is it possible for you to construct a similarly-misaligned partition within an MD-RAID0 array, unpack the skeleton image I sent you into that partition, and then try to reproduce my original fsck report on that? Do you need more information from me first? zw -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#789290: e2fsprogs: e2fsck claims to have fixed fs, but a second run finds all the same problems
On Sat, Jun 20, 2015 at 12:38:56PM -0400, Zack Weinberg wrote: > Either is possible. These are an identical pair of Western Digital > drives and they're about five years old. They *claim* to have > 512-byte physical sectors (per hdparm -I -- full dump at the bottom) > but I would totally believe they are faking that. I pulled the spec sheet for the drives; the copyright date is 2008-2009, so I suspect they are a bit older than five years. It does claim to be a 512 byte physical sector drives. So it's possible the comment about not being aligned is just in error. > so, the > computer's power supply failed catastrophically in the middle of a > system upgrade, which is how the root filesystem got so very > corrupted. That could certainly have caused physical damage. (The > drives are currently attached to a different computer for data > recovery.) Or the disk could just be 6+ years old, and it's just too old. If I were you I would just replace the hard drives and be done with it. > In this case, that would have been cfdisk as of roughly 9 months ago, > and I *think* the problem was it didn't know what to do with an MD > device. Notice how the outer partitions start at offset 2048 but the > inner partitions start at offset 63? Or this was just the case where cfdisk didn't want to mess with a prexisting partition table, and the original partition table as shipped from the manufacturer was Windows XP compatible. - Ted -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#789290: e2fsprogs: e2fsck claims to have fixed fs, but a second run finds all the same problems
On Sat, Jun 20, 2015 at 11:35 AM, Theodore Ts'o wrote: > On Sat, Jun 20, 2015 at 11:05:31AM -0400, Zack Weinberg wrote: >> >> e2fsck successfully repairs both the skeleton image and the complete >> partition image when they are on a known-good disk. > > OK, so this is a storage device issue. I'd be taking a very jaundiced > look at the reliability/correctness of your drives. > > It could be that they have a firmware bug in how they handle 512e > emulation. (See below.) Or maybe one or more is starting to go bad. > (Not all drive failures are predicted by S.M.A.R.T. In fact, only > about 50-66% of drive failures are predicted by SMART. Think about > that the next time you are tempted to skimp on backups. :-) Either is possible. These are an identical pair of Western Digital drives and they're about five years old. They *claim* to have 512-byte physical sectors (per hdparm -I -- full dump at the bottom) but I would totally believe they are faking that. Also, the computer's power supply failed catastrophically in the middle of a system upgrade, which is how the root filesystem got so very corrupted. That could certainly have caused physical damage. (The drives are currently attached to a different computer for data recovery.) The fsck behavior I originally reported continues to be 100% reproducible on the physical partition. There are no hard errors in the SMART logs for either drive. (After I'm done copying data off the /home partition, which was not corrupted, I will run extended selftests.) Before the catastrophic power supply failure, there were no problems writing data to either filesystem inside the RAID array. And the outer partitions are properly aligned. Putting all of those things together, I wonder whether this might be a bug in direct (not filesystem) access to the block devices for misaligned partitions within MD-RAID(0). Is it possible for you to construct a similarly-misaligned partition within an MD-RAID0 array, unpack the skeleton image I sent you into that partition, and then try to reproduce my original fsck report on that? Do you need more information from me first? ... > Yeah, that's not good. Congratulation, whatever software set up your > RAID configuration is as intelligent (or as obsolete) as Windows XP. > Which explains why hard drive vendors are still selling 512e drives, > although they devoutly wish they could stop. In this case, that would have been cfdisk as of roughly 9 months ago, and I *think* the problem was it didn't know what to do with an MD device. Notice how the outer partitions start at offset 2048 but the inner partitions start at offset 63? (The disks are much older than the installation because the computer is secondhand, and had been completely wiped.) --- # hdparm -I /dev/sdd /dev/sdd: ATA device, with non-removable media Model Number: ST3320418AS Serial Number: 9VM5KB8B Firmware Revision: CC44 Transport: Serial Standards: Used: unknown (minor revision code 0x0029) Supported: 8 7 6 5 Likely used: 8 Configuration: Logicalmaxcurrent cylinders1638316383 heads1616 sectors/track6363 -- CHS current addressable sectors: 16514064 LBAuser addressable sectors: 268435455 LBA48 user addressable sectors: 625142448 Logical/Physical Sector size: 512 bytes device size with M = 1024*1024: 305245 MBytes device size with M = 1000*1000: 320072 MBytes (320 GB) cache/buffer size = 16384 KBytes Nominal Media Rotation Rate: 7200 Capabilities: LBA, IORDY(can be disabled) Queue depth: 32 Standby timer values: spec'd by Standard, no device specific minimum R/W multiple sector transfer: Max = 16Current = 16 Recommended acoustic management value: 208, current value: 254 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: EnabledSupported: *SMART feature set Security Mode feature set *Power Management feature set *Write cache *Look-ahead *Host Protected Area feature set *WRITE_BUFFER command *READ_BUFFER command *DOWNLOAD_MICROCODE Power-Up In Standby feature set SET_FEATURES required to spinup after power up SET_MAX security extension *Automatic Acoustic Management feature set *48-bit Address feature set *Device Configuration Overlay feature set *Mandatory FLUSH_CACHE *FLUSH_CACHE_EXT *SMART error logging *SMART self-test *General Purpose Logging feature set *WRITE_{DMA|MULTIPLE}_FUA_EXT *64-bit World wide name Write-Read-Verif
Bug#789290: e2fsprogs: e2fsck claims to have fixed fs, but a second run finds all the same problems
On Sat, Jun 20, 2015 at 11:05:31AM -0400, Zack Weinberg wrote: > > e2fsck successfully repairs both the skeleton image and the complete > partition image when they are on a known-good disk. OK, so this is a storage device issue. I'd be taking a very jaundiced look at the reliability/correctness of your drives. It could be that they have a firmware bug in how they handle 512e emulation. (See below.) Or maybe one or more is starting to go bad. (Not all drive failures are predicted by S.M.A.R.T. In fact, only about 50-66% of drive failures are predicted by SMART. Think about that the next time you are tempted to skimp on backups. :-) > Here's some more detail about the partition. The "Partition does not > start on physical sector boundary" thing might be relevant. (I can't > say I understand how that can even happen, though.) Modern hard drives either are have 512e or 4k sectors. 512 byte emulation is provided for backwards compatibility for Windows XP. This means that they have a logical sector size of 512, and a physical sector size of 4069. This means that it's *allowed* for you to send writes which are multiple of 512, but which are not aligned on the 4096 byte boundary or not a multiple of 4096 bytes. However, the drive will do a read-modify-write cycle, which is not the most efficient thing in the world. If the partition is not aligned on a 4k boundary, then *all* writes will be subject to a read-modify-write cycle, which will of course trash your write performance. For drives with a physical and logical sector size of 4k, then in fact, the LBA numbers sent to the hard drive is in units of 4k. So a drive LBA of 2 represents the physical sector which is 8192 bytes from the beginning of the disk. However, Linux internal always uses sector numbers which are in units of 512 bytes, so when you see terms like LBA thrown around, you need to be careful about whether you are talking about LBA's from the POV of the Linux kernel, or LBA's from the SATA/SCSI specification's point of view. In Linux, the device driver will take a request to read LBA #12 with a sector count of 8 and turn that into a SATA command requesting read of 2 drive sectors starting at drive LBA #3. Hence, on a sector with 4k logical/physical sectors, it is *impossible* to send misaligned reads or writes; we talk to the drive in units of 4k. > Device Boot StartEnd Sectors Size Id Type > /dev/md127p1 63 128005919 12800585761G 83 Linux > /dev/md127p2 128005920 1113433019 985427100 469.9G 83 Linux > > Partition 1 does not start on physical sector boundary. > Partition 2 does not start on physical sector boundary. > Remaining 7236 unallocated 512-byte sectors. Yeah, that's not good. Congratulation, whatever software set up your RAID configuration is as intelligent (or as obsolete) as Windows XP. Which explains why hard drive vendors are still selling 512e drives, although they devoutly wish they could stop. It took them a decade longer to introduce native 4k sector drives than they had originally wished, and most of this can be blamed on the failure of Windows Vista and the fact that enterprises stuck with Windows XP for much longer than anyone (including Microsoft) would have wanted. - Ted -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#789290: e2fsprogs: e2fsck claims to have fixed fs, but a second run finds all the same problems
> It could be caused by a hardware problem, or if it's a RAID array, if > the RAID array is out of sync, it's possible for two subsequent reads > to return something else. It's RAID0, which I *believe* can't get out of sync, but there is much I do not understand about RAID. > Can you take the two .gz files and reconstruct a file system on some > other system with a known-bug disk, and then try running e2fsck on the > the image? e2fsck successfully repairs both the skeleton image and the complete partition image when they are on a known-good disk. Here's some more detail about the partition. The "Partition does not start on physical sector boundary" thing might be relevant. (I can't say I understand how that can even happen, though.) md127 : active raid0 sde3[1] sdd3[0] 556720128 blocks super 1.2 512k chunks # sfdisk -Vl /dev/md127 Disk /dev/md127: 531 GiB, 570081411072 bytes, 1113440256 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 524288 bytes / 1048576 bytes Disklabel type: dos Disk identifier: 0x2c9d8483 Device Boot StartEnd Sectors Size Id Type /dev/md127p1 63 128005919 12800585761G 83 Linux /dev/md127p2 128005920 1113433019 985427100 469.9G 83 Linux Partition 1 does not start on physical sector boundary. Partition 2 does not start on physical sector boundary. Remaining 7236 unallocated 512-byte sectors. # sfdisk -Vl /dev/sdd Disk /dev/sdd: 298.1 GiB, 320072933376 bytes, 625142448 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x3ce15391 Device BootStart End Sectors Size Id Type /dev/sdd1 *2048 1050623 1048576 512M 83 Linux /dev/sdd21050624 68159487 6710886432G 82 Linux swap / Solaris /dev/sdd3 68159488 625142447 556982960 265.6G fd Linux raid autodetect # sfdisk -Vl /dev/sde Disk /dev/sde: 298.1 GiB, 320072933376 bytes, 625142448 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: dos Disk identifier: 0x75d309b0 Device BootStart End Sectors Size Id Type /dev/sde1 2048 1050623 1048576 512M 83 Linux /dev/sde21050624 68159487 6710886432G 82 Linux swap / Solaris /dev/sde3 68159488 625142447 556982960 265.6G fd Linux raid autodetect -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#789290: e2fsprogs: e2fsck claims to have fixed fs, but a second run finds all the same problems
On Fri, Jun 19, 2015 at 12:09:11PM -0400, Zack Weinberg wrote: > On Fri, Jun 19, 2015 at 11:53 AM, Theodore Ts'o wrote: > > > > I can't reproduce the problem on my end (see attached) > > Still happens for me on the real filesystem (see attached). We appear > to be using the same version of e2fsprogs. What could cause the > divergence? It could be caused by a hardware problem, or if it's a RAID array, if the RAID array is out of sync, it's possible for two subsequent reads to return something else. Can you take the two .gz files and reconstruct a file system on some other system with a known-bug disk, and then try running e2fsck on the the image? - Ted -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#789290: e2fsprogs: e2fsck claims to have fixed fs, but a second run finds all the same problems
I'm going to have to wipe out and recreate this filesystem in order to continue repairing this computer, but I have saved a complete image of the partition. It's a bit too big to just send you (11GB after xz compression) and also it contains /etc/shadow and similar. But I'm happy to do further tests on it. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#789290: e2fsprogs: e2fsck claims to have fixed fs, but a second run finds all the same problems
On Fri, Jun 19, 2015 at 11:22:21AM -0400, Zack Weinberg wrote: > Package: e2fsprogs > Version: 1.42.13-1 > Severity: normal > > When e2fsck -yf is run on the filesystem that produced the attached image > (qcow2 format, xz-compressed, split in half for attachment) > it reports a big long list of errors and claims to have fixed them. > If you run it again, it reports the *same* big long list of errors and > claims to have fixed them. I can't reproduce the problem on my end (see attached) - Ted typescript.gz Description: application/gzip
Bug#789290: e2fsprogs: e2fsck claims to have fixed fs, but a second run finds all the same problems
On Fri, Jun 19, 2015 at 11:53 AM, Theodore Ts'o wrote: > > I can't reproduce the problem on my end (see attached) Still happens for me on the real filesystem (see attached). We appear to be using the same version of e2fsprogs. What could cause the divergence? zw typescript.gz Description: GNU Zip compressed data