Replicable file-system corruption due to fsck/ufs

2019-04-10 Thread Jamie Landeg-Jones
I've noticed a replicable disk corruption by fsck_ufs/ffs on sparse files.

This is on amd/64 12-stable-20190409, but I first noticed it on
12-stable-20190326.

I didn't notice it on my previous build of 12-stable-20190107, but I
may not have had any relevant sparse files at the time, so I don't know
if that version was affected. 12-release worked OK.

Here is a simplified replicable example. Thinking about it just now, I
suspect it's triggered by files which end in sparseness.

Can anyone else replicate this, or has my machine gone nuts?

Cheers, Jamie

 | root@thompson# l
 | total 12
 | 4 drwxr-x---   2 root  wheel  -   512 11 Apr 04:08 ./
 | 4 drwxr-xr-x  16 root  wheel  - 1,024 11 Apr 04:08 ../
 | 4 -rw-r-   1 root  wheel  -43 11 Apr 04:08 typescript
 |
 | root@thompson# dd if=/dev/zero bs=1m count=2048 of=test.img
 | 2048+0 records in
 | 2048+0 records out
 | 2147483648 bytes transferred in 4.127411 secs (520298036 bytes/sec)
 |
 | root@thompson# l
 | total 2097708
 |   4 drwxr-x---   2 root  wheel  -   512 11 Apr 04:08 ./
 |   4 drwxr-xr-x  16 root  wheel  - 1,024 11 Apr 04:08 ../
 | 2097696 -rw-r-   1 root  wheel  - 2,147,483,648 11 Apr 04:08 test.img
 |   4 -rw-r-   1 root  wheel  -43 11 Apr 04:08 typescript
 |
 | root@thompson# mdconfig test.img
 | md1
 |
 | root@thompson# newfs /dev/md1
 | /dev/md1: 2048.0MB (4194304 sectors) block size 32768, fragment size 4096
 | using 4 cylinder groups of 512.03MB, 16385 blks, 65664 inodes.
 | super-block backups (for fsck_ffs -b #) at:
 |  192, 1048832, 2097472, 3146112
 |
 | root@thompson# md mnt
 | mnt
 |
 | root@thompson# mount /dev/md1 mnt
 |
 | root@thompson# cd mnt/
 | ~/x/mnt ~/x
 |
 | root@thompson# df .
 | Filesystem 1K-blocks Used Avail Capacity  Mounted on
 | /dev/md1   2,031,1328 1,868,636 0%/root/x/mnt
 |
 | root@thompson# l
 | total 12
 | 4 drwxr-xr-x  3 root  wheel - 512 11 Apr 04:09 ./
 | 4 drwxr-x---  3 root  wheel - 512 11 Apr 04:09 ../
 | 4 drwxrwxr-x  2 root  operator  - 512 11 Apr 04:09 .snap/
 |
 | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
 | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
 | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
 | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
 | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
 | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
 | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
 | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
 | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
 |
 | root@thompson# l
 | total 652
 |   4 drwxr-xr-x  3 root  wheel -   512 11 Apr 04:14 ./
 |   4 drwxr-x---  3 root  wheel -   512 11 Apr 04:09 ../
 |   4 drwxrwxr-x  2 root  operator  -   512 11 Apr 04:09 .snap/
 | 640 -rw-r-  1 root  wheel - 9,663,676,605 11 Apr 04:14 test
 |
 | root@thompson# sha256 -r test > sha256.out
 |
 | root@thompson# cd ..
 | ~/x ~/x/mnt
 |
 | root@thompson# umount mnt
 |
 | root@thompson# fsck /dev/md1
 | ** /dev/md1
 | ** Last Mounted on /root/x/mnt
 | ** Phase 1 - Check Blocks and Sizes
 | INODE 4: FILE SIZE 9663676605 BEYOND END OF ALLOCATED FILE, SIZE SHOULD BE 
1342210048
 | ADJUST? [yn] y
 |
 | ** Phase 2 - Check Pathnames
 | ** Phase 3 - Check Connectivity
 | ** Phase 4 - Check Reference Counts
 | ** Phase 5 - Check Cyl groups
 | 4 files, 163 used, 507620 free (20 frags, 63450 blocks, 0.0% fragmentation)
 |
 | * FILE SYSTEM IS CLEAN *
 |
 | * FILE SYSTEM WAS MODIFIED *
 |
 | root@thompson# fsck /dev/md1
 | ** /dev/md1
 | ** Last Mounted on /root/x/mnt
 | ** Phase 1 - Check Blocks and Sizes
 | PARTIALLY TRUNCATED INODE I=4
 | SALVAGE? [yn] y
 |
 | INCORRECT BLOCK COUNT I=4 (1280 should be 256)
 | CORRECT? [yn] y
 |
 | INODE 4: FILE SIZE 1342210048 BEYOND END OF ALLOCATED FILE, SIZE SHOULD BE 
268468224
 | ADJUST? [yn] y
 |
 | ** Phase 2 - Check Pathnames
 | ** Phase 3 - Check Connectivity
 | ** Phase 4 - Check Reference Counts
 | ** Phase 5 - Check Cyl groups
 | FREE BLK COUNT(S) WRONG IN SUPERBLK
 | SALVAGE? [yn] y
 |
 | SUMMARY INFORMATION BAD
 | SALVAGE? [yn] y
 |
 | BLK(S) MISSING IN BIT MAPS
 | SALVAGE? [yn] y
 |
 | 4 files, 35 used, 507748 free (20 frags, 63466 blocks, 0.0% fragmentation)
 |
 | * FILE SYSTEM IS CLEAN *
 |
 | * FILE SYSTEM WAS MODIFIED *
 |
 | root@thompson# fsck /dev/md1
 | ** /dev/md1
 | ** Last Mounted on /root/x/mnt
 | ** Phase 1 - Check Blocks and Sizes
 | PARTIALLY TRUNCATED INODE I=4
 | SALVAGE? [yn] y
 |
 | INCORRECT BLOCK COUNT I=4 (256 should be 128)
 | CORRECT? [yn] y
 |
 | INODE 4: FILE SIZE 268468224 BEYOND END OF ALLOCATED FILE, SIZE SHOULD BE 
134610944
 | ADJUST? [yn] y
 |
 | ** Phase 2 - Check Pathnames
 | ** Phase 3 - Check Connectivi

Re: Replicable file-system corruption due to fsck/ufs

2019-04-10 Thread Peter Holm
On Thu, Apr 11, 2019 at 04:47:43AM +0100, Jamie Landeg-Jones wrote:
> I've noticed a replicable disk corruption by fsck_ufs/ffs on sparse files.
> 
> This is on amd/64 12-stable-20190409, but I first noticed it on
> 12-stable-20190326.
> 
> I didn't notice it on my previous build of 12-stable-20190107, but I
> may not have had any relevant sparse files at the time, so I don't know
> if that version was affected. 12-release worked OK.
> 
> Here is a simplified replicable example. Thinking about it just now, I
> suspect it's triggered by files which end in sparseness.
> 
> Can anyone else replicate this, or has my machine gone nuts?
> 
> Cheers, Jamie
> 
>  | root@thompson# l
>  | total 12
>  | 4 drwxr-x---   2 root  wheel  -   512 11 Apr 04:08 ./
>  | 4 drwxr-xr-x  16 root  wheel  - 1,024 11 Apr 04:08 ../
>  | 4 -rw-r-   1 root  wheel  -43 11 Apr 04:08 typescript
>  |
>  | root@thompson# dd if=/dev/zero bs=1m count=2048 of=test.img
>  | 2048+0 records in
>  | 2048+0 records out
>  | 2147483648 bytes transferred in 4.127411 secs (520298036 bytes/sec)
>  |
>  | root@thompson# l
>  | total 2097708
>  |   4 drwxr-x---   2 root  wheel  -   512 11 Apr 04:08 ./
>  |   4 drwxr-xr-x  16 root  wheel  - 1,024 11 Apr 04:08 ../
>  | 2097696 -rw-r-   1 root  wheel  - 2,147,483,648 11 Apr 04:08 test.img
>  |   4 -rw-r-   1 root  wheel  -43 11 Apr 04:08 typescript
>  |
>  | root@thompson# mdconfig test.img
>  | md1
>  |
>  | root@thompson# newfs /dev/md1
>  | /dev/md1: 2048.0MB (4194304 sectors) block size 32768, fragment size 4096
>  | using 4 cylinder groups of 512.03MB, 16385 blks, 65664 inodes.
>  | super-block backups (for fsck_ffs -b #) at:
>  |  192, 1048832, 2097472, 3146112
>  |
>  | root@thompson# md mnt
>  | mnt
>  |
>  | root@thompson# mount /dev/md1 mnt
>  |
>  | root@thompson# cd mnt/
>  | ~/x/mnt ~/x
>  |
>  | root@thompson# df .
>  | Filesystem 1K-blocks Used Avail Capacity  Mounted on
>  | /dev/md1   2,031,1328 1,868,636 0%/root/x/mnt
>  |
>  | root@thompson# l
>  | total 12
>  | 4 drwxr-xr-x  3 root  wheel - 512 11 Apr 04:09 ./
>  | 4 drwxr-x---  3 root  wheel - 512 11 Apr 04:09 ../
>  | 4 drwxrwxr-x  2 root  operator  - 512 11 Apr 04:09 .snap/
>  |
>  | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
>  | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
>  | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
>  | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
>  | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
>  | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
>  | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
>  | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
>  | root@thompson# echo "testing 1...2...3..." >> test ; truncate -s +1g test
>  |
>  | root@thompson# l
>  | total 652
>  |   4 drwxr-xr-x  3 root  wheel -   512 11 Apr 04:14 ./
>  |   4 drwxr-x---  3 root  wheel -   512 11 Apr 04:09 ../
>  |   4 drwxrwxr-x  2 root  operator  -   512 11 Apr 04:09 .snap/
>  | 640 -rw-r-  1 root  wheel - 9,663,676,605 11 Apr 04:14 test
>  |
>  | root@thompson# sha256 -r test > sha256.out
>  |
>  | root@thompson# cd ..
>  | ~/x ~/x/mnt
>  |
>  | root@thompson# umount mnt
>  |
>  | root@thompson# fsck /dev/md1
>  | ** /dev/md1
>  | ** Last Mounted on /root/x/mnt
>  | ** Phase 1 - Check Blocks and Sizes
>  | INODE 4: FILE SIZE 9663676605 BEYOND END OF ALLOCATED FILE, SIZE SHOULD BE 
> 1342210048
>  | ADJUST? [yn] y
>  |
>  | ** Phase 2 - Check Pathnames
>  | ** Phase 3 - Check Connectivity
>  | ** Phase 4 - Check Reference Counts
>  | ** Phase 5 - Check Cyl groups
>  | 4 files, 163 used, 507620 free (20 frags, 63450 blocks, 0.0% fragmentation)
>  |
>  | * FILE SYSTEM IS CLEAN *
>  |
>  | * FILE SYSTEM WAS MODIFIED *
>  |
>  | root@thompson# fsck /dev/md1
>  | ** /dev/md1
>  | ** Last Mounted on /root/x/mnt
>  | ** Phase 1 - Check Blocks and Sizes
>  | PARTIALLY TRUNCATED INODE I=4
>  | SALVAGE? [yn] y
>  |
>  | INCORRECT BLOCK COUNT I=4 (1280 should be 256)
>  | CORRECT? [yn] y
>  |
>  | INODE 4: FILE SIZE 1342210048 BEYOND END OF ALLOCATED FILE, SIZE SHOULD BE 
> 268468224
>  | ADJUST? [yn] y
>  |
>  | ** Phase 2 - Check Pathnames
>  | ** Phase 3 - Check Connectivity
>  | ** Phase 4 - Check Reference Counts
>  | ** Phase 5 - Check Cyl groups
>  | FREE BLK COUNT(S) WRONG IN SUPERBLK
>  | SALVAGE? [yn] y
>  |
>  | SUMMARY INFORMATION BAD
>  | SALVAGE? [yn] y
>  |
>  | BLK(S) MISSING IN BIT MAPS
>  | SALVAGE? [yn] y
>  |
>  | 4 files, 35 used, 507748 free (20 frags, 63466 blocks, 0.0% fragmentation)
>  |
>  | * FILE SYSTEM IS CLEAN *
>  |
>  | * FILE SYSTEM WAS MODIFIED *
>  |
>  | root@thompson# fsck /dev/md1
>  | ** /dev/md1
>  | ** Last Mounted on /roo

Re: Replicable file-system corruption due to fsck/ufs

2019-04-10 Thread jamie
Peter Holm  wrote:

> I see this even with a single truncate on HEAD.
>
> $ ./truncate10.sh
> 96 -rw-r--r--  1 root  wheel  1073741824 11 apr. 06:33 test
> ** /dev/md10a
> ** Last Mounted on /mnt
> ** Phase 1 - Check Blocks and Sizes
> INODE 3: FILE SIZE 1073741824 BEYOND END OF ALLOCATED FILE, SIZE SHOULD BE 
> 268435456
> ADJUST? yes

Thanks.. I should have tested that myself.. doh! I was trying to closer 
replicate
my real file that triggered the problem which contained a number of sparse 
areas.

And thanks for adding Kirk to the discussion. I wanted to first be sure it 
wasn't
just me :-)

Cheers, Jamie
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Replicable file-system corruption due to fsck/ufs

2019-04-10 Thread Warner Losh
On Wed, Apr 10, 2019 at 10:46 PM  wrote:

> Peter Holm  wrote:
>
> > I see this even with a single truncate on HEAD.
> >
> > $ ./truncate10.sh
> > 96 -rw-r--r--  1 root  wheel  1073741824 11 apr. 06:33 test
> > ** /dev/md10a
> > ** Last Mounted on /mnt
> > ** Phase 1 - Check Blocks and Sizes
> > INODE 3: FILE SIZE 1073741824 BEYOND END OF ALLOCATED FILE, SIZE SHOULD
> BE 268435456
> > ADJUST? yes
>
> Thanks.. I should have tested that myself.. doh! I was trying to closer
> replicate
> my real file that triggered the problem which contained a number of sparse
> areas.
>
> And thanks for adding Kirk to the discussion. I wanted to first be sure it
> wasn't
> just me :-)
>

I believe that this was added recently to detect corruption that happens
when a file is being appended when the system crashes.

Warner
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Replicable file-system corruption due to fsck/ufs

2019-04-12 Thread Kirk McKusick
> Peter Holm  wrote:
> 
>> I see this even with a single truncate on HEAD.
>>
>> $ ./truncate10.sh
>> 96 -rw-r--r--  1 root  wheel  1073741824 11 apr. 06:33 test
>> ** /dev/md10a
>> ** Last Mounted on /mnt
>> ** Phase 1 - Check Blocks and Sizes
>> INODE 3: FILE SIZE 1073741824 BEYOND END OF ALLOCATED FILE, SIZE SHOULD BE 
>> 268435456
>> ADJUST? yes
> 
> Thanks.. I should have tested that myself.. doh! I was trying to
> closer replicate my real file that triggered the problem which
> contained a number of sparse areas.
> 
> And thanks for adding Kirk to the discussion. I wanted to first be
> sure it wasn't just me :-)
> 
> Cheers, Jamie

This is indeed a bug in the calculation of the location of the last
block of a file. I believe that the following patch to head will
fix it.

Peter, can you please test and let me know.

If Peter confirms that it fixes the bug, I will check it into head
and MFC it to 12-stable and 11-stable after a 2-week settle-in time.

Kirk McKusick

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Replicable file-system corruption due to fsck/ufs

2019-04-13 Thread Peter Holm
On Fri, Apr 12, 2019 at 04:13:00PM -0700, Kirk McKusick wrote:
> > Peter Holm  wrote:
> > 
> >> I see this even with a single truncate on HEAD.
> >>
> >> $ ./truncate10.sh
> >> 96 -rw-r--r--  1 root  wheel  1073741824 11 apr. 06:33 test
> >> ** /dev/md10a
> >> ** Last Mounted on /mnt
> >> ** Phase 1 - Check Blocks and Sizes
> >> INODE 3: FILE SIZE 1073741824 BEYOND END OF ALLOCATED FILE, SIZE SHOULD BE 
> >> 268435456
> >> ADJUST? yes
> > 
> > Thanks.. I should have tested that myself.. doh! I was trying to
> > closer replicate my real file that triggered the problem which
> > contained a number of sparse areas.
> > 
> > And thanks for adding Kirk to the discussion. I wanted to first be
> > sure it wasn't just me :-)
> > 
> > Cheers, Jamie
> 
> This is indeed a bug in the calculation of the location of the last
> block of a file. I believe that the following patch to head will
> fix it.
> 
> Peter, can you please test and let me know.
> 
> If Peter confirms that it fixes the bug, I will check it into head
> and MFC it to 12-stable and 11-stable after a 2-week settle-in time.
> 
>   Kirk McKusick
> 

Yes, this patch works for me.

-- 
Peter
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Replicable file-system corruption due to fsck/ufs

2019-04-13 Thread Kirk McKusick
> Date: Sat, 13 Apr 2019 14:32:45 +0200
> From: Peter Holm 
> To: Kirk McKusick 
> Cc: Jamie Landeg-Jones , ja...@catflap.dyslexicfish.net,
> Warner Losh , freebsd-stable@freebsd.org
> Subject: Re: Replicable file-system corruption due to fsck/ufs
> 
> On Fri, Apr 12, 2019 at 04:13:00PM -0700, Kirk McKusick wrote:
> 
>> This is indeed a bug in the calculation of the location of the last
>> block of a file. I believe that the following patch to head will
>> fix it.
>> 
>> Peter, can you please test and let me know.
>> 
>> If Peter confirms that it fixes the bug, I will check it into head
>> and MFC it to 12-stable and 11-stable after a 2-week settle-in time.
>> 
>>  Kirk McKusick
> 
> Yes, this patch works for me.
> 
> -- 
> Peter

Great, thanks for the quick test. Now committed to head as -r346185.

Kirk
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Replicable file-system corruption due to fsck/ufs

2019-04-18 Thread Jamie Landeg-Jones
Kirk McKusick  wrote:

> > From: Peter Holm 
> > 
> > On Fri, Apr 12, 2019 at 04:13:00PM -0700, Kirk McKusick wrote:
> > 
> >> This is indeed a bug in the calculation of the location of the last
> >> block of a file. I believe that the following patch to head will
> >> fix it.
> >> 
> >> Peter, can you please test and let me know.
> >> 
> >> If Peter confirms that it fixes the bug, I will check it into head
> >> and MFC it to 12-stable and 11-stable after a 2-week settle-in time.
> >> 
> >>Kirk McKusick
> > 
> > Yes, this patch works for me.
> > 
> > -- 
> > Peter
>
> Great, thanks for the quick test. Now committed to head as -r346185.

Apologies for the delay in replying. Thanks Kirk for fixing this, and
thanks Peter for testing it.

The patch also applied cleanly to 12-stable, and is working there too.

Cheers, Jamie

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"