Re: BTRFS use-after-free bug at free_extent_buffer_internal
On Monday 22 April 2024 5:15:50 PM AEST Qu Wenruo wrote: > > 在 2024/4/22 16:07, Sachi King 写道: > > Hi, > > > > I've hit a bug with u-boot on my BTRFS filesystem, and I'm fairly certain > > it's a bug and not a corruption issue. > > > > A bit of history on the filesystem. It is a fairly new filesystem as it was > > being used to give me access to test a wayland application on a > > Raspberry Pi. The filesystem was about 3 days old when I hit the bug, and > > I'm fairly certain it never had an unclean shutdown. I have checked the > > filesystem with "btrfs check" which has found no errors. The filesystem > > mounts on Linux and is functional. > > > >> # btrfs check --check-data-csum /dev/sda2 > >> Opening filesystem to check... > >> Checking filesystem on /dev/sda2 > >> UUID: 18db6211-ac36-42c1-a22f-5e15e1486e0d > >> [1/7] checking root items > >> [2/7] checking extents > >> [3/7] checking free space tree > >> [4/7] checking fs roots > >> [5/7] checking csums against data > >> [6/7] checking root refs > >> [7/7] checking quota groups skipped (not enabled on this FS) > >> found 5070573568 bytes used, no error found > >> total csum bytes: 4451620 > >> total tree bytes: 370458624 > >> total fs tree bytes: 353124352 > >> total extent tree bytes: 10010624 > >> btree space waste bytes: 62303284 > >> file data blocks allocated: 6786519040 > >> referenced 6328619008 > > Since btrfs check reports no error, the fs must be valid. > > But considering how new it is, it may be related to some new features > not properly implemented in Uboot. > > Is it possible to provide the whole binary dump of the fs? Sure, here's a copy of the disk image. The BTRFS is at partition 2. http://maribel.nakato.io/u-boot-newtest.img.zstd > > > > > > I've made an image of the filesystem so I could reproduce the bug in an > > environment that doesn't require the physical SBC, and have reproduced > > the issue using the head of the master branch with "qemu-x86_64_defconfig". > > > > My testing qemu was produced with the following: > >> # make qemu-x86_64_defconfig > >> # cat << EOF >> .config > >> CONFIG_AUTOBOOT=y > >> CONFIG_BOOTDELAY=1 > >> CONFIG_USE_BOOTCOMMAND=y > >> CONFIG_BOOTSTD_DEFAULTS=y > >> CONFIG_BOOTSTD_FULL=y > >> CONFIG_CMD_BOOTFLOW_FULL=y > >> CONFIG_BOOTCOMMAND="bootflow scan -lb" > >> CONFIG_ENV_IS_NOWHERE=y > >> CONFIG_LZ4=y > >> CONFIG_BZIP2=y > >> CONFIG_ZSTD=y > >> CONFIG_FS_BTRFS=y > >> CONFIG_CMD_BTRFS=y > >> CONFIG_GZIP=y > >> CONFIG_DEVICE_TREE_INCLUDES="bootstd.dtsi" > >> EOF > >> # make -j24 > > > > bootstd.dtsi is placed at "arch/x86/dts/bootstd.dtsi" and contains: > >> / { > >> bootstd { > >> compatible = "u-boot,boot-std"; > >> filename-prefixes = "/@boot/", "/boot/", "/"; > >> bootdev-order = "scsi"; > >> extlinux { > >> compatible = "u-boot,extlinux"; > >> }; > >> }; > >> }; > > > > > > The VM was run with > >> qemu-system-x86_64 -bios u-boot.rom -nographic -M q35 -action > >> reboot=shutdown -drive file=/mnt/dbg/u-boot-debug.img > > > > The error message I recive on boot is > >> BUG at fs/btrfs/extent-io.c:629/free_extent_buffer_internal()! > >> BUG! > >> resetting ... > > > > > > I added a print statement to free_extent_buffer_internal that prints the > > start address of the extent_buffer as I'm not sure what to be looking for > > here. This print statement is before the decrement. > >> printf("free_extent_buffer_internal: eb->start[%llx] eb->refs[%i]\n", > >> eb->start, eb->refs); > > > > The last message before the crash reported eb->start to be "0", with 0 refs. > >> free_extent_buffer_internal: eb->start[0] eb->refs[0] > > > > The extent at 0 struck me as odd, so I tried commenting out the freeing, by > > removing the call to free_extent_buffer_final, and this resulted in bootflow > > succeeding and showing me the boot menu, which suprised me. > > I expected to see the bug reproduce itself, with refs being zero, but > > eb->start > > pointing somewhere valid, but I instead got a valid address with refs at 2. > > > > I'm assuming that the order free_extent_buffer_internal is called is > > deterministic, so by counting the print outputs the line that prior held > > the extent_buffer with a 0 start was replaced with: > >> free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > > > > Interestingly, as can be seen in the full logs with my included print > > messages, 249c000 is being used just before this, with a ref count of > > 2. 249c000 does appear to reach a point where it should have been freed > > in the past, before it gets used again as seen in both logs. > > > > The failing boot log: > >> U-Boot SPL 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:32:37 +1000) > >> Trying to boot from SPI > >> Jumping to 64-bit U-Boot: Note many features are missing > >> > >> > >> U-Boot 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:32:37 +1000) > >> > >> CPU:
BTRFS use-after-free bug at free_extent_buffer_internal
Hi, I've hit a bug with u-boot on my BTRFS filesystem, and I'm fairly certain it's a bug and not a corruption issue. A bit of history on the filesystem. It is a fairly new filesystem as it was being used to give me access to test a wayland application on a Raspberry Pi. The filesystem was about 3 days old when I hit the bug, and I'm fairly certain it never had an unclean shutdown. I have checked the filesystem with "btrfs check" which has found no errors. The filesystem mounts on Linux and is functional. > # btrfs check --check-data-csum /dev/sda2 > Opening filesystem to check... > Checking filesystem on /dev/sda2 > UUID: 18db6211-ac36-42c1-a22f-5e15e1486e0d > [1/7] checking root items > [2/7] checking extents > [3/7] checking free space tree > [4/7] checking fs roots > [5/7] checking csums against data > [6/7] checking root refs > [7/7] checking quota groups skipped (not enabled on this FS) > found 5070573568 bytes used, no error found > total csum bytes: 4451620 > total tree bytes: 370458624 > total fs tree bytes: 353124352 > total extent tree bytes: 10010624 > btree space waste bytes: 62303284 > file data blocks allocated: 6786519040 > referenced 6328619008 I've made an image of the filesystem so I could reproduce the bug in an environment that doesn't require the physical SBC, and have reproduced the issue using the head of the master branch with "qemu-x86_64_defconfig". My testing qemu was produced with the following: > # make qemu-x86_64_defconfig > # cat << EOF >> .config > CONFIG_AUTOBOOT=y > CONFIG_BOOTDELAY=1 > CONFIG_USE_BOOTCOMMAND=y > CONFIG_BOOTSTD_DEFAULTS=y > CONFIG_BOOTSTD_FULL=y > CONFIG_CMD_BOOTFLOW_FULL=y > CONFIG_BOOTCOMMAND="bootflow scan -lb" > CONFIG_ENV_IS_NOWHERE=y > CONFIG_LZ4=y > CONFIG_BZIP2=y > CONFIG_ZSTD=y > CONFIG_FS_BTRFS=y > CONFIG_CMD_BTRFS=y > CONFIG_GZIP=y > CONFIG_DEVICE_TREE_INCLUDES="bootstd.dtsi" > EOF > # make -j24 bootstd.dtsi is placed at "arch/x86/dts/bootstd.dtsi" and contains: > / { > bootstd { > compatible = "u-boot,boot-std"; > filename-prefixes = "/@boot/", "/boot/", "/"; > bootdev-order = "scsi"; > extlinux { > compatible = "u-boot,extlinux"; > }; > }; > }; The VM was run with > qemu-system-x86_64 -bios u-boot.rom -nographic -M q35 -action reboot=shutdown > -drive file=/mnt/dbg/u-boot-debug.img The error message I recive on boot is > BUG at fs/btrfs/extent-io.c:629/free_extent_buffer_internal()! > BUG! > resetting ... I added a print statement to free_extent_buffer_internal that prints the start address of the extent_buffer as I'm not sure what to be looking for here. This print statement is before the decrement. > printf("free_extent_buffer_internal: eb->start[%llx] eb->refs[%i]\n", > eb->start, eb->refs); The last message before the crash reported eb->start to be "0", with 0 refs. > free_extent_buffer_internal: eb->start[0] eb->refs[0] The extent at 0 struck me as odd, so I tried commenting out the freeing, by removing the call to free_extent_buffer_final, and this resulted in bootflow succeeding and showing me the boot menu, which suprised me. I expected to see the bug reproduce itself, with refs being zero, but eb->start pointing somewhere valid, but I instead got a valid address with refs at 2. I'm assuming that the order free_extent_buffer_internal is called is deterministic, so by counting the print outputs the line that prior held the extent_buffer with a 0 start was replaced with: > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] Interestingly, as can be seen in the full logs with my included print messages, 249c000 is being used just before this, with a ref count of 2. 249c000 does appear to reach a point where it should have been freed in the past, before it gets used again as seen in both logs. The failing boot log: > U-Boot SPL 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:32:37 +1000) > Trying to boot from SPI > Jumping to 64-bit U-Boot: Note many features are missing > > > U-Boot 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:32:37 +1000) > > CPU: QEMU Virtual CPU version 2.5+ > DRAM: 128 MiB > Core: 13 devices, 13 uclasses, devicetree: separate > Loading Environment from nowhere... OK > Model: QEMU x86 (I440FX) > Net: e1000: 00:00:00:00:00:00 > > Error: e1000#0 No valid MAC address found. > eth_initialize() No ethernet found. > > > Hit any key to stop autoboot: 0 > Scanning for bootflows in all bootdevs > Seq Method State UclassPart Name Filename > --- --- -- > > scanning bus for devices... > Target spinup took 0 ms. > SATA link 1 timeout. > Target spinup took 0 ms. > SATA link 3 timeout. > SATA link 4 timeout. > SATA link 5 timeout. > AHCI 0001. 32 slots 6 ports 1.5 Gbps 0x3f impl SATA mode > flags: 64bit ncq only >
Re: BTRFS use-after-free bug at free_extent_buffer_internal
在 2024/4/22 16:45, Qu Wenruo 写道: [...] I added a print statement to free_extent_buffer_internal that prints the start address of the extent_buffer as I'm not sure what to be looking for here. This print statement is before the decrement. printf("free_extent_buffer_internal: eb->start[%llx] eb->refs[%i]\n", eb->start, eb->refs); Just a small advice, in fact you can go with sandbox mode, running U-boot in userspace, and bind a host file as a device to test the filesystem code. At least that's what I did for most U-boot bugs. Thanks, Qu
Re: BTRFS use-after-free bug at free_extent_buffer_internal
在 2024/4/22 16:07, Sachi King 写道: Hi, I've hit a bug with u-boot on my BTRFS filesystem, and I'm fairly certain it's a bug and not a corruption issue. A bit of history on the filesystem. It is a fairly new filesystem as it was being used to give me access to test a wayland application on a Raspberry Pi. The filesystem was about 3 days old when I hit the bug, and I'm fairly certain it never had an unclean shutdown. I have checked the filesystem with "btrfs check" which has found no errors. The filesystem mounts on Linux and is functional. # btrfs check --check-data-csum /dev/sda2 Opening filesystem to check... Checking filesystem on /dev/sda2 UUID: 18db6211-ac36-42c1-a22f-5e15e1486e0d [1/7] checking root items [2/7] checking extents [3/7] checking free space tree [4/7] checking fs roots [5/7] checking csums against data [6/7] checking root refs [7/7] checking quota groups skipped (not enabled on this FS) found 5070573568 bytes used, no error found total csum bytes: 4451620 total tree bytes: 370458624 total fs tree bytes: 353124352 total extent tree bytes: 10010624 btree space waste bytes: 62303284 file data blocks allocated: 6786519040 referenced 6328619008 Since btrfs check reports no error, the fs must be valid. But considering how new it is, it may be related to some new features not properly implemented in Uboot. Is it possible to provide the whole binary dump of the fs? I've made an image of the filesystem so I could reproduce the bug in an environment that doesn't require the physical SBC, and have reproduced the issue using the head of the master branch with "qemu-x86_64_defconfig". My testing qemu was produced with the following: # make qemu-x86_64_defconfig # cat << EOF >> .config CONFIG_AUTOBOOT=y CONFIG_BOOTDELAY=1 CONFIG_USE_BOOTCOMMAND=y CONFIG_BOOTSTD_DEFAULTS=y CONFIG_BOOTSTD_FULL=y CONFIG_CMD_BOOTFLOW_FULL=y CONFIG_BOOTCOMMAND="bootflow scan -lb" CONFIG_ENV_IS_NOWHERE=y CONFIG_LZ4=y CONFIG_BZIP2=y CONFIG_ZSTD=y CONFIG_FS_BTRFS=y CONFIG_CMD_BTRFS=y CONFIG_GZIP=y CONFIG_DEVICE_TREE_INCLUDES="bootstd.dtsi" EOF # make -j24 bootstd.dtsi is placed at "arch/x86/dts/bootstd.dtsi" and contains: / { bootstd { compatible = "u-boot,boot-std"; filename-prefixes = "/@boot/", "/boot/", "/"; bootdev-order = "scsi"; extlinux { compatible = "u-boot,extlinux"; }; }; }; The VM was run with qemu-system-x86_64 -bios u-boot.rom -nographic -M q35 -action reboot=shutdown -drive file=/mnt/dbg/u-boot-debug.img The error message I recive on boot is BUG at fs/btrfs/extent-io.c:629/free_extent_buffer_internal()! BUG! resetting ... I added a print statement to free_extent_buffer_internal that prints the start address of the extent_buffer as I'm not sure what to be looking for here. This print statement is before the decrement. printf("free_extent_buffer_internal: eb->start[%llx] eb->refs[%i]\n", eb->start, eb->refs); The last message before the crash reported eb->start to be "0", with 0 refs. free_extent_buffer_internal: eb->start[0] eb->refs[0] The extent at 0 struck me as odd, so I tried commenting out the freeing, by removing the call to free_extent_buffer_final, and this resulted in bootflow succeeding and showing me the boot menu, which suprised me. I expected to see the bug reproduce itself, with refs being zero, but eb->start pointing somewhere valid, but I instead got a valid address with refs at 2. I'm assuming that the order free_extent_buffer_internal is called is deterministic, so by counting the print outputs the line that prior held the extent_buffer with a 0 start was replaced with: free_extent_buffer_internal: eb->start[249c000] eb->refs[2] Interestingly, as can be seen in the full logs with my included print messages, 249c000 is being used just before this, with a ref count of 2. 249c000 does appear to reach a point where it should have been freed in the past, before it gets used again as seen in both logs. The failing boot log: U-Boot SPL 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:32:37 +1000) Trying to boot from SPI Jumping to 64-bit U-Boot: Note many features are missing U-Boot 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:32:37 +1000) CPU: QEMU Virtual CPU version 2.5+ DRAM: 128 MiB Core: 13 devices, 13 uclasses, devicetree: separate Loading Environment from nowhere... OK Model: QEMU x86 (I440FX) Net: e1000: 00:00:00:00:00:00 Error: e1000#0 No valid MAC address found. eth_initialize() No ethernet found. Hit any key to stop autoboot: 0 Scanning for bootflows in all bootdevs Seq Method State UclassPart Name Filename --- --- -- scanning bus for devices... Target spinup took 0 ms. SATA link 1 timeout. Target spinup took 0 ms. SATA link 3 timeout. SATA link 4 timeout. SATA link 5 timeout.