The following code reliably throws a SIGBUS in the memset, and cat testfile > /dev/null returns an IO error.
I've sometimes gotten as high as iteration 900 before a SIGBUS, so don't assume a single clear is OK. linux 3.17.0, SATA -> MD(raid5) -> bcache (ssd) -> btrfs Working on eliminating more variables. #include <fcntl.h> #include <unistd.h> #include <sys/mman.h> #include <stdint.h> #include <stdlib.h> #include <stdio.h> #include <string.h> #define MB (1024ull * 1024) #define GB (1024ull * MB) #define TEST_SIZE (4096) int main() { int fd; srandom(1024); fd=open("testfile", O_RDWR|O_CREAT, 0600); posix_fallocate(fd, 0, TEST_SIZE * MB); uint8_t * map = 0; int i; for(i=0;i<1000;i++) { size_t location=(random() % (TEST_SIZE-1)) * MB; map = (uint8_t *) mmap(map, MB, PROT_READ|PROT_WRITE, MAP_SHARED, fd, location); printf("%d: writing at %04zd mb\n", i, location); memset(map, 0x5a, 1 * MB); msync(map, 1*MB, MS_ASYNC); munmap(map, MB); } } On Wed, Oct 29, 2014 at 5:50 PM, Dan Merillat <dan.meril...@gmail.com> wrote: > I'm in the middle of debugging the exact same thing. 3.17.0 - > rtorrent dies with SIGBUS. > > I've done some debugging, the sequence is something like this: > open a new file > fallocate() to the final size > mmap() all (or a portion) of the file > write to the region > run SHA1 on that mmap'd region to validate the chink > crash, eventually. Generally not at the same point. > > Reading that file (cat > /dev/null) returns -EIO. > > Looking up the process maps, the SIGBUS appears to be happening in the > middle of a mapped region of a pre-allocated file - I.E. it shouldn't > be. I'm not completely ruling out a rtorrent bug but it appears sane > to me. > > Weirder: "old" files, that have been around a while, work just fine for > seeding. > I've re-hashed my entire collection without an error. > > Seeing this on both inherit-COW and no-inherit-COW files, and the > filesystem is not using compression. > > The interesting part is going back and attempting to read the files > later they sometimes don't throw an IO error. > > Absolutely nothing in dmesg. > > Working on a testcase that triggers it reliably but no luck so far. I > thought I had bad RAM but two people upgrading to 3.17 and seeing the > same bug at around the same time can't be a coincidence. I rebooted > to 3.17 on the 25th, the first new download was on the 28th and that > failed. > > Working on a testcase for it that's more reproducable than "go grab > torrent files with rtorrent". > > On Tue, Oct 28, 2014 at 12:49 PM, Alec Blayne <a...@tevsa.net> wrote: >> Hi, it seems that when using rtorrent to download into a btrfs system, >> it leads to the creation of files that fail to read properly. >> For instance, I get rtorrent to crash, but if I try to rsync the file he >> was writting into someplace else, rsync also fails with the message >> "can't map file "$file": Input/Output error (5)". >> If I give it time, eventually the file gets into a good state and I can >> rsync it somewhere else (as long as rtorrent doesn't keep writting into >> it). This doesn't happen using ext4 on the same system. >> >> No btrfs errors, or any other errors, show up in any log. Scrubbing or >> balancing don't turn up any issues. I've tried using a subvolume mounted >> with nodatacow and/or flushoncommit, which didn't help. I'm not using >> quotas and at some point had a single snapshot that I deleted. The >> filesystem was originally created recently (on a 3.16.4+ kernel). >> >> Here's what the array looks like: >> >> Label: 'data' uuid: ffe83a3d-f4ba-46b7-8424-4ec3380cb811 >> Total devices 4 FS bytes used 3.14TiB >> devid 4 size 2.73TiB used 2.36TiB path /dev/sdd1 >> devid 5 size 1.82TiB used 1.45TiB path /dev/sdc1 >> devid 6 size 1.82TiB used 1.45TiB path /dev/sdb1 >> devid 7 size 1.82TiB used 1.45TiB path /dev/sda1 >> >> Btrfs v3.17 >> >> Data, RAID1: total=3.34TiB, used=3.13TiB >> System, RAID1: total=32.00MiB, used=512.00KiB >> Metadata, RAID1: total=10.00GiB, used=7.31GiB >> GlobalReserve, single: total=512.00MiB, used=0.00B >> >> >> On linux 3.17.1: Linux 3.17.1-gentoo-r1 #3 SMP PREEMPT Tue Oct 28 >> 02:43:11 WET 2014 x86_64 AMD Athlon(tm) 5350 APU with Radeon(tm) R3 >> AuthenticAMD GNU/Linux >> >> I'm utterly puzzled and clueless at how to dig into this issue. >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html