Reiterate: btrfs stuck with lot's of files
Hi, guys again. Looking at this issue, I suspect this is bug in btrfs. We'll have to clean up this installation soon, so if there is any request to do some debugging, please, ask. I'll try to reiterate what was said in this thread. Short story: btrfs filesystem made of 22 1Tb disks with lot's of files (~3024). Write load is 25 Mbyte/second. After some time file system became unable to cope with this load. Also at this time `sync` takes ages to finish, shutdown -r hangs (I guess related to sync). Also I see there is one some kernel kworker that is main suspect for this behavior: all the time it takes 100% of CPU core, jumping from core to core. At the same time according to iostat write/read speed is close to zero and everything is stuck. Siting some details from previous messages: > > top - 13:10:58 up 1 day, 9:26, 5 users, load average: 157.76, 156.61, > > 149.29 > > Tasks: 235 total, 2 running, 233 sleeping, 0 stopped, 0 zombie > > %Cpu(s): 19.8 us, 15.0 sy, 0.0 ni, 60.7 id, 3.9 wa, 0.0 hi, 0.6 si, 0.0 > > st > > KiB Mem: 65922104 total, 65414856 used, 507248 free, 1844 buffers > > KiB Swap:0 total,0 used,0 free. 62570804 cached Mem > > > >PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ > > COMMAND > > 8644 root 20 0 0 0 0 R 96.5 0.0 127:21.95 > > kworker/u16:16 > > 5047 dvr 20 0 6884292 122668 4132 S 6.4 0.2 258:59.49 > > dvrserver > > 30223 root 20 0 20140 2600 2132 R 6.4 0.0 0:00.01 top > > 1 root 20 04276 1628 1524 S 0.0 0.0 0:40.19 init > > > > There are about 300 treads on server, some of which are writing on disk. > > A bit information about this btrfs filesystem: this is 22 disk file > > system with raid1 for metadata and raid0 for data: > > > > # btrfs filesystem df /store/ > > Data, single: total=11.92TiB, used=10.86TiB > > System, RAID1: total=8.00MiB, used=1.27MiB > > System, single: total=4.00MiB, used=0.00B > > Metadata, RAID1: total=46.00GiB, used=33.49GiB > > Metadata, single: total=8.00MiB, used=0.00B > > GlobalReserve, single: total=512.00MiB, used=128.00KiB > > # btrfs property get /store/ > > ro=false > > label=store > > # btrfs device stats /store/ > > (shows all zeros) > > # btrfs balance status /store/ > > No balance found on '/store/' # btrfs filesystem show Label: 'store' uuid: 296404d1-bd3f-417d-8501-02f8d7906bcf Total devices 22 FS bytes used 6.50TiB devid1 size 931.51GiB used 558.02GiB path /dev/sdb devid2 size 931.51GiB used 559.00GiB path /dev/sdc devid3 size 931.51GiB used 559.00GiB path /dev/sdd devid4 size 931.51GiB used 559.00GiB path /dev/sde devid5 size 931.51GiB used 559.00GiB path /dev/sdf devid6 size 931.51GiB used 559.00GiB path /dev/sdg devid7 size 931.51GiB used 559.00GiB path /dev/sdh devid8 size 931.51GiB used 559.00GiB path /dev/sdi devid9 size 931.51GiB used 559.00GiB path /dev/sdj devid 10 size 931.51GiB used 559.00GiB path /dev/sdk devid 11 size 931.51GiB used 559.00GiB path /dev/sdl devid 12 size 931.51GiB used 559.00GiB path /dev/sdm devid 13 size 931.51GiB used 559.00GiB path /dev/sdn devid 14 size 931.51GiB used 559.00GiB path /dev/sdo devid 15 size 931.51GiB used 559.00GiB path /dev/sdp devid 16 size 931.51GiB used 559.00GiB path /dev/sdq devid 17 size 931.51GiB used 559.00GiB path /dev/sdr devid 18 size 931.51GiB used 559.00GiB path /dev/sds devid 19 size 931.51GiB used 559.00GiB path /dev/sdt devid 20 size 931.51GiB used 559.00GiB path /dev/sdu devid 21 size 931.51GiB used 559.01GiB path /dev/sdv devid 22 size 931.51GiB used 560.01GiB path /dev/sdw Btrfs v3.17.1 > > iostat 1 exposes following problem: > > > > avg-cpu: %user %nice %system %iowait %steal %idle > >16.960.00 17.09 65.950.000.00 > > > > Device:tpskB_read/skB_wrtn/skB_readkB_wrtn > > sda 0.00 0.00 0.00 0 0 > > sdc 0.00 0.00 0.00 0 0 > > sdb 0.00 0.00 0.00 0 0 > > sde 0.00 0.00 0.00 0 0 > > sdd 0.00 0.00 0.00 0 0 > > sdf 0.00 0.00 0.00 0 0 > > sdg 0.00 0.00 0.00 0 0 > > sdj 0.00 0.00 0.00 0 0 > > sdh 0.00 0.00 0.00 0 0 > > sdk 0.00 0.00 0.00 0 0 > > sdi 1.00 0.00 200.00 0200 > > sdl 0.00
btrfs stuck with lot's of files
Hi, guys. We have a problem with btrfs file system: sometimes it became stuck without leaving me any way to interrupt it (shutdown -r now is unable to restart server). By stuck I mean some processes that previously were able to write on disk are unable to cope with load and load average goes up: top - 13:10:58 up 1 day, 9:26, 5 users, load average: 157.76, 156.61, 149.29 Tasks: 235 total, 2 running, 233 sleeping, 0 stopped, 0 zombie %Cpu(s): 19.8 us, 15.0 sy, 0.0 ni, 60.7 id, 3.9 wa, 0.0 hi, 0.6 si, 0.0 st KiB Mem: 65922104 total, 65414856 used, 507248 free, 1844 buffers KiB Swap:0 total,0 used,0 free. 62570804 cached Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 8644 root 20 0 0 0 0 R 96.5 0.0 127:21.95 kworker/u16:16 5047 dvr 20 0 6884292 122668 4132 S 6.4 0.2 258:59.49 dvrserver 30223 root 20 0 20140 2600 2132 R 6.4 0.0 0:00.01 top 1 root 20 04276 1628 1524 S 0.0 0.0 0:40.19 init There are about 300 treads on server, some of which are writing on disk. A bit information about this btrfs filesystem: this is 22 disk file system with raid1 for metadata and raid0 for data: # btrfs filesystem df /store/ Data, single: total=11.92TiB, used=10.86TiB System, RAID1: total=8.00MiB, used=1.27MiB System, single: total=4.00MiB, used=0.00B Metadata, RAID1: total=46.00GiB, used=33.49GiB Metadata, single: total=8.00MiB, used=0.00B GlobalReserve, single: total=512.00MiB, used=128.00KiB # btrfs property get /store/ ro=false label=store # btrfs device stats /store/ (shows all zeros) # btrfs balance status /store/ No balance found on '/store/' # btrfs filesystem show /store/ Btrfs v3.17.1 (btw, is it supposed to have only version here?) As for load we write quite small files of size (some of 313K, some of 800K), that's why metadata takes that much. So back to the problem. iostat 1 exposes following problem: avg-cpu: %user %nice %system %iowait %steal %idle 16.960.00 17.09 65.950.000.00 Device:tpskB_read/skB_wrtn/skB_readkB_wrtn sda 0.00 0.00 0.00 0 0 sdc 0.00 0.00 0.00 0 0 sdb 0.00 0.00 0.00 0 0 sde 0.00 0.00 0.00 0 0 sdd 0.00 0.00 0.00 0 0 sdf 0.00 0.00 0.00 0 0 sdg 0.00 0.00 0.00 0 0 sdj 0.00 0.00 0.00 0 0 sdh 0.00 0.00 0.00 0 0 sdk 0.00 0.00 0.00 0 0 sdi 1.00 0.00 200.00 0200 sdl 0.00 0.00 0.00 0 0 sdn 48.00 0.00 17260.00 0 17260 sdm 0.00 0.00 0.00 0 0 sdp 0.00 0.00 0.00 0 0 sdo 0.00 0.00 0.00 0 0 sdq 0.00 0.00 0.00 0 0 sdr 0.00 0.00 0.00 0 0 sds 0.00 0.00 0.00 0 0 sdt 0.00 0.00 0.00 0 0 sdv 0.00 0.00 0.00 0 0 sdw 0.00 0.00 0.00 0 0 sdu 0.00 0.00 0.00 0 0 write goes to one disk. I've tried to debug what's going in kworker and did $ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event $ cat /sys/kernel/debug/tracing/trace_pipe > trace_pipe.out2 trace_pipe2.out.xz in attachment. Could you comment, what goes wrong here? Server has 64Gb of RAM. Is it possible that it is unable to keep all metadata in memory, can we encrease this memory limit, if exists? Thanks in advance for any pointers, -- Peter. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs stuck with lot's of files
В Пн, 01/12/2014 в 10:47 -0800, Robert White пишет: > On 12/01/2014 03:46 AM, Peter Volkov wrote: > > (stuff about getting hung up trying to write to one drive) > > That drive (/dev/sdn) is probably starting to fail. > (about failed drive) Thank you Robert for the answer. It is not likely that drive fails here. Similar condition (write to a single drive) happens with other drives i.e. such write pattern may happen with any drive. After looking at what happens longer I see the following. During stuck single processor core is busy 100% of CPU in kernel space (some kworker is taking 100% CPU). Ftrace reveals that btrfs_async_reclaim_metadata_space is most frequently called function. So it looks like btrfs is doing some operation with metadata and until it finishes that everything is stuck (practically no writes happens on disk). So I'm looking for suggestion on how to cope with this process. > > # btrfs filesystem df /store/ > > Data, single: total=11.92TiB, used=10.86TiB > > Reguardless of the above... > > You have a terabyte of unused but allocated data storage. You probably > need to balance your system to un-jamb that. That's a lot of space that > is unavailable to the metadata (etc). Well, I'm afraid that balance will put fs into even longer "stuck". > ASIDE: Having your metadata set to RAID1 (as opposed to the default of > DUP) seems a little iffy since your data is still set to DUP. That's true. But why data is duplicated? During btrfs volume creation I've set explicitly -d data single. > FUTHER ASIDE: raid1 metadata and raid5 data might be good for you given > 22 volumes and 10% empty empty space it would only cost you half of your > existing empty space. If you don't RAID your data, there is no real > point to putting your metadata in RAID. Is raid5 ready for use? As I read post[1] mentioned on[2] it is still some way to make it stable. [1] http://marc.merlins.org/perso/btrfs/post_2014-03-23_Btrfs-Raid5-Status.html [2] https://btrfs.wiki.kernel.org/index.php/RAID56 -- Peter. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs stuck with lot's of files
В Вт, 02/12/2014 в 09:33 +0800, Qu Wenruo пишет: > Original Message > Subject: btrfs stuck with lot's of files > From: Peter Volkov > To: linux-btrfs@vger.kernel.org > Date: 2014年12月01日 19:46 > > Hi, guys. > > > > We have a problem with btrfs file system: sometimes it became stuck > > without leaving me any way to interrupt it (shutdown -r now is unable to > > restart server). By stuck I mean some processes that previously were > > able to write on disk are unable to cope with load and load average goes > > up: > > > > top - 13:10:58 up 1 day, 9:26, 5 users, load average: 157.76, 156.61, > > 149.29 > > Tasks: 235 total, 2 running, 233 sleeping, 0 stopped, 0 zombie > > %Cpu(s): 19.8 us, 15.0 sy, 0.0 ni, 60.7 id, 3.9 wa, 0.0 hi, 0.6 si, > > 0.0 st > > KiB Mem: 65922104 total, 65414856 used, 507248 free, 1844 buffers > > KiB Swap:0 total,0 used,0 free. 62570804 cached > > Mem > > > >PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ > > COMMAND > > 8644 root 20 0 0 0 0 R 96.5 0.0 127:21.95 > > kworker/u16:16 > > 5047 dvr 20 0 6884292 122668 4132 S 6.4 0.2 258:59.49 > > dvrserver > > 30223 root 20 0 20140 2600 2132 R 6.4 0.0 0:00.01 > > top > > 1 root 20 04276 1628 1524 S 0.0 0.0 0:40.19 > > init > > > > > > > > There are about 300 treads on server, some of which are writing on disk. > > A bit information about this btrfs filesystem: this is 22 disk file > > system with raid1 for metadata and raid0 for data: > > > > # btrfs filesystem df /store/ > > Data, single: total=11.92TiB, used=10.86TiB > > System, RAID1: total=8.00MiB, used=1.27MiB > > System, single: total=4.00MiB, used=0.00B > > Metadata, RAID1: total=46.00GiB, used=33.49GiB > > Metadata, single: total=8.00MiB, used=0.00B > > GlobalReserve, single: total=512.00MiB, used=128.00KiB > > # btrfs property get /store/ > > ro=false > > label=store > > # btrfs device stats /store/ > > (shows all zeros) > > # btrfs balance status /store/ > > No balance found on '/store/' > > # btrfs filesystem show /store/ > > Btrfs v3.17.1 > > (btw, is it supposed to have only version here?) > This is a small bug that if there is appending '/' in the path for > 'btrfs fi show', it can't recognize it > Patch is already sent and maybe included next version. > > > > As for load we write quite small files of size (some of 313K, some of > > 800K), that's why metadata takes that much. So back to the problem. > > iostat 1 exposes following problem: > > > > avg-cpu: %user %nice %system %iowait %steal %idle > >16.960.00 17.09 65.950.000.00 > > > > Device:tpskB_read/skB_wrtn/skB_readkB_wrtn > > sda 0.00 0.00 0.00 0 0 > > sdc 0.00 0.00 0.00 0 0 > > sdb 0.00 0.00 0.00 0 0 > > sde 0.00 0.00 0.00 0 0 > > sdd 0.00 0.00 0.00 0 0 > > sdf 0.00 0.00 0.00 0 0 > > sdg 0.00 0.00 0.00 0 0 > > sdj 0.00 0.00 0.00 0 0 > > sdh 0.00 0.00 0.00 0 0 > > sdk 0.00 0.00 0.00 0 0 > > sdi 1.00 0.00 200.00 0200 > > sdl 0.00 0.00 0.00 0 0 > > sdn 48.00 0.00 17260.00 0 17260 > > sdm 0.00 0.00 0.00 0 0 > > sdp 0.00 0.00 0.00 0 0 > > sdo 0.00 0.00 0.00 0 0 > > sdq 0.00 0.00 0.00 0 0 > > sdr 0.00 0.00 0.00 0 0 > > sds 0.00 0.00 0.00 0 0 > > sdt 0.00 0.00 0.00 0 0 > > sdv 0.00 0.00 0.00 0 0 > > sdw 0.00 0.00 0.00