On Tue, 2 Dec 2014 12:48:21 +0000 (UTC) Duncan <1i5t5.dun...@cox.net> wrote:
> Peter Volkov posted on Tue, 02 Dec 2014 04:50:29 +0300 as excerpted: > > > В Пн, 01/12/2014 в 10:47 -0800, Robert White пишет: > >> On 12/01/2014 03:46 AM, Peter Volkov wrote: > >> > (stuff about getting hung up trying to write to one drive) > >> > >> That drive (/dev/sdn) is probably starting to fail. > >> (about failed drive) > > > > Thank you Robert for the answer. It is not likely that drive fails > > here. Similar condition (write to a single drive) happens with > > other drives i.e. such write pattern may happen with any drive. > > > > After looking at what happens longer I see the following. During > > stuck single processor core is busy 100% of CPU in kernel space > > (some kworker is taking 100% CPU). > > FWIW, agreed that it's unlikely to be the drive, especially if you're > not seeing bus resets or drive errors in dmesg and smart says the > drive is fine, as I expect it does/will. It may be a btrfs bug or > scaling issue, of which btrfs still has some, or it could simply be > the single mode vs raid0 mode issue I explain below. I encountered a similar problem here a few days ago on a btrfs raid1 partition while using rsync to clone a (~30GB) directory. Everything started fine, but I came back an hour later to find rsync had apparently stalled at about 20% with cpu usage at 100% on a single kworker thread. I was able to kill rsync eventually, and after a while (don't know how long, but >10 minutes) cpu usage returned to normal. Restarting rsync resulted in kworker at 100% cpu in less than a minute. Once stalled there was little drive access happening. Another raid1 partition (mdadm/ext4) on the same drive pair was having no problems. Nothing showed in the system logs. In this instance I'd forgotten to delete a temporary 500GB file before starting rsync, so although recently balanced (musage=80/dusage=80) it was running at near capacity. After a reboot, deleting the 500GB file & running balance, everything returned to normal. Ran rsync again & it completed fine. Running slackware current, with Kernel 3.16.4 # btrfs filesystem df /mnt/general Data, RAID1: total=1.38TiB, used=1.38TiB System, RAID1: total=32.00MiB, used=256.00KiB Metadata, RAID1: total=6.00GiB, used=4.67GiB GlobalReserve, single: total=512.00MiB, used=0.00B # btrfs filesystem show /mnt/general Label: none uuid: 592376ea-769f-4abb-915e-aa5e49162d90 Total devices 2 FS bytes used 1.38TiB devid 1 size 1.79TiB used 1.39TiB path /dev/sda4 devid 2 size 1.79TiB used 1.39TiB path /dev/sdd4 Btrfs v3.17.2 -- Ian -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html