Ian Armstrong posted on Tue, 02 Dec 2014 18:56:13 +0000 as excerpted:

> On Tue, 2 Dec 2014 12:48:21 +0000 (UTC)
> Duncan <1i5t5.dun...@cox.net> wrote:
> 
>> FWIW, agreed that it's unlikely to be the drive, especially if you're
>> not seeing bus resets or drive errors in dmesg and smart says the drive
>> is fine, as I expect it does/will.  It may be a btrfs bug or scaling
>> issue, of which btrfs still has some, or it could simply be the single
>> mode vs raid0 mode issue I explain below.
> 
> I encountered a similar problem here a few days ago on a btrfs raid1
> partition while using rsync to clone a (~30GB) directory.
> 
> Everything started fine, but I came back an hour later to find rsync had
> apparently stalled at about 20% with cpu usage at 100% on a single
> kworker thread. I was able to kill rsync eventually, and after a while
> (don't know how long, but >10 minutes) cpu usage returned to normal.
> Restarting rsync resulted in kworker at 100% cpu in less than a minute.
> Once stalled there was little drive access happening. Another raid1
> partition (mdadm/ext4) on the same drive pair was having no problems.
> Nothing showed in the system logs.
> 
> In this instance I'd forgotten to delete a temporary 500GB file before
> starting rsync, so although recently balanced (musage=80/dusage=80) it
> was running at near capacity.
> 
> After a reboot, deleting the 500GB file & running balance, everything
> returned to normal. Ran rsync again & it completed fine.
> 
> Running slackware current, with Kernel 3.16.4

FWIW that was my point -- there are still such bugs out there, often 
corner-case so they don't affect most folks most of the time, but out 
there.

I had a similar stall recently, a kworker stuck at 100% that went away 
after I killed whatever app had triggered the problem (pan, the news 
program I'm writing this with, as it happens).  In my case I chalked it 
up to a known corner-case bug in my slightly old 3.17.0 kernel (my use-
case doesn't do read-only snapshots so I'm not affected by that known bug 
that effectively blacklists 3.17.0 for some users; this would have been a 
different one).  I don't /know/ it was that bug, but it most likely was, 
as it's a known but rare corner-case that AFAIK is already fixed in the 
late 3.18-rcs.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to