relocation.c:242!

Tomasz Chmielewski Sat, 13 Dec 2014 05:54:03 -0800

On 2014-12-13 10:39, Robert White wrote:

Might I ask why you are running balance? After a persistent error I'd
understand going straight to scrub, but balance is usually for
transformation or to redistribute things after atypical use.


There were several reasons for running balance on this system:

1) I was getting "no space left", even though there were hundreds of GBsleft. Not sure if this still applies to the current kernels (3.18 andlater) though, but it was certainly the problem in the past.

2) The system was regularly freezing, I'd say once a week was a norm.Sometimes I was getting btrfs traces logged in syslog.After a few freezes the fs was getting corrupted to different degree. Atsome point, it was so bad that it was only possible to use it read only.So I had to get the data off, reformat, copy back... It would startcrashing after a few weeks of usage.


My usage case is quite simple:

- skinny extents, extended inode refs
- mount compress-force=zlib
- rsync many remote data sources (-a -H --inplace --partial) + snapshot
- around 500 snapshots in total, from 20 or so subvolumes

Especially rsync's --inplace option combined with many snapshots andlarge fragmentation was deadly for btrfs - I was seeing system freezesright when rsyncing a highly fragmented, large file.

Then, running balance on the "corrupted" filesystem was more an exercise(if scrub passes fine, I would expect balance to pass as well). SomeBUGs it was causing was sometimes fixed in newer kernels, sometimes not(btrfsck was not really usable a few months back).

3) I had different luck with recovering btrfs after a failed drive (inRAID-1). Sometimes it worked as expected, sometimes, the fs was gettingbroken so much I had to rsync data off it and format from scratch (wheremdraid would kick the drive after getting write errors - it's not thecase with btrfs, and weird things can happen).Sometimes, running "btrfs device delete missing" (it's balance inprinciple, I think) would take weeks, during which a second drive couldeasily die.Again, running balance would be more exercise there, to see if the newerkernel still crashes.

An entire generation of folks have grown used to defraging windows
boxes and all, but if you've already got an array that is going to
take "many days" to balance what benefit do you actually expect to
receive?

For me - it's a good test to see if btrfs is finally getting stable(some cases explained above).

Defrag -- used for "I think I'm getting a lot of unnecessary head seek
in this application, these files need to be brought into closer
order".

Fragmentation was an issue for btrfs, at least a few kernels back (asexplained above, with rsync's --inplace).However, I'm not running autodefrag anywhere - not sure how it affectssnapshots.

Scrub -- used for defensive checking a-la checkdisk. "I suspect that
after that unexpected power outage something may be a little off", or
alternately "I think my disks are giving me bitrot, I better check".


For me, it was passing fine, where balance was crashing the kernel.

Again, my main rationale for running balance is to see if btrfs isbehaving stable. While I have systems with btrfs which are running finefor months, I also have ones which will crash after 1-2 weeks (once thesystem grows in size / complexity).

So hopefully, btrfsck had fixed that fs - once it is running stable fora week or two, I might be brave to re-enable btrfs quotas (was anothersystem freezer, at least a few kernels back).



--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 3.18.0: kernel BUG at fs/btrfs/relocation.c:242!

Reply via email to