Le 2015-09-17 12:41, Qu Wenruo a écrit :
In the meantime, I've reactivated quotas, umounted the filesystem and
ran a btrfsck on it : as you would expect, there's no qgroup problem
reported so far.

At least, rescan code is working without problem.

I'll clear all my snapshots, run an quota rescan, then
re-create them one by one by rsyncing from my ext4 system I still have.
Maybe I'll run into the issue again.


Would you mind to do the following check for each subvolume rsync?

1) Do 'sync; btrfs qgroup show -prce --raw' and save the output
2) Create the needed snapshot
3) Do 'sync; btrfs qgroup show -prce --raw' and save the output
4) Avoid doing IO if possible until step 6)
5) Do 'btrfs quota rescan -w' and save it
6) Do 'sync; btrfs qgroup show -prce --raw' and save the output
7) Rsync data from ext4 to the newly created snapshot

The point is, as you mentioned, rescan is working fine, we can compare
output from 3), 6) and 1) to see which qgroup accounting number
changes.

And if differs, which means the qgroup update at write time OR
snapshot creation has something wrong, at least we can locate the
problem to qgroup update routine or snapshot creation.

I was about to do that, but first there's something that sounds strange : I've begun by trashing all my snapshots, then ran a quota rescan, and waited for it to complete, to start on a sane base.
However, this is the output of qgroup show now :

qgroupid rfer excl max_rfer max_excl parent child -------- ---- ---- -------- -------- ------ ----- 0/5 16384 16384 none none --- --- 0/1906 1657848029184 1657848029184 none none --- --- 0/1909 124950921216 124950921216 none none --- --- 0/1911 1054587293696 1054587293696 none none --- --- 0/3270 23727300608 23727300608 none none --- --- 0/3314 23206055936 23206055936 none none --- --- 0/3317 18472996864 0 none none --- --- 0/3318 22235709440 18446744073708421120 none none --- --- 0/3319 22240333824 0 none none --- --- 0/3320 22289608704 0 none none --- --- 0/3321 22289608704 0 none none --- --- 0/3322 18461151232 0 none none --- --- 0/3323 18423902208 0 none none --- --- 0/3324 18423902208 0 none none --- --- 0/3325 18463506432 0 none none --- --- 0/3326 18463506432 0 none none --- --- 0/3327 18463506432 0 none none --- --- 0/3328 18463506432 0 none none --- --- 0/3329 18585427968 0 none none --- --- 0/3330 18621472768 18446744073251348480 none none --- --- 0/3331 18621472768 0 none none --- --- 0/3332 18621472768 0 none none --- --- 0/3333 18783076352 0 none none --- --- 0/3334 18799804416 0 none none --- --- 0/3335 18799804416 0 none none --- --- 0/3336 18816217088 0 none none --- --- 0/3337 18816266240 0 none none --- --- 0/3338 18816266240 0 none none --- --- 0/3339 18816266240 0 none none --- --- 0/3340 18816364544 0 none none --- --- 0/3341 7530119168 7530119168 none none --- --- 0/3342 4919283712 0 none none --- --- 0/3343 4921724928 0 none none --- --- 0/3344 4921724928 0 none none --- --- 0/3345 6503317504 18446744073690902528 none none --- --- 0/3346 6503452672 0 none none --- --- 0/3347 6509514752 0 none none --- --- 0/3348 6515793920 0 none none --- --- 0/3349 6515793920 0 none none --- --- 0/3350 6518685696 0 none none --- --- 0/3351 6521511936 0 none none --- --- 0/3352 6521511936 0 none none --- --- 0/3353 6521544704 0 none none --- --- 0/3354 6597963776 0 none none --- --- 0/3355 6598275072 0 none none --- --- 0/3356 6635880448 0 none none --- --- 0/3357 6635880448 0 none none --- --- 0/3358 6635880448 0 none none --- --- 0/3359 6635880448 0 none none --- --- 0/3360 6635880448 0 none none --- --- 0/3361 6635880448 0 none none --- --- 0/3362 6635880448 0 none none --- --- 0/3363 6635880448 0 none none --- --- 0/3364 6635880448 0 none none --- --- 0/3365 6635880448 0 none none --- --- 0/3366 6635896832 0 none none --- --- 0/3367 24185790464 24185790464 none none --- ---

I would have expected all these qgroupids to have been trashed with the snapshots, but it seems not. It reminded me of the bug you were talking about, where deleted snapshots don't always clear correctly their qgroup, but as these don't disappear after a rescan either... I'm a bit surprised.

I've just tried quota disable / quota enable, and not it seems OK. Just wanted to let you know, in case it's not known behavior ...

The procedure I'll use will be slighlty different from what you proposed, but to my understanding it won't change the result :

0) Rsync data from the next ext4 "snapshot" to the subvolume
1) Do 'sync; btrfs qgroup show -prce --raw' and save the output
2) Create the needed readonly snapshot on btrfs
3) Do 'sync; btrfs qgroup show -prce --raw' and save the output
4) Avoid doing IO if possible until step 6)
5) Do 'btrfs quota rescan -w' and save it
6) Do 'sync; btrfs qgroup show -prce --raw' and save the output

I'll post the results once this is done.

--
Stéphane.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to