Re: btrfs quota issues

Qu Wenruo Thu, 25 Aug 2016 18:53:09 -0700


At 08/24/2016 02:38 AM, Rakesh Sankeshi wrote:

sorry, was out of the town.

not much load on the system at all.

As we are hitting many issues in production, just using this system
for my test purpose. Built few different filesystems. 1 with LZO
compression, second one with ZLIB and third one without any
compression. All has issues related to quota.

whenever there is an issue, I am getting quota exceeded error (EDQUOT).

Please let me know if you still need more info.


Would you please try this patch and see if it has any improvement?
https://patchwork.kernel.org/patch/9201685/

BTW, is balance/relocate involved in your workload?

Also, for non-compressed case, what's the threshold to trigger the bug?
Is it always about 100 and 90G?

Or is it related to the sum of the 2 subvolumes?

(In your initial report, the limit is 200 for each subvolume while thesum of the rfer of these 2 subvolumes seems to be 200G, maybe justcoincident?)


Thanks,
Qu



On Tue, Aug 16, 2016 at 5:56 PM, Qu Wenruo <quwen...@cn.fujitsu.com> wrote:



At 08/17/2016 12:05 AM, Rakesh Sankeshi wrote:


2) after EDQUOT, can't write anymore.

I can delete the data, but still can't write further



So it's a underflow now.


3) tested it without compression and also with LZO and ZLIB.. all
behave same way with qgroup. no consistency on when it hits the quota
limit and don't understand on how it's calculating the numbers.



Even without compression?!

That's a really big problem then.
Workload please, it's an urgent bug now.

It's better to provide the scripts to reproduce it.



And for the meaning of the numbers, for rfer(reference) it means the size of
all extents the subvolume has referred to, including both data and metadata.

For excl(exclusive), it means the size of all extents that only belongs to
the subvolume.

And since it's all about size of extents(on-disk), for compression case,
it's the size after compression.

Also, if one subvolume only referred to part of an extent, the whole extent
size will be accounted.


Last but not least.
Considering there is quite a lot of report about hitting ENOSPC while there
is still a lot of unallocated space,
is it reporting error message like "No space left on device" (ENOSPC) or
"Quota exceeded"(EDQUOT)?

Thanks,
Qu


In case of ext4 and xfs, I can see visually that it's hitting the quota
limit.



On Mon, Aug 15, 2016 at 6:01 PM, Qu Wenruo <quwen...@cn.fujitsu.com>
wrote:




At 08/16/2016 03:11 AM, Rakesh Sankeshi wrote:



yes, subvol level.

qgroupid         rfer         excl     max_rfer     max_excl parent
child

--------         ----         ----     --------     -------- ------
-----

0/5          16.00KiB     16.00KiB         none         none ---     ---

0/258       119.48GiB    119.48GiB    200.00GiB         none ---     ---

0/259        92.57GiB     92.57GiB    200.00GiB         none ---     ---


although I have 200GB limit on 2 subvols, running into issue at about
120 and 92GB itself




1) About workload
Would you mind to mention the work pattern of your write?

Just dd data with LZO compression?
For compression part, it's a little complicated, as the reserved data
size
and on disk extent size are different.

It's possible that at some code we leaked some reserved data space.


2) Behavior after EDQUOT
And, after EDQUOT happens, can you write data into the subvolume?
If you can still write a lot of data (at least several giga), it seems to
be
something related with temporary reserved space.

If not, and even can't remove any file due to EQUOTA, then it's almost
sure
we have underflowed the reserved data.
In that case, unmount and mount again will be the only workaround.
(In fact, not workaround at all)

3) Behavior without compression

If it's OK for you, would you mind to test it without compression?
Currently we mostly use the assumption that on-disk extent size are the
same
with in-memory extent size (non-compression).

So qgroup + compression is not the main concern before and is buggy.

If without compression, qgroup works sanely, at least we can be sure that
the cause is qgroup + compression.

Thanks,
Qu



On Sun, Aug 14, 2016 at 7:11 PM, Qu Wenruo <quwen...@cn.fujitsu.com>
wrote:





At 08/12/2016 01:32 AM, Rakesh Sankeshi wrote:




I set 200GB limit to one user and 100GB to another user.

as soon as I reached 139GB and 53GB each, hitting the quota errors.
anyway to workaround quota functionality on btrfs LZO compressed
filesystem?


Please paste "btrfs qgroup show -prce <mnt>" output if you are using
btrfs
qgroup/quota function.

And, AFAIK btrfs qgroup is applied to subvolume, not user.

So did you mean limit it to one subvolume belongs to one user?

Thanks,
Qu



4.7.0-040700-generic #201608021801 SMP

btrfs-progs v4.7


Label: none  uuid: 66a78faf-2052-4864-8a52-c5aec7a56ab8

Total devices 2 FS bytes used 150.62GiB

devid    1 size 1.00TiB used 78.01GiB path /dev/xvdc

devid    2 size 1.00TiB used 78.01GiB path /dev/xvde


Data, RAID0: total=150.00GiB, used=149.12GiB

System, RAID1: total=8.00MiB, used=16.00KiB

Metadata, RAID1: total=3.00GiB, used=1.49GiB

GlobalReserve, single: total=512.00MiB, used=0.00B


Filesystem      Size  Used Avail Use% Mounted on

/dev/xvdc       2.0T  153G  1.9T   8% /test_lzo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs quota issues

Reply via email to