Re: kworker threads may be working saner now instead of using 100% of a CPU core for minutes (Re: Still not production ready)

2016-09-07 Thread Martin Steigerwald
Am Mittwoch, 7. September 2016, 11:53:04 CEST schrieb Christian Rohmann:
> On 03/20/2016 12:24 PM, Martin Steigerwald wrote:
> >> btrfs kworker thread uses up 100% of a Sandybridge core for minutes on
> >> 
> >> > random write into big file
> >> > https://bugzilla.kernel.org/show_bug.cgi?id=90401
> > 
> > I think I saw this up to kernel 4.3. I think I didn´t see this with 4.4
> > anymore and definately not with 4.5.
> > 
> > So it may be fixed.
> > 
> > Did anyone else see kworker threads using 100% of a core for minutes with
> > 4.4 / 4.5?
> 
> I run 4.8rc5 and currently see this issue. kworking has been running at
> 100% for hours now, seems stuck there.
> 
> Anything I should look at in order to narrow this down to a root cause?

I didn´t see any issues since my last post, currently running 4.8-rc5 myself.

I suggest you look at kernel log and probably review this thread and my bug 
report for what other information I came up with. Particulary in my case the 
issue only happened when BTRFS allocated all device spaces into chunks, but 
the space in the chunks was not fully used up yet. I.e. when BTRFS had to seek 
for new space in chunks and couldn´t just allocate a new chunk anymore. In 
addition to that your BTRFS configuration, storage configuration, yada. Just 
review what I reported to get an idea.

If you are sufficiently sure that your issue is the same from looking at the 
kernel log… so if the backtraces look sufficiently similar, then I´d add to my 
bug report. Otherwise I´d hope a new one.

Good luck.
-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kworker threads may be working saner now instead of using 100% of a CPU core for minutes (Re: Still not production ready)

2016-09-07 Thread Christian Rohmann


On 03/20/2016 12:24 PM, Martin Steigerwald wrote:
>> btrfs kworker thread uses up 100% of a Sandybridge core for minutes on
>> > random write into big file
>> > https://bugzilla.kernel.org/show_bug.cgi?id=90401
> I think I saw this up to kernel 4.3. I think I didn´t see this with 4.4 
> anymore and definately not with 4.5.
> 
> So it may be fixed.
> 
> Did anyone else see kworker threads using 100% of a core for minutes with 4.4 
> / 4.5?

I run 4.8rc5 and currently see this issue. kworking has been running at
100% for hours now, seems stuck there.

Anything I should look at in order to narrow this down to a root cause?


Regards

Christian
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kworker threads may be working saner now instead of using 100% of a CPU core for minutes (Re: Still not production ready)

2016-03-20 Thread Martin Steigerwald
On Sonntag, 13. Dezember 2015 23:35:08 CET Martin Steigerwald wrote:
> Hi!
> 
> For me it is still not production ready. Again I ran into:
> 
> btrfs kworker thread uses up 100% of a Sandybridge core for minutes on
> random write into big file
> https://bugzilla.kernel.org/show_bug.cgi?id=90401

I think I saw this up to kernel 4.3. I think I didn´t see this with 4.4 
anymore and definately not with 4.5.

So it may be fixed.

Did anyone else see kworker threads using 100% of a core for minutes with 4.4 
/ 4.5?


For me this would be a big step forward. And yes, I am aware some people have 
new and other issues, but well for me a non working balance – it may also be 
broken here with "no space left on device", it errored out often enough here – 
is still something different than having to switch off the device hard unless 
you want to give it a ton of time to eventually shutdown which is not an 
option if you just want to work with your system.


In any case many thanks to all the developers working on improving BTRFS, and 
especially those who bring in bug fixes. I do think BTRFS still needs more 
stability work when I read through the recent mailing list threads.

Thanks,
Martin

> No matter whether SLES 12 uses it as default for root, no matter whether
> Fujitsu and Facebook use it: I will not let this onto any customer machine
> without lots and lots of underprovisioning and rigorous free space
> monitoring. Actually I will renew my recommendations in my trainings to be
> careful with BTRFS.
> 
> From my experience the monitoring would check for:
> 
> merkaba:~> btrfs fi show /home
> Label: 'home'  uuid: […]
> Total devices 2 FS bytes used 156.31GiB
> devid1 size 170.00GiB used 164.13GiB path /dev/mapper/msata-home
> devid2 size 170.00GiB used 164.13GiB path /dev/mapper/sata-home
> 
> If "used" is same as "size" then make big fat alarm. It is not sufficient
> for it to happen. It can run for quite some time just fine without any
> issues, but I never have seen a kworker thread using 100% of one core for
> extended period of time blocking everything else on the fs without this
> condition being met.
> 
> 
> In addition to that last time I tried it aborts scrub any of my BTRFS
> filesstems. Reported in another thread here that got completely ignored so
> far. I think I could go back to 4.2 kernel to make this work.
> 
> 
> I am not going to bother to go into more detail on any on this, as I get the
> impression that my bug reports and feedback get ignored. So I spare myself
> the time to do this work for now.
> 
> 
> Only thing I wonder now whether this all could be cause my /home is already
> more than one and a half year old. Maybe newly created filesystems are
> created in a way that prevents these issues? But it already has a nice
> global reserve:
> 
> merkaba:~> btrfs fi df /
> Data, RAID1: total=27.98GiB, used=24.07GiB
> System, RAID1: total=19.00MiB, used=16.00KiB
> Metadata, RAID1: total=2.00GiB, used=536.80MiB
> GlobalReserve, single: total=192.00MiB, used=0.00B
> 
> 
> Actually when I see that this free space thing is still not fixed for good I
> wonder whether it is fixable at all. Is this an inherent issue of BTRFS or
> more generally COW filesystem design?
> 
> I think it got somewhat better. It took much longer to come into that state
> again than last time, but still, blocking like this is *no* option for a
> *production ready* filesystem.
> 
> 
> 
> I am seriously consider to switch to XFS for my production laptop again.
> Cause I never saw any of these free space issues with any of the XFS or
> Ext4 filesystems I used in the last 10 years.
> 
> Thanks,


-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Still not production ready

2015-12-16 Thread Chris Mason
On Tue, Dec 15, 2015 at 06:30:58PM -0800, Liu Bo wrote:
> On Wed, Dec 16, 2015 at 10:19:00AM +0800, Qu Wenruo wrote:
> > >max_stripe_size is fixed at 1GB and the chunk size is stripe_size * 
> > >data_stripes,
> > >may I know how your partition gets a 10GB chunk?
> > 
> > Oh, it seems that I remembered the wrong size.
> > After checking the code, yes you're right.
> > A stripe won't be larger than 1G, so my assumption above is totally wrong.
> > 
> > And the problem is not in the 10% limit.
> > 
> > Please forget it.
> 
> No problem, glad to see people talking about the space issue again.

You can still end up with larger block groups if you have a lot of
drives.  We've had different problems with that in the past, but it is
limited now to 10G.

At any rate if things are still getting badly out of balance we need to
tweak the allocator some more.

It's hard to reproduce because you need a burst of allocations for
whatever type is full.  I'll give it another shot.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Still not production ready

2015-12-15 Thread Martin Steigerwald
Am Dienstag, 15. Dezember 2015, 16:59:58 CET schrieb Chris Mason:
> On Mon, Dec 14, 2015 at 10:08:16AM +0800, Qu Wenruo wrote:
> > Martin Steigerwald wrote on 2015/12/13 23:35 +0100:
> > >Hi!
> > >
> > >For me it is still not production ready.
> > 
> > Yes, this is the *FACT* and not everyone has a good reason to deny it.
> > 
> > >Again I ran into:
> > >
> > >btrfs kworker thread uses up 100% of a Sandybridge core for minutes on
> > >random write into big file
> > >https://bugzilla.kernel.org/show_bug.cgi?id=90401
> > 
> > Not sure about guideline for other fs, but it will attract more dev's
> > attention if it can be posted to maillist.
> > 
> > >No matter whether SLES 12 uses it as default for root, no matter whether
> > >Fujitsu and Facebook use it: I will not let this onto any customer
> > >machine
> > >without lots and lots of underprovisioning and rigorous free space
> > >monitoring. Actually I will renew my recommendations in my trainings to
> > >be careful with BTRFS.
> > >
> > > From my experience the monitoring would check for:
> > >merkaba:~> btrfs fi show /home
> > >Label: 'home'  uuid: […]
> > >
> > > Total devices 2 FS bytes used 156.31GiB
> > > devid1 size 170.00GiB used 164.13GiB path
> > > /dev/mapper/msata-home
> > > devid2 size 170.00GiB used 164.13GiB path
> > > /dev/mapper/sata-home
> > >
> > >If "used" is same as "size" then make big fat alarm. It is not sufficient
> > >for it to happen. It can run for quite some time just fine without any
> > >issues, but I never have seen a kworker thread using 100% of one core
> > >for extended period of time blocking everything else on the fs without
> > >this condition being met.> 
> > And specially advice on the device size from myself:
> > Don't use devices over 100G but less than 500G.
> > Over 100G will leads btrfs to use big chunks, where data chunks can be at
> > most 10G and metadata to be 1G.
> > 
> > I have seen a lot of users with about 100~200G device, and hit unbalanced
> > chunk allocation (10G data chunk easily takes the last available space and
> > makes later metadata no where to store)
> 
> Maybe we should tune things so the size of the chunk is based on the
> space remaining instead of the total space?

Still on my filesystem where was over 1 GiB free on metadata chunks, so…

… my theory still is: BTRFS has trouble finding free space in chunks at some 
time.

> > And unfortunately, your fs is already in the dangerous zone.
> > (And you are using RAID1, which means it's the same as one 170G btrfs with
> > SINGLE data/meta)
> > 
> > >In addition to that last time I tried it aborts scrub any of my BTRFS
> > >filesstems. Reported in another thread here that got completely ignored
> > >so
> > >far. I think I could go back to 4.2 kernel to make this work.
> 
> We'll pick this thread up again, the ones that get fixed the fastest are
> the ones that we can easily reproduce.  The rest need a lot of think
> time.

I understand. Maybe I just wanted to see at least some sort of an reaction.

I now have 4.4-rc5 running, the boot crash I had appears to be fixed. Oh, and 
I see that scrubbing / at leasted worked now:

merkaba:~> btrfs scrub status -d /
scrub status for […]
scrub device /dev/dm-5 (id 1) history
scrub started at Wed Dec 16 00:13:20 2015 and finished after 00:01:42
total bytes scrubbed: 23.94GiB with 0 errors
scrub device /dev/mapper/msata-debian (id 2) history
scrub started at Wed Dec 16 00:13:20 2015 and finished after 00:01:34
total bytes scrubbed: 23.94GiB with 0 errors

Okay, I test the other ones tomorrow, so maybe this one is fixed meanwhile.

Yay!

Thanks,
-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Still not production ready

2015-12-15 Thread Liu Bo
On Wed, Dec 16, 2015 at 10:19:00AM +0800, Qu Wenruo wrote:
> 
> 
> Liu Bo wrote on 2015/12/15 17:53 -0800:
> >On Wed, Dec 16, 2015 at 09:20:45AM +0800, Qu Wenruo wrote:
> >>
> >>
> >>Chris Mason wrote on 2015/12/15 16:59 -0500:
> >>>On Mon, Dec 14, 2015 at 10:08:16AM +0800, Qu Wenruo wrote:
> >>>>
> >>>>
> >>>>Martin Steigerwald wrote on 2015/12/13 23:35 +0100:
> >>>>>Hi!
> >>>>>
> >>>>>For me it is still not production ready.
> >>>>
> >>>>Yes, this is the *FACT* and not everyone has a good reason to deny it.
> >>>>
> >>>>>Again I ran into:
> >>>>>
> >>>>>btrfs kworker thread uses up 100% of a Sandybridge core for minutes on 
> >>>>>random
> >>>>>write into big file
> >>>>>https://bugzilla.kernel.org/show_bug.cgi?id=90401
> >>>>
> >>>>Not sure about guideline for other fs, but it will attract more dev's
> >>>>attention if it can be posted to maillist.
> >>>>
> >>>>>
> >>>>>
> >>>>>No matter whether SLES 12 uses it as default for root, no matter whether
> >>>>>Fujitsu and Facebook use it: I will not let this onto any customer 
> >>>>>machine
> >>>>>without lots and lots of underprovisioning and rigorous free space 
> >>>>>monitoring.
> >>>>>Actually I will renew my recommendations in my trainings to be careful 
> >>>>>with
> >>>>>BTRFS.
> >>>>>
> >>>>> From my experience the monitoring would check for:
> >>>>>
> >>>>>merkaba:~> btrfs fi show /home
> >>>>>Label: 'home'  uuid: […]
> >>>>> Total devices 2 FS bytes used 156.31GiB
> >>>>> devid1 size 170.00GiB used 164.13GiB path 
> >>>>> /dev/mapper/msata-home
> >>>>> devid2 size 170.00GiB used 164.13GiB path 
> >>>>> /dev/mapper/sata-home
> >>>>>
> >>>>>If "used" is same as "size" then make big fat alarm. It is not 
> >>>>>sufficient for
> >>>>>it to happen. It can run for quite some time just fine without any 
> >>>>>issues, but
> >>>>>I never have seen a kworker thread using 100% of one core for extended 
> >>>>>period
> >>>>>of time blocking everything else on the fs without this condition being 
> >>>>>met.
> >>>>>
> >>>>
> >>>>And specially advice on the device size from myself:
> >>>>Don't use devices over 100G but less than 500G.
> >>>>Over 100G will leads btrfs to use big chunks, where data chunks can be at
> >>>>most 10G and metadata to be 1G.
> >>>>
> >>>>I have seen a lot of users with about 100~200G device, and hit unbalanced
> >>>>chunk allocation (10G data chunk easily takes the last available space and
> >>>>makes later metadata no where to store)
> >>>
> >>>Maybe we should tune things so the size of the chunk is based on the
> >>>space remaining instead of the total space?
> >>
> >>Submitted such patch before.
> >>David pointed out that such behavior will cause a lot of small fragmented
> >>chunks at last several GB.
> >>Which may make balance behavior not as predictable as before.
> >>
> >>
> >>At least, we can just change the current 10% chunk size limit to 5% to make
> >>such problem less easier to trigger.
> >>It's a simple and easy solution.
> >>
> >>Another cause of the problem is, we understated the chunk size change for fs
> >>at the borderline of big chunk.
> >>
> >>For 99G, its chunk size limit is 1G, and it needs 99 data chunks to fully
> >>cover the fs.
> >>But for 100G, it only needs 10 chunks to covert the fs.
> >>And it need to be 990G to match the number again.
> >
> >max_stripe_size is fixed at 1GB and the chunk size is stripe_size * 
> >data_stripes,
> >may I know how your partition gets a 10GB chunk?
> 
> Oh, it seems that I remembered the wrong size.
> After checking the code, yes you're right.
> A stripe won't be larger than 1G, so my assumption above is totally wrong.
> 
> And the problem is not in the 10% li

Re: Still not production ready

2015-12-15 Thread Chris Mason
On Mon, Dec 14, 2015 at 10:08:16AM +0800, Qu Wenruo wrote:
> 
> 
> Martin Steigerwald wrote on 2015/12/13 23:35 +0100:
> >Hi!
> >
> >For me it is still not production ready.
> 
> Yes, this is the *FACT* and not everyone has a good reason to deny it.
> 
> >Again I ran into:
> >
> >btrfs kworker thread uses up 100% of a Sandybridge core for minutes on random
> >write into big file
> >https://bugzilla.kernel.org/show_bug.cgi?id=90401
> 
> Not sure about guideline for other fs, but it will attract more dev's
> attention if it can be posted to maillist.
> 
> >
> >
> >No matter whether SLES 12 uses it as default for root, no matter whether
> >Fujitsu and Facebook use it: I will not let this onto any customer machine
> >without lots and lots of underprovisioning and rigorous free space 
> >monitoring.
> >Actually I will renew my recommendations in my trainings to be careful with
> >BTRFS.
> >
> > From my experience the monitoring would check for:
> >
> >merkaba:~> btrfs fi show /home
> >Label: 'home'  uuid: […]
> > Total devices 2 FS bytes used 156.31GiB
> > devid1 size 170.00GiB used 164.13GiB path /dev/mapper/msata-home
> > devid2 size 170.00GiB used 164.13GiB path /dev/mapper/sata-home
> >
> >If "used" is same as "size" then make big fat alarm. It is not sufficient for
> >it to happen. It can run for quite some time just fine without any issues, 
> >but
> >I never have seen a kworker thread using 100% of one core for extended period
> >of time blocking everything else on the fs without this condition being met.
> >
> 
> And specially advice on the device size from myself:
> Don't use devices over 100G but less than 500G.
> Over 100G will leads btrfs to use big chunks, where data chunks can be at
> most 10G and metadata to be 1G.
> 
> I have seen a lot of users with about 100~200G device, and hit unbalanced
> chunk allocation (10G data chunk easily takes the last available space and
> makes later metadata no where to store)

Maybe we should tune things so the size of the chunk is based on the
space remaining instead of the total space?

> 
> And unfortunately, your fs is already in the dangerous zone.
> (And you are using RAID1, which means it's the same as one 170G btrfs with
> SINGLE data/meta)
> 
> >
> >In addition to that last time I tried it aborts scrub any of my BTRFS
> >filesstems. Reported in another thread here that got completely ignored so
> >far. I think I could go back to 4.2 kernel to make this work.

We'll pick this thread up again, the ones that get fixed the fastest are
the ones that we can easily reproduce.  The rest need a lot of think
time.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Still not production ready

2015-12-15 Thread Qu Wenruo



Chris Mason wrote on 2015/12/15 16:59 -0500:

On Mon, Dec 14, 2015 at 10:08:16AM +0800, Qu Wenruo wrote:



Martin Steigerwald wrote on 2015/12/13 23:35 +0100:

Hi!

For me it is still not production ready.


Yes, this is the *FACT* and not everyone has a good reason to deny it.


Again I ran into:

btrfs kworker thread uses up 100% of a Sandybridge core for minutes on random
write into big file
https://bugzilla.kernel.org/show_bug.cgi?id=90401


Not sure about guideline for other fs, but it will attract more dev's
attention if it can be posted to maillist.




No matter whether SLES 12 uses it as default for root, no matter whether
Fujitsu and Facebook use it: I will not let this onto any customer machine
without lots and lots of underprovisioning and rigorous free space monitoring.
Actually I will renew my recommendations in my trainings to be careful with
BTRFS.

 From my experience the monitoring would check for:

merkaba:~> btrfs fi show /home
Label: 'home'  uuid: […]
 Total devices 2 FS bytes used 156.31GiB
 devid1 size 170.00GiB used 164.13GiB path /dev/mapper/msata-home
 devid2 size 170.00GiB used 164.13GiB path /dev/mapper/sata-home

If "used" is same as "size" then make big fat alarm. It is not sufficient for
it to happen. It can run for quite some time just fine without any issues, but
I never have seen a kworker thread using 100% of one core for extended period
of time blocking everything else on the fs without this condition being met.



And specially advice on the device size from myself:
Don't use devices over 100G but less than 500G.
Over 100G will leads btrfs to use big chunks, where data chunks can be at
most 10G and metadata to be 1G.

I have seen a lot of users with about 100~200G device, and hit unbalanced
chunk allocation (10G data chunk easily takes the last available space and
makes later metadata no where to store)


Maybe we should tune things so the size of the chunk is based on the
space remaining instead of the total space?


Submitted such patch before.
David pointed out that such behavior will cause a lot of small 
fragmented chunks at last several GB.

Which may make balance behavior not as predictable as before.


At least, we can just change the current 10% chunk size limit to 5% to 
make such problem less easier to trigger.

It's a simple and easy solution.

Another cause of the problem is, we understated the chunk size change 
for fs at the borderline of big chunk.


For 99G, its chunk size limit is 1G, and it needs 99 data chunks to 
fully cover the fs.

But for 100G, it only needs 10 chunks to covert the fs.
And it need to be 990G to match the number again.

The sudden drop of chunk number is the root cause.

So we'd better reconsider both the big chunk size limit and chunk size 
limit to find a balanaced solution for it.


Thanks,
Qu




And unfortunately, your fs is already in the dangerous zone.
(And you are using RAID1, which means it's the same as one 170G btrfs with
SINGLE data/meta)



In addition to that last time I tried it aborts scrub any of my BTRFS
filesstems. Reported in another thread here that got completely ignored so
far. I think I could go back to 4.2 kernel to make this work.


We'll pick this thread up again, the ones that get fixed the fastest are
the ones that we can easily reproduce.  The rest need a lot of think
time.

-chris





--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Still not production ready

2015-12-15 Thread Qu Wenruo



Liu Bo wrote on 2015/12/15 17:53 -0800:

On Wed, Dec 16, 2015 at 09:20:45AM +0800, Qu Wenruo wrote:



Chris Mason wrote on 2015/12/15 16:59 -0500:

On Mon, Dec 14, 2015 at 10:08:16AM +0800, Qu Wenruo wrote:



Martin Steigerwald wrote on 2015/12/13 23:35 +0100:

Hi!

For me it is still not production ready.


Yes, this is the *FACT* and not everyone has a good reason to deny it.


Again I ran into:

btrfs kworker thread uses up 100% of a Sandybridge core for minutes on random
write into big file
https://bugzilla.kernel.org/show_bug.cgi?id=90401


Not sure about guideline for other fs, but it will attract more dev's
attention if it can be posted to maillist.




No matter whether SLES 12 uses it as default for root, no matter whether
Fujitsu and Facebook use it: I will not let this onto any customer machine
without lots and lots of underprovisioning and rigorous free space monitoring.
Actually I will renew my recommendations in my trainings to be careful with
BTRFS.

 From my experience the monitoring would check for:

merkaba:~> btrfs fi show /home
Label: 'home'  uuid: […]
 Total devices 2 FS bytes used 156.31GiB
 devid1 size 170.00GiB used 164.13GiB path /dev/mapper/msata-home
 devid2 size 170.00GiB used 164.13GiB path /dev/mapper/sata-home

If "used" is same as "size" then make big fat alarm. It is not sufficient for
it to happen. It can run for quite some time just fine without any issues, but
I never have seen a kworker thread using 100% of one core for extended period
of time blocking everything else on the fs without this condition being met.



And specially advice on the device size from myself:
Don't use devices over 100G but less than 500G.
Over 100G will leads btrfs to use big chunks, where data chunks can be at
most 10G and metadata to be 1G.

I have seen a lot of users with about 100~200G device, and hit unbalanced
chunk allocation (10G data chunk easily takes the last available space and
makes later metadata no where to store)


Maybe we should tune things so the size of the chunk is based on the
space remaining instead of the total space?


Submitted such patch before.
David pointed out that such behavior will cause a lot of small fragmented
chunks at last several GB.
Which may make balance behavior not as predictable as before.


At least, we can just change the current 10% chunk size limit to 5% to make
such problem less easier to trigger.
It's a simple and easy solution.

Another cause of the problem is, we understated the chunk size change for fs
at the borderline of big chunk.

For 99G, its chunk size limit is 1G, and it needs 99 data chunks to fully
cover the fs.
But for 100G, it only needs 10 chunks to covert the fs.
And it need to be 990G to match the number again.


max_stripe_size is fixed at 1GB and the chunk size is stripe_size * 
data_stripes,
may I know how your partition gets a 10GB chunk?


Oh, it seems that I remembered the wrong size.
After checking the code, yes you're right.
A stripe won't be larger than 1G, so my assumption above is totally wrong.

And the problem is not in the 10% limit.

Please forget it.

Thanks,
Qu




Thanks,

-liubo




The sudden drop of chunk number is the root cause.

So we'd better reconsider both the big chunk size limit and chunk size limit
to find a balanaced solution for it.

Thanks,
Qu




And unfortunately, your fs is already in the dangerous zone.
(And you are using RAID1, which means it's the same as one 170G btrfs with
SINGLE data/meta)



In addition to that last time I tried it aborts scrub any of my BTRFS
filesstems. Reported in another thread here that got completely ignored so
far. I think I could go back to 4.2 kernel to make this work.


We'll pick this thread up again, the ones that get fixed the fastest are
the ones that we can easily reproduce.  The rest need a lot of think
time.

-chris





--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html






--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Still not production ready

2015-12-14 Thread Duncan
Qu Wenruo posted on Mon, 14 Dec 2015 15:32:02 +0800 as excerpted:

> Oh, my poor English... :(

Well, as I said, native English speakers commonly enough mis-negate...

The real issue seems to be that English simply lacks proper support for 
the double-negatives feature that people keep wanting to use, despite the 
fact that it yields an officially undefined result that compilers (people 
reading/hearing) don't quite know what to do with, with actual results 
often throwing warnings and generally changing from compiler to 
compiler . =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Still not production ready

2015-12-14 Thread Austin S. Hemmelgarn

On 2015-12-14 14:08, Chris Murphy wrote:

On Mon, Dec 14, 2015 at 5:10 AM, Duncan <1i5t5.dun...@cox.net> wrote:

Qu Wenruo posted on Mon, 14 Dec 2015 15:32:02 +0800 as excerpted:


Oh, my poor English... :(


Well, as I said, native English speakers commonly enough mis-negate...

The real issue seems to be that English simply lacks proper support for
the double-negatives feature that people keep wanting to use, despite the
fact that it yields an officially undefined result that compilers (people
reading/hearing) don't quite know what to do with, with actual results
often throwing warnings and generally changing from compiler to
compiler . =:^)


It's a trap! Haha. Yeah like you say, it's not a matter of poor
English. Qu writes very understandable English. Officially in English
the negatives should cancel, which is different in many other
languages where additional negatives amplify. But even native English
speakers have dialects where it amplifies, rather than cancels. So I'd
consider the double or multiple negative in English as a
colloquialism. And a trap!


Some days I really wish Esperanto or Interlingua had actually caught on...

Or even Lojban, at least then the language would be more like the 
systems being discussed, even if it would be a serious pain to learn and 
use.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Still not production ready

2015-12-14 Thread Chris Murphy
On Mon, Dec 14, 2015 at 5:10 AM, Duncan <1i5t5.dun...@cox.net> wrote:
> Qu Wenruo posted on Mon, 14 Dec 2015 15:32:02 +0800 as excerpted:
>
>> Oh, my poor English... :(
>
> Well, as I said, native English speakers commonly enough mis-negate...
>
> The real issue seems to be that English simply lacks proper support for
> the double-negatives feature that people keep wanting to use, despite the
> fact that it yields an officially undefined result that compilers (people
> reading/hearing) don't quite know what to do with, with actual results
> often throwing warnings and generally changing from compiler to
> compiler . =:^)

It's a trap! Haha. Yeah like you say, it's not a matter of poor
English. Qu writes very understandable English. Officially in English
the negatives should cancel, which is different in many other
languages where additional negatives amplify. But even native English
speakers have dialects where it amplifies, rather than cancels. So I'd
consider the double or multiple negative in English as a
colloquialism. And a trap!


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


still kworker at 100% cpu in all of device size allocated with chunks situations with write load (was: Re: Still not production ready)

2015-12-14 Thread Martin Steigerwald
Am Sonntag, 13. Dezember 2015, 15:19:14 CET schrieb Marc MERLIN:
> On Sun, Dec 13, 2015 at 11:35:08PM +0100, Martin Steigerwald wrote:
> > Hi!
> > 
> > For me it is still not production ready. Again I ran into:
> > 
> > btrfs kworker thread uses up 100% of a Sandybridge core for minutes on
> > random write into big file
> > https://bugzilla.kernel.org/show_bug.cgi?id=90401
> 
> Sorry you're having issues. I haven't seen this before myself.
> I couldn't find the kernel version you're using in your Email or the bug
> you filed (quick scan).
> 
> That's kind of important :)

I definately know this much. :) It happened with 4.3 yesterday. The other 
kernel version was 3.18. Information should be in the bug report. Yeah, 3.18 
as mentioned in the Kernel Version field. And 4.3 as I mentioned in the last 
comment of the bug report.

The scrubbing issue is I think since 4.3, I also seen it with 4.4-rc2/rc4 I 
believe, but I didn´t go back then to check more toroughly. I didn´t report 
the scrubbing issue with bugzilla yet as I got no feedback on my mailing list 
posts so far. I will bump the thread in a moment and suggest we discuss free 
space issue here and scrubbing issue in the other thread. I went back to 4.3 
cause 4.4-rc2/4 does not even boot on my machine most of the times. I also 
reported this (BTRFS unrelated one).

Thanks,
-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


still kworker at 100% cpu in all of device size allocated with chunks situations with write load (was: Re: Still not production ready)

2015-12-14 Thread Martin Steigerwald
Am Montag, 14. Dezember 2015, 10:08:16 CET schrieb Qu Wenruo:
> Martin Steigerwald wrote on 2015/12/13 23:35 +0100:
> > Hi!
> > 
> > For me it is still not production ready.
> 
> Yes, this is the *FACT* and not everyone has a good reason to deny it.
> 
> > Again I ran into:
> > 
> > btrfs kworker thread uses up 100% of a Sandybridge core for minutes on
> > random write into big file
> > https://bugzilla.kernel.org/show_bug.cgi?id=90401
> 
> Not sure about guideline for other fs, but it will attract more dev's
> attention if it can be posted to maillist.

I did, as mentioned in the bug report:

BTRFS free space handling still needs more work: Hangs again
Martin Steigerwald | 26 Dec 14:37 2014
http://permalink.gmane.org/gmane.comp.file-systems.btrfs/41790

> > No matter whether SLES 12 uses it as default for root, no matter whether
> > Fujitsu and Facebook use it: I will not let this onto any customer machine
> > without lots and lots of underprovisioning and rigorous free space
> > monitoring. Actually I will renew my recommendations in my trainings to
> > be careful with BTRFS.
> > 
> >  From my experience the monitoring would check for:
> > merkaba:~> btrfs fi show /home
> > Label: 'home'  uuid: […]
> > 
> >  Total devices 2 FS bytes used 156.31GiB
> >  devid1 size 170.00GiB used 164.13GiB path
> >  /dev/mapper/msata-home
> >  devid2 size 170.00GiB used 164.13GiB path
> >  /dev/mapper/sata-home
> > 
> > If "used" is same as "size" then make big fat alarm. It is not sufficient
> > for it to happen. It can run for quite some time just fine without any
> > issues, but I never have seen a kworker thread using 100% of one core for
> > extended period of time blocking everything else on the fs without this
> > condition being met.
> And specially advice on the device size from myself:
> Don't use devices over 100G but less than 500G.
> Over 100G will leads btrfs to use big chunks, where data chunks can be
> at most 10G and metadata to be 1G.
> 
> I have seen a lot of users with about 100~200G device, and hit
> unbalanced chunk allocation (10G data chunk easily takes the last
> available space and makes later metadata no where to store)

Interesting, but in my case there is still quite some free space in already 
allocated metadata chunks. Anyway, I did had enospc issues on trying to 
balance the chunks.

> And unfortunately, your fs is already in the dangerous zone.
> (And you are using RAID1, which means it's the same as one 170G btrfs
> with SINGLE data/meta)

Well, I know for any FS its not recommended to let it run to full and leave 
about 10-15% free at least, but while it is not 10-15% anymore, its still a 
whopping 11-12 GiB of free space. I would accept a somewhat slower operation 
in this case, but no kworker at 100% for about 10-30 seconds blocking 
everything else on going on on the filesystem. For whatever reason Plasma 
seems to access the fs on almost every action I do with it, so not even panels 
slide out anymore or activity switcher works during that time.

> > In addition to that last time I tried it aborts scrub any of my BTRFS
> > filesstems. Reported in another thread here that got completely ignored so
> > far. I think I could go back to 4.2 kernel to make this work.
> 
> Unfortunately, this happens a lot of times, even you posted it to mail list.
> Devs here are always busy locating bugs or adding new features or
> enhancing current behavior.
> 
> So *PLEASE* be patient about such slow response.

Okay, thanks at least for the acknowledgement of this. I try to be even more 
patient.
 
> BTW, you may not want to revert to 4.2 until some bug fix is backported
> to 4.2.
> As qgroup rework in 4.2 has broken delayed ref and caused some scrub
> bugs. (My fault)

Hm, well scrubbing does not work for me either. But since 4.3/4.4rc2/4. I just 
bumped the thread:

Re: [4.3-rc4] scrubbing aborts before finishing

by replying a well by replying a third time to it (not fourth, miscounted:). 

> > I am not going to bother to go into more detail on any on this, as I get
> > the impression that my bug reports and feedback get ignored. So I spare
> > myself the time to do this work for now.
> > 
> > 
> > Only thing I wonder now whether this all could be cause my /home is
> > already
> > more than one and a half year old. Maybe newly created filesystems are
> > created in a way that prevents these issues? But it already has a nice
> > global reserve:
> > 
> > merkaba:~> btrfs fi df /
> > Data, RAID1: total=27.98GiB, used=24.07GiB
> > System, RAID1: total=19.00MiB, used=16.00KiB
&g

Re: Still not production ready

2015-12-13 Thread Marc MERLIN
On Sun, Dec 13, 2015 at 11:35:08PM +0100, Martin Steigerwald wrote:
> Hi!
> 
> For me it is still not production ready. Again I ran into:
> 
> btrfs kworker thread uses up 100% of a Sandybridge core for minutes on random 
> write into big file
> https://bugzilla.kernel.org/show_bug.cgi?id=90401
 
Sorry you're having issues. I haven't seen this before myself.
I couldn't find the kernel version you're using in your Email or the bug
you filed (quick scan).

That's kind of important :)

Marc
 
> No matter whether SLES 12 uses it as default for root, no matter whether 
> Fujitsu and Facebook use it: I will not let this onto any customer machine 
> without lots and lots of underprovisioning and rigorous free space 
> monitoring. 
> Actually I will renew my recommendations in my trainings to be careful with 
> BTRFS.
> 
> From my experience the monitoring would check for:
> 
> merkaba:~> btrfs fi show /home
> Label: 'home'  uuid: […]
> Total devices 2 FS bytes used 156.31GiB
> devid1 size 170.00GiB used 164.13GiB path /dev/mapper/msata-home
> devid2 size 170.00GiB used 164.13GiB path /dev/mapper/sata-home
> 
> If "used" is same as "size" then make big fat alarm. It is not sufficient for 
> it to happen. It can run for quite some time just fine without any issues, 
> but 
> I never have seen a kworker thread using 100% of one core for extended period 
> of time blocking everything else on the fs without this condition being met.
> 
> 
> In addition to that last time I tried it aborts scrub any of my BTRFS 
> filesstems. Reported in another thread here that got completely ignored so 
> far. I think I could go back to 4.2 kernel to make this work.
> 
> 
> I am not going to bother to go into more detail on any on this, as I get the 
> impression that my bug reports and feedback get ignored. So I spare myself 
> the 
> time to do this work for now.
> 
> 
> Only thing I wonder now whether this all could be cause my /home is already 
> more than one and a half year old. Maybe newly created filesystems are 
> created 
> in a way that prevents these issues? But it already has a nice global reserve:
> 
> merkaba:~> btrfs fi df /
> Data, RAID1: total=27.98GiB, used=24.07GiB
> System, RAID1: total=19.00MiB, used=16.00KiB
> Metadata, RAID1: total=2.00GiB, used=536.80MiB
> GlobalReserve, single: total=192.00MiB, used=0.00B
> 
> 
> Actually when I see that this free space thing is still not fixed for good I 
> wonder whether it is fixable at all. Is this an inherent issue of BTRFS or 
> more generally COW filesystem design?
> 
> I think it got somewhat better. It took much longer to come into that state 
> again than last time, but still, blocking like this is *no* option for a 
> *production ready* filesystem.
> 
> 
> 
> I am seriously consider to switch to XFS for my production laptop again. 
> Cause 
> I never saw any of these free space issues with any of the XFS or Ext4 
> filesystems I used in the last 10 years.
> 
> Thanks,
> -- 
> Martin
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Still not production ready

2015-12-13 Thread Martin Steigerwald
Hi!

For me it is still not production ready. Again I ran into:

btrfs kworker thread uses up 100% of a Sandybridge core for minutes on random 
write into big file
https://bugzilla.kernel.org/show_bug.cgi?id=90401


No matter whether SLES 12 uses it as default for root, no matter whether 
Fujitsu and Facebook use it: I will not let this onto any customer machine 
without lots and lots of underprovisioning and rigorous free space monitoring. 
Actually I will renew my recommendations in my trainings to be careful with 
BTRFS.

>From my experience the monitoring would check for:

merkaba:~> btrfs fi show /home
Label: 'home'  uuid: […]
Total devices 2 FS bytes used 156.31GiB
devid1 size 170.00GiB used 164.13GiB path /dev/mapper/msata-home
devid2 size 170.00GiB used 164.13GiB path /dev/mapper/sata-home

If "used" is same as "size" then make big fat alarm. It is not sufficient for 
it to happen. It can run for quite some time just fine without any issues, but 
I never have seen a kworker thread using 100% of one core for extended period 
of time blocking everything else on the fs without this condition being met.


In addition to that last time I tried it aborts scrub any of my BTRFS 
filesstems. Reported in another thread here that got completely ignored so 
far. I think I could go back to 4.2 kernel to make this work.


I am not going to bother to go into more detail on any on this, as I get the 
impression that my bug reports and feedback get ignored. So I spare myself the 
time to do this work for now.


Only thing I wonder now whether this all could be cause my /home is already 
more than one and a half year old. Maybe newly created filesystems are created 
in a way that prevents these issues? But it already has a nice global reserve:

merkaba:~> btrfs fi df /
Data, RAID1: total=27.98GiB, used=24.07GiB
System, RAID1: total=19.00MiB, used=16.00KiB
Metadata, RAID1: total=2.00GiB, used=536.80MiB
GlobalReserve, single: total=192.00MiB, used=0.00B


Actually when I see that this free space thing is still not fixed for good I 
wonder whether it is fixable at all. Is this an inherent issue of BTRFS or 
more generally COW filesystem design?

I think it got somewhat better. It took much longer to come into that state 
again than last time, but still, blocking like this is *no* option for a 
*production ready* filesystem.



I am seriously consider to switch to XFS for my production laptop again. Cause 
I never saw any of these free space issues with any of the XFS or Ext4 
filesystems I used in the last 10 years.

Thanks,
-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Still not production ready

2015-12-13 Thread Duncan
Qu Wenruo posted on Mon, 14 Dec 2015 10:08:16 +0800 as excerpted:

> Martin Steigerwald wrote on 2015/12/13 23:35 +0100:
>> Hi!
>>
>> For me it is still not production ready.
> 
> Yes, this is the *FACT* and not everyone has a good reason to deny it.

In the above sentence, I /think/ you (Qu) agree with Martin (and I) that 
btrfs shouldn't be considered production ready... yet, and the first part 
of the sentence makes it very clear that you feel strongly about the 
*FACT*, but the second half of the sentence (after *FACT*) doesn't parse 
well in English, thus leaving the entire sentence open to interpretation, 
tho it's obvious either way that you feel strongly about it. =:^\

At the risk of getting it completely wrong, what I /think/ you meant to 
say is (as expanded in typically Duncan fashion =:^)...

Yes, this is the *FACT*, though some people have reasons to deny it.

Presumably, said reasons would include the fact that various distros are 
trying to sell enterprise support contracts to customers very eager to 
have the features that btrfs provides, and said customers are willing to 
pay for assurances that the solutions they're buying are "production 
ready", whether that's actually the case or not, presumably because said 
payment is (in practice) simply ensuring there's someone else to pin the 
blame on if things go bad.

And the demonstration of that would be the continued fact that people 
otherwise unnecessarily continue to pay rather large sums of money for 
that very assurance, when in practice, they'd get equal or better support 
not worrying about that payment, but instead actually making use of free-
of-cost resources such as this list.


[Linguistic analysis, see frequent discussion of this topic at Language 
Log, which I happen to subscribe to as I find this sort of thing 
interesting, for more commentary and examples of the same general issue: 
http://languagelog.net ]

The problem with the sentence as originally written, is that English 
doesn't deal well with multi-negation, sometimes considering each 
negation an inversion of the previous (as do most programming languages 
and thus programmers), while other times or as read/heard/interpreted by 
others repeated negation may be considered a strengthening of the 
original negation.

Regardless, mis-negation due to speaker/writer confusion is quite common 
even among native English speakers/writers.

The negating words in question here are "not" and "deny".  If you will 
note, my rewrite kept "deny", but rewrote the "not" out of the sentence, 
so there's only one negative to worry about, making the meaning much 
clearer as the reader's mind isn't left trying to figure out what the 
speaker meant with the double-negative (mistake? deliberate canceling out 
of the first negative with the second? deliberate intensifier?)  and thus 
unable to be sure one way or the other what was meant.

And just in case there would have been doubt, the explanation then makes 
doubly obvious what I think your intent was by expanding on it.  Of 
course that's easy to do as I entirely agree.

OTOH if I'm mistaken as to your intent and you meant it the other way... 
well then you'll need to do the explaining as then the implication is 
that some people have good reasons to deny it and you agree with them, 
but without further expansion, I wouldn't know where you're trying to go 
with that claim.


Just in case there's any doubt left of my own opinion on the original 
claim of not production ready in the above discussion, let me be 
explicit:  I (too) agree with Martin (and I think with Qu) that btrfs 
isn't yet production ready.  But I don't believe you'll find many on the 
list taking issue with that, as I think everybody on-list agrees, btrfs 
/isn't/ production ready.  Certainly pretty much just that has been 
repeatedly stated in individualized style by many posters including 
myself, and I've yet to see anyone take serious issue with it.

>> No matter whether SLES 12 uses it as default for root, no matter
>> whether Fujitsu and Facebook use it: I will not let this onto any
>> customer machine without lots and lots of underprovisioning and
>> rigorous free space monitoring.
>> Actually I will renew my recommendations in my trainings to be careful
>> with BTRFS.

... And were I to put money on it, my money would be on every regular on-
list poster 100% agreeing with that. =:^)

>>
>>  From my experience the monitoring would check for:
>>
>> merkaba:~> btrfs fi show /home
>>  Label: 'home'  uuid: […]
>>  Total devices 2 FS bytes used 156.31GiB
>>  devid1 size 170.00GiB used 164.13GiB path /dev/[path1]
>>  devid2 size 170.00GiB used 164.13GiB path /dev/[path2]
>>
>> If "used" is same as "size" then make big fat alarm. It

Re: Still not production ready

2015-12-13 Thread Qu Wenruo



Duncan wrote on 2015/12/14 06:21 +:

Qu Wenruo posted on Mon, 14 Dec 2015 10:08:16 +0800 as excerpted:


Martin Steigerwald wrote on 2015/12/13 23:35 +0100:

Hi!

For me it is still not production ready.


Yes, this is the *FACT* and not everyone has a good reason to deny it.


In the above sentence, I /think/ you (Qu) agree with Martin (and I) that
btrfs shouldn't be considered production ready... yet, and the first part
of the sentence makes it very clear that you feel strongly about the
*FACT*, but the second half of the sentence (after *FACT*) doesn't parse
well in English, thus leaving the entire sentence open to interpretation,
tho it's obvious either way that you feel strongly about it. =:^\


Oh, my poor English... :(

The latter half is just in case someone consider btrfs is stable in some 
respects.




At the risk of getting it completely wrong, what I /think/ you meant to
say is (as expanded in typically Duncan fashion =:^)...

Yes, this is the *FACT*, though some people have reasons to deny it.


Right! That's what I want to say!!



Presumably, said reasons would include the fact that various distros are
trying to sell enterprise support contracts to customers very eager to
have the features that btrfs provides, and said customers are willing to
pay for assurances that the solutions they're buying are "production
ready", whether that's actually the case or not, presumably because said
payment is (in practice) simply ensuring there's someone else to pin the
blame on if things go bad.

And the demonstration of that would be the continued fact that people
otherwise unnecessarily continue to pay rather large sums of money for
that very assurance, when in practice, they'd get equal or better support
not worrying about that payment, but instead actually making use of free-
of-cost resources such as this list.


[Linguistic analysis, see frequent discussion of this topic at Language
Log, which I happen to subscribe to as I find this sort of thing
interesting, for more commentary and examples of the same general issue:
http://languagelog.net ]

The problem with the sentence as originally written, is that English
doesn't deal well with multi-negation, sometimes considering each
negation an inversion of the previous (as do most programming languages
and thus programmers), while other times or as read/heard/interpreted by
others repeated negation may be considered a strengthening of the
original negation.

Regardless, mis-negation due to speaker/writer confusion is quite common
even among native English speakers/writers.

The negating words in question here are "not" and "deny".  If you will
note, my rewrite kept "deny", but rewrote the "not" out of the sentence,
so there's only one negative to worry about, making the meaning much
clearer as the reader's mind isn't left trying to figure out what the
speaker meant with the double-negative (mistake? deliberate canceling out
of the first negative with the second? deliberate intensifier?)  and thus
unable to be sure one way or the other what was meant.

And just in case there would have been doubt, the explanation then makes
doubly obvious what I think your intent was by expanding on it.  Of
course that's easy to do as I entirely agree.

OTOH if I'm mistaken as to your intent and you meant it the other way...
well then you'll need to do the explaining as then the implication is
that some people have good reasons to deny it and you agree with them,
but without further expansion, I wouldn't know where you're trying to go
with that claim.


Just in case there's any doubt left of my own opinion on the original
claim of not production ready in the above discussion, let me be
explicit:  I (too) agree with Martin (and I think with Qu) that btrfs
isn't yet production ready.  But I don't believe you'll find many on the
list taking issue with that, as I think everybody on-list agrees, btrfs
/isn't/ production ready.  Certainly pretty much just that has been
repeatedly stated in individualized style by many posters including
myself, and I've yet to see anyone take serious issue with it.


No matter whether SLES 12 uses it as default for root, no matter
whether Fujitsu and Facebook use it: I will not let this onto any
customer machine without lots and lots of underprovisioning and
rigorous free space monitoring.
Actually I will renew my recommendations in my trainings to be careful
with BTRFS.


... And were I to put money on it, my money would be on every regular on-
list poster 100% agreeing with that. =:^)



  From my experience the monitoring would check for:

merkaba:~> btrfs fi show /home
  Label: 'home'  uuid: […]
  Total devices 2 FS bytes used 156.31GiB
  devid1 size 170.00GiB used 164.13GiB path /dev/[path1]
  devid2 size 170.00GiB used 164.13GiB path /dev/[path2]

If "used" is same as "size" then make big fat alarm. It is not
sufficie