First: Happy New Year to you! Second: Take your time. I know its holidays for many. For me it means I easily have time to follow-up on this.
Am Mittwoch, 16. Dezember 2015, 09:20:45 CET schrieb Qu Wenruo: > Chris Mason wrote on 2015/12/15 16:59 -0500: > > On Mon, Dec 14, 2015 at 10:08:16AM +0800, Qu Wenruo wrote: > >> Martin Steigerwald wrote on 2015/12/13 23:35 +0100: > >>> Hi! > >>> > >>> For me it is still not production ready. > >> > >> Yes, this is the *FACT* and not everyone has a good reason to deny it. > >> > >>> Again I ran into: > >>> > >>> btrfs kworker thread uses up 100% of a Sandybridge core for minutes on > >>> random write into big file > >>> https://bugzilla.kernel.org/show_bug.cgi?id=90401 > >> > >> Not sure about guideline for other fs, but it will attract more dev's > >> attention if it can be posted to maillist. > >> > >>> No matter whether SLES 12 uses it as default for root, no matter whether > >>> Fujitsu and Facebook use it: I will not let this onto any customer > >>> machine > >>> without lots and lots of underprovisioning and rigorous free space > >>> monitoring. Actually I will renew my recommendations in my trainings to > >>> be careful with BTRFS. > >>> > >>> From my experience the monitoring would check for: > >>> merkaba:~> btrfs fi show /home > >>> Label: 'home' uuid: […] > >>> > >>> Total devices 2 FS bytes used 156.31GiB > >>> devid 1 size 170.00GiB used 164.13GiB path > >>> /dev/mapper/msata-home > >>> devid 2 size 170.00GiB used 164.13GiB path > >>> /dev/mapper/sata-home > >>> > >>> If "used" is same as "size" then make big fat alarm. It is not > >>> sufficient for it to happen. It can run for quite some time just fine > >>> without any issues, but I never have seen a kworker thread using 100% > >>> of one core for extended period of time blocking everything else on the > >>> fs without this condition being met.>> > >> And specially advice on the device size from myself: > >> Don't use devices over 100G but less than 500G. > >> Over 100G will leads btrfs to use big chunks, where data chunks can be at > >> most 10G and metadata to be 1G. > >> > >> I have seen a lot of users with about 100~200G device, and hit unbalanced > >> chunk allocation (10G data chunk easily takes the last available space > >> and > >> makes later metadata no where to store) > > > > Maybe we should tune things so the size of the chunk is based on the > > space remaining instead of the total space? > > Submitted such patch before. > David pointed out that such behavior will cause a lot of small > fragmented chunks at last several GB. > Which may make balance behavior not as predictable as before. > > > At least, we can just change the current 10% chunk size limit to 5% to > make such problem less easier to trigger. > It's a simple and easy solution. > > Another cause of the problem is, we understated the chunk size change > for fs at the borderline of big chunk. > > For 99G, its chunk size limit is 1G, and it needs 99 data chunks to > fully cover the fs. > But for 100G, it only needs 10 chunks to covert the fs. > And it need to be 990G to match the number again. > > The sudden drop of chunk number is the root cause. > > So we'd better reconsider both the big chunk size limit and chunk size > limit to find a balanaced solution for it. Did you come to any conclusion here? Is there anything I can change with my home BTRFS filesystem to try to find out what works? Challenge here is that it doesn´t happen under defined circumstances. So far I only know the required condition, but not the sufficient condition for it to happen. Another user run into the issue and reported his findings in the bug report: https://bugzilla.kernel.org/show_bug.cgi?id=90401#c14 Thanks, -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html