Re: btrfs check help
> On Nov 26, 2015, at 10:03 PM, Vincent Olivier <vinc...@up4.com> wrote: > >> >> On Nov 25, 2015, at 8:44 PM, Qu Wenruo <quwen...@cn.fujitsu.com> wrote: >> >> >> >> Vincent Olivier wrote on 2015/11/25 11:51 -0500: >>> I should probably point out that there is 64GB of RAM on this machine and >>> it’s a dual Xeon processor (LGA2011-3) system. Also, there is only Btrfs >>> served via Samba and the kernel panic was caused Btrfs (as per what I >>> remember from the log on the screen just before I rebooted) and happened in >>> the middle of the night when zero (0) client was connected. >>> >>> You will find below the full “btrfs check” log for each device in the order >>> it is listed by “btrfs fi show”. >> >> There is really no need to do such thing, as btrfs is able to manage >> multiple device, calling btrfsck on any of them is enough as long as it's >> not hugely damaged. >> >>> >>> Ca I get a strong confirmation that I should run with the “—repair” option >>> on each device? Thanks. >> >> YES. >> >> Inode nbytes fix is *VERY* safe as long as it's the only error. >> >> Although it's not that convincing since the inode nbytes fix code is written >> by myself and authors always tend to believe their codes are good >> But at least, some other users with more complicated problem(with inode >> nbytes error) fixed it. >> >> The last decision is still on you anyway. > > I will do it on the first device from the “fi show” output and report. ok this doesn’t look good. i ran —repair and check again and it looks even worse. please help. [root@3dcpc5 ~]# btrfs check --repair /dev/sdk enabling repair mode Checking filesystem on /dev/sdk UUID: 6a742786-070d-4557-9e67-c73b84967bf5 checking extents Fixed 0 roots. checking free space cache cache and super generation don't match, space cache will be invalidated checking fs roots reset nbytes for ino 1341670 root 5 reset nbytes for ino 1341670 root 11406 warning line 3653 checking csums checking root refs found 19343374874998 bytes used err is 0 total csum bytes: 18863243900 total tree bytes: 27413118976 total fs tree bytes: 4455694336 total extent tree bytes: 3077373952 btree space waste bytes: 2882193883 file data blocks allocated: 19461564538880 referenced 20155355832320 root@3dcpc5 ~]# btrfs check /dev/sdk Checking filesystem on /dev/sdk UUID: 6a742786-070d-4557-9e67-c73b84967bf5 checking extents checking free space cache block group 53328591454208 has wrong amount of free space failed to load free space cache for block group 53328591454208 block group 53329665196032 has wrong amount of free space failed to load free space cache for block group 53329665196032 Wanted offset 58836887044096, found 58836887011328 Wanted offset 58836887044096, found 58836887011328 cache appears valid but isnt 58836887011328 Wanted offset 60505481887744, found 60505481805824 Wanted offset 60505481887744, found 60505481805824 cache appears valid but isnt 60505481805824 Wanted bytes 16384, found 81920 for off 60979001966592 Wanted bytes 1073725440, found 81920 for off 60979001966592 cache appears valid but isnt 60979001950208 Wanted offset 61297908056064, found 61297908006912 Wanted offset 61297908056064, found 61297908006912 cache appears valid but isnt 61297903271936 Wanted bytes 32768, found 16384 for off 61711301296128 Wanted bytes 1066319872, found 16384 for off 61711301296128 cache appears valid but isnt 61711293874176 There is no free space entry for 62691824041984-62691824058368 There is no free space entry for 62691824041984-62692693901312 cache appears valid but isnt 62691620159488 There is no free space entry for 63723505205248-63723505221632 There is no free space entry for 63723505205248-63724559794176 cache appears valid but isnt 63723486052352 Wanted bytes 32768, found 16384 for off 64746920902656 Wanted bytes 914849792, found 16384 for off 64746920902656 cache appears valid but isnt 64746762010624 There is no free space entry for 65770368401408-65770368434176 There is no free space entry for 65770368401408-6577710720 cache appears valid but isnt 65770037968896 Wanted offset 66758954270720, found 66758954221568 Wanted offset 66758954270720, found 66758954221568 cache appears valid but isnt 66758954188800 block group 70204591702016 has wrong amount of free space failed to load free space cache for block group 70204591702016 block group 70205665443840 has wrong amount of free space failed to load free space cache for block group 70205665443840 block group 70206739185664 has wrong amount of free space failed to load free space cache for block group 70206739185664 Wanted offset 70216543715328, found 70216543698944 Wanted offset 70216543715328, found 70216543698944 cache appears valid b
Re: btrfs check help
> On Nov 25, 2015, at 8:44 PM, Qu Wenruo <quwen...@cn.fujitsu.com> wrote: > > > > Vincent Olivier wrote on 2015/11/25 11:51 -0500: >> I should probably point out that there is 64GB of RAM on this machine and >> it’s a dual Xeon processor (LGA2011-3) system. Also, there is only Btrfs >> served via Samba and the kernel panic was caused Btrfs (as per what I >> remember from the log on the screen just before I rebooted) and happened in >> the middle of the night when zero (0) client was connected. >> >> You will find below the full “btrfs check” log for each device in the order >> it is listed by “btrfs fi show”. > > There is really no need to do such thing, as btrfs is able to manage multiple > device, calling btrfsck on any of them is enough as long as it's not hugely > damaged. > >> >> Ca I get a strong confirmation that I should run with the “—repair” option >> on each device? Thanks. > > YES. > > Inode nbytes fix is *VERY* safe as long as it's the only error. > > Although it's not that convincing since the inode nbytes fix code is written > by myself and authors always tend to believe their codes are good > But at least, some other users with more complicated problem(with inode > nbytes error) fixed it. > > The last decision is still on you anyway. I will do it on the first device from the “fi show” output and report. Thanks, Vincent -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs check help
: 2881050910 file data blocks allocated: 19445786390528 referenced 20138885959680 Checking filesystem on /dev/sdn UUID: 6a742786-070d-4557-9e67-c73b84967bf5 checking extents [O] checking free space cache [.] root 5 inode 1341670 errors 400, nbytes wrong root 11406 inode 1341670 errors 400, nbytes wrong found 19328980191604 bytes used err is 1 total csum bytes: 18849205856 total tree bytes: 27393392640 total fs tree bytes: 4452958208 total extent tree bytes: 3075571712 btree space waste bytes: 2881050910 file data blocks allocated: 19445786390528 referenced 20138885959680 Checking filesystem on /dev/sdl UUID: 6a742786-070d-4557-9e67-c73b84967bf5 checking extents [o] checking free space cache [o] root 5 inode 1341670 errors 400, nbytes wrong root 11406 inode 1341670 errors 400, nbytes wrong found 19328980191604 bytes used err is 1 total csum bytes: 18849205856 total tree bytes: 27393392640 total fs tree bytes: 4452958208 total extent tree bytes: 3075571712 btree space waste bytes: 2881050910 file data blocks allocated: 19445786390528 referenced 20138885959680 Checking filesystem on /dev/sdc UUID: 6a742786-070d-4557-9e67-c73b84967bf5 checking extents [O] checking free space cache [.] root 5 inode 1341670 errors 400, nbytes wrong root 11406 inode 1341670 errors 400, nbytes wrong found 19328980191604 bytes used err is 1 total csum bytes: 18849205856 total tree bytes: 27393392640 total fs tree bytes: 4452958208 total extent tree bytes: 3075571712 btree space waste bytes: 2881050910 file data blocks allocated: 19445786390528 referenced 20138885959680 Checking filesystem on /dev/sdr UUID: 6a742786-070d-4557-9e67-c73b84967bf5 checking extents [O] checking free space cache [.] root 5 inode 1341670 errors 400, nbytes wrong root 11406 inode 1341670 errors 400, nbytes wrong found 19328980191604 bytes used err is 1 total csum bytes: 18849205856 total tree bytes: 27393392640 total fs tree bytes: 4452958208 total extent tree bytes: 3075571712 btree space waste bytes: 2881050910 file data blocks allocated: 19445786390528 referenced 20138885959680 Checking filesystem on /dev/sdf UUID: 6a742786-070d-4557-9e67-c73b84967bf5 checking extents [o] checking free space cache [o] root 5 inode 1341670 errors 400, nbytes wrong root 11406 inode 1341670 errors 400, nbytes wrong found 19328980191604 bytes used err is 1 total csum bytes: 18849205856 total tree bytes: 27393392640 total fs tree bytes: 4452958208 total extent tree bytes: 3075571712 btree space waste bytes: 2881050910 file data blocks allocated: 19445786390528 referenced 20138885959680 Checking filesystem on /dev/sde UUID: 6a742786-070d-4557-9e67-c73b84967bf5 checking extents [.] checking free space cache [.] root 5 inode 1341670 errors 400, nbytes wrong root 11406 inode 1341670 errors 400, nbytes wrong found 19328980191604 bytes used err is 1 total csum bytes: 18849205856 total tree bytes: 27393392640 total fs tree bytes: 4452958208 total extent tree bytes: 3075571712 btree space waste bytes: 2881050910 file data blocks allocated: 19445786390528 referenced 20138885959680 Checking filesystem on /dev/sdd UUID: 6a742786-070d-4557-9e67-c73b84967bf5 checking extents [o] checking free space cache [.] root 5 inode 1341670 errors 400, nbytes wrong root 11406 inode 1341670 errors 400, nbytes wrong found 19328980191604 bytes used err is 1 total csum bytes: 18849205856 total tree bytes: 27393392640 total fs tree bytes: 4452958208 total extent tree bytes: 3075571712 btree space waste bytes: 2881050910 file data blocks allocated: 19445786390528 referenced 20138885959680 Checking filesystem on /dev/sdb UUID: 6a742786-070d-4557-9e67-c73b84967bf5 checking extents [o] checking free space cache [.] root 5 inode 1341670 errors 400, nbytes wrong root 11406 inode 1341670 errors 400, nbytes wrong found 19328980191604 bytes used err is 1 total csum bytes: 18849205856 total tree bytes: 27393392640 total fs tree bytes: 4452958208 total extent tree bytes: 3075571712 btree space waste bytes: 2881050910 file data blocks allocated: 19445786390528 referenced 20138885959680 > On Nov 24, 2015, at 3:32 PM, Hugo Mills <h...@carfax.org.uk> wrote: > > On Tue, Nov 24, 2015 at 03:28:28PM -0500, Austin S Hemmelgarn wrote: >> On 2015-11-24 12:06, Vincent Olivier wrote: >>> Hi, >>> >>> Woke up this morning with a kernel panic (for which I do not have details). >>> Please find below the output for btrfs check. Is this normal ? What should >>> I do ? Arch Linux 4.2.5. Btrfs-utils 4.3.1. 17x4TB RAID10. >> You get bonus points for being on a reasonably up-to-date kernel and >> userspace :) >> >> This is actually a pretty tame check result for a filesystem that's >> been through kernel panic. I think everything listed here is safe >> for check to fix, but I would suggest waiting until the devs provide >> opinions before actually running with --repair. I would also >> suggest comparing resul
btrfs check help
Hi, Woke up this morning with a kernel panic (for which I do not have details). Please find below the output for btrfs check. Is this normal ? What should I do ? Arch Linux 4.2.5. Btrfs-utils 4.3.1. 17x4TB RAID10. Regards, Vincent [root@3dcpc5 ~]# btrfs check /dev/sdk Checking filesystem on /dev/sdk UUID: 6a742786-070d-4557-9e67-c73b84967bf5 checking extents checking free space cache checking fs roots root 5 inode 1341670 errors 400, nbytes wrong root 11406 inode 1341670 errors 400, nbytes wrong found 19328809638262 bytes used err is 1 total csum bytes: 18849042724 total tree bytes: 27389886464 total fs tree bytes: 4449746944 total extent tree bytes: 3075457024 btree space waste bytes: 2880474254 file data blocks allocated: 19430708535296 referenced 20123773407232-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FYIO: A rant about btrfs
Hi, > On Sep 16, 2015, at 11:20 AM, Austin S Hemmelgarn> wrote: > > On 2015-09-16 10:43, M G Berberich wrote: >> Hello, >> >> just for information. I stumbled about a rant about btrfs-performance: >> >> http://blog.pgaddict.com/posts/friends-dont-let-friends-use-btrfs-for-oltp I read it too. > It is worth noting a few things that were done incorrectly in this testing: > 1. _NEVER_ turn off write barriers (nobarrier mount option), doing so subtly > breaks the data integrity guarantees of _ALL_ filesystems, but especially so > on COW filesystems like BTRFS. With this off, you will have a much higher > chance that a power loss will cause data loss. It shouldn't be turned off > unless you are also turning off write-caching in the hardware or know for > certain that no write-reordering is done by the hardware (and almost all > modern hardware does write-reordering for performance reasons). But can the “nobarrier” mount option affect performances negatively for Btrfs (and not only data integrity)? > 2. He provides no comparison of any other filesystem with TRIM support turned > on (it is very likely that all filesystems will demonstrate such performance > drops. Based on that graph, it looks like the device doesn't support > asynchronous trim commands). I think he means by the text surrounding the only graph that mentions TRIM that this exact same test on the other filesystems he benchmarked yield much better results. > 3. He's testing it for a workload is a known and documented problem for > BTRFS, and claiming that that means that it isn't worth considering as a > general usage filesystem. Most people don't run RDBMS servers on their > systems, and as such, such a workload is not worth considering for most > people. Apparently RDBMS being a problem on Btrfs is neither known nor documented enough (he’s right about the contrast with claiming publicly that Btrfs is indeed production ready). My view on this is that having one filesystem to rule them all (all storage technologies, all use cases) is unrealistic. Also the time when you could put your NAS on an old i386 with 3MB of RAM is over. Compression, checksumming, COW, snapshotting, quotas, etc. are all computationally intensive features. In 2015 block storage has become computationally intensive. How about saying non-root Btrfs RAID10 is the best choice for a Samba NAS on rotational-HDDs with no SMR (my use case)? For root and RDBMS, I use ext4 on a M.2 SSD and with a sane initramfs and the most recent stable kernel. I am happy with the performances and delighted with the features Btrfs provides. I think it is much more productive to document and compare the most successful Btrfs deployments rather than trying to find bugs and bottlenecks for use cases that are the development focus of other filesystems. I don’t know, I might not make a lot of sense here but on top of refactoring the Gotchas, I would be happy to start a successful deployment story section on the wiki and maybe improve my usage of Btrfs along the way (who else here is using Btrfs in a similar fashion?). > His points about the degree of performance jitter are valid however, as are > the complaints of apparent CPU intensive stalls in the BTRFS code, and I > occasionally see both on my own systems. Me too. My two cents is that focusing on improving performances for Btrfs-optimal use cases is much more interesting than bringing new features like automatically turning COW off for RDBMS usage or debugging TRIM support. Vincent signature.asc Description: Message signed with OpenPGP using GPGMail
Re: FYIO: A rant about btrfs
> On Sep 16, 2015, at 2:22 PM, Austin S Hemmelgarn <ahferro...@gmail.com> wrote: > > On 2015-09-16 12:51, Vincent Olivier wrote: >> Hi, >> >> >>> On Sep 16, 2015, at 11:20 AM, Austin S Hemmelgarn <ahferro...@gmail.com> >>> wrote: >>> >>> On 2015-09-16 10:43, M G Berberich wrote: >>>> Hello, >>>> >>>> just for information. I stumbled about a rant about btrfs-performance: >>>> >>>> http://blog.pgaddict.com/posts/friends-dont-let-friends-use-btrfs-for-oltp >> I read it too. >>> It is worth noting a few things that were done incorrectly in this testing: >>> 1. _NEVER_ turn off write barriers (nobarrier mount option), doing so >>> subtly breaks the data integrity guarantees of _ALL_ filesystems, but >>> especially so on COW filesystems like BTRFS. With this off, you will have >>> a much higher chance that a power loss will cause data loss. It shouldn't >>> be turned off unless you are also turning off write-caching in the hardware >>> or know for certain that no write-reordering is done by the hardware (and >>> almost all modern hardware does write-reordering for performance reasons). >> But can the “nobarrier” mount option affect performances negatively for >> Btrfs (and not only data integrity)? > Using it improves performance for every filesystem on Linux that supports it. > This does not mean that it is _EVER_ a good idea to do so. This mount > option is one of the few things on my list of things that I will _NEVER_ > personally provide support to people for, because it almost guarantees that > you will lose data if the system dies unexpectedly (even if it's for a reason > other than power loss). OK fine. Let it be clearer then (on the Btrfs wiki): nobarrier is an absolute no go. Case closed. >>> 2. He provides no comparison of any other filesystem with TRIM support >>> turned on (it is very likely that all filesystems will demonstrate such >>> performance drops. Based on that graph, it looks like the device doesn't >>> support asynchronous trim commands). >> I think he means by the text surrounding the only graph that mentions TRIM >> that this exact same test on the other filesystems he benchmarked yield much >> better results. > Possibly, but there are also known issues with TRIM/DISCARD on BTRFS in 4.0. > And his claim is still baseless unless he actually provides reference for it. Same as above: TRIM/DISCARD officially not recommended in production until further notice? >>> 3. He's testing it for a workload is a known and documented problem for >>> BTRFS, and claiming that that means that it isn't worth considering as a >>> general usage filesystem. Most people don't run RDBMS servers on their >>> systems, and as such, such a workload is not worth considering for most >>> people. >> Apparently RDBMS being a problem on Btrfs is neither known nor documented >> enough (he’s right about the contrast with claiming publicly that Btrfs is >> indeed production ready). > OK, maybe not documented, but RDBMS falls under 'Large files with highly > random access patterns and heavy RMW usage', which is a known issue for > BTRFS, and also applies to VM images. This guy is no idiot. If it wasn’t clear enough for him. It’s not clear enough period. >>> His points about the degree of performance jitter are valid however, as are >>> the complaints of apparent CPU intensive stalls in the BTRFS code, and I >>> occasionally see both on my own systems. >> Me too. My two cents is that focusing on improving performances for >> Btrfs-optimal use cases is much more interesting than bringing new features >> like automatically turning COW off for RDBMS usage or debugging TRIM support. > It depends, BTRFS is still not feature complete with the overall intent when > it was started (raid56 and qgroups being the two big issues at the moment), > and attempting to optimize things tends to introduce bugs, which we have > quite enough of already without people adding more (and they still seem to be > breeding like rabbits). I would just like a clear statement from a dev-lead saying : until we are feature-complete (with a finite list of features to complete) the focus will be on feature-completion and not optimizing already-implemented features. Ideally with an ETA on when optimization will be more of a priority than it is today. > That said, my systems (which are usually doing mostly CPU or memory bound > tasks, and not I/O bound like the aforementioned benchmarks were testing) run > no slower than they did with ext4 as t
Re: errors while "btrfs receive"
It's ~900GiB. Sorry. I'm on 4.1.6 Arch Linux. -Original Message- From: "Duncan" <1i5t5.dun...@cox.net> Sent: Tuesday, September 1, 2015 03:30 To: linux-btrfs@vger.kernel.org Subject: Re: errors while "btrfs receive" Vincent Olivier posted on Mon, 31 Aug 2015 14:34:02 -0400 as excerpted: > i'm doing a ~900TiB receive on a 6x4TB RAID0 > > "fi show", "device scan" all fail and report "unable to connect to > /dev/sdX" > > is it normal ? Can't answer your direct question as my use-case doesn't include send/ receive, but... ~900TiB receive on a 24TB (6x4TB) raid0? Did you mean ~900GiB, or did you miss the decimal point, or I mean, yeah, btrfs does have the compress= option, but I don't think it's going to compress 900 TiB to fit in under 24! That's likely to be some seriously lossy compression! -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
errors while "btrfs receive"
hi, i'm doing a ~900TiB receive on a 6x4TB RAID0 "fi show", "device scan" all fail and report "unable to connect to /dev/sdX" is it normal ? thanks, Vincent -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Response to Bcachefs Claims
I'm still parsing through the multi-device advices. Will be back on this when I'm done. And I'll probably switch distro to Archlinux which seems the way to go if one is using cutting-edge kernel features like Btrfs. As for the work on Gotchas/Known Issues on the Btrfs wiki, I also think that the best way is to start with the Gotchas page and put a more prominent link for it on the home page. I would restructure the Gotchas page in the following ways : * Add the mention Known Issues close to the title; * Only keep current issues/gotchas (current stable kernel, current userspace utilities release), archive every other on a separate page; * Group by smallest feature encompassing the issue : multi-device, quotas, subvolumes, compression, conversion from extX, interactions with other things like LVM, MD, encryption, etc.; * Something that is new and not as thoroughly tested as other features should be listed there as well (I think) until there is a consensus (to be defined) on it being reliable enough to be taken out of the list. Or maybe that should go on another page ? * Provide links to HOWTOs or best practices for the features discussed (multi-device GOTCHAS should link to multi-device HOWTO). I will be thinking about it more before doing anything and still welcoming ideas. Thanks ! Vincent -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Response to Bcachefs Claims
Hi, I have been using Btrfs for almost a year now with a 16x4TB RAID10 and its 8x4TB RAID0 backup (using incremental snapshots diffs). I have always tried to stay at the latest stable kernel (currently 4.1.6). But I might be moving to Fedora 22 because Centos 7 has significant incompatibilities with the 4.1.x kernel series. I have seen the news about Bcachefs aiming to be Btrfs-complete while being extX-stable. What are the chances Bcachefs beats Btrfs at being the Linux kernel's next official file system ? I chose Btrfs over ZFS because it seemed like the only next-gen heir to ext4/xfs. I have been having a few problems with Btrfs myself. I have only one that remains unresolved : I haven't found the best way to mount Btrfs at boot time. LABEL= won't work for known reasons (I don't understand however why a mount can't do its own device scan transparently). UUID= won't work for unknown reasons (haven't got a reply on this, maybe it's the same as LABEL=). And I will use /dev/* in fstab for stability reasons. Right now I'm mounting the fs manually after a device scan and picking up the first device that shows up in the fi show run. I can live with that but I suppose that things like this contribute to the feeling that Btrfs is actually still experimental contrarily to claims that it is production-ready. For my own sake and other's I would like to maintain (if nobody is already working on that nor needs any help) a centralized human-readable digest of known issues that would be featured prominently on top of the Btrfs wiki. I would merge the Gotchas page and the various known issues pages (including the various multi-device mount gotchas here and there). Answers ? Comments ? Help ? Thanks, Vincent -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mount btrfs takes 30 minutes, btrfs check runs out of memory
it appears that it might be related to label/uuid fstab boot mounting instead when I mount manually without the noauto,x-systemd.automount” options and use the first device I get from btrfs fi show after a btrfs device scan I never get the problem does this sound familiar ? I thought I was safe with uuid mount in stab… I can (temporarily) live with manually mounting this filesystem but I would appreciate being able to mount it at boot time via fstab… thanks vincent -Original Message- From: Vincent Olivier vinc...@up4.com Sent: Thursday, August 13, 2015 22:42 To: Duncan 1i5t5.dun...@cox.net Cc: linux-btrfs@vger.kernel.org Subject: Re: mount btrfs takes 30 minutes, btrfs check runs out of memory I'll try without autodefrag anyways tomorrow just to make sure. And then file a bug report too with however it decides to behave. Vincent -Original Message- From: Duncan 1i5t5.dun...@cox.net Sent: Thursday, August 13, 2015 20:30 To: linux-btrfs@vger.kernel.org Subject: Re: mount btrfs takes 30 minutes, btrfs check runs out of memory Chris Murphy posted on Thu, 13 Aug 2015 17:19:41 -0600 as excerpted: Well I think others have suggested 3000 snapshots and quite a few things will get very slow. But then also you have autodefrag and I forget the interaction of this with many snapshots since the snapshot aware defrag code was removed. Autodefrag shouldn't have any snapshots mount-time-related interaction, with snapshot-aware-defrag disabled. The interaction between defrag (auto or not) and snapshots will be additional data space usage, since with snapshot-aware disabled, defrag only works with the current copy, thus forcing it to COW the extents elsewhere while not freeing the old extents as they're still referenced by the snapshots, but it shouldn't affect mount-time. -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mount btrfs takes 30 minutes, btrfs check runs out of memory
Hi, I think I might be having this problem too. 12 x 4TB RAID10 (original makefs, not converted from ext or whatnot). Says it has ~6TiB left. Centos 7. Dual Xeon CPU. 32GB RAM. ELRepo Kernel 4.1.5. Fstab options: noatime,autodefrag,compress=zlib,space_cache,nossd,noauto,x-systemd.automount Sometimes (not all the time) when I cd or ls the mount point it will not return within 5 minutes (I never let it run more than 5 minutes before rebooting) and I reboot and then it takes between 10-30s. Well as I'm writing this it's already been more than 10 minutes. I don't have the problem when I mount manually without the noauto,x-systemd.automount options. Can anyone help ? Thanks. Vincent -Original Message- From: Austin S Hemmelgarn ahferro...@gmail.com Sent: Wednesday, August 5, 2015 07:30 To: John Ettedgui john.etted...@gmail.com Cc: Qu Wenruo quwen...@cn.fujitsu.com, btrfs linux-btrfs@vger.kernel.org Subject: Re: mount btrfs takes 30 minutes, btrfs check runs out of memory On 2015-08-04 13:36, John Ettedgui wrote: On Tue, Aug 4, 2015 at 4:28 AM, Austin S Hemmelgarn ahferro...@gmail.com wrote: On 2015-08-04 00:58, John Ettedgui wrote: On Mon, Aug 3, 2015 at 8:01 PM, Qu Wenruo quwen...@cn.fujitsu.com wrote: Although the best practice is staying away from such converted fs, either using pure, newly created btrfs, or convert back to ext* before any balance. Unfortunately I don't have enough hard drive space to do a clean btrfs, so my only way to use btrfs for that partition was a conversion. If you could get your hands on a decent sized flash drive (32G or more), you could do an incremental conversion offline. The steps would look something like this: 1. Boot the system into a LiveCD or something similar that doesn't need to run from your regular root partition (SystemRescueCD would be my personal recommendation, although if you go that way, make sure to boot the alternative kernel, as it's a lot newer then the standard ones). 2. Plug in the flash drive, format it as BTRFS. 3. Mount both your old partition and the flash drive somewhere. 4. Start copying files from the old partition to the flash drive. 5. When you hit ENOSPC on the flash drive, unmount the old partition, shrink it down to the minimum size possible, and create a new partition in the free space produced by doing so. 6. Add the new partition to the BTRFS filesystem on the flash drive. 7. Repeat steps 4-6 until you have copied everything. 8. Wipe the old partition, and add it to the BTRFS filesystem. 9. Run a full balance on the new BTRFS filesystem. 10. Delete the partition from step 5 that is closest to the old partition (via btrfs device delete), then resize the old partition to fill the space that the deleted partition took up. 11. Repeat steps 9-10 until the only remaining partitions in the new BTRFS filesystem are the old one and the flash drive. 12. Delete the flash drive from the BTRFS filesystem. This takes some time and coordination, but it does work reliably as long as you are careful (I've done it before on multiple systems). I suppose I could do that even without the flash as I have some free space anyway, but moving Tbs of data with Gbs of free space will take days, plus the repartitioning. It'd probably be easier to start with a 1Tb drive or something. Is this currently my best bet as conversion is not as good as I thought? I believe my other 2 partitions also come from conversion, though I may have rebuilt them later from scratch. Thank you! John Yeah, you're probably better off getting a TB disk and starting with that. In theory it is possible to automate the process, but I would advise against that if at all possible, it's a lot easier to recover from an error if you're doing it manually. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mount btrfs takes 30 minutes, btrfs check runs out of memory
I'll try without autodefrag anyways tomorrow just to make sure. And then file a bug report too with however it decides to behave. Vincent -Original Message- From: Duncan 1i5t5.dun...@cox.net Sent: Thursday, August 13, 2015 20:30 To: linux-btrfs@vger.kernel.org Subject: Re: mount btrfs takes 30 minutes, btrfs check runs out of memory Chris Murphy posted on Thu, 13 Aug 2015 17:19:41 -0600 as excerpted: Well I think others have suggested 3000 snapshots and quite a few things will get very slow. But then also you have autodefrag and I forget the interaction of this with many snapshots since the snapshot aware defrag code was removed. Autodefrag shouldn't have any snapshots mount-time-related interaction, with snapshot-aware-defrag disabled. The interaction between defrag (auto or not) and snapshots will be additional data space usage, since with snapshot-aware disabled, defrag only works with the current copy, thus forcing it to COW the extents elsewhere while not freeing the old extents as they're still referenced by the snapshots, but it shouldn't affect mount-time. -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mount btrfs takes 30 minutes, btrfs check runs out of memory
I have 2 snapshots a few days apart for incrementally backing up the volume but that's it. I'll try without autodefrag tomorrow. Vincent -Original Message- From: Chris Murphy li...@colorremedies.com Sent: Thursday, August 13, 2015 19:19 To: Btrfs BTRFS linux-btrfs@vger.kernel.org Subject: Re: mount btrfs takes 30 minutes, btrfs check runs out of memory On Thu, Aug 13, 2015 at 4:38 PM, Vincent Olivier vinc...@up4.com wrote: Hi, I think I might be having this problem too. 12 x 4TB RAID10 (original makefs, not converted from ext or whatnot). Says it has ~6TiB left. Centos 7. Dual Xeon CPU. 32GB RAM. ELRepo Kernel 4.1.5. Fstab options: noatime,autodefrag,compress=zlib,space_cache,nossd,noauto,x-systemd.automount Well I think others have suggested 3000 snapshots and quite a few things will get very slow. But then also you have autodefrag and I forget the interaction of this with many snapshots since the snapshot aware defrag code was removed. I'd say file a bug with the full details of the hardware from the ground up to the Btrfs file system. And include as an attachment, dmesg with sysrq+t during this hang. Usually I see t asked if there's just slowness/delays, and w if there's already a kernel message saying there's a blocked task for 120 seconds. -- Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
systemd : Timed out waiting for defice dev-disk-by…
Hi, (Sorry if this gets sent twice : one of my mail relay is misbehaving today) 50% of the time when booting, the system go in safe mode because my 12x 4TB RAID10 btrfs is taking too long to mount from fstab. When I comment it out from fstab and mount it manually, it’s all good. I don’t like that. Is there a way to increase the timer or something ? Thanks, Vincent -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Send/Receive Use Case
actually I have another question : is it posssible for a RAID0 fs to “receive” from a “sending” RAID10 ? or do they need to be of the same replication scheme too ? On Jun 27, 2015, at 10:34 AM, Vincent Olivier vinc...@up4.com wrote: ok i’ll go home and rethink my life then ;) On Jun 27, 2015, at 10:21 AM, Hugo Mills h...@carfax.org.uk wrote: On Sat, Jun 27, 2015 at 10:04:28AM -0400, Vincent Olivier wrote: Hi, There are 4 things I’m not sure about re:send/receive. 1) Is it possible to first copy things on a file system using rsync and then use send-receive ? And to subsequently mix rsync and send-receive ? Provided that snapshots are made accordingly. Probably. It depends on exctly how you want to use them. 2) Is possible to “send a snapshot diff to disk and then “receive it from the said disk into a remote filesystem ? I have two very large and physically distant btrfs filesystems. It would be more economical to juste dump snapshot diffs to disk for transport instead of the network. Yes, that's perfectly possible. 3) How are “conflicts” handled by send-receive if at all ? There are no conflicts possible, due to the requirement of all the subvolumes involved in the send/receive process being read-only. (Actually, that's not quite true -- you can make a subvolume read/write, and then read-only again. In that case, the receive will probably fail, leaving the received subvolume in a partially-created state). 4) If a file is created, modified and then deleted in-between two snapshots is it ignored by send/receive or does send/receive re-enacts” the journal exactly ? It'll be ignored. The FS doesn't keep track of how it reached a particular state -- only what that state is. Hugo. -- Hugo Mills | IMPROVE YOUR ORGANISMS!! hugo@... carfax.org.uk | http://carfax.org.uk/ | PGP: E2AB1DE4 |Subject line of spam email -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Send/Receive Use Case
this is GOOD news . thanks ! On Jul 10, 2015, at 3:11 PM, Hugo Mills h...@carfax.org.uk wrote: On Fri, Jul 10, 2015 at 03:03:27PM -0400, Vincent Olivier wrote: actually I have another question : is it posssible for a RAID0 fs to “receive” from a “sending” RAID10 ? Yes, definitely. I do my backups from RAID-1 to single. The send stream format is based on files, not on the underlying raw storage. Hugo. or do they need to be of the same replication scheme too ? On Jun 27, 2015, at 10:34 AM, Vincent Olivier vinc...@up4.com wrote: ok i’ll go home and rethink my life then ;) On Jun 27, 2015, at 10:21 AM, Hugo Mills h...@carfax.org.uk wrote: On Sat, Jun 27, 2015 at 10:04:28AM -0400, Vincent Olivier wrote: Hi, There are 4 things I’m not sure about re:send/receive. 1) Is it possible to first copy things on a file system using rsync and then use send-receive ? And to subsequently mix rsync and send-receive ? Provided that snapshots are made accordingly. Probably. It depends on exctly how you want to use them. 2) Is possible to “send a snapshot diff to disk and then “receive it from the said disk into a remote filesystem ? I have two very large and physically distant btrfs filesystems. It would be more economical to juste dump snapshot diffs to disk for transport instead of the network. Yes, that's perfectly possible. 3) How are “conflicts” handled by send-receive if at all ? There are no conflicts possible, due to the requirement of all the subvolumes involved in the send/receive process being read-only. (Actually, that's not quite true -- you can make a subvolume read/write, and then read-only again. In that case, the receive will probably fail, leaving the received subvolume in a partially-created state). 4) If a file is created, modified and then deleted in-between two snapshots is it ignored by send/receive or does send/receive re-enacts” the journal exactly ? It'll be ignored. The FS doesn't keep track of how it reached a particular state -- only what that state is. Hugo. -- Hugo Mills | UNIX: Japanese brand of food containers hugo@... carfax.org.uk | http://carfax.org.uk/ | PGP: E2AB1DE4 | signature.asc Description: Message signed with OpenPGP using GPGMail
Send/Receive Use Case
Hi, There are 4 things I’m not sure about re:send/receive. 1) Is it possible to first copy things on a file system using rsync and then use send-receive ? And to subsequently mix rsync and send-receive ? Provided that snapshots are made accordingly. 2) Is possible to “send a snapshot diff to disk and then “receive it from the said disk into a remote filesystem ? I have two very large and physically distant btrfs filesystems. It would be more economical to juste dump snapshot diffs to disk for transport instead of the network. 3) How are “conflicts” handled by send-receive if at all ? 4) If a file is created, modified and then deleted in-between two snapshots is it ignored by send/receive or does send/receive re-enacts” the journal exactly ? Thanks, Vincent-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Send/Receive Use Case
ok i’ll go home and rethink my life then ;) On Jun 27, 2015, at 10:21 AM, Hugo Mills h...@carfax.org.uk wrote: On Sat, Jun 27, 2015 at 10:04:28AM -0400, Vincent Olivier wrote: Hi, There are 4 things I’m not sure about re:send/receive. 1) Is it possible to first copy things on a file system using rsync and then use send-receive ? And to subsequently mix rsync and send-receive ? Provided that snapshots are made accordingly. Probably. It depends on exctly how you want to use them. 2) Is possible to “send a snapshot diff to disk and then “receive it from the said disk into a remote filesystem ? I have two very large and physically distant btrfs filesystems. It would be more economical to juste dump snapshot diffs to disk for transport instead of the network. Yes, that's perfectly possible. 3) How are “conflicts” handled by send-receive if at all ? There are no conflicts possible, due to the requirement of all the subvolumes involved in the send/receive process being read-only. (Actually, that's not quite true -- you can make a subvolume read/write, and then read-only again. In that case, the receive will probably fail, leaving the received subvolume in a partially-created state). 4) If a file is created, modified and then deleted in-between two snapshots is it ignored by send/receive or does send/receive re-enacts” the journal exactly ? It'll be ignored. The FS doesn't keep track of how it reached a particular state -- only what that state is. Hugo. -- Hugo Mills | IMPROVE YOUR ORGANISMS!! hugo@... carfax.org.uk | http://carfax.org.uk/ | PGP: E2AB1DE4 |Subject line of spam email -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID10 Balancing Request for Comments and Advices
On Jun 17, 2015, at 9:27 AM, Hugo Mills h...@carfax.org.uk wrote: On Wed, Jun 17, 2015 at 09:13:08AM -0400, Vincent Olivier wrote: On Jun 16, 2015, at 8:14 PM, Chris Murphy li...@colorremedies.com wrote: On Tue, Jun 16, 2015 at 5:58 PM, Duncan 1i5t5.dun...@cox.net wrote: On a current kernel unlike older ones, btrfs actually automates entirely empty chunk reclaim, so this problem doesn't occur anything close to near as often as it used to. However, it's still possible to have mostly but not entirely empty chunks that btrfs won't automatically reclaim. A balance can be used to rewrite and combine these mostly empty chunks, reclaiming the space saved. This is what Hugo was recommending. Yes, as little as a -dusage=5 (data chunks that are 5% or less full) can clear the problem and is very fast, seconds. Possibly a bit longer, many seconds o single digit minutes is -dusage=15. I haven't done a full balance in forever. Yes, on this 80% full 6x4TB RAID10 -dusage=15 took 2 seconds and relocated 0 out of 3026 chunks”. Out of curiosity, I had to use -dusage=90 to have it relocate only 1 chunk and it took les than 30 seconds. So I put a -dusage=25 in the weekly cron just before the scrub. In most cases, all you need to do is clean up one data chunk to give the metadata enough space to work in. Instead of manually iterating through several values of usage= until you get a useful response, you can use limit=n to stop after n successful block group relocations. Nice! Will do that instead! Thanks. signature.asc Description: Message signed with OpenPGP using GPGMail
Re: RAID10 Balancing Request for Comments and Advices
On Jun 16, 2015, at 8:14 PM, Chris Murphy li...@colorremedies.com wrote: On Tue, Jun 16, 2015 at 5:58 PM, Duncan 1i5t5.dun...@cox.net wrote: On a current kernel unlike older ones, btrfs actually automates entirely empty chunk reclaim, so this problem doesn't occur anything close to near as often as it used to. However, it's still possible to have mostly but not entirely empty chunks that btrfs won't automatically reclaim. A balance can be used to rewrite and combine these mostly empty chunks, reclaiming the space saved. This is what Hugo was recommending. Yes, as little as a -dusage=5 (data chunks that are 5% or less full) can clear the problem and is very fast, seconds. Possibly a bit longer, many seconds o single digit minutes is -dusage=15. I haven't done a full balance in forever. Yes, on this 80% full 6x4TB RAID10 -dusage=15 took 2 seconds and relocated 0 out of 3026 chunks”. Out of curiosity, I had to use -dusage=90 to have it relocate only 1 chunk and it took les than 30 seconds. So I put a -dusage=25 in the weekly cron just before the scrub. FYI. Thanks for your help.-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID10 Balancing Request for Comments and Advices
On Jun 16, 2015, at 7:58 PM, Duncan 1i5t5.dun...@cox.net wrote: Vincent Olivier posted on Tue, 16 Jun 2015 09:34:29 -0400 as excerpted: On Jun 16, 2015, at 8:25 AM, Hugo Mills h...@carfax.org.uk wrote: On Tue, Jun 16, 2015 at 08:09:17AM -0400, Vincent Olivier wrote: My first question is this : is it normal to have “single” blocks ? Why not only RAID10? I don’t remember the exact mkfs options I used but I certainly didn’t ask for “single” so this is unexpected. Yes. It's an artefact of the way that mkfs works. If you run a balance on those chunks, they'll go away. (btrfs balance start -dusage=0 -musage=0 /mountpoint) Thanks! I did and it did go away, except for the GlobalReserve, single: total=512.00MiB, used=0.00B”. But I suppose this is a permanent fixture, right? Yes. GlobalReserve is for short-term btrfs-internal use, reserved for times when btrfs needs to (temporarily) allocate some space in ordered to free space, etc. It's always single, and you'll rarely see anything but 0 used except perhaps in the middle of a balance or something. Get it. Thanks. Is there anyway to put that on another device, say, a SSD? I am thinking of backing up this RAID10 on a 2x8TB device-managed SMR RAID1 and I want to minimize random write operations (noatime al.). I will start a new thread for that maybe but first, is there something substantial I can read about btrfs+SMR? Or should I avoid SMR+btfs ? For maintenance, I would suggest running a scrub regularly, to check for various forms of bitrot. Typical frequencies for a scrub are once a week or once a month -- opinions vary (as do runtimes). Yes. I cronned it weekly for now. Takes about 5 hours. Is it automatically corrected on RAID10 since a copy of it exist within the filesystem ? What happens for RAID0 ? For raid10 (and the raid1 I use), yes, it's corrected, from the other existing copy, assuming it's good, tho if there are metadata checksum errors, there may be corresponding unverified checksums as well, where the verification couldn't be done because the metadata containing the checksums was bad. Thus, if there are errors found and corrected, and you see unverified errors as well, rerun the scrub, so the newly corrected metadata can now be used to verify the previously unverified errors. ok then, rule of the thumb re-run the scrub on “unverified checksum error(s)”. I have yet to see checksum errors yet but will keep it in mind.. I'm presently getting a lot of experience with this as one of the ssds in my raid1 is gradually failing and rewriting sectors. Generally what happens is that the ssd will take too long, triggering a SATA reset (30 second timeout), and btrfs will call that an error. The scrub then rewrites the bad copy on the unreliable device with the good copy from the more reliable device, with the write triggering a sector relocation on the bad device. The newly written copy then checks out good, but if it was metadata, it very likely contained checksums for several other blocks, which couldn't be verified because the block containing their checksums was itself bad. Typically I'll see dozens to a couple hundred unverified errors for every bad metadata block rewritten in this way. Rerunning the scrub then either verifies or fixes the previously unverified blocks, tho sometimes one of those in turn ends up bad and if it's a metadata block, I may end up rerunning the scrub another time or two, until everything checks out. FWIW, on the bad device, smartctl -A reports (excerpted): ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0032 098 098 036Old_age Always - 259 182 Erase_Fail_Count_Total 0x0032 100 100 000Old_age Always - 132 While on the paired good device: 5 Reallocated_Sector_Ct 0x0032 253 253 036Old_age Always - 0 182 Erase_Fail_Count_Total 0x0032 253 253 000Old_age Always - 0 Meanwhile, smartctl -H has already warned once that the device is failing, tho it went back to passing status again, but as of now it's saying failing, again. The attribute that actually registers as failing, again from the bad device, followed by the good, is: 1 Raw_Read_Error_Rate 0x000f 001 001 006Pre-fail Always FAILING_NOW 3081 1 Raw_Read_Error_Rate 0x000f 160 159 006Pre-fail Always - 41 When it's not actually reporting failing, the FAILING_NOW status is replaced with IN_THE_PAST. 250 Read_Error_Retry_Rate is the other attribute of interest, with values of 100 current and worst for both devices, threshold 0, but a raw value of 2488 for the good device and over 17,000,000 for the failing device. But with the cooked value never moving from 100 and with no real guidance on how to interpret the raw values, while
RAID10 Balancing Request for Comments and Advices
Hello, I have a Centos 7 machine with the latest EPEL kernel-ml (4.0.5) with a 6-disk 4TB HGST RAID10 btrfs volume. With the following mount options : noatime,compress=zlib,space_cache 0 2 btrfs filesystem df” gives : Data, RAID10: total=7.08TiB, used=7.02TiB Data, single: total=8.00MiB, used=0.00B System, RAID10: total=7.88MiB, used=656.00KiB System, single: total=4.00MiB, used=0.00B Metadata, RAID10: total=9.19GiB, used=7.56GiB Metadata, single: total=8.00MiB, used=0.00B GlobalReserve, single: total=512.00MiB, used=0.00B My first question is this : is it normal to have “single” blocks ? Why not only RAID10? I don’t remember the exact mkfs options I used but I certainly didn’t ask for “single” so this is unexpected. My second question is : what is the best device add / balance sequence to use if I want to add 2 more disks to this RAID10 volume? Also is a balance necessary at all since I’m adding a pair? My third question is: given that this file system is an offline backup for another RAID0 volume with SMB sharing, what is the best maintenance schedule as long as it is offline? For now, I only have a weekly cron scrub now, but I think that the priority is to have it balanced after a send-receive or rsync to optimize storage space availability (over performance). Is there a “light” balancing method recommended in this case? My fourth question, still within the same context: are there best practices when using smartctl for periodically testing (long test, short test) btrfs RAID devices? Thanks! Vincent -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID10 Balancing Request for Comments and Advices
On Jun 16, 2015, at 8:25 AM, Hugo Mills h...@carfax.org.uk wrote: On Tue, Jun 16, 2015 at 08:09:17AM -0400, Vincent Olivier wrote: btrfs filesystem df” gives : Data, RAID10: total=7.08TiB, used=7.02TiB Data, single: total=8.00MiB, used=0.00B System, RAID10: total=7.88MiB, used=656.00KiB System, single: total=4.00MiB, used=0.00B Metadata, RAID10: total=9.19GiB, used=7.56GiB Metadata, single: total=8.00MiB, used=0.00B GlobalReserve, single: total=512.00MiB, used=0.00B My first question is this : is it normal to have “single” blocks ? Why not only RAID10? I don’t remember the exact mkfs options I used but I certainly didn’t ask for “single” so this is unexpected. Yes. It's an artefact of the way that mkfs works. If you run a balance on those chunks, they'll go away. (btrfs balance start -dusage=0 -musage=0 /mountpoint) Thanks! I did and it did go away, except for the GlobalReserve, single: total=512.00MiB, used=0.00B”. But I suppose this is a permanent fixture, right? My second question is : what is the best device add / balance sequence to use if I want to add 2 more disks to this RAID10 volume? Also is a balance necessary at all since I’m adding a pair? Add both devices first, then balance. For a RAID-1 filesystem, adding two devices wouldn't need a balance to get full usage out of the new devices. However, you've got RAID-10, so the most you'd be able to get on the FS without a balance is four times the remaining space on one of the existing disks. The chunk allocator for RAID-10 will allocate as many chunks as it can in an even number across all the devices, omitting the device with the smallest free space if there's an odd number of devices. It must have space on at least four devices, so adding two devices means that it'll have to have free space on at least two of the existing ones (and will try to use all of them). So yes, unless you're adding four devices, a rebalance is required here. It is perfectly clear and logical that 1+0 works on four devices at a time. My third question is: given that this file system is an offline backup for another RAID0 volume with SMB sharing, what is the best maintenance schedule as long as it is offline? For now, I only have a weekly cron scrub now, but I think that the priority is to have it balanced after a send-receive or rsync to optimize storage space availability (over performance). Is there a “light” balancing method recommended in this case? You don't need to balance after send/receive or rsync. If you find that you have lots of data space allocated but not used (the first line in btrfs fi df, above), *and* metadata close to usage (within, say, 700 MiB), *and* no unallocated space (btrfs fi show), then it's worth running a filtered balance with -dlimit=3 or some similar small value to free up some space that the metadata can expand into. Other than that, it's pretty much entirely pointless. Ok thanks. Is there a btrfs-utils way of automating the if less than 1Gb free do balance -dlimit=3” ? For maintenance, I would suggest running a scrub regularly, to check for various forms of bitrot. Typical frequencies for a scrub are once a week or once a month -- opinions vary (as do runtimes). Yes. I cronned it weekly for now. Takes about 5 hours. Is it automatically corrected on RAID10 since a copy of it exist within the filesystem ? What happens for RAID0 ? Thanks! V-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html