Re: Kernel crash on mount after SMR disk trouble
10.6.2016, 23.20, Henk Slager kirjoitti: On Sat, May 14, 2016 at 10:19 AM, Jukka Larjawrote: In short: I added two 8TB Seagate Archive SMR disk to btrfs pool and tried to delete one of the old disks. After some errors I ended up with file system that can be mounted read-only, but crashes the kernel if mounted normally. Tried btrfs check --repair (which noted that space cache needs to be zeroed) and zeroing space cache (via mount parameter), but that didn't change anything. Longer version: I was originally running Debian Jessie with some pretty recent kernel (maybe 4.4), but somewhat older btrfs tools. After the trouble started, I tried You should at least have kernel 4.4, the critical patch for supporting this drive was added in 4.4-rc3 or 4.4-rc4, i dont remember exactly. Only if you somehow disable NCQ completely in your linux system (kernel and more) or use a HW chipset/bridge that does that for you it might work. After the crash I tracked the issue somewhat and found a discussion about very similar issue (starting with drives failing with dd or badblocks and ending, after several patches, to drives working in everything except maybe in Btrfs in certain cases). As far as I could tell, the 4.5 kernel has all the patches from that discussion, but I may have missed something that wasn't mentioned there. updating (now running Kernel 4.5.1 and tools 4.4.1). I checked the new disks with badblocks (no problems found), but based on some googling, Seagate's SMR disks seem to have various problems, so the root cause is probably one type or another of disk errors. Seagate provides a special variant of the linux ext4 fs system that should then play well with their SMR drive. Also the advice is to not use this drive in a array setup; the risk is way to high that they can't keep up with the demands of the higher layers and then get resets or their FW crashes. You should have had also have a look at your system's and drive timeouts (see scterc). To summarize: adding those drives to an btrfs raid array is asking for trouble. Increasing timeouts didn't help with the drive. Array freezes when drive drops out, then there's a crash when timeout occurs. It doesn't matter if the drive has come back in the mean time (drive doesn't return with same /dev/sdX, though I don't know if that matters for Btrfs). I always thought that the problem with these drives was supposed to be bad performance and worse than usual ability to handle power going out. My use case is quite light from bytes written point of view, so I didn't expect trouble. Of course, doing the initial add + balance isn't light at all. What I don't expect is what's essentially write errors. Pity, since the disks are dirt cheap compared to alternatives and I really don't care about performance. I am using 1 such drive with an Intel J1900 SoC (Atom, SATA2) and it works, although I get still the typical error occasionally. As it is just a btrfs receive target, just 1 fs dup/dup/single for the whole drive, all CoW, it survives those lockups or crashes, I just restart the board+drive. In general, reading back multi-TB ro snapshots works fine and is on par with Gbps LAN speeds. I'll probably test those drives as a target for DVR backups, when I get them out of the array (still waiting for new drives with which to start over. Then I just tear down the old array). Indeed kernel should not crash on such a case. It is not clear if you run a 4.5.1 or 4.5.0 kernel in terms of kernel.org terminology, but newer than 4.5.x probably does not help in this case. You could try to mount with usebackuproot and then see if you can get it writable, after setting long timeout values for the drive. If it works, then remove those 2 SMRs from the array ASAP. I understand that usebackuproot requires kernel >= 4.6. I probably won't be installing a custom kernel, but if I still have the array in its current state when 4.6 becomes available in Debian Stretch, I'll give it a try. -- ...Elämälle vierasta toimintaa... Jukka Larja, jla...@iki.fi, 0407679919 "... on paper looked like a great chip (10 GFs at 1.2 GHZ whith 35W" "It's a mystery to me why people continue to use silicon - processors on paper are always faster and cooler :-)" - lubemark and Richard Cownie on RWT forums - -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: recent complete stalls of btrfs (4.6.0-rc4+) -- any advice?
On Fri, Jun 10, 2016 at 5:41 PM, Yaroslav Halchenko <y...@onerussian.com> wrote: > Dear BTRFS developers, > > First of all -- thanks for developing BTRFS! So far it served really > well, when others falling (or failing) behind in my initial evaluation > (http://datalad.org/test_fs_analysis.html). With btrbk backups are a > breeze. But it still does fail completely for me at times > unfortunately. > > I know that I should upgrade the kernel, and I will now... but I > thought to share this incident(s) report since those might have been of > some value. Running Debian jessie but with manually built kernel. > btrfs is extensively used for a high meta-data partition (lots of > symlinks, lots of directories with a single file in them -- heave use of > git-annex), snapshots are taken regularly etc. > > Setup -- btrfs on top of software raids: > > # btrfs fi show /mnt/btrfs/ > Label: 'tank' uuid: b5fe7f5e-3478-4293-a42c-bf9ca26ea724 > Total devices 4 FS bytes used 21.07TiB > devid2 size 10.92TiB used 5.30TiB path /dev/md10 > devid3 size 10.92TiB used 5.30TiB path /dev/md11 > devid4 size 10.92TiB used 5.30TiB path /dev/md12 > devid5 size 10.92TiB used 5.30TiB path /dev/md13 > > > Within last 5 days, the beast has stalled twice by now. The last signs > were: > > * 20160605 -- kernel kaboomed at btrfs level > > smaug login: [3675876.734400] Kernel panic - not syncing: stack-protector: > Kernel stack is corrupted in: a03d0354 > [3675876.734400] > [3675876.745680] CPU: 9 PID: 651474 Comm: git Tainted: GW IO > 4.6.0-rc4+ #1 > [3675876.753272] Hardware name: Supermicro X10DRi/X10DRI-T, BIOS 1.0b > 09/17/2014 > [3675876.760431] 0086 5e62edd4 813098f5 > 817cd080 > [3675876.768104] 880036f23da8 811701af 881e0010 > 880036f23db8 > [3675876.775763] 880036f23d50 5e62edd4 880036f23d88 > a03d0354 > [3675876.783426] Call Trace: > [3675876.786057] [] ? dump_stack+0x5c/0x77 > [3675876.791575] [] ? panic+0xdf/0x226 > [3675876.796812] [] ? btrfs_add_link+0x384/0x3e0 [btrfs] > [3675876.803549] [] ? __stack_chk_fail+0x17/0x30 > [3675876.809610] [] ? btrfs_add_link+0x384/0x3e0 [btrfs] > [3675876.816391] [] ? btrfs_link+0x143/0x220 [btrfs] > [3675876.822802] [] ? vfs_link+0x1af/0x280 > [3675876.828331] [] ? SyS_link+0x22a/0x260 > [3675876.833859] [] ? entry_SYSCALL_64_fastpath+0x1e/0xa8 > [3675876.840740] Kernel Offset: disabled > [3675876.854050] ---[ end Kernel panic - not syncing: stack-protector: Kernel > stack is corrupted in: a03d0354 > [3675876.854050] > > * 20160610 -- again, different kaboom > > [443370.085059] CPU: 10 PID: 1044513 Comm: git-annex Tainted: GW IO > 4.6.0-rc4+ #1 > [443370.093268] Hardware name: Supermicro X10DRi/X10DRI-T, BIOS 1.0b > 09/17/2014 > [443370.100356] task: 8806c463d0c0 ti: 8808f9dc8000 task.ti: > 8808f9dc8000 > [443370.107953] RIP: 0010:[] [] > 0x88090f67be10 > [443370.115761] RSP: 0018:8808f9dcbe18 EFLAGS: 00010292 > [443370.121187] RAX: 88103fd95fc0 RBX: 8808f9dcc000 RCX: > > [443370.128438] RDX: RSI: 8806c463d0c0 RDI: > 88103fd95fc0 > [443370.135693] RBP: 8808f9dcbe30 R08: 8808f9dc8000 R09: > > [443370.142940] R10: 000a R11: R12: > 881035beedc8 > [443370.150184] R13: 880ff1106800 R14: 88123d6c R15: > 88123d6c0068 > [443370.157432] FS: 7f0ab3d83740() GS:88103fd8() > knlGS: > [443370.165645] CS: 0010 DS: ES: CR0: 80050033 > [443370.171512] CR2: 88090f67be10 CR3: 000cf7516000 CR4: > 001406e0 > [443370.178758] Stack: > [443370.180880] 88069dda93c0 a0358700 88069dda93c0 > 880f > [443370.188490] 8806c463d0c0 810bb560 8808f9dcbe48 > 8808f9dcbe48 > [443370.196107] d5ce3509 88069dda93c0 0001 > 8806a64835c8 > [443370.203726] Call Trace: > [443370.206310] [] ? btrfs_commit_transaction+0x350/0xa30 > [btrfs] > [443370.213826] [] ? wait_woken+0x90/0x90 > [443370.219280] [] ? btrfs_sync_file+0x2fb/0x3d0 [btrfs] > [443370.226012] [] ? do_fsync+0x38/0x60 > [443370.231267] [] ? SyS_fdatasync+0xf/0x20 > [443370.236870] [] ? entry_SYSCALL_64_fastpath+0x1e/0xa8 > [443370.243604] Code: 88 ff ff 21 67 5b 81 ff ff ff ff 00 00 6c 3d 12 88 ff > ff dd 77 35 a0 ff ff ff ff 00 00 00 00 00 00 00 00 40 e0 91 4b 08 88 ff ff > <60> b5 0b 81 ff ff ff ff f0 fd 61 8a 0c 88 ff ff 18 7c 79 3e
recent complete stalls of btrfs (4.6.0-rc4+) -- any advice?
Dear BTRFS developers, First of all -- thanks for developing BTRFS! So far it served really well, when others falling (or failing) behind in my initial evaluation (http://datalad.org/test_fs_analysis.html). With btrbk backups are a breeze. But it still does fail completely for me at times unfortunately. I know that I should upgrade the kernel, and I will now... but I thought to share this incident(s) report since those might have been of some value. Running Debian jessie but with manually built kernel. btrfs is extensively used for a high meta-data partition (lots of symlinks, lots of directories with a single file in them -- heave use of git-annex), snapshots are taken regularly etc. Setup -- btrfs on top of software raids: # btrfs fi show /mnt/btrfs/ Label: 'tank' uuid: b5fe7f5e-3478-4293-a42c-bf9ca26ea724 Total devices 4 FS bytes used 21.07TiB devid2 size 10.92TiB used 5.30TiB path /dev/md10 devid3 size 10.92TiB used 5.30TiB path /dev/md11 devid4 size 10.92TiB used 5.30TiB path /dev/md12 devid5 size 10.92TiB used 5.30TiB path /dev/md13 Within last 5 days, the beast has stalled twice by now. The last signs were: * 20160605 -- kernel kaboomed at btrfs level smaug login: [3675876.734400] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: a03d0354 [3675876.734400] [3675876.745680] CPU: 9 PID: 651474 Comm: git Tainted: GW IO 4.6.0-rc4+ #1 [3675876.753272] Hardware name: Supermicro X10DRi/X10DRI-T, BIOS 1.0b 09/17/2014 [3675876.760431] 0086 5e62edd4 813098f5 817cd080 [3675876.768104] 880036f23da8 811701af 881e0010 880036f23db8 [3675876.775763] 880036f23d50 5e62edd4 880036f23d88 a03d0354 [3675876.783426] Call Trace: [3675876.786057] [] ? dump_stack+0x5c/0x77 [3675876.791575] [] ? panic+0xdf/0x226 [3675876.796812] [] ? btrfs_add_link+0x384/0x3e0 [btrfs] [3675876.803549] [] ? __stack_chk_fail+0x17/0x30 [3675876.809610] [] ? btrfs_add_link+0x384/0x3e0 [btrfs] [3675876.816391] [] ? btrfs_link+0x143/0x220 [btrfs] [3675876.822802] [] ? vfs_link+0x1af/0x280 [3675876.828331] [] ? SyS_link+0x22a/0x260 [3675876.833859] [] ? entry_SYSCALL_64_fastpath+0x1e/0xa8 [3675876.840740] Kernel Offset: disabled [3675876.854050] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: a03d0354 [3675876.854050] * 20160610 -- again, different kaboom [443370.085059] CPU: 10 PID: 1044513 Comm: git-annex Tainted: GW IO 4.6.0-rc4+ #1 [443370.093268] Hardware name: Supermicro X10DRi/X10DRI-T, BIOS 1.0b 09/17/2014 [443370.100356] task: 8806c463d0c0 ti: 8808f9dc8000 task.ti: 8808f9dc8000 [443370.107953] RIP: 0010:[] [] 0x88090f67be10 [443370.115761] RSP: 0018:8808f9dcbe18 EFLAGS: 00010292 [443370.121187] RAX: 88103fd95fc0 RBX: 8808f9dcc000 RCX: [443370.128438] RDX: RSI: 8806c463d0c0 RDI: 88103fd95fc0 [443370.135693] RBP: 8808f9dcbe30 R08: 8808f9dc8000 R09: [443370.142940] R10: 000a R11: R12: 881035beedc8 [443370.150184] R13: 880ff1106800 R14: 88123d6c R15: 88123d6c0068 [443370.157432] FS: 7f0ab3d83740() GS:88103fd8() knlGS: [443370.165645] CS: 0010 DS: ES: CR0: 80050033 [443370.171512] CR2: 88090f67be10 CR3: 000cf7516000 CR4: 001406e0 [443370.178758] Stack: [443370.180880] 88069dda93c0 a0358700 88069dda93c0 880f [443370.188490] 8806c463d0c0 810bb560 8808f9dcbe48 8808f9dcbe48 [443370.196107] d5ce3509 88069dda93c0 0001 8806a64835c8 [443370.203726] Call Trace: [443370.206310] [] ? btrfs_commit_transaction+0x350/0xa30 [btrfs] [443370.213826] [] ? wait_woken+0x90/0x90 [443370.219280] [] ? btrfs_sync_file+0x2fb/0x3d0 [btrfs] [443370.226012] [] ? do_fsync+0x38/0x60 [443370.231267] [] ? SyS_fdatasync+0xf/0x20 [443370.236870] [] ? entry_SYSCALL_64_fastpath+0x1e/0xa8 [443370.243604] Code: 88 ff ff 21 67 5b 81 ff ff ff ff 00 00 6c 3d 12 88 ff ff dd 77 35 a0 ff ff ff ff 00 00 00 00 00 00 00 00 40 e0 91 4b 08 88 ff ff <60> b5 0b 81 ff ff ff ff f0 fd 61 8a 0c 88 ff ff 18 7c 79 3e 00 [443370.264107] RIP [] 0x88090f67be10 [443370.271044] RSP [443370.276177] CR2: 88090f67be10 [443370.284979] ---[ end trace 2c4b690b49d17ebd ]--- and for the last case here is more details with dmesg showing apparently other tracebacks and errors logged before, so might be of help: http://www.onerussian.com/tmp/dmesg-nonet.20160610.txt Are those issues something which was fixed since 4.6.0-rc4+ or I should be on look out for them to come back? What other information should I provide if I run into them again to help you troubleshoot/fix it? P.S. Please CC me the replies -- Yaro
Re: Cannot balance FS (No space left on device)
On 06/11/2016 12:10 AM, ojab // wrote: On Fri, Jun 10, 2016 at 9:56 PM, Hans van Kranenburgwrote: You can work around it by either adding two disks (like Henk said), or by temporarily converting some chunks to single. Just enough to get some free space on the first two disks to get a balance going that can fill the third one. You don't have to convert all of your data or metadata to single! Something like: btrfs balance start -v -dconvert=single,limit=10 /mnt/xxx/ Unfortunately it fails even if I set limit=1: $ sudo btrfs balance start -v -dconvert=single,limit=1 /mnt/xxx/ Dumping filters: flags 0x1, state 0x0, force is off DATA (flags 0x120): converting, target=281474976710656, soft is off, limit=1 ERROR: error during balancing '/mnt/xxx/': No space left on device There may be more info in syslog - try dmesg | tail Ah, apparently the balance operation *always* wants to allocate some new empty space before starting to look more close at the task you give it... This means that it's trying to allocate a new set of RAID0 chunks first... and that's exactly the opposite of what we want to accomplish here. If you really can add only one extra device now, there's always a more dirty way to get the job done. What you can do for example is: - partition the new disk in two partitions - add them both to the filesystem (btrfs doesn't know both block devices are on the same physical disk, ghehe) - convert a small number of data blocks to single - then device delete the third disk again so the single chunks move back to the two first disks - add the third disk back as one whole block device - etc... :D Moo, -- Hans van Kranenburg - System / Network Engineer T +31 (0)10 2760434 | hans.van.kranenb...@mendix.com | www.mendix.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Allocator behaviour during device delete
On 06/10/2016 09:58 PM, Hans van Kranenburg wrote: On 06/10/2016 09:26 PM, Henk Slager wrote: On Thu, Jun 9, 2016 at 3:54 PM, Brendan Hidewrote: On 06/09/2016 03:07 PM, Austin S. Hemmelgarn wrote: OK, I'm pretty sure I know what was going on in this case. Your assumption that device delete uses the balance code is correct, and that is why you see what's happening happening. There are two key bits that are missing though: 1. Balance will never allocate chunks when it doesn't need to. In relation to discussions w.r.t. enospc and device full of chunks, I say this 1. statement and I see different behavior with kernel 4.6.0 tools 4.5.3 On a idle fs with some fragmentation, I did balance -dusage=5, it completes succesfuly and leaves and new empty chunk (highest vaddr). Then balance -dusage=6, does 2 chunks with that usage level: - the zero filled last chunk is replaced with a new empty chunk (higher vaddr) - the 2 usage=6 chunks are gone - one chunk with the lowest vaddr saw its usage increase from 47 to 60 - several metadata chunks have change slightly in usage I noticed the same thing, kernel 4.5.4, progs 4.4.1. When balance starts doing anything, (so relocating >= 1 chunks, not when relocating 0), it first creates a new empty chunk. Even if all data that is balanced away is added to already existing chunks, the new empty one is still always left behind. When doing balance again with dusage=0, or repeatedly doing so, each time a new empty chunk is created, and then the previous empty one is removed, bumping up the start vaddr of the new chunk with 1GB each time. Well, there it is: commit 2c9fe835525896077e7e6d8e416b97f2f868edef http://www.spinics.net/lists/linux-btrfs/msg47679.html First the "I find it somewhat awkward that we always allocate a new data block group no matter what." section, and then the answer below: "2: for filesystem with data, we have to create target-chunk in balance operation, this patch only make "creating-chunk" earlier" ^^ This overlooks the case in which creating a new chunk is not necessary at all, because all data can be appended to existing ones? This also prevents ojab in the latest thread here to convert some chunks to single when his devices with RAID0 are full, because it forcibly tries to create new empty RAID0 space first, which is not going to be used at all, and which is the opposite behaviour of what is intented... -- Hans van Kranenburg - System / Network Engineer T +31 (0)10 2760434 | hans.van.kranenb...@mendix.com | www.mendix.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Cannot balance FS (No space left on device)
On Fri, Jun 10, 2016 at 9:56 PM, Hans van Kranenburgwrote: > You can work around it by either adding two disks (like Henk said), or by > temporarily converting some chunks to single. Just enough to get some free > space on the first two disks to get a balance going that can fill the third > one. You don't have to convert all of your data or metadata to single! > > Something like: > > btrfs balance start -v -dconvert=single,limit=10 /mnt/xxx/ Unfortunately it fails even if I set limit=1: >$ sudo btrfs balance start -v -dconvert=single,limit=1 /mnt/xxx/ >Dumping filters: flags 0x1, state 0x0, force is off > DATA (flags 0x120): converting, target=281474976710656, soft is off, limit=1 >ERROR: error during balancing '/mnt/xxx/': No space left on device >There may be more info in syslog - try dmesg | tail //wbr ojab -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Cannot balance FS (No space left on device)
On 06/10/2016 11:33 PM, ojab // wrote: On Fri, Jun 10, 2016 at 9:00 PM, Henk Slagerwrote: I have seldom seen an fs so full, very regular numbers :) But can you provide the output of this script: https://github.com/knorrie/btrfs-heatmap/blob/master/show_usage.py It gives better info w.r.t. devices and it is then easier to say what has to be done. But you have btrfs raid0 data (2 stripes) and raid1 metadata, and they both want 2 devices currently and there is only one device with place for your 2G chunks. So in theory you need 2 empty devices added for a balance to succeed. If you can allow reduces redundancy for some time, you could shrink the fs used space on hdd1 to half, same for the partition itself, add a hdd2 parttition and add that to the fs. Or just add another HDD. Then your 50Gb of deletions could get into effect if you start balancing. Also have a look at the balance stripe filters I would say. Output of show_usage.py: https://gist.githubusercontent.com/ojab/850276af6ff3aa566b8a3ce6ec444521/raw/4d77e02d556ed0edb0f9823259f145f65e80bc66/gistfile1.txt Looks like I only have smaller spare drives at the moment (largest is 100GB), is it ok to use? Or there is some minimal drive size needed for my setup? You can work around it by either adding two disks (like Henk said), or by temporarily converting some chunks to single. Just enough to get some free space on the first two disks to get a balance going that can fill the third one. You don't have to convert all of your data or metadata to single! Something like: btrfs balance start -v -dconvert=single,limit=10 /mnt/xxx/ New allocated chunks will go to the third disk, because it has the most free space. After this, you can convert the single data back to raid0: btrfs balance start -v -dconvert=raid0,soft /mnt/xxx/ soft is important, because it only touches everything that is not raid0 yet. And in the end there should be a few GB of free space on the first two disks, so you can do the big balance to spread all data over the three disks, just btrfs balance start -v -dusage=100 /mnt/xxx/ Review the commands before doing anything, as I haven't tested this here. The man page for btrfs-balance contains all info :) Looking at btrfs balance status, btrfs fi show etc, in another terminal while it's working is always nice, so you see what's happening, and you can always stop it when you think it moved around enough data with btrfs balance cancel. Moo, -- Hans van Kranenburg - System / Network Engineer T +31 (0)10 2760434 | hans.van.kranenb...@mendix.com | www.mendix.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Cannot balance FS (No space left on device)
On Fri, Jun 10, 2016 at 9:00 PM, Henk Slagerwrote: > I have seldom seen an fs so full, very regular numbers :) > > But can you provide the output of this script: > https://github.com/knorrie/btrfs-heatmap/blob/master/show_usage.py > > It gives better info w.r.t. devices and it is then easier to say what > has to be done. > > But you have btrfs raid0 data (2 stripes) and raid1 metadata, and they > both want 2 devices currently and there is only one device with place > for your 2G chunks. So in theory you need 2 empty devices added for a > balance to succeed. If you can allow reduces redundancy for some time, > you could shrink the fs used space on hdd1 to half, same for the > partition itself, add a hdd2 parttition and add that to the fs. Or > just add another HDD. > Then your 50Gb of deletions could get into effect if you start > balancing. Also have a look at the balance stripe filters I would say. Output of show_usage.py: https://gist.githubusercontent.com/ojab/850276af6ff3aa566b8a3ce6ec444521/raw/4d77e02d556ed0edb0f9823259f145f65e80bc66/gistfile1.txt Looks like I only have smaller spare drives at the moment (largest is 100GB), is it ok to use? Or there is some minimal drive size needed for my setup? //wbr ojab -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Cannot balance FS (No space left on device)
On Fri, Jun 10, 2016 at 8:04 PM, ojab //wrote: > [Please CC me since I'm not subscribed to the list] > Hi, > I've tried to `/usr/bin/btrfs fi defragment -r` my btrfs partition, > but it's failed w/ "No space left on device" and now I can't get any > free space on that partition (deleting some files or adding new device > doesn't help). During defrag I've used `space_cache=v2` mount option, > but remounted FS w/ `clear_cache` flag since then. Also I've deleted > about 50Gb of files and added new 250Gb disk since then: > >>$ df -h /mnt/xxx/ >>Filesystem Size Used Avail Use% Mounted on >>/dev/sdc1 2,1T 1,8T 37G 99% /mnt/xxx >>$ sudo /usr/bin/btrfs fi show >>Label: none uuid: 8a65465d-1a8c-4f80-abc6-c818c38567c3 >>Total devices 3 FS bytes used 1.78TiB >>devid1 size 931.51GiB used 931.51GiB path /dev/sdc1 >>devid2 size 931.51GiB used 931.51GiB path /dev/sdb1 >>devid3 size 230.41GiB used 0.00B path /dev/sdd1 >>$ sudo /usr/bin/btrfs fi usage /mnt/xxx/ >>Overall: >>Device size: 2.04TiB >>Device allocated: 1.82TiB >>Device unallocated:230.41GiB >>Device missing:0.00B >>Used: 1.78TiB >>Free (estimated): 267.23GiB (min: 152.03GiB) >>Data ratio:1.00 >>Metadata ratio:2.00 >>Global reserve:512.00MiB (used: 0.00B) >> >>Data,RAID0: Size:1.81TiB, Used:1.78TiB >> /dev/sdb1 928.48GiB >> /dev/sdc1 928.48GiB >> >>Metadata,RAID1: Size:3.00GiB, Used:2.30GiB >> /dev/sdb1 3.00GiB >> /dev/sdc1 3.00GiB >> >>System,RAID1: Size:32.00MiB, Used:176.00KiB >> /dev/sdb132.00MiB >> /dev/sdc132.00MiB >> >>Unallocated: >> /dev/sdb1 1.01MiB >> /dev/sdc1 1.00MiB >> /dev/sdd1 230.41GiB >>$ sudo /usr/bin/btrfs balance start -dusage=66 /mnt/xxx/ >>Done, had to relocate 0 out of 935 chunks >>$ sudo /usr/bin/btrfs balance start -dusage=67 /mnt/xxx/ >>ERROR: error during balancing '/mnt/xxx/': No space left on device >>There may be more info in syslog - try dmesg | tail > > I assume that there is something wrong with metadata, since I can copy > files to FS. > I'm on 4.6.2 vanilla kernel and using btrfs-progs-4.6, btrfs-debugfs > output can be found here: > https://gist.githubusercontent.com/ojab/1a8b1f83341403a169a8e66995c7c3da/raw/61621d22f706d7543a93a3d005415543af9a0db0/gistfile1.txt. > Any hint what else can I try to fix the issue? I have seldom seen an fs so full, very regular numbers :) But can you provide the output of this script: https://github.com/knorrie/btrfs-heatmap/blob/master/show_usage.py It gives better info w.r.t. devices and it is then easier to say what has to be done. But you have btrfs raid0 data (2 stripes) and raid1 metadata, and they both want 2 devices currently and there is only one device with place for your 2G chunks. So in theory you need 2 empty devices added for a balance to succeed. If you can allow reduces redundancy for some time, you could shrink the fs used space on hdd1 to half, same for the partition itself, add a hdd2 parttition and add that to the fs. Or just add another HDD. Then your 50Gb of deletions could get into effect if you start balancing. Also have a look at the balance stripe filters I would say. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs-progs: add check-only option for balance
Hi, Correct me if I'm wrong, On 06/09/2016 11:46 PM, Ashish Samant wrote: +/* return 0 if balance can remove a data block group, otherwise return 1 */ +static int search_data_bgs(const char *path) +{ + struct btrfs_ioctl_search_args args; + struct btrfs_ioctl_search_key *sk; + struct btrfs_ioctl_search_header *header; + struct btrfs_block_group_item *bg; + unsigned long off = 0; + DIR *dirstream = NULL; + int e; + int fd; + int i; + u64 total_free = 0; + u64 min_used = (u64)-1; + u64 free_of_min_used = 0; + u64 bg_of_min_used = 0; + u64 flags; + u64 used; + int ret = 0; + int nr_data_bgs = 0; + + fd = btrfs_open_dir(path, , 1); + if (fd < 0) + return 1; + + memset(, 0, sizeof(args)); + sk = + + sk->tree_id = BTRFS_EXTENT_TREE_OBJECTID; + sk->min_objectid = sk->min_offset = sk->min_transid = 0; + sk->max_objectid = sk->max_offset = sk->max_transid = (u64)-1; + sk->max_type = sk->min_type = BTRFS_BLOCK_GROUP_ITEM_KEY; + sk->nr_items = 65536; This search returns not only block group information, but also everything else. You're first retrieving the complete extent tree to userspace, in buffers... + + while (1) { + ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, ); + e = errno; + if (ret < 0) { + fprintf(stderr, "ret %d error '%s'\n", ret, + strerror(e)); + return ret; + } + /* +* it should not happen. +*/ + if (sk->nr_items == 0) + break; + + off = 0; + for (i = 0; i < sk->nr_items; i++) { + header = (struct btrfs_ioctl_search_header *)(args.buf + + off); + + off += sizeof(*header); + if (header->type == BTRFS_BLOCK_GROUP_ITEM_KEY) { ...and then just throwing 99.99% of the results away again. This is going to take a phenomenal amount of effort on a huge filesystem, copying unnessecary data around between the kernel and your program. The first thing I learned myself when starting to play with the search ioctl is that the search doesn't happen in some kind of 3 dimensional space. You can't just filter on a type of object when walking the tree. http://logs.tvrrug.org.uk/logs/%23btrfs/2016-02-13.html#2016-02-13T22:32:52 The sk->max_type = sk->min_type = BTRFS_BLOCK_GROUP_ITEM_KEY only makes the search space start somewhere halfway objid 0 and end halfway objid max, including all other possible values for the type field for all objids in between. + bg = (struct btrfs_block_group_item *) + (args.buf + off); + flags = btrfs_block_group_flags(bg); + if (flags & BTRFS_BLOCK_GROUP_DATA) { + nr_data_bgs++; + used = btrfs_block_group_used(bg); + printf( + "block_group %15llu (len %11llu used %11llu)\n", + header->objectid, + header->offset, used); + total_free += header->offset - used; + if (min_used >= used) { + min_used = used; + free_of_min_used = + header->offset - used; + bg_of_min_used = + header->objectid; + } + } + } + + off += header->len; + sk->min_objectid = header->objectid; + sk->min_type = header->type; + sk->min_offset = header->offset; When the following is a part of your extent tree... key (289406976 EXTENT_ITEM 19193856) itemoff 15718 itemsize 53 extent refs 1 gen 11 flags DATA extent data backref root 5 objectid 258 offset 0 count 1 key (289406976 BLOCK_GROUP_ITEM 1073741824) itemoff 15694 itemsize 24 block group used 24612864 chunk_objectid 256 flags DATA ...and when the extent_item just manages to squeeze in as last result into the current result buffer from the ioctl... ...then your search key looks like (289406976 168 19193856) after copying the values from the last seen object... + } + sk->nr_items = 65536; + +
Re: Kernel crash on mount after SMR disk trouble
On Sat, May 14, 2016 at 10:19 AM, Jukka Larjawrote: > In short: > > I added two 8TB Seagate Archive SMR disk to btrfs pool and tried to delete > one of the old disks. After some errors I ended up with file system that can > be mounted read-only, but crashes the kernel if mounted normally. Tried > btrfs check --repair (which noted that space cache needs to be zeroed) and > zeroing space cache (via mount parameter), but that didn't change anything. > > Longer version: > > I was originally running Debian Jessie with some pretty recent kernel (maybe > 4.4), but somewhat older btrfs tools. After the trouble started, I tried You should at least have kernel 4.4, the critical patch for supporting this drive was added in 4.4-rc3 or 4.4-rc4, i dont remember exactly. Only if you somehow disable NCQ completely in your linux system (kernel and more) or use a HW chipset/bridge that does that for you it might work. > updating (now running Kernel 4.5.1 and tools 4.4.1). I checked the new disks > with badblocks (no problems found), but based on some googling, Seagate's > SMR disks seem to have various problems, so the root cause is probably one > type or another of disk errors. Seagate provides a special variant of the linux ext4 fs system that should then play well with their SMR drive. Also the advice is to not use this drive in a array setup; the risk is way to high that they can't keep up with the demands of the higher layers and then get resets or their FW crashes. You should have had also have a look at your system's and drive timeouts (see scterc). To summarize: adding those drives to an btrfs raid array is asking for trouble. I am using 1 such drive with an Intel J1900 SoC (Atom, SATA2) and it works, although I get still the typical error occasionally. As it is just a btrfs receive target, just 1 fs dup/dup/single for the whole drive, all CoW, it survives those lockups or crashes, I just restart the board+drive. In general, reading back multi-TB ro snapshots works fine and is on par with Gbps LAN speeds. > Here's the output of btrfs fi show: > > Label: none uuid: 8b65962d-0982-449b-ac6f-1acc8397ceb9 > Total devices 12 FS bytes used 13.15TiB > devid1 size 3.64TiB used 3.36TiB path /dev/sde1 > devid2 size 3.64TiB used 3.36TiB path /dev/sdg1 > devid3 size 3.64TiB used 3.36TiB path /dev/sdh1 > devid4 size 3.64TiB used 3.34TiB path /dev/sdf1 > devid5 size 1.82TiB used 1.44TiB path /dev/sdi1 > devid6 size 1.82TiB used 1.54TiB path /dev/sdl1 > devid7 size 1.82TiB used 1.51TiB path /dev/sdk1 > devid8 size 1.82TiB used 1.54TiB path /dev/sdj1 > devid9 size 3.64TiB used 3.31TiB path /dev/sdb1 > devid 10 size 3.64TiB used 3.36TiB path /dev/sda1 > devid 11 size 7.28TiB used 168.00GiB path /dev/sdc1 > devid 12 size 7.28TiB used 168.00GiB path /dev/sdd1 > > Last two devices (11 and 12) are the new disks. After adding them, I first > copied some new data in (about 130 GBs), which seemed to go fine. Then I > tried to remove disk 5. After some time (about 30 GiBs written to 11 and > 12), there were some errors and disk 11 or 12 dropped out and fs went > read-only. After some trouble-shooting (googling), I decided the new disks > were too iffy to trust and tried to remove them. > > I don't remember exactly what errors I got, but device delete operation was > interrupted due to errors at least once or twice, before more serious > trouble began. In between the attempts I updated the HBA's (an LSI 9300) > firmware. After final device delete attempt the end result was that > attempting to mount causes kernel to crash. I then tried updating kernel and > running check --repair, but that hasn't helped. Mounting read-only seems to > work perfectly, but I haven't tried copying everything to /dev/null or > anything like that (just few files). > > The log of the crash (it is very repeatable) can be seen here: > http://jane.aarghimedes.fi/~jlarja/tempe/btrfs-trouble/btrfs_crash_log.txt > > Snipped from start of that: > > touko 12 06:41:22 jane kernel: BTRFS info (device sda1): disk space caching > is enabled > touko 12 06:41:24 jane kernel: BTRFS info (device sda1): bdev /dev/sdd1 > errs: wr 0, rd 0, flush 1, corrupt 0, gen 0 > touko 12 06:41:39 jane kernel: BUG: unable to handle kernel NULL pointer > dereference at 01f0 > touko 12 06:41:39 jane kernel: IP: [] > can_overcommit+0x1e/0xf0 [btrfs] > touko 12 06:41:39 jane kernel: PGD 0 > touko 12 06:41:39 jane kernel: Oops: [#1] SMP > > My dmesg log is here: > http://jane.aarghimedes.fi/~jlarja/tempe/btrfs-trouble/dmesg.log > > Other information: > Linux jane 4.5.0-1-amd64 #1 SMP Debian 4.5.1-1 (2016-04-14) x86_64 GNU/Linux > btrfs-progs v4.4.1 > > btrfs fi df /mnt/Allosaurus/ > Data, RAID1: total=13.13TiB, used=13.07TiB > Data, single: total=8.00MiB, used=0.00B > System, RAID1:
[GIT PULL] Btrfs
Hi Linus My for-linus-4.7 branch: git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git for-linus-4.7 Has some fixes and some new self tests for btrfs. The self tests are usually disabled in the .config file (unless you're doing btrfs dev work), and this bunch is meant to find problems with the 64K page size patches. Jeff has a patch to help people see if they are using the hardware assist crc32c module, which really helps us nail down problems when people ask why crcs are using so much CPU. Otherwise, it's small fixes. Feifei Xu (8) commits (+475/-361): Btrfs: test_check_exists: Fix infinite loop when searching for free space entries (+2/-2) Btrfs: self-tests: Execute page straddling test only when nodesize < PAGE_SIZE (+30/-19) Btrfs: self-tests: Use macros instead of constants and add missing newline (+31/-18) Btrfs: self-tests: Support testing all possible sectorsizes and nodesizes (+32/-22) Btrfs: self-tests: Fix extent buffer bitmap test fail on BE system (+11/-1) Btrfs: Fix integer overflow when calculating bytes_per_bitmap (+7/-7) Btrfs: self-tests: Fix test_bitmaps fail on 64k sectorsize (+7/-1) Btrfs: self-tests: Support non-4k page size (+355/-291) Liu Bo (3) commits (+104/-15): Btrfs: clear uptodate flags of pages in sys_array eb (+2/-0) Btrfs: add validadtion checks for chunk loading (+67/-15) Btrfs: add more validation checks for superblock (+35/-0) Josef Bacik (1) commits (+1/-0): Btrfs: end transaction if we abort when creating uuid root Jeff Mahoney (1) commits (+9/-2): btrfs: advertise which crc32c implementation is being used at module load Vinson Lee (1) commits (+1/-1): btrfs: Use __u64 in exported linux/btrfs.h. Total: (14) commits (+590/-379) fs/btrfs/ctree.c | 6 +- fs/btrfs/disk-io.c | 20 +- fs/btrfs/disk-io.h | 2 +- fs/btrfs/extent_io.c | 10 +- fs/btrfs/extent_io.h | 4 +- fs/btrfs/free-space-cache.c| 18 +- fs/btrfs/hash.c| 5 + fs/btrfs/hash.h| 1 + fs/btrfs/super.c | 57 -- fs/btrfs/tests/btrfs-tests.c | 6 +- fs/btrfs/tests/btrfs-tests.h | 27 +-- fs/btrfs/tests/extent-buffer-tests.c | 13 +- fs/btrfs/tests/extent-io-tests.c | 86 ++--- fs/btrfs/tests/free-space-tests.c | 76 +--- fs/btrfs/tests/free-space-tree-tests.c | 30 +-- fs/btrfs/tests/inode-tests.c | 344 ++--- fs/btrfs/tests/qgroup-tests.c | 111 ++- fs/btrfs/volumes.c | 109 +-- include/uapi/linux/btrfs.h | 2 +- 19 files changed, 569 insertions(+), 358 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Allocator behaviour during device delete
On 06/10/2016 09:26 PM, Henk Slager wrote: On Thu, Jun 9, 2016 at 3:54 PM, Brendan Hidewrote: On 06/09/2016 03:07 PM, Austin S. Hemmelgarn wrote: OK, I'm pretty sure I know what was going on in this case. Your assumption that device delete uses the balance code is correct, and that is why you see what's happening happening. There are two key bits that are missing though: 1. Balance will never allocate chunks when it doesn't need to. In relation to discussions w.r.t. enospc and device full of chunks, I say this 1. statement and I see different behavior with kernel 4.6.0 tools 4.5.3 On a idle fs with some fragmentation, I did balance -dusage=5, it completes succesfuly and leaves and new empty chunk (highest vaddr). Then balance -dusage=6, does 2 chunks with that usage level: - the zero filled last chunk is replaced with a new empty chunk (higher vaddr) - the 2 usage=6 chunks are gone - one chunk with the lowest vaddr saw its usage increase from 47 to 60 - several metadata chunks have change slightly in usage I noticed the same thing, kernel 4.5.4, progs 4.4.1. When balance starts doing anything, (so relocating >= 1 chunks, not when relocating 0), it first creates a new empty chunk. Even if all data that is balanced away is added to already existing chunks, the new empty one is still always left behind. When doing balance again with dusage=0, or repeatedly doing so, each time a new empty chunk is created, and then the previous empty one is removed, bumping up the start vaddr of the new chunk with 1GB each time. -- Hans van Kranenburg - System / Network Engineer T +31 (0)10 2760434 | hans.van.kranenb...@mendix.com | www.mendix.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Allocator behaviour during device delete
On Thu, Jun 9, 2016 at 3:54 PM, Brendan Hidewrote: > > > On 06/09/2016 03:07 PM, Austin S. Hemmelgarn wrote: >> >> On 2016-06-09 08:34, Brendan Hide wrote: >>> >>> Hey, all >>> >>> I noticed this odd behaviour while migrating from a 1TB spindle to SSD >>> (in this case on a LUKS-encrypted 200GB partition) - and am curious if >>> this behaviour I've noted below is expected or known. I figure it is a >>> bug. Depending on the situation, it *could* be severe. In my case it was >>> simply annoying. >>> >>> --- >>> Steps >>> >>> After having added the new device (btrfs dev add), I deleted the old >>> device (btrfs dev del) >>> >>> Then, whilst waiting for that to complete, I started a watch of "btrfs >>> fi show /". Note that the below is very close to the output at the time >>> - but is not actually copy/pasted from the output. >>> Label: 'tricky-root' uuid: bcbe47a5-bd3f-497a-816b-decb4f822c42 Total devices 2 FS bytes used 115.03GiB devid1 size 0.00GiB used 298.06GiB path /dev/sda2 devid2 size 200.88GiB used 0.00GiB path /dev/mapper/cryptroot >>> >>> >>> >>> devid1 is the old disk while devid2 is the new SSD >>> >>> After a few minutes, I saw that the numbers have changed - but that the >>> SSD still had no data: >>> Label: 'tricky-root' uuid: bcbe47a5-bd3f-497a-816b-decb4f822c42 Total devices 2 FS bytes used 115.03GiB devid1 size 0.00GiB used 284.06GiB path /dev/sda2 devid2 size 200.88GiB used 0.00GiB path /dev/mapper/cryptroot >>> >>> >>> The "FS bytes used" amount was changing a lot - but mostly stayed near >>> the original total, which is expected since there was very little >>> happening other than the "migration". >>> >>> I'm not certain of the exact point where it started using the new disk's >>> space. I figure that may have been helpful to pinpoint. :-/ >> >> OK, I'm pretty sure I know what was going on in this case. Your >> assumption that device delete uses the balance code is correct, and that >> is why you see what's happening happening. There are two key bits that >> are missing though: >> 1. Balance will never allocate chunks when it doesn't need to. In relation to discussions w.r.t. enospc and device full of chunks, I say this 1. statement and I see different behavior with kernel 4.6.0 tools 4.5.3 On a idle fs with some fragmentation, I did balance -dusage=5, it completes succesfuly and leaves and new empty chunk (highest vaddr). Then balance -dusage=6, does 2 chunks with that usage level: - the zero filled last chunk is replaced with a new empty chunk (higher vaddr) - the 2 usage=6 chunks are gone - one chunk with the lowest vaddr saw its usage increase from 47 to 60 - several metadata chunks have change slightly in usage It could be a 2-step datamove, but from just the states before and after balance I can't prove that. >> 2. The space usage listed in fi show is how much space is allocated to >> chunks, not how much is used in those chunks. >> >> In this case, based on what you've said, you had a lot of empty or >> mostly empty chunks. As a result of this, the device delete was both >> copying data, and consolidating free space. If you have a lot of empty >> or mostly empty chunks, it's not unusual for a device delete to look >> like this until you start hitting chunks that have actual data in them. >> The pri8mary point of this behavior is that it makes it possible to >> directly switch to a smaller device without having to run a balance and >> then a resize before replacing the device, and then resize again >> afterwards. > > > Thanks, Austin. Your explanation is along the lines of my thinking though. > > The new disk should have had *some* data written to it at that point, as it > started out at over 600GiB in allocation (should have probably mentioned > that already). Consolidating or not, I would consider data being written to > the old disk to be a bug, even if it is considered minor. > > I'll set up a reproducible test later today to prove/disprove the theory. :) > > -- > __ > Brendan Hide > http://swiftspirit.co.za/ > http://www.webafrica.co.za/?AFF1E97 > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Replacing drives with larger ones in a 4 drive raid1
This is somewhat off topic but... 9.6.2016, 18.20, Duncan kirjoitti: Are those the 8 TB SMR "archive" drives? I haven't been following the issue very closely, but be aware that there were serious issues with those drives a few kernels back, and that while those issues are now fixed, the drives themselves operate rather differently than normal drives, and simply don't work well in normal usage. Either the issues were not fixed or LSI Logic / Symbios Logic SAS3008 is incompatible with the drives (and an older model of theirs, which I don't have anymore) as well as Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 05). I haven't been able to get the disks to fail with any other load but Btrfs. However, with that they fail spectacularly. They drop out and make enough mess to corrupt things beyond repair. (See https://www.spinics.net/lists/linux-btrfs/msg55218.html for more info.) There's a slight change that I missed some relevant kernel update. When I get new disks and can get the array fixed (it still only mounts read-only), I'll do some testing with the SMR drives. If they work, that's great, but at the moment I wouldn't buy them for Btrfs use even if the workload or environmental characteristics wouldn't be a problem. -- ...Elämälle vierasta toimintaa... Jukka Larja, roskak...@aarghimedes.fi "Are we feeling better then?" "I'm naming all the stars." "You can't see the stars, love. That's the ceiling. Also, it's day." "I can see them. But I've named them all the same name, and there's terrible confusion..." - Spike & Drusilla, Buffy the Vampire Slayer - -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Cannot balance FS (No space left on device)
[Please CC me since I'm not subscribed to the list] Hi, I've tried to `/usr/bin/btrfs fi defragment -r` my btrfs partition, but it's failed w/ "No space left on device" and now I can't get any free space on that partition (deleting some files or adding new device doesn't help). During defrag I've used `space_cache=v2` mount option, but remounted FS w/ `clear_cache` flag since then. Also I've deleted about 50Gb of files and added new 250Gb disk since then: >$ df -h /mnt/xxx/ >Filesystem Size Used Avail Use% Mounted on >/dev/sdc1 2,1T 1,8T 37G 99% /mnt/xxx >$ sudo /usr/bin/btrfs fi show >Label: none uuid: 8a65465d-1a8c-4f80-abc6-c818c38567c3 >Total devices 3 FS bytes used 1.78TiB >devid1 size 931.51GiB used 931.51GiB path /dev/sdc1 >devid2 size 931.51GiB used 931.51GiB path /dev/sdb1 >devid3 size 230.41GiB used 0.00B path /dev/sdd1 >$ sudo /usr/bin/btrfs fi usage /mnt/xxx/ >Overall: >Device size: 2.04TiB >Device allocated: 1.82TiB >Device unallocated:230.41GiB >Device missing:0.00B >Used: 1.78TiB >Free (estimated): 267.23GiB (min: 152.03GiB) >Data ratio:1.00 >Metadata ratio:2.00 >Global reserve:512.00MiB (used: 0.00B) > >Data,RAID0: Size:1.81TiB, Used:1.78TiB > /dev/sdb1 928.48GiB > /dev/sdc1 928.48GiB > >Metadata,RAID1: Size:3.00GiB, Used:2.30GiB > /dev/sdb1 3.00GiB > /dev/sdc1 3.00GiB > >System,RAID1: Size:32.00MiB, Used:176.00KiB > /dev/sdb132.00MiB > /dev/sdc132.00MiB > >Unallocated: > /dev/sdb1 1.01MiB > /dev/sdc1 1.00MiB > /dev/sdd1 230.41GiB >$ sudo /usr/bin/btrfs balance start -dusage=66 /mnt/xxx/ >Done, had to relocate 0 out of 935 chunks >$ sudo /usr/bin/btrfs balance start -dusage=67 /mnt/xxx/ >ERROR: error during balancing '/mnt/xxx/': No space left on device >There may be more info in syslog - try dmesg | tail I assume that there is something wrong with metadata, since I can copy files to FS. I'm on 4.6.2 vanilla kernel and using btrfs-progs-4.6, btrfs-debugfs output can be found here: https://gist.githubusercontent.com/ojab/1a8b1f83341403a169a8e66995c7c3da/raw/61621d22f706d7543a93a3d005415543af9a0db0/gistfile1.txt. Any hint what else can I try to fix the issue? //wbr ojab -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs-progs: add check-only option for balance
Hi all, On 2016-06-09 23:46, Ashish Samant wrote: > From: Liu Bo> > This aims to decide whether a balance can reduce the number of > data block groups and if it can, this shows the '-dvrange' block > group's objectid. > > With this, you can run > 'btrfs balance start -c mnt' or 'btrfs balance start --check-only mnt' > > -- > $ btrfs balance start -c /mnt/btrfs > Checking data block groups... > block_group12582912 (len 8388608 used 786432) > block_group 1103101952 (len 1073741824 used 536870912) > block_group 2176843776 (len 1073741824 used 1073741824) > total bgs 3 total_free 544473088 min_used bg 12582912 has (min_used 786432 > free 7602176) > run 'btrfs balance start -dvrange=12582912..12582913 your_mnt' > > $ btrfs balance start -dvrange=12582912..12582913 /mnt/btrfs > Done, had to relocate 1 out of 5 chunks > > $ btrfs balance start -c /mnt/btrfs > Checking data block groups... > block_group 1103101952 (len 1073741824 used 537395200) > block_group 2176843776 (len 1073741824 used 1073741824) > total bgs 2 total_free 536346624 min_used bg 1103101952 has (min_used > 537395200 free 536346624) > -- > > So you now know how to babysit your btrfs in a smart way. I think that it is an excellent tool. However I have some suggestions, most of these are from an user interface POV: 1) this should be a real command; it doesn't make sense at all that this command is a "sub command" of "btrfs bal start". I have two suggestion about that: a) we could add a new sub-command to the "balance" family. Something like "btrfs bal analisys", where we could put some suggestions for a good balance b) we could add a new sub-command to the "inspect" family. We could also add some feature like showing other block_gruop (system and metadata), and print their profile: i.e. # btrfs inspect block-group-analisys / Type Mode Start LenUsed Data single 83806388224 1.00GiB 945.64MiB Data single 84880130048 1.00GiB 890.60MiB Data single 85953871872 1.00GiB 818.18MiB Data single 87027613696 1.00GiB 835.58MiB Data single 88101355520 1.00GiB 1023.91MiB System single 8917509734432.00MiB16.00KiB Metadata single 89208651776 1.00GiB 614.88MiB [...] further options could be added like showing only the most empty chunks, sorting by the Used value, filtering by type and/or profile 2) From a readability POV, I suggest to use the pretty_size() function to display a more readable "len" and "used". 3) For the same reason, I suggest to switch to a "tabular" format, like my example: it doesn't make sense to write for every line "block_group/len/used"... 4) when the usual balance command fails because ENOSPACE, we could suggest to use this new command more notes below BR G.Baroncelli > > Signed-off-by: Liu Bo > Signed-off-by: Ashish Samant > --- > cmds-balance.c | 127 > +++- > 1 files changed, 126 insertions(+), 1 deletions(-) > > diff --git a/cmds-balance.c b/cmds-balance.c > index 8f3bf5b..e2aab6c 100644 > --- a/cmds-balance.c > +++ b/cmds-balance.c > @@ -493,6 +493,116 @@ out: > return ret; > } > > +/* return 0 if balance can remove a data block group, otherwise return 1 */ > +static int search_data_bgs(const char *path) > +{ > + struct btrfs_ioctl_search_args args; > + struct btrfs_ioctl_search_key *sk; > + struct btrfs_ioctl_search_header *header; > + struct btrfs_block_group_item *bg; > + unsigned long off = 0; > + DIR *dirstream = NULL; > + int e; > + int fd; > + int i; > + u64 total_free = 0; > + u64 min_used = (u64)-1; > + u64 free_of_min_used = 0; > + u64 bg_of_min_used = 0; > + u64 flags; > + u64 used; > + int ret = 0; > + int nr_data_bgs = 0; > + > + fd = btrfs_open_dir(path, , 1); > + if (fd < 0) > + return 1; > + > + memset(, 0, sizeof(args)); > + sk = > + > + sk->tree_id = BTRFS_EXTENT_TREE_OBJECTID; > + sk->min_objectid = sk->min_offset = sk->min_transid = 0; > + sk->max_objectid = sk->max_offset = sk->max_transid = (u64)-1; > + sk->max_type = sk->min_type = BTRFS_BLOCK_GROUP_ITEM_KEY; > + sk->nr_items = 65536; > + > + while (1) { > + ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, ); > + e = errno; > + if (ret < 0) { > + fprintf(stderr, "ret %d error '%s'\n", ret, > + strerror(e)); > + return ret; > + } > + /* > + * it should not happen. > + */ >
Re: fsck: to repair or not to repair
On Fri, Jun 10, 2016 at 7:22 PM, Adam Borowskiwrote: > On Fri, Jun 10, 2016 at 01:12:42PM -0400, Austin S. Hemmelgarn wrote: >> On 2016-06-10 12:50, Adam Borowski wrote: >> >And, as of coreutils 8.25, the default is no reflink, with "never" not being >> >recognized even as a way to avoid an alias. As far as I remember, this >> >applies to every past version with support for reflinks too. >> > >> Odd, I could have sworn that was an option... >> >> And I do know there was talk at least at one point of adding it and >> switching to reflink=auto by default. > > Yes please! > > It's hard to come with a good reason for not reflinking when it's possible > -- the only one I see is if you have a nocow VM and want to slightly improve > speed at a cost of lots of disk space. And even then, there's cat a >b for > that. For a nocow VM imagefile, reflink anyhow does not work so cp --reflink=auto would then just duplicate the whole thing, do doing a 'cp --reflink=never' (never works for --sparse), either silently or with a warning/note. For a cow VM imagefile, the only thing I do and want w.r.t. cp is reflink=always, so I also vote for auto on by default. If you want to 'defrag' a VM imagefile, using cat or dd and enough RAM does a better and faster job than cp or btrfs manual defrag. > And the cost on non-btrfs non-unmerged-xfs is a single syscall per file, > that's utterly negligible compared to actually copying the data. > > -- > An imaginary friend squared is a real enemy. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fsck: to repair or not to repair
On 2016-06-10 13:22, Adam Borowski wrote: On Fri, Jun 10, 2016 at 01:12:42PM -0400, Austin S. Hemmelgarn wrote: On 2016-06-10 12:50, Adam Borowski wrote: And, as of coreutils 8.25, the default is no reflink, with "never" not being recognized even as a way to avoid an alias. As far as I remember, this applies to every past version with support for reflinks too. Odd, I could have sworn that was an option... And I do know there was talk at least at one point of adding it and switching to reflink=auto by default. Yes please! It's hard to come with a good reason for not reflinking when it's possible -- the only one I see is if you have a nocow VM and want to slightly improve speed at a cost of lots of disk space. And even then, there's cat a >b for that. There are other arguments, the most common one being not changing user visible behavior. There are (misguided) people who expect copying a file to mean you have two distinct copies of that file. OTOH, it's not too hard to set up a system to do this, you just put: alias cp='cp --reflink=auto' into your bashrc (or something similar into whatever other shell you use). I've been doing this since cp added support for it. And the cost on non-btrfs non-unmerged-xfs is a single syscall per file, that's utterly negligible compared to actually copying the data. Actually, IIRC, it's an ioctl, not a syscall, which can be kind of expensive (I don't know how much more expensive, but ioctls are usually more expensive than syscalls). Other things to keep in mind though that may impact this (either way): 1. There are other filesystems that support reflinks (OCFS2 and ZFS come immediately to mind). 2. Most of the filesystems that support reflinks are used more in enterprise situations, where the bit about not changing user visible behavior is a much stronger argument. 3. Even in enterprise situations, reflink capable filesystems are still unusual outside of petabyte scale data storage. 4. Last I checked, the most widely used filesystem that supports reflinks (ZFS) uses a different ioctl interface for them than most other Linux filesystems, which means more checking is needed than just calling one ioctl. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fsck: to repair or not to repair
On Fri, Jun 10, 2016 at 01:12:42PM -0400, Austin S. Hemmelgarn wrote: > On 2016-06-10 12:50, Adam Borowski wrote: > >And, as of coreutils 8.25, the default is no reflink, with "never" not being > >recognized even as a way to avoid an alias. As far as I remember, this > >applies to every past version with support for reflinks too. > > > Odd, I could have sworn that was an option... > > And I do know there was talk at least at one point of adding it and > switching to reflink=auto by default. Yes please! It's hard to come with a good reason for not reflinking when it's possible -- the only one I see is if you have a nocow VM and want to slightly improve speed at a cost of lots of disk space. And even then, there's cat a >b for that. And the cost on non-btrfs non-unmerged-xfs is a single syscall per file, that's utterly negligible compared to actually copying the data. -- An imaginary friend squared is a real enemy. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fsck: to repair or not to repair
On 2016-06-10 12:50, Adam Borowski wrote: On Fri, Jun 10, 2016 at 08:54:36AM -0700, Nikolaus Rath wrote: On Jun 10 2016, "Austin S. Hemmelgarn"wrote: JFYI, if you've using GNU cp, you can pass '--reflink=never' to avoid it making reflinks. I would have expected so, but at least in coreutils 8.23 the only valid options are "never" and "auto" (at least according to cp --help and the manpage). Where do you get "never" from? .-- cp: invalid argument ‘never’ for ‘--reflink’ Valid arguments are: - ‘auto’ - ‘always’ Try 'cp --help' for more information. ` And, as of coreutils 8.25, the default is no reflink, with "never" not being recognized even as a way to avoid an alias. As far as I remember, this applies to every past version with support for reflinks too. Odd, I could have sworn that was an option... And I do know there was talk at least at one point of adding it and switching to reflink=auto by default. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs filesystem keeps allocating new chunks for no apparent reason
On Thu, Jun 9, 2016 at 5:41 PM, Duncan <1i5t5.dun...@cox.net> wrote: > Hans van Kranenburg posted on Thu, 09 Jun 2016 01:10:46 +0200 as > excerpted: > >> The next question is what files these extents belong to. To find out, I >> need to open up the extent items I get back and follow a backreference >> to an inode object. Might do that tomorrow, fun. >> >> To be honest, I suspect /var/log and/or the file storage of mailman to >> be the cause of the fragmentation, since there's logging from postfix, >> mailman and nginx going on all day long in a slow but steady tempo. >> While using btrfs for a number of use cases at work now, we normally >> don't use it for the root filesystem. And the cases where it's used as >> root filesystem don't do much logging or mail. > > FWIW, that's one reason I have a dedicated partition (and filesystem) for > logs, here. (The other reason is that should something go runaway log- > spewing, I get a warning much sooner when my log filesystem fills up, not > much later, with much worse implications, when the main filesystem fills > up!) > >> And no, autodefrag is not in the mount options currently. Would that be >> helpful in this case? > > It should be helpful, yes. Be aware that autodefrag works best with > smaller (sub-half-gig) files, however, and that it used to cause > performance issues with larger database and VM files, in particular. I don't know why you relate filesize and autodefrag. Maybe because you say '... used to cause ...'. autodefrag detects random writes and then tries to defrag a certain range. Its scope size is 256K as far as I see from the code and over time you see VM images that are on a btrfs fs (CoW, hourly ro snapshots) having a lot of 256K (or a bit less) sized extents according to what filefrag reports. I once wanted to try and change the 256K to 1M or even 4M, but I haven't come to that. A 32G VM image would consist of 131072 extents for 256K, 32768 extents for 1M, 8192 extents for 4M. > There used to be a warning on the wiki about that, that was recently > removed, so apparently it's not the issue that it was, but you might wish > to monitor any databases or VMs with gig-plus files to see if it's going > to be a performance issue, once you turn on autodefrag. For very active databases, I don't know what the effects are, with or without autodefrag ( either on SSD and/or HDD). At least on HDD-only, so no persistent SSD caching and noautodefrag, VMs will result in unacceptable performance soon. > The other issue with autodefrag is that if it hasn't been on and things > are heavily fragmented, it can at first drive down performance as it > rewrites all these heavily fragmented files, until it catches up and is > mostly dealing only with the normal refragmentation load. I assume you mean that one only gets a performance drop if you actually do new writes to the fragmented files since autodefrag on. It shouldn't start defragging by itself AFAIK. > Of course the > best way around that is to run autodefrag from the first time you mount > the filesystem and start writing to it, so it never gets overly > fragmented in the first place. For a currently in-use and highly > fragmented filesystem, you have two choices, either backup and do a fresh > mkfs.btrfs so you can start with a clean filesystem and autodefrag from > the beginning, or doing manual defrag. > > However, be aware that if you have snapshots locking down the old extents > in their fragmented form, a manual defrag will copy the data to new > extents without releasing the old ones as they're locked in place by the > snapshots, thus using additional space. Worse, if the filesystem is > already heavily fragmented and snapshots are locking most of those > fragments in place, defrag likely won't help a lot, because the free > space as well will be heavily fragmented. So starting off with a clean > and new filesystem and using autodefrag from the beginning really is your > best bet. If it is about multi-TB fs, I think most important is to have enough unfragmented free space available and hopefully at the beginning of the device if it is flat HDD. Maybe a balance -ddrange=1M..<20% of device> can do that, I haven't tried. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fsck: to repair or not to repair
On Jun 10 2016, Adam Borowskiwrote: > On Fri, Jun 10, 2016 at 08:54:36AM -0700, Nikolaus Rath wrote: >> On Jun 10 2016, "Austin S. Hemmelgarn" wrote: >> > JFYI, if you've using GNU cp, you can pass '--reflink=never' to avoid >> > it making reflinks. >> >> I would have expected so, but at least in coreutils 8.23 the only valid >> options are "never" and "auto" (at least according to cp --help and the >> manpage). > > Where do you get "never" from? I meant to write "always" (as in my second mail, I thought I hit "cancel" quickly enough). Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fsck: to repair or not to repair
On Fri, Jun 10, 2016 at 08:54:36AM -0700, Nikolaus Rath wrote: > On Jun 10 2016, "Austin S. Hemmelgarn"wrote: > > JFYI, if you've using GNU cp, you can pass '--reflink=never' to avoid > > it making reflinks. > > I would have expected so, but at least in coreutils 8.23 the only valid > options are "never" and "auto" (at least according to cp --help and the > manpage). Where do you get "never" from? .-- cp: invalid argument ‘never’ for ‘--reflink’ Valid arguments are: - ‘auto’ - ‘always’ Try 'cp --help' for more information. ` And, as of coreutils 8.25, the default is no reflink, with "never" not being recognized even as a way to avoid an alias. As far as I remember, this applies to every past version with support for reflinks too. -- An imaginary friend squared is a real enemy. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] btrfs: prefix fsid to all trace events
On Thu, Jun 09, 2016 at 07:48:01PM -0400, je...@suse.com wrote: > From: Jeff Mahoney> > When using trace events to debug a problem, it's impossible to determine > which file system generated a particular event. This patch adds a > macro to prefix standard information to the head of a trace event. > > The extent_state alloc/free events are all that's left without an > fs_info available. Looks good to me. Reviewed-by: Liu Bo Thanks, -liubo > > Signed-off-by: Jeff Mahoney > --- > fs/btrfs/delayed-ref.c | 9 +- > fs/btrfs/extent-tree.c | 10 +- > fs/btrfs/qgroup.c| 19 +-- > fs/btrfs/qgroup.h| 9 +- > fs/btrfs/super.c | 2 +- > include/trace/events/btrfs.h | 282 > --- > 6 files changed, 182 insertions(+), 149 deletions(-) > > diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c > index 430b368..e7b1ec0 100644 > --- a/fs/btrfs/delayed-ref.c > +++ b/fs/btrfs/delayed-ref.c > @@ -606,7 +606,8 @@ add_delayed_ref_head(struct btrfs_fs_info *fs_info, > qrecord->num_bytes = num_bytes; > qrecord->old_roots = NULL; > > - qexisting = btrfs_qgroup_insert_dirty_extent(delayed_refs, > + qexisting = btrfs_qgroup_insert_dirty_extent(fs_info, > + delayed_refs, >qrecord); > if (qexisting) > kfree(qrecord); > @@ -615,7 +616,7 @@ add_delayed_ref_head(struct btrfs_fs_info *fs_info, > spin_lock_init(_ref->lock); > mutex_init(_ref->mutex); > > - trace_add_delayed_ref_head(ref, head_ref, action); > + trace_add_delayed_ref_head(fs_info, ref, head_ref, action); > > existing = htree_insert(_refs->href_root, > _ref->href_node); > @@ -682,7 +683,7 @@ add_delayed_tree_ref(struct btrfs_fs_info *fs_info, > ref->type = BTRFS_TREE_BLOCK_REF_KEY; > full_ref->level = level; > > - trace_add_delayed_tree_ref(ref, full_ref, action); > + trace_add_delayed_tree_ref(fs_info, ref, full_ref, action); > > ret = add_delayed_ref_tail_merge(trans, delayed_refs, head_ref, ref); > > @@ -739,7 +740,7 @@ add_delayed_data_ref(struct btrfs_fs_info *fs_info, > full_ref->objectid = owner; > full_ref->offset = offset; > > - trace_add_delayed_data_ref(ref, full_ref, action); > + trace_add_delayed_data_ref(fs_info, ref, full_ref, action); > > ret = add_delayed_ref_tail_merge(trans, delayed_refs, head_ref, ref); > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 689d25a..ecb68bb 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -2194,7 +2194,7 @@ static int run_delayed_data_ref(struct > btrfs_trans_handle *trans, > ins.type = BTRFS_EXTENT_ITEM_KEY; > > ref = btrfs_delayed_node_to_data_ref(node); > - trace_run_delayed_data_ref(node, ref, node->action); > + trace_run_delayed_data_ref(root->fs_info, node, ref, node->action); > > if (node->type == BTRFS_SHARED_DATA_REF_KEY) > parent = ref->parent; > @@ -2349,7 +2349,7 @@ static int run_delayed_tree_ref(struct > btrfs_trans_handle *trans, >SKINNY_METADATA); > > ref = btrfs_delayed_node_to_tree_ref(node); > - trace_run_delayed_tree_ref(node, ref, node->action); > + trace_run_delayed_tree_ref(root->fs_info, node, ref, node->action); > > if (node->type == BTRFS_SHARED_BLOCK_REF_KEY) > parent = ref->parent; > @@ -2413,7 +2413,8 @@ static int run_one_delayed_ref(struct > btrfs_trans_handle *trans, >*/ > BUG_ON(extent_op); > head = btrfs_delayed_node_to_head(node); > - trace_run_delayed_ref_head(node, head, node->action); > + trace_run_delayed_ref_head(root->fs_info, node, head, > +node->action); > > if (insert_reserved) { > btrfs_pin_extent(root, node->bytenr, > @@ -8316,7 +8317,8 @@ static int record_one_subtree_extent(struct > btrfs_trans_handle *trans, > > delayed_refs = >transaction->delayed_refs; > spin_lock(_refs->lock); > - if (btrfs_qgroup_insert_dirty_extent(delayed_refs, qrecord)) > + if (btrfs_qgroup_insert_dirty_extent(trans->root->fs_info, > + delayed_refs, qrecord)) > kfree(qrecord); > spin_unlock(_refs->lock); > > diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c > index 9d4c05b..13e28d8 100644 > --- a/fs/btrfs/qgroup.c > +++ b/fs/btrfs/qgroup.c > @@ -1453,9 +1453,10 @@ int btrfs_qgroup_prepare_account_extents(struct > btrfs_trans_handle *trans, > return ret; > } > > -struct
Re: fsck: to repair or not to repair
On Jun 10 2016, "Austin S. Hemmelgarn"wrote: > JFYI, if you've using GNU cp, you can pass '--reflink=never' to avoid > it making reflinks. I would have expected so, but at least in coreutils 8.23 the only valid options are "always" and "auto" (at least according to cp --help and the manpage). Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fsck: to repair or not to repair
On Jun 10 2016, "Austin S. Hemmelgarn"wrote: > JFYI, if you've using GNU cp, you can pass '--reflink=never' to avoid > it making reflinks. I would have expected so, but at least in coreutils 8.23 the only valid options are "never" and "auto" (at least according to cp --help and the manpage). Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: doc: add missing newline in btrfs-convert
Signed-off-by: Noah Massey--- Documentation/btrfs-convert.asciidoc | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/btrfs-convert.asciidoc b/Documentation/btrfs-convert.asciidoc index 28f9a39..ab3577d 100644 --- a/Documentation/btrfs-convert.asciidoc +++ b/Documentation/btrfs-convert.asciidoc @@ -90,6 +90,7 @@ are supported by old kernels. To disable a feature, prefix it with '^'. To see all available features that btrfs-convert supports run: + +btrfs-convert -O list-all+ ++ -p|--progress:: show progress of conversion, on by default --no-progress:: -- 2.8.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Btrfs progs release 4.6
Hi, the btrfs-progs 4.6 have been released (no change since rc1). The biggest change is the btrfs-convert rewrite. The delayed release was caused by more testing as there were some late fixes to the code although the patchset has been in the development branch for a long time. Apart from that, usual load of small fixes and improvements. * convert - major rewrite: * fix a long-standing bug that led to mixing data blocks into metadata block groups * the workaround was to do full balance after conversion, which was recommended practice anyway * explicitly set the lowest supported version of e2fstools to 1.41 * provide and install udev rules file that addresses problems with device mapper devices, renames after removal * send: new option: quiet * dev usage: report slack space (device size minus filesystem area on the dev) * image: support DUP * build: short options to enable debugging builds * other: * code cleanups * build fixes * more tests and other enhancements Tarballs: https://www.kernel.org/pub/linux/kernel/people/kdave/btrfs-progs/ Git: git://git.kernel.org/pub/scm/linux/kernel/git/kdave/btrfs-progs.git Shortlog: Anand Jain (2): btrfs-progs: makefile: add clean-all to the usage btrfs-progs: clean up commands.h David Sterba (22): btrfs-progs: build: add support for debugging builds btrfs-progs: docs: compression is disabled with nodatasum/nodatacow btrfs-progs: device usage: report slack space btrfs-progs: makefile: add target for testing installation btrfs-progs: drop O_CREATE from open_ctree_fs_info btrfs-progs: fix type mismatch in backtrace dumping functions btrfs-progs: switch to common message helpers in utils.c btrfs-progs: tests: convert, run md5sum with sudo helper btrfs-progs: tests: run rollback after conversion btrfs-progs: tests: convert: dump all superblocks after conversion btrfs-progs: tests: document cli-tests in readme btrfs-progs: use wider int type in btrfs_min_global_blk_rsv_size btrfs-progs: tests: move convert helpers to a separate file btrfs-progs: tests: convert: separate ext2 tests btrfs-progs: tests: convert: separate ext3 tests btrfs-progs: tests: convert: separate ext4 tests btrfs-progs: tests: clean up the test driver of convert tests btrfs-progs: tests: convert: set common variables btrfs-progs: tests: unify test drivers btrfs-progs: tests: 004-ext2-backup-superblock-ranges, drop unnecessary root privs btrfs-progs: tests: 004-ext2-backup-superblock-ranges, use common helpers for image loop Btrfs progs v4.6 Jeff Mahoney (1): btrfs-progs: udev: add rules for dm devices Lu Fengqi (2): btrfs-progs: tests: add 020-extent-ref-cases btrfs-progs: make btrfs-image restore to support dup M G Berberich (1): btrfs-progs: send: add quiet option Merlin Hartley (1): btrfs-progs: doc: fix typo in btrfs-subvolume Nicholas D Steeves (1): btrfs-progs: typo review of strings and comments Qu Wenruo (36): btrfs-progs: Enhance tree block check by checking empty leaf or node btrfs-progs: Return earlier for previous item btrfs-progs: convert-tests: Add test for backup superblock migration btrfs-progs: corrupt-block: Add support to corrupt extent for skinny metadata btrfs-progs: utils: Introduce new pseudo random API btrfs-progs: Use new random number API btrfs-progs: convert-tests: Add support for custom test scripts btrfs-progs: convert-tests: Add test case for backup superblock migration btrfs-progs: convert: add compatibility layer for e2fsprogs < 1.42 btrfs-progs: convert: Introduce functions to read used space btrfs-progs: convert: Introduce new function to remove reserved ranges btrfs-progs: convert: Introduce function to calculate the available space btrfs-progs: utils: Introduce new function for convert btrfs-progs: Introduce function to setup temporary superblock btrfs-progs: Introduce function to setup temporary tree root btrfs-progs: Introduce function to setup temporary chunk root btrfs-progs: Introduce function to initialize device tree btrfs-progs: Introduce function to initialize fs tree btrfs-progs: Introduce function to initialize csum tree btrfs-progs: Introduce function to setup temporary extent tree btrfs-progs: Introduce function to create convert data chunks btrfs-progs: extent-tree: Introduce function to find the first overlapping extent btrfs-progs: extent-tree: Enhance btrfs_record_file_extent btrfs-progs: convert: Introduce new function to create converted image btrfs-progs: convert: Introduce function to migrate reserved ranges btrfs-progs: convert: Enhance record_file_blocks to handle reserved ranges btrfs-progs: convert: Introduce init_btrfs_v2 function. btrfs-progs: Introduce
Re: fsck: to repair or not to repair
On 2016-06-09 23:40, Nikolaus Rath wrote: On May 11 2016, Nikolaus Rathwrote: Hello, I recently ran btrfsck on one of my file systems, and got the following messages: checking extents checking free space cache checking fs roots root 5 inode 3149867 errors 400, nbytes wrong root 5 inode 3150237 errors 400, nbytes wrong root 5 inode 3150238 errors 400, nbytes wrong root 5 inode 3150242 errors 400, nbytes wrong root 5 inode 3150260 errors 400, nbytes wrong [ lots of similar message with different inode numbers ] root 5 inode 15595011 errors 400, nbytes wrong root 5 inode 15595016 errors 400, nbytes wrong Checking filesystem on /dev/mapper/vg0-nikratio_crypt UUID: 8742472d-a9b0-4ab6-b67a-5d21f14f7a38 found 263648960636 bytes used err is 1 total csum bytes: 395314372 total tree bytes: 908644352 total fs tree bytes: 352735232 total extent tree bytes: 95039488 btree space waste bytes: 156301160 file data blocks allocated: 675209801728 referenced 410351722496 Btrfs v3.17 Can someone explain to me the risk that I run by attempting a repair, and (conversely) what I put at stake when continuing to use this file system as-is? To follow-up on this: after finding out which files were affected (using btrfs inspect-internal), I was able to fix the problem without using btrfsck by simply copying the data, deleting the file, and restoring it: cat affected-files.txt | while read -r name; do rsync -a "${name}" "/backup/location/${name}" rm -f "${name}" cp -a "/backup/location/${name}" "${name}" done (I used rsync to avoid cp making use of reflinks). After this procedure, btrfschk reported no more problems. JFYI, if you've using GNU cp, you can pass '--reflink=never' to avoid it making reflinks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Managing storage (incl. Btrfs) on Linux with openATTIC
Hi there, if you're using Btrfs on Linux for file serving purposes, i'd like to invite you to take a look at our open source storage management project "openATTIC": http://openattic.org/ We provide a web UI and RESTful API to create CIFS/NFS shares on top of Btrfs and other file systems, including monitoring and snapshots. Other file systems like ext4, XFS or ZFS are supported, too. We also support sharing block volumes via iSCSI and Fibre Channel via LIO and are currently working on adding Ceph Management and Monitoring support as well. openATTIC 2.0 is currently in development and we're looking for more testers and feedback. Packages for the Debian/Ubuntu, RHEL/CentOS and SUSE are available via apt/yum repos. For the time being, we don't yet support all the nifty Btrfs features (e.g. RAID levels), but you can already use openATTIC to manage (e.g. creating and snapshotting) and monitor Btrfs file systems via the WebUI. We plan to further extend the Btrfs functionality incrementally with each release. Some use cases we have in mind are documented here: https://wiki.openattic.org/display/OP/openATTIC+Storage+Management+Use+Cases So if you're looking for a free (GPLv2) storage management tool that supports your favorite file system, we'd be glad if you give openATTIC a try! Thanks and sorry for the noise, Lenz -- Lenz Grimmer- http://www.lenzg.net/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to map extents to files
At 06/02/2016 10:56 PM, Nikolaus Rath wrote: On Jun 02 2016, Qu Wenruowrote: At 06/02/2016 11:06 AM, Nikolaus Rath wrote: Hello, For one of my btrfs volumes, btrfsck reports a lot of the following warnings: [...] checking extents bad extent [138477568, 138510336), type mismatch with chunk bad extent [140091392, 140148736), type mismatch with chunk bad extent [140148736, 140201984), type mismatch with chunk bad extent [140836864, 140865536), type mismatch with chunk [...] Is there a way to discover which files are affected by this (in particular so that I can take a look at them before and after a btrfsck --repair)? Which version is the progs? If the fs is not converted from ext2/3/4, it may be a false alert. Version is 4.4.1. The fs may very well have been converted from ext4, but I can't tell for sure. Best, -Nikolaus Sorry for the late reply. For such case, btrfsck --repair is unable to fix it, as btrfs-progs is not able to balance extents. Normally, a full balance would fix it. I would try to update btrfs-progs to 4.5 and recheck, to see if it's a false alert. If not, then remove unused snapshots and then do the full balance. It's recommended to delete unused snapshots, as if there are too many snapshots, balance may be quite slow. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html