Re: Regression with crc32c selection?
$ uname -a Linux nas 4.17.0-1-amd64 #1 SMP Debian 4.17.8-1 (2018-07-20) x86_64 GNU/Linux $ dmesg | grep Btrfs [8.168408] Btrfs loaded, crc32c=crc32c-intel $ lsmod | grep crc32 crc32_pclmul 16384 0 libcrc32c 16384 1 btrfs crc32c_generic 16384 0 crc32c_intel 24576 2 $ grep CRC /boot/config-4.17.0-1-amd64 # CONFIG_PCIE_ECRC is not set # CONFIG_W1_SLAVE_DS2433_CRC is not set CONFIG_CRYPTO_CRC32C=m CONFIG_CRYPTO_CRC32C_INTEL=m CONFIG_CRYPTO_CRC32=m CONFIG_CRYPTO_CRC32_PCLMUL=m CONFIG_CRYPTO_CRCT10DIF=y CONFIG_CRYPTO_CRCT10DIF_PCLMUL=m CONFIG_CRC_CCITT=m CONFIG_CRC16=m CONFIG_CRC_T10DIF=y CONFIG_CRC_ITU_T=m CONFIG_CRC32=y # CONFIG_CRC32_SELFTEST is not set CONFIG_CRC32_SLICEBY8=y # CONFIG_CRC32_SLICEBY4 is not set # CONFIG_CRC32_SARWATE is not set # CONFIG_CRC32_BIT is not set # CONFIG_CRC4 is not set CONFIG_CRC7=m CONFIG_LIBCRC32C=m CONFIG_CRC8=m On Mon, 23 Jul 2018 at 16:14, Holger Hoffstätte wrote: > > Hi, > > While backporting a bunch of fixes to my own 4.16.x tree > (4.17 had a few too many bugs for my taste) I also ended up merging: > > df91f56adce1f: libcrc32c: Add crc32c_impl function > 9678c54388b6a: btrfs: Remove custom crc32c init code > > ..which AFAIK went into 4.17 and seemed harmless enough; after fixing up > a trivial context conflict it builds, runs, all good..except that btrfs > (apprently?) no longer uses the preferred crc32c-intel module, but the > crc32c-generic one instead. > > In order to rule out any mistakes on my part I built 4.18.0-rc6 and it > seems to have the same problem: > > Jul 23 15:55:09 ragnarok kernel: raid6: sse2x1 gen() 11267 MB/s > Jul 23 15:55:09 ragnarok kernel: raid6: sse2x1 xor() 8110 MB/s > Jul 23 15:55:09 ragnarok kernel: raid6: sse2x2 gen() 13409 MB/s > Jul 23 15:55:09 ragnarok kernel: raid6: sse2x2 xor() 9137 MB/s > Jul 23 15:55:09 ragnarok kernel: raid6: sse2x4 gen() 15884 MB/s > Jul 23 15:55:09 ragnarok kernel: raid6: sse2x4 xor() 10579 MB/s > Jul 23 15:55:09 ragnarok kernel: raid6: using algorithm sse2x4 gen() 15884 > MB/s > Jul 23 15:55:09 ragnarok kernel: raid6: xor() 10579 MB/s, rmw enabled > Jul 23 15:55:09 ragnarok kernel: raid6: using ssse3x2 recovery algorithm > Jul 23 15:55:09 ragnarok kernel: xor: automatically using best checksumming > function avx > Jul 23 15:55:09 ragnarok kernel: Btrfs loaded, crc32c=crc32c-generic > > I understand that the new crc32c_impl() function changed from > crypto_tfm_alg_driver_name() to crypto_shash_driver_name() - could this > be the reason? The module is loaded just fine, but apprently not used: > > $lsmod | grep crc32 > crc32_pclmul 16384 0 > crc32c_intel 24576 0 > > In other words, is this supposed to happen or is my kernel config somehow > no longer right? It worked before and doesn't look too wrong: > > $grep CRC /etc/kernels/kernel-config-x86_64-4.18.0-rc6 > # CONFIG_PCIE_ECRC is not set > CONFIG_CRYPTO_CRC32C=y > CONFIG_CRYPTO_CRC32C_INTEL=m > CONFIG_CRYPTO_CRC32=m > CONFIG_CRYPTO_CRC32_PCLMUL=m > # CONFIG_CRYPTO_CRCT10DIF is not set > CONFIG_CRC_CCITT=m > CONFIG_CRC16=y > # CONFIG_CRC_T10DIF is not set > CONFIG_CRC_ITU_T=y > CONFIG_CRC32=y > # CONFIG_CRC32_SELFTEST is not set > CONFIG_CRC32_SLICEBY8=y > # CONFIG_CRC32_SLICEBY4 is not set > # CONFIG_CRC32_SARWATE is not set > # CONFIG_CRC32_BIT is not set > # CONFIG_CRC4 is not set > # CONFIG_CRC7 is not set > CONFIG_LIBCRC32C=y > # CONFIG_CRC8 is not set > > Ultimately btrfs (and everything else) works, but the process of how > the kernel selects a crc32c implementation seems rather mysterious to me. :/ > > Any insights welcome. If it's a regression I can gladly test fixes. > > cheers > Holger > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Ongoing Btrfs stability issues
On 9 March 2018 at 20:05, Alex Adriaansewrote: > > Yes, we have PostgreSQL databases running these VMs that put a heavy I/O load > on these machines. Dump the databases and recreate them with --data-checksums and Btrfs No_COW attribute. You can add this to /etc/postgresql-common/createcluster.conf in Debian/Ubuntu if you use pg_createcluster: initdb_options = '--data-checksums' -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-progs - failed btrfs replace on RAID1 seems to have left things in a wrong state
On 1 December 2017 at 08:18, Duncan <1i5t5.dun...@cox.net> wrote: > > When udev sees a device it triggers > a btrfs device scan, which lets btrfs know which devices belong to which > individual btrfs. But once it associates a device with a particular > btrfs, there's nothing to unassociate it -- the only way to do that on > a running kernel is to successfully complete a btrfs device remove or > replacement... and your replace didn't complete due to error. > > Of course the other way to do it is to reboot, fresh kernel, fresh > btrfs state, and it learns again what devices go with which btrfs > when the appearing devices trigger the udev rule that triggers a > btrfs scan. Or reload the btrfs module. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A partially failing disk in raid0 needs replacement
On 14 November 2017 at 09:36, Klaus Agnolettiwrote: > > How do you guys think I should go about this? I'd clone the disk with GNU ddrescue. https://www.gnu.org/software/ddrescue/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Please help with exact actions for raid1 hot-swap
On 10 September 2017 at 08:33, Marat Khalili <m...@rqc.ru> wrote: > It doesn't need replaced disk to be readable, right? Only enough to be mountable, which it already is, so your read errors on /dev/sdb isn't a problem. > Then what prevents same procedure to work without a spare bay? It is basically the same procedure but with a bunch of gotchas due to bugs and odd behaviour. Only having one shot at it, before it can only be mounted read-only, is especially problematic (will be fixed in Linux 4.14). > -- > > With Best Regards, > Marat Khalili > > On September 9, 2017 1:29:08 PM GMT+03:00, Patrik Lundquist > <patrik.lundqu...@gmail.com> wrote: >>On 9 September 2017 at 12:05, Marat Khalili <m...@rqc.ru> wrote: >>> Forgot to add, I've got a spare empty bay if it can be useful here. >> >>That makes it much easier since you don't have to mount it degraded, >>with the risks involved. >> >>Add and partition the disk. >> >># btrfs replace start /dev/sdb7 /dev/sdc(?)7 /mnt/data >> >>Remove the old disk when it is done. >> >>> -- >>> >>> With Best Regards, >>> Marat Khalili >>> >>> On September 9, 2017 10:46:10 AM GMT+03:00, Marat Khalili >><m...@rqc.ru> wrote: >>>>Dear list, >>>> >>>>I'm going to replace one hard drive (partition actually) of a btrfs >>>>raid1. Can you please spell exactly what I need to do in order to get >>>>my >>>>filesystem working as RAID1 again after replacement, exactly as it >>was >>>>before? I saw some bad examples of drive replacement in this list so >>I >>>>afraid to just follow random instructions on wiki, and putting this >>>>system out of action even temporarily would be very inconvenient. >>>> >>>>For this filesystem: >>>> >>>>> $ sudo btrfs fi show /dev/sdb7 >>>>> Label: 'data' uuid: 37d3313a-e2ad-4b7f-98fc-a01d815952e0 >>>>> Total devices 2 FS bytes used 106.23GiB >>>>> devid1 size 2.71TiB used 126.01GiB path /dev/sda7 >>>>> devid2 size 2.71TiB used 126.01GiB path /dev/sdb7 >>>>> $ grep /mnt/data /proc/mounts >>>>> /dev/sda7 /mnt/data btrfs >>>>> rw,noatime,space_cache,autodefrag,subvolid=5,subvol=/ 0 0 >>>>> $ sudo btrfs fi df /mnt/data >>>>> Data, RAID1: total=123.00GiB, used=104.57GiB >>>>> System, RAID1: total=8.00MiB, used=48.00KiB >>>>> Metadata, RAID1: total=3.00GiB, used=1.67GiB >>>>> GlobalReserve, single: total=512.00MiB, used=0.00B >>>>> $ uname -a >>>>> Linux host 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11 21:17:51 UTC >>>>> 2017 x86_64 x86_64 x86_64 GNU/Linux >>>> >>>>I've got this in dmesg: >>>> >>>>> [Sep 8 20:31] ata6.00: exception Emask 0x0 SAct 0x7ecaa5ef SErr 0x0 >>>>> action 0x0 >>>>> [ +0.51] ata6.00: irq_stat 0x4008 >>>>> [ +0.29] ata6.00: failed command: READ FPDMA QUEUED >>>>> [ +0.38] ata6.00: cmd 60/70:18:50:6c:f3/00:00:79:00:00/40 tag >>3 >>>>> ncq 57344 in >>>>>res 41/40:00:68:6c:f3/00:00:79:00:00/40 >>Emask >>>>> 0x409 (media error) >>>>> [ +0.94] ata6.00: status: { DRDY ERR } >>>>> [ +0.26] ata6.00: error: { UNC } >>>>> [ +0.001195] ata6.00: configured for UDMA/133 >>>>> [ +0.30] sd 6:0:0:0: [sdb] tag#3 FAILED Result: >>hostbyte=DID_OK >>>>> driverbyte=DRIVER_SENSE >>>>> [ +0.05] sd 6:0:0:0: [sdb] tag#3 Sense Key : Medium Error >>>>> [current] [descriptor] >>>>> [ +0.04] sd 6:0:0:0: [sdb] tag#3 Add. Sense: Unrecovered read >>>>> error - auto reallocate failed >>>>> [ +0.05] sd 6:0:0:0: [sdb] tag#3 CDB: Read(16) 88 00 00 00 00 >>00 >>>> >>>>> 79 f3 6c 50 00 00 00 70 00 00 >>>>> [ +0.03] blk_update_request: I/O error, dev sdb, sector >>>>2045996136 >>>>> [ +0.47] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0, >>>>rd >>>>> 1, flush 0, corrupt 0, gen 0 >>>>> [ +0.62] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0, >>>>rd >>>>> 2, flush 0, corrupt 0, gen 0 >>>>> [ +0.77] ata6: EH complete >>>> >>>>There's still 1 in Current_Pending_Sector line of smartctl output as >>of >>>> >>>>now, so it probably won't heal by itself. >>>> >>>>-- >>>> >>>>With Best Regards, >>>>Marat Khalili >>>>-- >>>>To unsubscribe from this list: send the line "unsubscribe >>linux-btrfs" >>>>in >>>>the body of a message to majord...@vger.kernel.org >>>>More majordomo info at http://vger.kernel.org/majordomo-info.html >>> -- >>> To unsubscribe from this list: send the line "unsubscribe >>linux-btrfs" in >>> the body of a message to majord...@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>-- >>To unsubscribe from this list: send the line "unsubscribe linux-btrfs" >>in >>the body of a message to majord...@vger.kernel.org >>More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Please help with exact actions for raid1 hot-swap
On 9 September 2017 at 12:05, Marat Khaliliwrote: > Forgot to add, I've got a spare empty bay if it can be useful here. That makes it much easier since you don't have to mount it degraded, with the risks involved. Add and partition the disk. # btrfs replace start /dev/sdb7 /dev/sdc(?)7 /mnt/data Remove the old disk when it is done. > -- > > With Best Regards, > Marat Khalili > > On September 9, 2017 10:46:10 AM GMT+03:00, Marat Khalili wrote: >>Dear list, >> >>I'm going to replace one hard drive (partition actually) of a btrfs >>raid1. Can you please spell exactly what I need to do in order to get >>my >>filesystem working as RAID1 again after replacement, exactly as it was >>before? I saw some bad examples of drive replacement in this list so I >>afraid to just follow random instructions on wiki, and putting this >>system out of action even temporarily would be very inconvenient. >> >>For this filesystem: >> >>> $ sudo btrfs fi show /dev/sdb7 >>> Label: 'data' uuid: 37d3313a-e2ad-4b7f-98fc-a01d815952e0 >>> Total devices 2 FS bytes used 106.23GiB >>> devid1 size 2.71TiB used 126.01GiB path /dev/sda7 >>> devid2 size 2.71TiB used 126.01GiB path /dev/sdb7 >>> $ grep /mnt/data /proc/mounts >>> /dev/sda7 /mnt/data btrfs >>> rw,noatime,space_cache,autodefrag,subvolid=5,subvol=/ 0 0 >>> $ sudo btrfs fi df /mnt/data >>> Data, RAID1: total=123.00GiB, used=104.57GiB >>> System, RAID1: total=8.00MiB, used=48.00KiB >>> Metadata, RAID1: total=3.00GiB, used=1.67GiB >>> GlobalReserve, single: total=512.00MiB, used=0.00B >>> $ uname -a >>> Linux host 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11 21:17:51 UTC >>> 2017 x86_64 x86_64 x86_64 GNU/Linux >> >>I've got this in dmesg: >> >>> [Sep 8 20:31] ata6.00: exception Emask 0x0 SAct 0x7ecaa5ef SErr 0x0 >>> action 0x0 >>> [ +0.51] ata6.00: irq_stat 0x4008 >>> [ +0.29] ata6.00: failed command: READ FPDMA QUEUED >>> [ +0.38] ata6.00: cmd 60/70:18:50:6c:f3/00:00:79:00:00/40 tag 3 >>> ncq 57344 in >>>res 41/40:00:68:6c:f3/00:00:79:00:00/40 Emask >>> 0x409 (media error) >>> [ +0.94] ata6.00: status: { DRDY ERR } >>> [ +0.26] ata6.00: error: { UNC } >>> [ +0.001195] ata6.00: configured for UDMA/133 >>> [ +0.30] sd 6:0:0:0: [sdb] tag#3 FAILED Result: hostbyte=DID_OK >>> driverbyte=DRIVER_SENSE >>> [ +0.05] sd 6:0:0:0: [sdb] tag#3 Sense Key : Medium Error >>> [current] [descriptor] >>> [ +0.04] sd 6:0:0:0: [sdb] tag#3 Add. Sense: Unrecovered read >>> error - auto reallocate failed >>> [ +0.05] sd 6:0:0:0: [sdb] tag#3 CDB: Read(16) 88 00 00 00 00 00 >> >>> 79 f3 6c 50 00 00 00 70 00 00 >>> [ +0.03] blk_update_request: I/O error, dev sdb, sector >>2045996136 >>> [ +0.47] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0, >>rd >>> 1, flush 0, corrupt 0, gen 0 >>> [ +0.62] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0, >>rd >>> 2, flush 0, corrupt 0, gen 0 >>> [ +0.77] ata6: EH complete >> >>There's still 1 in Current_Pending_Sector line of smartctl output as of >> >>now, so it probably won't heal by itself. >> >>-- >> >>With Best Regards, >>Marat Khalili >>-- >>To unsubscribe from this list: send the line "unsubscribe linux-btrfs" >>in >>the body of a message to majord...@vger.kernel.org >>More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Please help with exact actions for raid1 hot-swap
On 9 September 2017 at 09:46, Marat Khaliliwrote: > > Dear list, > > I'm going to replace one hard drive (partition actually) of a btrfs raid1. > Can you please spell exactly what I need to do in order to get my filesystem > working as RAID1 again after replacement, exactly as it was before? I saw > some bad examples of drive replacement in this list so I afraid to just > follow random instructions on wiki, and putting this system out of action > even temporarily would be very inconvenient. I recently replaced both disks in a two disk Btrfs raid1 to increase capacity and took some notes. Using systemd? systemd will automatically unmount a degraded disk and ruin your one chance to replace the disk as long as Btrfs has the bug where it notes single chunks and one disk missing and refuses to mount degraded again. Comment out your mount in fstab and run "systemctl daemon-reload". The mount file in /var/run/systemd/generator/ will be removed. (Is there a better way?) Unmount the volume. # hdparm -Y /dev/sdb # echo 1 > /sys/block/sdb/device/delete Replace the disk. Create partitions etc. You might have to restart smartd, if using it. Make Btrfs forget the old device. Will otherwise think the old disk is still there. (Is there a better way?) # rmmod btrfs; modprobe btrfs # btrfs device scan # mount -o degraded /dev/sda7 /mnt/data # btrfs device usage /mnt/data # btrfs replace start /dev/sdbX /mnt/data # btrfs replace status /mnt/data Convert single or dup chunks to raid1 # btrfs balance start -fv -dconvert=raid1,soft -mconvert=raid1,soft -sconvert=raid1,soft /mnt/data Unmount, restore fstab, reload systemd again, mount. > > For this filesystem: > >> $ sudo btrfs fi show /dev/sdb7 >> Label: 'data' uuid: 37d3313a-e2ad-4b7f-98fc-a01d815952e0 >> Total devices 2 FS bytes used 106.23GiB >> devid1 size 2.71TiB used 126.01GiB path /dev/sda7 >> devid2 size 2.71TiB used 126.01GiB path /dev/sdb7 >> $ grep /mnt/data /proc/mounts >> /dev/sda7 /mnt/data btrfs >> rw,noatime,space_cache,autodefrag,subvolid=5,subvol=/ 0 0 >> $ sudo btrfs fi df /mnt/data >> Data, RAID1: total=123.00GiB, used=104.57GiB >> System, RAID1: total=8.00MiB, used=48.00KiB >> Metadata, RAID1: total=3.00GiB, used=1.67GiB >> GlobalReserve, single: total=512.00MiB, used=0.00B >> $ uname -a >> Linux host 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11 21:17:51 UTC 2017 >> x86_64 x86_64 x86_64 GNU/Linux > > > I've got this in dmesg: > >> [Sep 8 20:31] ata6.00: exception Emask 0x0 SAct 0x7ecaa5ef SErr 0x0 action >> 0x0 >> [ +0.51] ata6.00: irq_stat 0x4008 >> [ +0.29] ata6.00: failed command: READ FPDMA QUEUED >> [ +0.38] ata6.00: cmd 60/70:18:50:6c:f3/00:00:79:00:00/40 tag 3 ncq >> 57344 in >>res 41/40:00:68:6c:f3/00:00:79:00:00/40 Emask 0x409 >> (media error) >> [ +0.94] ata6.00: status: { DRDY ERR } >> [ +0.26] ata6.00: error: { UNC } >> [ +0.001195] ata6.00: configured for UDMA/133 >> [ +0.30] sd 6:0:0:0: [sdb] tag#3 FAILED Result: hostbyte=DID_OK >> driverbyte=DRIVER_SENSE >> [ +0.05] sd 6:0:0:0: [sdb] tag#3 Sense Key : Medium Error [current] >> [descriptor] >> [ +0.04] sd 6:0:0:0: [sdb] tag#3 Add. Sense: Unrecovered read error - >> auto reallocate failed >> [ +0.05] sd 6:0:0:0: [sdb] tag#3 CDB: Read(16) 88 00 00 00 00 00 79 f3 >> 6c 50 00 00 00 70 00 00 >> [ +0.03] blk_update_request: I/O error, dev sdb, sector 2045996136 >> [ +0.47] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0, rd 1, >> flush 0, corrupt 0, gen 0 >> [ +0.62] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0, rd 2, >> flush 0, corrupt 0, gen 0 >> [ +0.77] ata6: EH complete > > > There's still 1 in Current_Pending_Sector line of smartctl output as of now, > so it probably won't heal by itself. > > -- > > With Best Regards, > Marat Khalili > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: device usage: don't calculate slack on missing device
Print Device slack: 0.00B instead of Device slack: 16.00EiB Signed-off-by: Patrik Lundquist <patrik.lundqu...@gmail.com> --- cmds-fi-usage.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/cmds-fi-usage.c b/cmds-fi-usage.c index 101a0c4..6c846c1 100644 --- a/cmds-fi-usage.c +++ b/cmds-fi-usage.c @@ -1040,6 +1040,7 @@ void print_device_sizes(struct device_info *devinfo, unsigned unit_mode) pretty_size_mode(devinfo->device_size, unit_mode)); printf(" Device slack: %*s%10s\n", (int)(20 - strlen("Device slack")), "", - pretty_size_mode(devinfo->device_size - devinfo->size, + pretty_size_mode(devinfo->device_size > 0 ? + devinfo->device_size - devinfo->size : 0, unit_mode)); } -- 2.14.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: checksum error in metadata node - best way to move root fs to new drive?
On 10 August 2016 at 23:21, Chris Murphywrote: > > I'm using LUKS, aes xts-plain64, on six devices. One is using mixed-bg > single device. One is dsingle mdup. And then 2x2 mraid1 draid1. I've > had zero problems. The two computers these run on do have aesni > support. Aging wise, they're all at least a year old. But I've been > using Btrfs on LUKS for much longer than that. FWIW: I've had 5 spinning disks with LUKS + Btrfs raid1 for 1,5 years. Also xts-plain64 with AES-NI acceleration. No problems so far. Not using Btrfs compression. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Status of SMR with BTRFS
On 21 July 2016 at 15:34, Chris Murphywrote: > > Do programs have a way to communicate what portion of a data file is > modified, so that only changed blocks are COW'd? When I change a > single pixel in a 400MiB image and do a save (to overwrite the > original file), it takes just as long to overwrite as to write it out > as a new file. It'd be neat if that could be optimized but I don't see > it being the case at the moment. Programs can choose to seek within a file and only overwrite changed parts, like BitTorrent (use NOCOW or defrag files like that). Paint programs usually compress the changed image on save, so most of the file is changed anyway. But if it's a raw image file just writing the changed pixels should work, but that would require a comparison with the original image (or a for pixel change history) so I doubt anyone cares to implement it at the application level. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair
On 7 May 2016 at 18:11, Niccolò Belliwrote: > Which kind of hardware issue? I did a full memtest86 check, a full > smartmontools extended check and even a badblocks -wsv. > If this is really an hardware issue that we can identify I would be more than > happy because Dell will replace my laptop and this nightmare will be finally > over. I'm open to suggestions. Well, your hardware differs from a lot of successful installations. Are you using any power management tweaks? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: scrub: Tree block spanning stripes, ignored
On 7 April 2016 at 17:33, Ivan Pwrote: > > After running btrfsck --readonly again, the output is: > > === > Checking filesystem on /dev/sdb > UUID: 013cda95-8aab-4cb2-acdd-2f0f78036e02 > checking extents > checking free space cache > block group 632463294464 has wrong amount of free space > failed to load free space cache for block group 632463294464 Mount once with option "clear_cache" and check again. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: device stats: Print devid instead of null
Print e.g. "[devid:4].write_io_errs 6" instead of "[(null)].write_io_errs 6" when device is missing. Signed-off-by: Patrik Lundquist <patrik.lundqu...@gmail.com> --- cmds-device.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/cmds-device.c b/cmds-device.c index b17b6c6..7616c43 100644 --- a/cmds-device.c +++ b/cmds-device.c @@ -447,6 +447,13 @@ static int cmd_device_stats(int argc, char **argv) canonical_path = canonicalize_path((char *)path); + /* No path when device is missing. */ + if (!canonical_path) { + canonical_path = malloc(32); + snprintf(canonical_path, 32, +"devid:%llu", args.devid); + } + if (args.nr_items >= BTRFS_DEV_STAT_WRITE_ERRS + 1) printf("[%s].write_io_errs %llu\n", canonical_path, -- 2.8.0.rc3 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: bad metadata crossing stripe boundary
On 2 April 2016 at 20:31, Kai Krakowwrote: > Am Sat, 2 Apr 2016 11:44:32 +0200 > schrieb Marc Haber : > >> On Sat, Apr 02, 2016 at 11:03:53AM +0200, Kai Krakow wrote: >> > Am Fri, 1 Apr 2016 07:57:25 +0200 >> > schrieb Marc Haber : >> > > On Thu, Mar 31, 2016 at 11:16:30PM +0200, Kai Krakow wrote: >> [...] >> [...] >> [...] >> > > >> > > I cryptsetup luksFormat'ted the partition before I mkfs.btrfs'ed >> > > it. That should do a much better job than wipefsing it, shouldnt >> > > it? >> > >> > Not sure how luksFormat works. If it encrypts what is already on the >> > device, it would also encrypt orphan superblocks. >> >> It overwrites the LUKS metadata including the symmetric key that was >> used to encrypt the existing data. Short of Shor's Algorithm and >> Quantum Computers, after that operation it is no longer possible to >> even guess what was on the disk before. > > If it was encrypted before... ;-) What does wipefs -n find? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: attempt to mount after crash during rebalance hard crashes server
On 29 March 2016 at 22:46, Chris Murphywrote: > On Tue, Mar 29, 2016 at 2:21 PM, Warren, Daniel > wrote: >> Greetings all, >> >> I'm running 4.4.0 from deb sid >> >> btrfs fi sh http://pastebin.com/QLTqSU8L >> kernel panic http://pastebin.com/aBF6XmzA > > Panic shows: > CPU: 0 PID: 153 Comm: kworker/u8:13 Not tainted 3.16-2-amd64 #1 Debian > 3.16.3-2 That kernel is from 2014-09-20, long before even Jessie was released. Current Sid is 4.4.6. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible Raid Bug
On 28 March 2016 at 05:54, Anand Jain <anand.j...@oracle.com> wrote: > > On 03/26/2016 07:51 PM, Patrik Lundquist wrote: >> >> # btrfs device stats /mnt >> >> [/dev/sde].write_io_errs 11 >> [/dev/sde].read_io_errs0 >> [/dev/sde].flush_io_errs 2 >> [/dev/sde].corruption_errs 0 >> [/dev/sde].generation_errs 0 >> >> The old counters are back. That's good, but wtf? > > > No. I doubt if they are old counters. The steps above didn't > show old error counts, but since you have created a file > test3 so there will be some write_io_errors, which we don;t > see after the balance. So I doubt if they are old counter > but instead they are new flush errors. No, /mnt/test3 doesn't generate errors, only 'single' block groups. The old counters seem to be cached somewhere and replace doesn't reset them everywhere. One more time with more device stats and I've upgraded the kernel to Linux debian 4.5.0-trunk-amd64 #1 SMP Debian 4.5-1~exp1 (2016-03-20) x86_64 GNU/Linux # mkfs.btrfs -m raid10 -d raid10 /dev/sdb /dev/sdc /dev/sdd /dev/sde # mount /dev/sdb /mnt; dmesg | tail # touch /mnt/test1; sync; btrfs device usage /mnt Only raid10 profiles. # echo 1 >/sys/block/sde/device/delete; dmesg | tail [ 426.831037] sd 5:0:0:0: [sde] Synchronizing SCSI cache [ 426.831517] sd 5:0:0:0: [sde] Stopping disk [ 426.845199] ata6.00: disabled We lost a disk. # touch /mnt/test2; sync; dmesg | tail [ 467.126471] BTRFS error (device sde): bdev /dev/sde errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 [ 467.127386] BTRFS error (device sde): bdev /dev/sde errs: wr 2, rd 0, flush 0, corrupt 0, gen 0 [ 467.128125] BTRFS error (device sde): bdev /dev/sde errs: wr 3, rd 0, flush 0, corrupt 0, gen 0 [ 467.128640] BTRFS error (device sde): bdev /dev/sde errs: wr 4, rd 0, flush 0, corrupt 0, gen 0 [ 467.129215] BTRFS error (device sde): bdev /dev/sde errs: wr 4, rd 0, flush 1, corrupt 0, gen 0 [ 467.129331] BTRFS warning (device sde): lost page write due to IO error on /dev/sde [ 467.129334] BTRFS error (device sde): bdev /dev/sde errs: wr 5, rd 0, flush 1, corrupt 0, gen 0 [ 467.129420] BTRFS warning (device sde): lost page write due to IO error on /dev/sde [ 467.129422] BTRFS error (device sde): bdev /dev/sde errs: wr 6, rd 0, flush 1, corrupt 0, gen 0 We've got write errors on the lost disk. # btrfs device usage /mnt No 'single' profiles because we haven't remounted yet. # btrfs device stat /mnt [/dev/sde].write_io_errs 6 [/dev/sde].read_io_errs0 [/dev/sde].flush_io_errs 1 [/dev/sde].corruption_errs 0 [/dev/sde].generation_errs 0 # reboot # wipefs -a /dev/sde; reboot # mount -o degraded /dev/sdb /mnt; dmesg | tail [ 52.876897] BTRFS info (device sdb): allowing degraded mounts [ 52.876901] BTRFS info (device sdb): disk space caching is enabled [ 52.876902] BTRFS: has skinny extents [ 52.878008] BTRFS warning (device sdb): devid 4 uuid 231d7892-3f31-40b5-8dff-baf8fec1a8aa is missing [ 52.879057] BTRFS info (device sdb): bdev (null) errs: wr 6, rd 0, flush 1, corrupt 0, gen 0 # btrfs device usage /mnt Still only raid10 profiles. # btrfs device stat /mnt [(null)].write_io_errs 6 [(null)].read_io_errs0 [(null)].flush_io_errs 1 [(null)].corruption_errs 0 [(null)].generation_errs 0 /dev/sde is now called "(null)". Print device id instead? E.g. "[devid:4].write_io_errs 6" # touch /mnt/test3; sync; btrfs device usage /mnt /dev/sdb, ID: 1 Device size: 2.00GiB Data,single: 624.00MiB Data,RAID10: 102.38MiB Metadata,RAID10: 102.38MiB System,RAID10: 4.00MiB Unallocated: 1.19GiB /dev/sdc, ID: 2 Device size: 2.00GiB Data,RAID10: 102.38MiB Metadata,RAID10: 102.38MiB System,single: 32.00MiB System,RAID10: 4.00MiB Unallocated: 1.76GiB /dev/sdd, ID: 3 Device size: 2.00GiB Data,RAID10: 102.38MiB Metadata,single: 256.00MiB Metadata,RAID10: 102.38MiB System,RAID10: 4.00MiB Unallocated: 1.55GiB missing, ID: 4 Device size: 0.00B Data,RAID10: 102.38MiB Metadata,RAID10: 102.38MiB System,RAID10: 4.00MiB Unallocated: 1.80GiB Now we've got 'single' profiles on all devices except the missing one. Replace missing device before unmount or get stuck with a read-only filesystem. # btrfs device stat /mnt Same as before. Only old errors on the missing device. # btrfs replace start -B 4 /dev/sde /mnt; dmesg | tail [ 1268.598652] BTRFS info (device sdb): dev_replace from (devid 4) to /dev/sde started [ 1268.615601] BTRFS info (device sdb): dev_replace from (devid 4) to /dev/sde finished # btrfs device stats /mnt [/dev/sde].write_io_errs 0 [/dev/sde].read_io_errs0 [/dev/sde].flush_io_errs 0
Re: Possible Raid Bug
So with the lessons learned: # mkfs.btrfs -m raid10 -d raid10 /dev/sdb /dev/sdc /dev/sdd /dev/sde # mount /dev/sdb /mnt; dmesg | tail # touch /mnt/test1; sync; btrfs device usage /mnt Only raid10 profiles. # echo 1 >/sys/block/sde/device/delete We lost a disk. # touch /mnt/test2; sync; dmesg | tail We've got write errors. # btrfs device usage /mnt No 'single' profiles because we haven't remounted yet. # reboot # wipefs -a /dev/sde; reboot # mount -o degraded /dev/sdb /mnt; dmesg | tail # btrfs device usage /mnt Still only raid10 profiles. # touch /mnt/test3; sync; btrfs device usage /mnt Now we've got 'single' profiles. Replace now or get hosed. # btrfs replace start -B 4 /dev/sde /mnt; dmesg | tail # btrfs device stats /mnt [/dev/sde].write_io_errs 0 [/dev/sde].read_io_errs0 [/dev/sde].flush_io_errs 0 [/dev/sde].corruption_errs 0 [/dev/sde].generation_errs 0 We didn't inherit the /dev/sde error count. Is that a bug? # btrfs balance start -dconvert=raid10,soft -mconvert=raid10,soft -sconvert=raid10,soft -vf /mnt; dmesg | tail # btrfs device usage /mnt Back to only 'raid10' profiles. # umount /mnt; mount /dev/sdb /mnt; dmesg | tail # btrfs device stats /mnt [/dev/sde].write_io_errs 11 [/dev/sde].read_io_errs0 [/dev/sde].flush_io_errs 2 [/dev/sde].corruption_errs 0 [/dev/sde].generation_errs 0 The old counters are back. That's good, but wtf? # btrfs device stats -z /dev/sde Give /dev/sde a clean bill of health. Won't warn when mounting again. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible Raid Bug
On 25 March 2016 at 18:20, Stephen Williamswrote: > > Your information below was very helpful and I was able to recreate the > Raid array. However my initial question still stands - What if the > drives dies completely? I work in a Data center and we see this quite a > lot where a drive is beyond dead - The OS will literally not detect it. That's currently a weakness of Btrfs. I don't know how people deal with it in production. I think Anand Jain is working on improving it. > At this point would the Raid10 array be beyond repair? As you need the > drive present in order to mount the array in degraded mode. Right... let's try it again but a little bit differently. # mount /dev/sdb /mnt Let's drop the disk. # echo 1 >/sys/block/sde/device/delete [ 3669.024256] sd 5:0:0:0: [sde] Synchronizing SCSI cache [ 3669.024934] sd 5:0:0:0: [sde] Stopping disk [ 3669.037028] ata6.00: disabled # touch /mnt/test3 # sync [ 3845.960839] BTRFS error (device sdb): bdev /dev/sde errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 [ 3845.961525] BTRFS error (device sdb): bdev /dev/sde errs: wr 2, rd 0, flush 0, corrupt 0, gen 0 [ 3845.962738] BTRFS error (device sdb): bdev /dev/sde errs: wr 3, rd 0, flush 0, corrupt 0, gen 0 [ 3845.963038] BTRFS error (device sdb): bdev /dev/sde errs: wr 4, rd 0, flush 0, corrupt 0, gen 0 [ 3845.963422] BTRFS error (device sdb): bdev /dev/sde errs: wr 4, rd 0, flush 1, corrupt 0, gen 0 [ 3845.963686] BTRFS warning (device sdb): lost page write due to IO error on /dev/sde [ 3845.963691] BTRFS error (device sdb): bdev /dev/sde errs: wr 5, rd 0, flush 1, corrupt 0, gen 0 [ 3845.963932] BTRFS warning (device sdb): lost page write due to IO error on /dev/sde [ 3845.963941] BTRFS error (device sdb): bdev /dev/sde errs: wr 6, rd 0, flush 1, corrupt 0, gen 0 # umount /mnt [ 4095.276831] BTRFS error (device sdb): bdev /dev/sde errs: wr 7, rd 0, flush 1, corrupt 0, gen 0 [ 4095.278368] BTRFS error (device sdb): bdev /dev/sde errs: wr 8, rd 0, flush 1, corrupt 0, gen 0 [ 4095.279152] BTRFS error (device sdb): bdev /dev/sde errs: wr 8, rd 0, flush 2, corrupt 0, gen 0 [ 4095.279373] BTRFS warning (device sdb): lost page write due to IO error on /dev/sde [ 4095.279377] BTRFS error (device sdb): bdev /dev/sde errs: wr 9, rd 0, flush 2, corrupt 0, gen 0 [ 4095.279609] BTRFS warning (device sdb): lost page write due to IO error on /dev/sde [ 4095.279612] BTRFS error (device sdb): bdev /dev/sde errs: wr 10, rd 0, flush 2, corrupt 0, gen 0 # mount -o degraded /dev/sdb /mnt [ 4608.113751] BTRFS info (device sdb): allowing degraded mounts [ 4608.113756] BTRFS info (device sdb): disk space caching is enabled [ 4608.113757] BTRFS: has skinny extents [ 4608.116557] BTRFS info (device sdb): bdev /dev/sde errs: wr 6, rd 0, flush 1, corrupt 0, gen 0 # touch /mnt/test4 # sync Writing to the filesystem works while the device is missing. No new errors in dmesg after re-mounting degraded. Reboot to get back /dev/sde. [4.329852] BTRFS: device fsid 75737bea-d76c-42f5-b0e6-7d346e38610d devid 4 transid 26 /dev/sde [4.330157] BTRFS: device fsid 75737bea-d76c-42f5-b0e6-7d346e38610d devid 3 transid 31 /dev/sdd [4.330511] BTRFS: device fsid 75737bea-d76c-42f5-b0e6-7d346e38610d devid 2 transid 31 /dev/sdc [4.330865] BTRFS: device fsid 75737bea-d76c-42f5-b0e6-7d346e38610d devid 1 transid 31 /dev/sdb /dev/sde transid is lagging behind, of course. # wipefs -a /dev/sde # btrfs device scan # mount -o degraded /dev/sdb /mnt [ 507.248621] BTRFS info (device sdb): allowing degraded mounts [ 507.248626] BTRFS info (device sdb): disk space caching is enabled [ 507.248628] BTRFS: has skinny extents [ 507.252815] BTRFS info (device sdb): bdev /dev/sde errs: wr 6, rd 0, flush 1, corrupt 0, gen 0 [ 507.252919] BTRFS: missing devices(1) exceeds the limit(0), writeable mount is not allowed [ 507.278277] BTRFS: open_ctree failed Well, that was unexpected! Reboot again. # mount -o degraded /dev/sdb /mnt [ 94.368514] BTRFS info (device sdd): allowing degraded mounts [ 94.368519] BTRFS info (device sdd): disk space caching is enabled [ 94.368521] BTRFS: has skinny extents [ 94.370909] BTRFS warning (device sdd): devid 4 uuid 8549a275-f663-4741-b410-79b49a1d465f is missing [ 94.372170] BTRFS info (device sdd): bdev (null) errs: wr 6, rd 0, flush 1, corrupt 0, gen 0 [ 94.372284] BTRFS: missing devices(1) exceeds the limit(0), writeable mount is not allowed [ 94.395021] BTRFS: open_ctree failed No go. # mount -o degraded,ro /dev/sdb /mnt # btrfs device stats /mnt [/dev/sdb].write_io_errs 0 [/dev/sdb].read_io_errs0 [/dev/sdb].flush_io_errs 0 [/dev/sdb].corruption_errs 0 [/dev/sdb].generation_errs 0 [/dev/sdc].write_io_errs 0 [/dev/sdc].read_io_errs0 [/dev/sdc].flush_io_errs 0 [/dev/sdc].corruption_errs 0 [/dev/sdc].generation_errs 0 [/dev/sdd].write_io_errs 0 [/dev/sdd].read_io_errs0 [/dev/sdd].flush_io_errs 0 [/dev/sdd].corruption_errs 0
Re: Possible Raid Bug
On Debian Stretch with Linux 4.4.6, btrfs-progs 4.4 in VirtualBox 5.0.16 with 4*2GB VDIs: # mkfs.btrfs -m raid10 -d raid10 /dev/sdb /dev/sdc /dev/sdd /dev/sdbe # mount /dev/sdb /mnt # touch /mnt/test # umount /mnt Everything fine so far. # wipefs -a /dev/sde *reboot* # mount /dev/sdb /mnt mount: wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. # dmesg | tail [ 85.979655] BTRFS info (device sdb): disk space caching is enabled [ 85.979660] BTRFS: has skinny extents [ 85.982377] BTRFS: failed to read the system array on sdb [ 85.996793] BTRFS: open_ctree failed Not very informative! An information regression? # mount -o degraded /dev/sdb /mnt # dmesg | tail [ 919.899071] BTRFS info (device sdb): allowing degraded mounts [ 919.899075] BTRFS info (device sdb): disk space caching is enabled [ 919.899077] BTRFS: has skinny extents [ 919.903216] BTRFS warning (device sdb): devid 4 uuid 8549a275-f663-4741-b410-79b49a1d465f is missing # touch /mnt/test2 # ls -l /mnt/ total 0 -rw-r--r-- 1 root root 0 mar 25 15:17 test -rw-r--r-- 1 root root 0 mar 25 15:42 test2 # btrfs device remove missing /mnt ERROR: error removing device 'missing': unable to go below four devices on raid10 As expected. # btrfs replace start -B missing /dev/sde /mnt ERROR: source device must be a block device or a devid Would have been nice if missing worked here too. Maybe it does in btrfs-progs 4.5? # btrfs replace start -B 4 /dev/sde /mnt # dmesg | tail [ 1618.170619] BTRFS info (device sdb): dev_replace from (devid 4) to /dev/sde started [ 1618.184979] BTRFS info (device sdb): dev_replace from (devid 4) to /dev/sde finished Repaired! # umount /mnt # mount /dev/sdb /mnt # dmesg | tail [ 1729.917661] BTRFS info (device sde): disk space caching is enabled [ 1729.917665] BTRFS: has skinny extents All in all it works just fine with Linux 4.4.6. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID-1 refuses to balance large drive
On 23 March 2016 at 20:33, Chris Murphywrote: > > On Wed, Mar 23, 2016 at 1:10 PM, Brad Templeton wrote: > > > > I am surprised to hear it said that having the mixed sizes is an odd > > case. > > Not odd as in wrong, just uncommon compared to other arrangements being > tested. I think mixed drive sizes in raid1 is a killer feature for a home NAS, where you replace an old smaller drive with the latest and largest when you need more storage. My raid1 currently consists of 6TB+3TB+3*2TB. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible Raid Bug
On 25 March 2016 at 12:49, Stephen Williamswrote: > > So catch 22, you need all the drives otherwise it won't let you mount, > But what happens if a drive dies and the OS doesn't detect it? BTRFS > wont allow you to mount the raid volume to remove the bad disk! Version of Linux and btrfs-progs? You can't have a raid10 with less than 4 devices so you need to add a new device before deleting the missing. That is of course still a problem with a read-only fs. btrfs replace is also the recommended way to replace a failed device nowadays. The wiki is outdated. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Major HDD performance degradation on btrfs receive
On 23 February 2016 at 18:26, Marc MERLINwrote: > > I'm currently doing a very slow defrag to see if it'll help (looks like > it's going to take days). > I'm doing this: > for i in dir1 dir2 debian32 debian64 ubuntu dir4 ; do echo $i; time btrfs fi > defragment -v -r $i; done [snip] > Also, should I try running defragment -r from cron from time to time? I find the default threshold a bit low and defragment daily with "-t 1m" to combat heavy random write fragmentation. Once in a while I defrag e.g. VM disk images with "-t 128m" but find higher thresholds mostly a waste of time. YMMV. > But, just to be clear, is there a way I missed to see how fragmented my > filesystem is without running filefrag on millions of files and parsing > the output? I don't think so, and filefrag is slow with heavily fragmented files because ioctl(FS_IOC_FIEMAP) is called many times with a buffer which only fits 292 fiemap_extents. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID1 disk upgrade method
On 30 January 2016 at 15:50, Patrik Lundquist <patrik.lundqu...@gmail.com> wrote: > On 29 January 2016 at 13:14, Austin S. Hemmelgarn <ahferro...@gmail.com> > wrote: >> >> Last I checked, Seagate's 'NAS' drives and whatever they've re-branded their >> other enterprise line as, as well as WD's 'Red' drives support both SCT ERC >> and FUA, but I don't know about any other brands (most of the Hitachi, >> Toshiba, and Samsung drives I've seen do not support FUA). > > I don't know about WD Red Pro but my WD Reds don't support FUA. > > Can I list supported commands with something like hdparm? I'm curious > about a WD Re in a LSI RAID. No FUA in WD Re either. [20312.701155] scsi 4:0:0:0: Direct-Access ATA WDC WD5003ABYZ-0 1S03 PQ: 0 ANSI: 5 [20312.701453] sd 4:0:0:0: [sdb] 976773168 512-byte logical blocks: (500 GB/465 GiB) [20312.701454] sd 4:0:0:0: Attached scsi generic sg2 type 0 [20312.701603] sd 4:0:0:0: [sdb] Write Protect is off [20312.701609] sd 4:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [20312.701663] sd 4:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [20312.712396] sd 4:0:0:0: [sdb] Attached SCSI disk -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: "WARNING: device 0 not present" during scrub?
On 30 January 2016 at 12:59, Christian Perneggerwrote: > > This is on a 1-month-old Debian stable (jessie) install and yes, I > know that means the kernel and btrfs-progs are ancient apt-get install -t jessie-backports linux-image-4.3.0-0.bpo.1-amd64 Or something like that for the image name. Unfortunately there's no stable backport of btrfs-tools (as they call btrfs-progs). https://tracker.debian.org/pkg/linux -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: cannot repair filesystem
On 1 January 2016 at 16:44, Jan Koesterwrote: > > Hi, > > if I try to repair filesystem got I'am assert. I use Raid6. > > Linux dibsi 3.16.0-0.bpo.4-amd64 #1 SMP Debian 3.16.7-ckt4-3~bpo70+1 > (2015-02-12) x86_64 GNU/Linux Raid6 wasn't completed until Linux 3.19 and I wouldn't call it stable yet. https://btrfs.wiki.kernel.org/index.php/RAID56 I suggest you upgrade from Wheezy to Jessie and install the lastest backports kernel and latest btrfs-progs from Git (there's no stable-bpo for btrfs-tools) if you want to use raid56. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel 3.19 and still "disk full" even though 'btrfs fi df" reports enough room left?
On 19 November 2015 at 06:58, Roman Mamedovwrote: > > On Wed, 18 Nov 2015 19:53:03 +0100 > linux-btrfs.tebu...@xoxy.net wrote: > > > $ uname -a > > Linux neptun 3.19.0-31-generic #36~14.04.1-Ubuntu SMP Thu Oct 8 > > 10:21:08 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux [...] > > So my suggestion would be to try a newer kernel from www.kernel.org: if the > problem disappears at 4.1 then just keep on using that, or 4.3 if you have to, > but otherwise that one might be a bit too new to start using right away. Give http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.1.13-wily/ a try. wget -e robots=off -r -l1 -np -nd -A '*all.deb','*generic*amd64.deb' http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.1.13-wily/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: More memory more jitters?
On 14 November 2015 at 15:11, CHENG Yuk-Pong, Danielwrote: > > Background info: > > I am running a heavy-write database server with 96GB ram. In the worse > case it cause multi minutes of high cpu loads. Systemd keeping kill > and restarting services, and old job don't die because they stuck in > uninterruptable wait... etc. > > Tried with nodatacow, but it seems only affect new file. It is not an > subvolume option either... How about nocow (chattr +C) on the database directories? You will have to copy the files to make nocow versions of them. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs/RAID5 became unmountable after SATA cable fault
On 6 November 2015 at 10:03, Janos Toth F.wrote: > > Although I updated the firmware of the drives. (I found an IMPORTANT > update when I went there to download SeaTools, although there was no > change log to tell me why this was important). This might changed the > error handling behavior of the drive...? I've had Seagate drives not reporting errors until I updated the firmware. They tended to timeout instead. Got a shitload of SMART errors after I updated, but they still didn't handle errors very well (became unresponsive). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Removing bad hdd from btrfs volume
On 7 August 2015 at 00:17, Peter Foley pefol...@pefoley.com wrote: Hi, I have an btrfs volume that spans multiple disks (no raid, just single), and earlier this morning I hit some hardware problems with one of the disks. I tried btrfs dev del /dev/sda1 /, but btrfs was unable to migrate the 1gb that appears to be causing the read errors. See http://sprunge.us/aeZC You might want to try to save as much as possible from the failing disk with the help of GNU ddrescue. Either by copying sda to a replacement disk or by copying sda1 to a file for loopback mounting. Unmount filesystem before copying and remove sda before you mount with the copy. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Inappropriate ioctl for device
On 25 July 2015 at 10:56, Mojtaba ker...@rp2.org wrote: System is debian wheezy or Jessie. This is Debian Jessie: root@s2:/# uname -a Linux s2 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u3 x86_64 GNU/Linux That's a way too old kernel to be running Btrfs on. You should be running on at least the Jessie 3.16 kernel. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: defrag: fix threshold overflow again
Commit dedb1ebeee847e3c4d71e14d0c1077887630e44a broke commit 96cfbbf0ea9fce7ecaa9e03964474f407f6e76ab. Casting thresh value greater than (u32)-1 simply truncates bits while desired value is (u32)-1 for max defrag threshold. I.e. btrfs fi defrag -t 4g is trimmed/truncated to 0 and -t 5g to 1073741824. Also added a missing newline. Signed-off-by: Patrik Lundquist patrik.lundqu...@gmail.com --- cmds-filesystem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/cmds-filesystem.c b/cmds-filesystem.c index 800aa4d..00a3f78 100644 --- a/cmds-filesystem.c +++ b/cmds-filesystem.c @@ -1172,8 +1172,9 @@ static int cmd_defrag(int argc, char **argv) thresh = parse_size(optarg); if (thresh (u32)-1) { fprintf(stderr, - WARNING: target extent size %llu too big, trimmed to %u, + WARNING: target extent size %llu too big, trimmed to %u\n, thresh, (u32)-1); + thresh = (u32)-1; } defrag_global_fancy_ioctl = 1; break; -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: defrag: remove unused variable
A leftover from when recursive defrag was added. Signed-off-by: Patrik Lundquist patrik.lundqu...@gmail.com --- cmds-filesystem.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/cmds-filesystem.c b/cmds-filesystem.c index 00a3f78..1b7b4c1 100644 --- a/cmds-filesystem.c +++ b/cmds-filesystem.c @@ -1131,7 +1131,6 @@ static int cmd_defrag(int argc, char **argv) int i; int recursive = 0; int ret = 0; - struct btrfs_ioctl_defrag_range_args range; int e = 0; int compress_type = BTRFS_COMPRESS_NONE; DIR *dirstream; @@ -1189,7 +1188,7 @@ static int cmd_defrag(int argc, char **argv) if (check_argc_min(argc - optind, 1)) usage(cmd_defrag_usage); - memset(defrag_global_range, 0, sizeof(range)); + memset(defrag_global_range, 0, sizeof(defrag_global_range)); defrag_global_range.start = start; defrag_global_range.len = len; defrag_global_range.extent_thresh = (u32)thresh; -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: counting fragments takes more time than defragmenting
On 14 July 2015 at 21:15, Hugo Mills h...@carfax.org.uk wrote: On Tue, Jul 14, 2015 at 09:09:00PM +0200, Patrik Lundquist wrote: On 14 July 2015 at 20:41, Hugo Mills h...@carfax.org.uk wrote: On Tue, Jul 14, 2015 at 01:57:07PM +0200, Patrik Lundquist wrote: On 24 June 2015 at 12:46, Duncan 1i5t5.dun...@cox.net wrote: Regardless of whether 1 or huge -t means maximum defrag, however, the nominal data chunk size of 1 GiB means that 30 GiB file you mentioned should be considered ideally defragged at 31 extents. This is a departure from ext4, which AFAIK in theory has no extent upper limit, so should be able to do that 30 GiB file in a single extent. But btrfs or ext4, 31 extents ideal or a single extent ideal, 150 extents still indicates at least some remaining fragmentation. So I converted the VMware VMDK file to a VirtualBox VDI file: -rw--- 1 plu plu 28845539328 jul 13 13:36 Windows7-disk1.vmdk -rw--- 1 plu plu 28993126400 jul 13 14:04 Windows7.vdi $ filefrag Windows7.vdi Windows7.vdi: 15 extents found $ btrfs filesystem defragment -t 3g Windows7.vdi $ filefrag Windows7.vdi Windows7.vdi: 24 extents found How can it be less than 28 extents with a chunk size of 1 GiB? I _think_ the fragment size will be limited by the block group size. This is not the same as the chunk size for some RAID levels -- for example, RAID-0, a block group can be anything from 2 to n chunks (across the same number of devices), where each chunk is 1 GiB, so potentially you could have arbitrary-sized block groups. The same would apply to RAID-10, -5 and -6. (Note, I haven't verified this, but it makes sense based on what I know of the internal data structures). It's a raid1 filesystem, so the block group ought to be the same size as the chunk, right? Yes. A 2GiB block group would suffice to explain it though. Not with RAID-1 -- I'd expect the block group size to be 1 GiB. So I had a look at the filefrag source and filefrag actually doesn't print the number of extents but the number of disk fragments. Contiguously allocated extents counts as one fragment. Windows7.vdi: 47 extents found is really 213 extents over 47 disk fragments. But I have one 2GiB extent, according to filefrag -v, so the question remains. :-) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: counting fragments takes more time than defragmenting
On 24 June 2015 at 12:46, Duncan 1i5t5.dun...@cox.net wrote: Regardless of whether 1 or huge -t means maximum defrag, however, the nominal data chunk size of 1 GiB means that 30 GiB file you mentioned should be considered ideally defragged at 31 extents. This is a departure from ext4, which AFAIK in theory has no extent upper limit, so should be able to do that 30 GiB file in a single extent. But btrfs or ext4, 31 extents ideal or a single extent ideal, 150 extents still indicates at least some remaining fragmentation. So I converted the VMware VMDK file to a VirtualBox VDI file: -rw--- 1 plu plu 28845539328 jul 13 13:36 Windows7-disk1.vmdk -rw--- 1 plu plu 28993126400 jul 13 14:04 Windows7.vdi $ filefrag Windows7.vdi Windows7.vdi: 15 extents found $ btrfs filesystem defragment -t 3g Windows7.vdi $ filefrag Windows7.vdi Windows7.vdi: 24 extents found How can it be less than 28 extents with a chunk size of 1 GiB? E2fsprogs version 1.42.12 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: counting fragments takes more time than defragmenting
On 14 July 2015 at 20:41, Hugo Mills h...@carfax.org.uk wrote: On Tue, Jul 14, 2015 at 01:57:07PM +0200, Patrik Lundquist wrote: On 24 June 2015 at 12:46, Duncan 1i5t5.dun...@cox.net wrote: Regardless of whether 1 or huge -t means maximum defrag, however, the nominal data chunk size of 1 GiB means that 30 GiB file you mentioned should be considered ideally defragged at 31 extents. This is a departure from ext4, which AFAIK in theory has no extent upper limit, so should be able to do that 30 GiB file in a single extent. But btrfs or ext4, 31 extents ideal or a single extent ideal, 150 extents still indicates at least some remaining fragmentation. So I converted the VMware VMDK file to a VirtualBox VDI file: -rw--- 1 plu plu 28845539328 jul 13 13:36 Windows7-disk1.vmdk -rw--- 1 plu plu 28993126400 jul 13 14:04 Windows7.vdi $ filefrag Windows7.vdi Windows7.vdi: 15 extents found $ btrfs filesystem defragment -t 3g Windows7.vdi $ filefrag Windows7.vdi Windows7.vdi: 24 extents found How can it be less than 28 extents with a chunk size of 1 GiB? I _think_ the fragment size will be limited by the block group size. This is not the same as the chunk size for some RAID levels -- for example, RAID-0, a block group can be anything from 2 to n chunks (across the same number of devices), where each chunk is 1 GiB, so potentially you could have arbitrary-sized block groups. The same would apply to RAID-10, -5 and -6. (Note, I haven't verified this, but it makes sense based on what I know of the internal data structures). It's a raid1 filesystem, so the block group ought to be the same size as the chunk, right? A 2GiB block group would suffice to explain it though. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't remove missing device
On 10 July 2015 at 06:05, None None whocares0...@freemail.hu wrote: According to dmesg sda returns bad data but the smart values for it seem fine. # smartctl -a /dev/sda ... SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] Run smartctl -t long /dev/sda -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: inspect: Fix out of bounds string termination.
Signed-off-by: Patrik Lundquist patrik.lundqu...@gmail.com --- cmds-inspect.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cmds-inspect.c b/cmds-inspect.c index 053cf8e..aafe37d 100644 --- a/cmds-inspect.c +++ b/cmds-inspect.c @@ -293,7 +293,7 @@ static int cmd_subvolid_resolve(int argc, char **argv) goto out; } - path[PATH_MAX] = '\0'; + path[PATH_MAX-1] = '\0'; printf(%s\n, path); out: -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: counting fragments takes more time than defragmenting
On 25 June 2015 at 06:01, Duncan 1i5t5.dun...@cox.net wrote: Patrik Lundquist posted on Wed, 24 Jun 2015 14:05:57 +0200 as excerpted: On 24 June 2015 at 12:46, Duncan 1i5t5.dun...@cox.net wrote: If it's uint32 limited, either kill everything above that in both the documentation and code, or alias everything above that to 3G (your next paragraph) or whatever. My simple overflow patch yesterday fixes the problem, so 4G or larger is max instead of 0. But btrfs or ext4, 31 extents ideal or a single extent ideal, 150 extents still indicates at least some remaining fragmentation. I gave it another shot but I've now got 154 extents instead. :-) Is it possible there's simply no gig-size free-space holes in the filesystem allocation, so it simply /can't/ defrag further than that, because there's no place to allocate whole-gig data chunks at a time? I would guess so, without allocating new chunks. Defrag can probably be smarter and avoid rewriting extents if it means splitting them (unless the compression flag is set and it must rewrite everything). -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: counting fragments takes more time than defragmenting
On 24 June 2015 at 05:20, Marc MERLIN m...@merlins.org wrote: Hello again, Just curious, is anyone seeing similar things with big VM images or other DBs? I forgot to mention that my vdi file is 88GB. It's surprising that it took longer to count the fragments than to actually defragment the file. Or that it took 3 defrag runs to get down to 11K extents from 104K. Are others seeing similar things? Filefrag is pretty much instant for my 30GB (150 extents) virtual disk, no CoW on file, no snapshots on volume. But what doesn't make sense to me is btrfs fi defrag; the -t option says -t size defragment only files at least size bytes big The -t value goes into struct btrfs_ioctl_defrag_range_args.extent_thresh which is documented as /* * any extent bigger than this will be considered * already defragged. Use 0 to take the kernel default * Use 1 to say every single extent must be rewritten */ Default extent_thresh is 256K. I can't see how 1 would say every single extent must be rewritten. On the contrary; 1 skips every extent. The compress flag even sets extent_thresh=(u32)-1 to force a rewrite. Marc, try btrfs fi defrag -t 4294967295 Win7.vdi for maximum defrag and time filefrag again with fewer extents. /Patrik Marc On Thu, Jun 04, 2015 at 05:42:45PM +0900, Marc MERLIN wrote: Hi Chris, After our quick chat, I gave it a shot on 3.19.6, and things are better than last time I tried. legolas:/var/local/nobck/VirtualBox VMs# lsattr Win7/ ---C Win7/Logs ---C Win7/Snapshots ---C Win7/Win7.vdi ---C Win7/Win7.png ---C Win7/autotune1.png ---C Win7/new_autotune2.png ---C Win7/Win7.vbox-prev ---C Win7/Win7.vbox But I have snapshots of that subvolume, so obviously that gets in the way of disabling COW. I had a look, and I have 100K fragments. That took 10mn to figure out: legolas:/var/local/nobck/VirtualBox VMs/Win7# filefrag Win7.vdi Win7.vdi: 104306 extents found This first filefrag run took about 10mn to count all the fragments on my SSD. That feels a bit slow, but maybe the userland tool is doing things in suboptimal ways. Defrag actually worked (mostly) and wasn't too slow. It used to take hours not to finish, and now it worked in 3mn: legolas:/var/local/nobck/VirtualBox VMs/Win7# time btrfs fi defrag Win7.vdi real 3m43.807s user 0m0.000s sys 0m44.044s This is defintely better than before. Note that it's not fully defragged, but close enough. Each subsequent run, filefrag is faster, and defrag is still faster than filefrag: legolas:/var/local/nobck/VirtualBox VMs/Win7# time filefrag Win7.vdi Win7.vdi: 11428 extents found real 2m42.090s user 0m0.000s sys 2m37.308s legolas:/var/local/nobck/VirtualBox VMs/Win7# time btrfs fi defrag Win7.vdi real 0m7.483s user 0m0.000s sys 0m2.672s legolas:/var/local/nobck/VirtualBox VMs/Win7# time filefrag Win7.vdi Win7.vdi: 11132 extents found real 0m22.525s user 0m0.000s sys 0m22.264s It's a bit unexpected that I still have 10k fragments after 2 defrag runs, but it's better than 100k :) Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: counting fragments takes more time than defragmenting
On 24 June 2015 at 12:46, Duncan 1i5t5.dun...@cox.net wrote: Patrik Lundquist posted on Wed, 24 Jun 2015 10:28:09 +0200 as excerpted: AFAIK, it's set huge to defrag everything, It's set to 256K by default. Assuming set a huge -t to defrag to the maximum extent possible is correct, that means -t 1G should be exactly as effective as -t 1T... 1G is actually more effective because 1T overflows the uint32 extent_thresh field, so 1T, 0, and 256K are currently the same. 3G is the largest value that works with -t as expected (disregarding the man page) and is easy to type. But btrfs or ext4, 31 extents ideal or a single extent ideal, 150 extents still indicates at least some remaining fragmentation. I gave it another shot but I've now got 154 extents instead. :-) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: Fix defrag threshold overflow
btrfs fi defrag -t 1T overflows the u32 thresh variable and default, instead of max, threshold is used. Signed-off-by: Patrik Lundquist patrik.lundqu...@gmail.com --- cmds-filesystem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/cmds-filesystem.c b/cmds-filesystem.c index 530f815..72bb45b 100644 --- a/cmds-filesystem.c +++ b/cmds-filesystem.c @@ -1127,7 +1127,7 @@ static int cmd_defrag(int argc, char **argv) int flush = 0; u64 start = 0; u64 len = (u64)-1; - u32 thresh = 0; + u64 thresh = 0; int i; int recursive = 0; int ret = 0; @@ -1186,7 +1186,7 @@ static int cmd_defrag(int argc, char **argv) memset(defrag_global_range, 0, sizeof(range)); defrag_global_range.start = start; defrag_global_range.len = len; - defrag_global_range.extent_thresh = thresh; + defrag_global_range.extent_thresh = thresh (u32)-1 ? (u32)-1 : (u32)thresh; if (compress_type) { defrag_global_range.flags |= BTRFS_DEFRAG_RANGE_COMPRESS; defrag_global_range.compress_type = compress_type; -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs performance - ssd array
On 12 January 2015 at 15:54, Austin S Hemmelgarn ahferro...@gmail.com wrote: Another thing to consider is that the kernel's default I/O scheduler and the default parameters for that I/O scheduler are almost always suboptimal for SSD's, and this tends to show far more with BTRFS than anything else. Personally I've found that using the CFQ I/O scheduler with the following parameters works best for a majority of SSD's: 1. slice_idle=0 2. back_seek_penalty=1 3. back_seek_max set equal to the size in sectors of the device 4. nr_requests and quantum set to the hardware command queue depth You can easily set these persistently for a given device with a udev rule like this: KERNEL=='sda', SUBSYSTEM=='block', ACTION=='add', ATTR{queue/scheduler}='cfq', ATTR{queue/iosched/back_seek_penalty}='1', ATTR{queue/iosched/back_seek_max}='device_size', ATTR{queue/iosched/quantum}='128', ATTR{queue/iosched/slice_idle}='0', ATTR{queue/nr_requests}='128' Make sure to replace '128' in the rule with whatever the command queue depth is for the device in question (It's usually 128 or 256, occasionally more), and device_size with the size of the device in kibibytes. So is it size in sectors of the device or size of the device in kibibytes for back_seek_max? :-) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Btrfs on top of LUKS (dm-crypt)
Hi, I've been looking at recommended cryptsetup options for Btrfs and I have one question: Marc uses cryptsetup luksFormat --align-payload=1024 directly on a disk partition and not on e.g. a striped mdraid. Is there a Btrfs reason for that alignment? http://marc.merlins.org/perso/btrfs/post_2014-04-27_Btrfs-Multi-Device-Dmcrypt.html Thanks, Patrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS free space handling still needs more work: Hangs again
On 28 December 2014 at 13:03, Martin Steigerwald mar...@lichtvoll.de wrote: BTW, I found that the Oracle blog didn´t work at all for me. I completed a cycle of defrag, sdelete -c and VBoxManage compact, [...] and it apparently did *nothing* to reduce the size of the file. They've changed the argument to -z; sdelete -z. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A note on spotting bugs [Was: ENOSPC after conversion]
On 12 December 2014 at 14:29, Robert White rwh...@pobox.com wrote: You yourself even found the annotation in the wiki that said you should have e4defragged the system before conversion. There's no mention of e4defrag on the Btrfs wiki, it says to btrfs defrag before balance to avoid ENOSPC, as the last step of conversion. What you are experiencing is a little vexing, but it's not a bug. It's not even a huge problem. And if you'd stop banging your head against it it wouldn't be any sort of problem at all. Neither of us can change these facts. I stopped banging my head several emails ago. I understand the problem and I will start over. I feel your pain man, but thats about it. I'm in no pain, it has been interesting. No data loss. No hurry. What more can I do? The conversion wiki is lacking. It would be great if someone (maybe you?) could expand upon the drawbacks of conversion. What is it that you want? Nothing more. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
ENOSPC after conversion [Was: Fixing Btrfs Filesystem Full Problems typo?]
I'll reboot the thread with a recap and my latest findings. * Half full 3TB disk converted from ext4 to Btrfs, after first verifying it with fsck. * Undo subvolume deleted after being happy with the conversion. * Recursive defrag. * Full balance, that ended with 98 enospc errors during balance. In that order, nothing in between. No snapshots or other subvolumes. Loads of real free space. Btrfs check reports a clean filesystem. Btrfs balance -musage=100 -dusage=99 works, but not -dusage=100. Conversion of metadata (~1.55 GiB) to DUP worked fine. A theory, based on the error messages, is that some of the converted files, even after defrag, still have extents larger than 1GiB and hence don't fit in a native Btrfs extent. Running defrag several more times and balance again doesn't help. An error looks like: BTRFS info (device sdc1): relocating block group 1821099687936 flags 1 BTRFS error (device sdc1): allocation failed flags 1, wanted 2013265920 BTRFS: space_info 1 has 4773171200 free, is not full BTRFS: space_info total=1494648619008, used=1489775505408, pinned=0, reserved=99700736, may_use=2102390784, readonly=241664 The following script returned 46 filenames (looking up the block group in the error): grep -B 1 BTRFS error /var/log/syslog | grep relocating | cut -d ' ' -f 14 | \ while read block do echo Block group: $block btrfs inspect-internal logical-resolve $block /mnt done The files are ranging from 41KiB to 6.6GiB in size, which doesn't seem to support the theory of too large extents. Moving the 46 files to another disk (no errors reported) and running balance again resulted in 64 enospc errors during balance - down from 98 errors. Running the above script again gives this error for about half of the block groups: ioctl ret=-1, error: No such file or directory I had no such errors the first time I looked up block groups. What's the next step in zeroing in on the bug, before I start over? And I will start over. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fixing Btrfs Filesystem Full Problems typo?
On 11 December 2014 at 09:42, Robert White rwh...@pobox.com wrote: On 12/10/2014 05:36 AM, Patrik Lundquist wrote: On 10 December 2014 at 13:17, Robert White rwh...@pobox.com wrote: On 12/09/2014 11:19 PM, Patrik Lundquist wrote: BUT FIRST UNDERSTAND: you do _not_ need to balance a newly converted filesystem. That is, the recommended balance (and recursive defrag) is _not_ a useability issue, its an efficiency issue. But if I can't start with an efficient filesystem I'd rather start over now/soon. I intend to add four more old disks for a RAID1 and it will be problematic to start over later on (I'd have to buy new, large disks). Nope, not an issue. When you add the space and rebalance with the conversions by adding all those other disks and such it will _completely_ _obliterate_ the current balance. But if the issue is too large extents, why would they fit on any added btrfs space? You are cleaning the house before the maid comes. Indeed, as a health check. And the patient is slightly ill. If you are going to add four more volumes, if those volumes are big enough just make a new filesystem on them then copy the files over. As it looks now, I will, but I also think there's a bug which I'm trying to zero in on. I deleted the subvolume after being satisfied with the conversion, defragged recursively, and balanced. In that order. Yea, but your file system is full and you are out of space so get on with the adding space. I don't think it is full. balance -musage=100 -dusage=99 completes with ~1.5TB free space. The remaining unbalanced data is using full or close to full blocks. Still can't speak for contiguous space though. (looking back through my mail spool) You haven't sent the output of /bin/df or btrfs fi df yet, I'd like to see what those two commands say. I have posted these before, but not /bin/df (no access at the moment). btrfs fi show Label: none uuid: 770fe01d-6a45-42b9-912e- e8f8b413f6a4 Total devices 1 FS bytes used 1.35TiB devid1 size 2.73TiB used 1.36TiB path /dev/sdc1 btrfs fi df /mnt Data, single: total=1.35TiB, used=1.35TiB System, single: total=32.00MiB, used=112.00KiB Metadata, single: total=3.00GiB, used=1.55GiB GlobalReserve, single: total=512.00MiB, used=0.00B btrfs check /dev/sdc1 Checking filesystem on /dev/sdc1 UUID: 770fe01d-6a45-42b9-912e-e8f8b413f6a4 found 825003219475 bytes used err is 0 total csum bytes: 1452612464 total tree bytes: 1669943296 total fs tree bytes: 39600128 total extent tree bytes: 52903936 btree space waste bytes: 79921034 file data blocks allocated: 1487627730944 referenced 1487627730944 This would be quadruply true if you'd tweaked the block group ratios when you made the original file system. Ext4 created with defaults, but I think it has been completely full at one time. Did you use e4defrag before you did the conversion or is this the result of converting chaos most profound? Didn't use e4defrag. Think of the time and worry you'd have saved if you'd copied the thing in the first place. 8-) But then I wouldn't learn as much. :-) Learning not to cut corners is a lesson... 8-) This is more of an experiment than cutting corners, but yeah. TRUTH BE TOLD :: After two very eventful conversions not too long ago I just don't do those any more. The total amount of time I saved by not copying the files was in the negative numbers before I just copied the files onto an external media and reformatted and restored. Conversion probably should be discouraged on the wiki then. It's like a choose-your-own-adventure book! 8-) I like that! :-) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fixing Btrfs Filesystem Full Problems typo?
On 11 December 2014 at 05:13, Duncan 1i5t5.dun...@cox.net wrote: Patrik correct me if I have this wrong, but filling in the history as I believe I have it... You're right Duncan, except it began as a private question about an error in a blog and went from there. Not that it matters, except the subject is not very fitting anymore and I tried to reboot the thread with a summary since it's getting a bit hard to find the facts. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ENOSPC after conversion [Was: Fixing Btrfs Filesystem Full Problems typo?]
On 11 December 2014 at 11:18, Robert White rwh...@pobox.com wrote: So far I don't see a bug. Fair enough, lets call it a huge problem with btrfs convert. I think it warrants a note in the wiki. On 12/11/2014 12:18 AM, Patrik Lundquist wrote: Running defrag several more times and balance again doesn't help. That sounds correct as defrag defrags files, it does not reallocate extents. From https://btrfs.wiki.kernel.org/index.php/Conversion_from_Ext3 A notable caveat is that a balance can fail with ENOSPC if the defragment is skipped. This is usually due to large extents on ext being larger than the maximum size btrfs normally operates with (1 GB). A defrag of all large files will avoid this: I interpreted it as breaking down large extents and reallocating them, thus avoiding my current situation. There's a good chance that if you balanced again and again the number of no space errors might decrease. With only one 2-ish gig empty slot sliding around like one of those puzzles where you have to sort the numbers from 1 to 15 by sliding them around in the 4x4=16 element grid. I was never fond of those puzzles. The first step is admitting that you _don't_ have a problem. I've got 99 problems and balance is one of them (the other are block groups). :-) Of course the filesystem is in a problematic state after the conversion, even if it's not a bug. ~1.5TB of free space and yet out of space and it can't be fixed with a balance. It might not be wrong per se but it's very problematic from a user perspective. Anyway, this thread has turned up lots of good information. You are _not_ out of space in which to create files. (or so I presume, you still haven't posted the output of /bin/df or btrfs filesystem df). I'm not; creating new files works. $ df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdc1 2930265088 1402223656 1526389096 48% /mnt $ btrfs fi df /mnt Data, single: total=1.41TiB, used=1.30TiB System, DUP: total=32.00MiB, used=124.00KiB Metadata, DUP: total=2.50GiB, used=1.49GiB GlobalReserve, single: total=512.00MiB, used=0.00B Your next step is to either add storage in accordance with your plan of adding four more volumes to make a RAID (as expressed elsewhere), or make a clean filesystem and copy your files over. I've already decided to start over with a clean filesystem to get rid of the ext4 legacy. I'm only curious about how to solve the balance problem, and now I know how. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: A note on spotting bugs [Was: ENOSPC after conversion]
On 11 December 2014 at 23:00, Robert White rwh...@pobox.com wrote: On 12/11/2014 12:18 AM, Patrik Lundquist wrote: * Full balance, that ended with 98 enospc errors during balance. Assuming that quote is an actual quote from the output of the balance... It is, from dmesg. Bugs are unexpected things that cause failures and/or damage. Not all errors are as pretty as BTRFS info (device sdc1): relocating block group 1756675178496 flags 1 BTRFS error (device sdc1): allocation failed flags 1, wanted 1272844288 BTRFS: space_info 1 has 13703077888 free, is not full BTRFS: space_info total=1504312295424, used=1487622750208, pinned=0, reserved=2986196992, may_use=1308749824, readonly=270336 some are BTRFS info (device sdc1): relocating block group 1780297498624 flags 1 [ cut here ] WARNING: CPU: 2 PID: 11094 at /build/linux-Y9HjRe/linux-3.16.7/fs/btrfs/extent-tree.c:7280 btrfs_alloc_free_block+0x219/0x450 [btrfs]() BTRFS: block rsv returned -28 Modules linked in: nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc btrfs xor nls_utf8 nls_cp437 vfat fat kvm_intel raid6_pq kvm crc32_pclmul jc42 coretemp ghash_clmulni_intel iTCO_wdt ipmi_watchdog iTCO_vendor_support aesni_intel joydev aes_x86_64 efi_pstore lrw gf128mul evdev glue_helper ast ablk_helper lpc_ich cryptd ttm pcspkr efivars mfd_core i2c_i801 drm_kms_helper drm tpm_tis tpm acpi_cpufreq i2c_ismt shpchp button processor thermal_sys ipmi_si ipmi_poweroff ipmi_devintf ipmi_msghandler autofs4 ext4 crc16 mbcache jbd2 sg sd_mod crc_t10dif crct10dif_generic hid_generic usbhid hid ahci libahci crct10dif_pclmul crct10dif_common crc32c_intel igb libata ehci_pci i2c_algo_bit xhci_hcd ehci_hcd i2c_core dca scsi_mod ptp usbcore pps_core usb_common CPU: 2 PID: 11094 Comm: btrfs Tainted: GW 3.16.0-4-amd64 #1 Debian 3.16.7-2 Hardware name: Supermicro A1SAi/A1SAi, BIOS 1.0c 02/27/2014 0009 81506b43 88032779f780 81065717 88032d68a640 88032779f7d0 1000 8803117df480 8106577c a0536338 0020 Call Trace: [81506b43] ? dump_stack+0x41/0x51 [81065717] ? warn_slowpath_common+0x77/0x90 [8106577c] ? warn_slowpath_fmt+0x4c/0x50 [a04a8b09] ? btrfs_alloc_free_block+0x219/0x450 [btrfs] [81142bf6] ? free_hot_cold_page_list+0x46/0x90 [a04dc5c8] ? read_extent_buffer+0xc8/0x120 [btrfs] [a0492c31] ? btrfs_copy_root+0x101/0x2e0 [btrfs] [a05032d1] ? create_reloc_root+0x201/0x2d0 [btrfs] [a0509398] ? btrfs_init_reloc_root+0x98/0xb0 [btrfs] [a04b9564] ? record_root_in_trans+0xa4/0xf0 [btrfs] [a04ba95f] ? btrfs_record_root_in_trans+0x3f/0x70 [btrfs] [a04bb940] ? start_transaction+0x90/0x560 [btrfs] [a04c605a] ? btrfs_evict_inode+0x33a/0x4d0 [btrfs] [811bf0ec] ? evict+0xac/0x170 [a04c0762] ? btrfs_run_delayed_iputs+0xd2/0xf0 [btrfs] [a04bb812] ? btrfs_commit_transaction+0x922/0x9c0 [btrfs] [a04bb940] ? start_transaction+0x90/0x560 [btrfs] [a0504ea4] ? prepare_to_relocate+0xf4/0x1b0 [btrfs] [a0509e72] ? relocate_block_group+0x42/0x670 [btrfs] [a050a667] ? btrfs_relocate_block_group+0x1c7/0x2d0 [btrfs] [a04e0432] ? btrfs_relocate_chunk.isra.27+0x62/0x700 [btrfs] [a04928d1] ? btrfs_set_path_blocking+0x31/0x70 [btrfs] [a0497d8d] ? btrfs_search_slot+0x4ad/0xad0 [btrfs] [a04d1fd5] ? btrfs_get_token_64+0x55/0xf0 [btrfs] [a04e355b] ? btrfs_balance+0x82b/0xe80 [btrfs] [a04eaba4] ? btrfs_ioctl_balance+0x154/0x500 [btrfs] [a04ef89c] ? btrfs_ioctl+0x58c/0x2b10 [btrfs] [811670f1] ? handle_mm_fault+0xa91/0x11a0 [810562a1] ? __do_page_fault+0x1d1/0x4e0 [8116afc1] ? vma_link+0xb1/0xc0 [811b788f] ? do_vfs_ioctl+0x2cf/0x4b0 [811b7af1] ? SyS_ioctl+0x81/0xa0 [8150ecc8] ? page_fault+0x28/0x30 [8150cc2d] ? system_call_fast_compare_end+0x10/0x15 ---[ end trace 880987d36ae50245 ]--- BTRFS error (device sdc1): allocation failed flags 1, wanted 2013265920 BTRFS: space_info 1 has 8384299008 free, is not full BTRFS: space_info total=1500017328128, used=1491533037568, pinned=0, reserved=99807232, may_use=2147475456, readonly=184320 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fixing Btrfs Filesystem Full Problems typo?
On 10 December 2014 at 13:17, Robert White rwh...@pobox.com wrote: On 12/09/2014 11:19 PM, Patrik Lundquist wrote: BUT FIRST UNDERSTAND: you do _not_ need to balance a newly converted filesystem. That is, the recommended balance (and recursive defrag) is _not_ a useability issue, its an efficiency issue. But if I can't start with an efficient filesystem I'd rather start over now/soon. I intend to add four more old disks for a RAID1 and it will be problematic to start over later on (I'd have to buy new, large disks). I deleted the subvolume after being satisfied with the conversion, defragged recursively, and balanced. In that order. Because you made a backup and everything yes? Shh! So anyway. Your system isn't bugged or broken it's full but its a fragmented fullness that has lots of free sectors but insufficent contiguous free sectors, so it cannot satisfy the request. It's a half full 3TB disk. There _is_ space, somewhere. I can't speak for contiguous space though. I don't know how to interpret the space_info error. Why is only 4773171200 (4,4GiB) free? Can I inspect block group 1821099687936 to try to find out what makes it problematic? BTRFS info (device sdc1): relocating block group 1821099687936 flags 1 BTRFS error (device sdc1): allocation failed flags 1, wanted 2013265920 BTRFS: space_info 1 has 4773171200 free, is not full BTRFS: space_info total=1494648619008, used=1489775505408, pinned=0, reserved=99700736, may_use=2102390784, readonly=241664 So it was looking for a single chunk 2013265920 bytes long and it couldn't find one because all the spaces were smaller and there was no room to make a new suitable space. The problem is that it wanted 2013265920 bytes and while the system as a whole had no way to satisfy that desire. It asked for something just shy of two gigs as a single extent. That's a tough order on a full platter. Since your entire free size is 2102390784 that is an attempt to allocate about 80% of your free space as one contiguous block. That's never going to happen. 8-) What about space_info 1 has 4773171200 free? Besides the other 1,5TB free space. I don't even know if 2GiB is normally a legal size for an extent. My understanding is that data is allocated in 1G chunks, so I'd expect all extents to be smaller than 1G. The 'summary' after the failed balances is always something like 98 enospc errors which now makes me suspect that I have 98 files with extents larger than 1GiB that the defrag didn't take care of. So if I can find out which files have 1GiB extents I can then copy them back and forth to solve the problem. Maybe running defrag more times can also solve it? Can I get a list of fragmented files? Suppose an old file with 2GiB extent isn't fragmented, will btrfs defrag still try to defrag it? After a quick glance at the btrfs-convert, it looks like it might make some pretty atypical extents if the underlying donor filesystem needed needed them. It wouldn't have had a choice. So it's easily within the realm of reason that you'd have some really fascinating data as a result of converting a nearly full EXT4 file system of the Terabyte+ size. It was about half full at conversion. This would be quadruply true if you'd tweaked the block group ratios when you made the original file system. Ext4 created with defaults, but I think it has been completely full at one time. So since you have nice backups... you should probably drop the ext2_saved subvolume and then get on with your life for good or ill. Done before defrag and balance attempts. Think of the time and worry you'd have saved if you'd copied the thing in the first place. 8-) But then I wouldn't learn as much. :-) P.S. you should re-balance your System and Metadata as DUP for now. Two copies of that stuff is better than one as right now you have no real recovery path for that stuff. If you didn't make that change on purpose it probably got down-revved from DUP automagically when you tired to RAID it. Good point. Maybe btrfs-convert should do that by default? I don't think it has ever been DUP. Eyup. And the metadata is now DUP. That's ~1.5GB extra metadata that was allocated just fine after the failed balance. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fixing Btrfs Filesystem Full Problems typo?
On 10 December 2014 at 14:11, Duncan 1i5t5.dun...@cox.net wrote: From there... I've never used it but I /think/ btrfs inspect-internal logical-resolve should let you map the 182109... address to a filename. From there, moving that file out of the filesystem and back in should eliminate that issue. btrfs inspect-internal logical-resolve 1821099687936 /mnt gives me the filename and it's only a 54175 bytes file. Assuming no snapshots still contain the file, of course, and that the ext* saved subvolume has already been deleted. Got no snapshots or subvolumes. Keeping it simple for now. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fixing Btrfs Filesystem Full Problems typo?
On 10 December 2014 at 13:47, Duncan 1i5t5.dun...@cox.net wrote: The recursive btrfs defrag after deleting the saved ext* subvolume _should_ have split up any such 1 GiB extents so balance could deal with them, but either it failed for some reason on at least one such file, or there's some other weird corner-case going on, very likely something else having to do with the conversion. I've run defrag several times again and it doesn't do anything additional. Patrik, assuming no btrfs snapshots yet, can you do a du --all --block- size=1M | sort -n (or similar), then take a look at all results over 1024 (1 GiB since the du specified 1 MiB blocks), and see if it's reasonable to move all those files out of the filesystem and back? Good idea, but it's quite a lot of files. I'd rather start over. But I've identified 46 files from Btrfs errors in syslog and will try to move them to another disk. They're ranging from 41KiB to 6.6GiB in size. Is btrfs-debug-tree -e useful in finding problematic files? -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fixing Btrfs Filesystem Full Problems typo?
On 10 December 2014 at 23:28, Robert White rwh...@pobox.com wrote: On 12/10/2014 10:56 AM, Patrik Lundquist wrote: On 10 December 2014 at 14:11, Duncan 1i5t5.dun...@cox.net wrote: Assuming no snapshots still contain the file, of course, and that the ext* saved subvolume has already been deleted. Got no snapshots or subvolumes. Keeping it simple for now. Does that mean that you have already manually removed the subvolume that was automatically created by btrfs-convert? Yes. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fixing Btrfs Filesystem Full Problems typo?
On 24 November 2014 at 13:35, Patrik Lundquist patrik.lundqu...@gmail.com wrote: On 24 November 2014 at 05:23, Duncan 1i5t5.dun...@cox.net wrote: Patrik Lundquist posted on Sun, 23 Nov 2014 16:12:54 +0100 as excerpted: The balance run now finishes without errors with usage=99 and I think I'll leave it at that. No RAID yet but will convert to RAID1. Converting between raid modes is done with a balance, so if you can't get that last bit to balance, you can't do a full conversion to raid1. Good point! It slipped my mind. I'll report back if incremental balances eventually solves the balance after conversion ENOSPC problem. I'm having no luck with a full balance of the converted filesystem. Tried it again with Linux v3.18.0 and btrfs-progs v3.17.3. What conclusions can be drawn from the following? BTRFS info (device sdc1): relocating block group 1821099687936 flags 1 BTRFS error (device sdc1): allocation failed flags 1, wanted 2013265920 BTRFS: space_info 1 has 4773171200 free, is not full BTRFS: space_info total=1494648619008, used=1489775505408, pinned=0, reserved=99700736, may_use=2102390784, readonly=241664 BTRFS: block group 234109272064 has 5368709120 bytes, 5368709120 used 0 pinned 0 reserved BTRFS info (device sdc1): block group has cluster?: no BTRFS info (device sdc1): 0 blocks of free space at or bigger than bytes is BTRFS: block group 242699206656 has 5368709120 bytes, 5368709120 used 0 pinned 0 reserved BTRFS info (device sdc1): block group has cluster?: no BTRFS info (device sdc1): 0 blocks of free space at or bigger than bytes is BTRFS: block group 339335970816 has 5368709120 bytes, 5368705024 used 0 pinned 0 reserved BTRFS critical (device sdc1): entry offset 344704675840, bytes 4096, bitmap no Label: none uuid: 770fe01d-6a45-42b9-912e-e8f8b413f6a4 Total devices 1 FS bytes used 1.35TiB devid1 size 2.73TiB used 1.36TiB path /dev/sdc1 Data, single: total=1.35TiB, used=1.35TiB System, single: total=32.00MiB, used=112.00KiB Metadata, single: total=3.00GiB, used=1.55GiB GlobalReserve, single: total=512.00MiB, used=0.00B Checking filesystem on /dev/sdc1 UUID: 770fe01d-6a45-42b9-912e-e8f8b413f6a4 found 825003219475 bytes used err is 0 total csum bytes: 1452612464 total tree bytes: 1669943296 total fs tree bytes: 39600128 total extent tree bytes: 52903936 btree space waste bytes: 79921034 file data blocks allocated: 1487627730944 referenced 1487627730944 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fixing Btrfs Filesystem Full Problems typo?
On 10 December 2014 at 00:13, Robert White rwh...@pobox.com wrote: On 12/09/2014 02:29 PM, Patrik Lundquist wrote: Label: none uuid: 770fe01d-6a45-42b9-912e-e8f8b413f6a4 Total devices 1 FS bytes used 1.35TiB devid1 size 2.73TiB used 1.36TiB path /dev/sdc1 Data, single: total=1.35TiB, used=1.35TiB System, single: total=32.00MiB, used=112.00KiB Metadata, single: total=3.00GiB, used=1.55GiB GlobalReserve, single: total=512.00MiB, used=0.00B Are you trying to convert a filesystem on a single device/partition to RAID 1? Not yet. I'm stuck at the full balance after the conversion from ext4. I haven't added the disks for RAID1 and might need them for starting over instead. A balance with -musage=100 -dusage=99 works but a full fails. It would be nice to nail the bug since the fs passes btrfs check and it seems to be a clear ENOSPC bug. I don't know how to interpret the space_info error. Why is only 4773171200 (4,4GiB) free? Can I inspect block group 1821099687936 to try to find out what makes it problematic? BTRFS info (device sdc1): relocating block group 1821099687936 flags 1 BTRFS error (device sdc1): allocation failed flags 1, wanted 2013265920 BTRFS: space_info 1 has 4773171200 free, is not full BTRFS: space_info total=1494648619008, used=1489775505408, pinned=0, reserved=99700736, may_use=2102390784, readonly=241664 P.S. you should re-balance your System and Metadata as DUP for now. Two copies of that stuff is better than one as right now you have no real recovery path for that stuff. If you didn't make that change on purpose it probably got down-revved from DUP automagically when you tired to RAID it. Good point. Maybe btrfs-convert should do that by default? I don't think it has ever been DUP. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: scrub implies failing drive - smartctl blissfully unaware
On 25 November 2014 at 22:34, Phillip Susi ps...@ubuntu.com wrote: On 11/19/2014 7:05 PM, Chris Murphy wrote: I'm not a hard drive engineer, so I can't argue either point. But consumer drives clearly do behave this way. On Linux, the kernel's default 30 second command timer eventually results in what look like link errors rather than drive read errors. And instead of the problems being fixed with the normal md and btrfs recovery mechanisms, the errors simply get worse and eventually there's data loss. Exhibits A, B, C, D - the linux-raid list is full to the brim of such reports and their solution. I have seen plenty of error logs of people with drives that do properly give up and return an error instead of timing out so I get the feeling that most drives are properly behaved. Is there a particular make/model of drive that is known to exhibit this silly behavior? I had a couple of Seagate Barracuda 7200.11 (codename Moose) drives with seriously retarded firmware. They never reported a read error AFAIK but began to time out instead. They wouldn't even respond after a link reset. I had to power cycle the disks. Funny days with ddrescue. Got almost everything off them. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: scrub implies failing drive - smartctl blissfully unaware
On 25 November 2014 at 23:14, Phillip Susi ps...@ubuntu.com wrote: On 11/19/2014 6:59 PM, Duncan wrote: The paper specifically mentioned that it wasn't necessarily the more expensive devices that were the best, either, but the ones that faired best did tend to have longer device-ready times. The conclusion was that a lot of devices are cutting corners on device-ready, gambling that in normal use they'll work fine, leading to an acceptable return rate, and evidently, the gamble pays off most of the time. I believe I read the same study and don't recall any such conclusion. Instead the conclusion was that the badly behaving drives aren't ordering their internal writes correctly and flushing their metadata from ram to flash before completing the write request. The problem was on the power *loss* side, not the power application. I've found: http://www.usenix.org/conference/fast13/technical-sessions/presentation/zheng http://lkcl.net/reports/ssd_analysis.html Are there any more studies? -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fixing Btrfs Filesystem Full Problems typo?
On 23 November 2014 at 08:52, Duncan 1i5t5.dun...@cox.net wrote: [a whole lot] Thanks for the long post, Duncan. My venture into the finer details of balance began with converting an ext4 fs to btrfs and after an inital defrag having a full balance fail with about a third to go. Consecutive full balances further reduced the number of chunks and got me closer to finish without the infamous ENOSPC. After 3-4 full balance runs it failed with less than 8% to go. The balance run now finishes without errors with usage=99 and I think I'll leave it at that. No RAID yet but will convert to RAID1. Is it correct that there is no reason to ever do a 100% balance as routine maintenance? I mean if you really need that last 1% space you actually need a disk upgrade instead. How about running a monthly maintenance job that uses bytes_used and dev_item.bytes_used from btrfs-show-super to approximate the balance need? (dev_item.bytes_used - bytes_used) / bytes_used == extra device space used The extra device space used after my balance usage=99 is 0,15%. It was 7,0% before I began tinkering with usage and ran into ENOSPC and I think it is safe to assume that it was a lot more right after the fs conversion. So lets iterate a balance run which begins with usage=0 and increases in steps of 5 or 10 and stops at 90 or 99 or when the extra device space used is less than 1%. Does it make sense? -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fixing Btrfs Filesystem Full Problems typo?
On 22 November 2014 at 23:26, Marc MERLIN m...@merlins.org wrote: This one hurts my brain every time I think about it :) I'm new to Btrfs so I may very well be wrong, since I haven't really read up on it. :-) So, the bigger the -dusage number, the more work btrfs has to do. Agreed. -dusage=0 does almost nothing -dusage=100 effectively rebalances everything And -dusage=0 effectively reclaims empty chunks, right? But saying saying less than 95% full for -dusage=95 would mean rebalancing everything that isn't almost full, But isn't that what rebalance does? Rewriting chunks =95% full to completely full chunks and effectively defragmenting chunks and most likely reduce the number of chunks. A -dusage=0 rebalance reduced my number of chunks from 1173 to 998 and dev_item.bytes_used went from 1593466421248 to 1491460947968. Now, just to be sure, if I'm getting this right, if your filesystem is 55% full, you could rebalance all blocks that have less than 55% space free, and use -dusage=55 I realize that I interpret the usage parameter as operating on blocks (chunks? are they the same in this case?) that are = 55% full while you interpret it as = 55% free. Which is correct? -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html