Re: List of known BTRFS Raid 5/6 Bugs?

2018-08-10 Thread Zygo Blaxell
On Sat, Aug 11, 2018 at 04:18:35AM +0200, erentheti...@mail.de wrote: > Write hole: > > > > The data will be readable until one of the data blocks becomes > > inaccessible (bad sector or failed disk). This is because it is only the > > parity block that is corrupted (old data blocks are still

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Andrei Borzenkov
10.08.2018 10:33, Tomasz Pala пишет: > On Fri, Aug 10, 2018 at 07:03:18 +0300, Andrei Borzenkov wrote: > >>> So - the limit set on any user >> >> Does btrfs support per-user quota at all? I am aware only of per-subvolume >> quotas. > > Well, this is a kind of deceptive word usage in

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Duncan
Chris Murphy posted on Fri, 10 Aug 2018 12:07:34 -0600 as excerpted: > But whether data is shared or exclusive seems potentially ephemeral, and > not something a sysadmin should even be able to anticipate let alone > individual users. Define "user(s)". Arguably, in the context of btrfs tool

Re: List of known BTRFS Raid 5/6 Bugs?

2018-08-10 Thread erenthetitan
Write hole: > The data will be readable until one of the data blocks becomes > inaccessible (bad sector or failed disk). This is because it is only the > parity block that is corrupted (old data blocks are still not modified > due to btrfs CoW), and the parity block is only required when

Re: List of known BTRFS Raid 5/6 Bugs?

2018-08-10 Thread Zygo Blaxell
On Fri, Aug 10, 2018 at 06:55:58PM +0200, erentheti...@mail.de wrote: > Did i get you right? > Please correct me if i am wrong: > > Scrubbing seems to have been fixed, you only have to run it once. Yes. There is one minor bug remaining here: when scrub detects an error on any disk in a raid5/6

Re: List of known BTRFS Raid 5/6 Bugs?

2018-08-10 Thread erenthetitan
Did i get you right? Please correct me if i am wrong: Scrubbing seems to have been fixed, you only have to run it once. Hotplugging (temporary connection loss) is affected by the write hole bug, and will create undetectable errors every 16 TB (crc32 limitation). The write Hole Bug can affect

Re: BUG: scheduling while atomic

2018-08-10 Thread Qu Wenruo
On 8/11/18 6:14 AM, James Courtier-Dutton wrote: > On 6 August 2018 at 07:26, Qu Wenruo wrote: >> >> >>> WARNING: CPU: 3 PID: 803 at >>> /build/linux-hwe-SYRsgd/linux-hwe-4.15.0/fs/btrfs/extent_map.c:77 >>> free_extent_map+0x78/0x90 [btrfs] >> >> Then it makes sense, as it's a WARN_ON() line,

Re: BUG: scheduling while atomic

2018-08-10 Thread James Courtier-Dutton
On 6 August 2018 at 07:26, Qu Wenruo wrote: > > >> WARNING: CPU: 3 PID: 803 at >> /build/linux-hwe-SYRsgd/linux-hwe-4.15.0/fs/btrfs/extent_map.c:77 >> free_extent_map+0x78/0x90 [btrfs] > > Then it makes sense, as it's a WARN_ON() line, showing one extent map is > still used. > > If it get freed,

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Austin S. Hemmelgarn
On 2018-08-10 14:07, Chris Murphy wrote: On Thu, Aug 9, 2018 at 5:35 PM, Qu Wenruo wrote: On 8/10/18 1:48 AM, Tomasz Pala wrote: On Tue, Jul 31, 2018 at 22:32:07 +0800, Qu Wenruo wrote: 2) Different limitations on exclusive/shared bytes Btrfs can set different limit on

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Austin S. Hemmelgarn
On 2018-08-10 14:21, Tomasz Pala wrote: On Fri, Aug 10, 2018 at 07:39:30 -0400, Austin S. Hemmelgarn wrote: I.e.: every shared segment should be accounted within quota (at least once). I think what you mean to say here is that every shared extent should be accounted to quotas for every

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Tomasz Pala
On Fri, Aug 10, 2018 at 07:39:30 -0400, Austin S. Hemmelgarn wrote: >> I.e.: every shared segment should be accounted within quota (at least once). > I think what you mean to say here is that every shared extent should be > accounted to quotas for every location it is reflinked from. IOW, that

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Chris Murphy
On Thu, Aug 9, 2018 at 5:35 PM, Qu Wenruo wrote: > > > On 8/10/18 1:48 AM, Tomasz Pala wrote: >> On Tue, Jul 31, 2018 at 22:32:07 +0800, Qu Wenruo wrote: >> >>> 2) Different limitations on exclusive/shared bytes >>>Btrfs can set different limit on exclusive/shared bytes, further >>>

Re: [COMMAND HANGS] The command 'btrfs subvolume sync -s 2 xyz' can hangs.

2018-08-10 Thread Giuseppe Della Bianca
In data giovedì 9 agosto 2018 20:48:03 CEST, Jeff Mahoney ha scritto: > On 8/9/18 11:15 AM, Giuseppe Della Bianca wrote: > > Hi. > > > > My system: > > - Fedora 28 x86_64 > > - kernel-4.17.7-200 > > - btrfs-progs-4.15.1-1 > > > > The command 'btrfs subvolume sync -s 2 xyz' hangs in this case: >

Re: [RFC PATCH 00/17] btrfs zoned block device support

2018-08-10 Thread Qu Wenruo
On 8/10/18 9:32 PM, Hans van Kranenburg wrote: > On 08/10/2018 09:28 AM, Qu Wenruo wrote: >> >> >> On 8/10/18 2:04 AM, Naohiro Aota wrote: >>> This series adds zoned block device support to btrfs. >>> >>> [...] >> >> And this is the patch modifying extent allocator. >> >> Despite that, the

Re: [RFC PATCH 00/17] btrfs zoned block device support

2018-08-10 Thread Hans van Kranenburg
On 08/10/2018 09:28 AM, Qu Wenruo wrote: > > > On 8/10/18 2:04 AM, Naohiro Aota wrote: >> This series adds zoned block device support to btrfs. >> >> [...] > > And this is the patch modifying extent allocator. > > Despite that, the support zoned storage looks pretty interesting and > have

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Austin S. Hemmelgarn
On 2018-08-09 13:48, Tomasz Pala wrote: On Tue, Jul 31, 2018 at 22:32:07 +0800, Qu Wenruo wrote: 2) Different limitations on exclusive/shared bytes Btrfs can set different limit on exclusive/shared bytes, further complicating the problem. 3) Btrfs quota only accounts data/metadata

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Austin S. Hemmelgarn
On 2018-08-09 19:35, Qu Wenruo wrote: On 8/10/18 1:48 AM, Tomasz Pala wrote: On Tue, Jul 31, 2018 at 22:32:07 +0800, Qu Wenruo wrote: 2) Different limitations on exclusive/shared bytes Btrfs can set different limit on exclusive/shared bytes, further complicating the problem. 3)

Re: Mount stalls indefinitely after enabling quota groups.

2018-08-10 Thread Dan Merillat
On Fri, Aug 10, 2018 at 6:51 AM, Qu Wenruo wrote: > > > On 8/10/18 6:42 PM, Dan Merillat wrote: >> On Fri, Aug 10, 2018 at 6:05 AM, Qu Wenruo wrote: > > But considering your amount of block groups, mount itself may take some > time (before trying to resume balance). I'd believe it, a clean

Re: Mount stalls indefinitely after enabling quota groups.

2018-08-10 Thread Qu Wenruo
On 8/10/18 6:42 PM, Dan Merillat wrote: > On Fri, Aug 10, 2018 at 6:05 AM, Qu Wenruo wrote: > >> >> Although not sure about the details, but the fs looks pretty huge. >> Tons of subvolume and its free space cache inodes. > > 11TB, 3 or so subvolumes and two snapshots I think. Not

Re: [PATCH 1/2] btrfs: assert for num_devices below 0

2018-08-10 Thread David Sterba
On Fri, Aug 10, 2018 at 01:53:20PM +0800, Anand Jain wrote: > In preparation to add helper function to deduce the num_devices with > replace running, use assert instead of bug_on and warn_on. > > Signed-off-by: Anand Jain Ok for the updated condition as it's going to be used in the new helper.

Re: [PATCH v5 2/2] btrfs: add helper btrfs_num_devices() to deduce num_devices

2018-08-10 Thread David Sterba
On Fri, Aug 10, 2018 at 01:53:21PM +0800, Anand Jain wrote: > When the replace is running the fs_devices::num_devices also includes > the replace device, however in some operations like device delete and > balance it needs the actual num_devices without the repalce devices, so > now the function

Re: Mount stalls indefinitely after enabling quota groups.

2018-08-10 Thread Dan Merillat
On Fri, Aug 10, 2018 at 6:05 AM, Qu Wenruo wrote: > > Although not sure about the details, but the fs looks pretty huge. > Tons of subvolume and its free space cache inodes. 11TB, 3 or so subvolumes and two snapshots I think. Not particularly large for NAS. > But only 3 tree reloc trees,

How to ensure that a snapshot is not corrupted?

2018-08-10 Thread Cerem Cem ASLAN
Original question is here: https://superuser.com/questions/1347843 How can we sure that a readonly snapshot is not corrupted due to a disk failure? Is the only way calculating the checksums one on another and store it for further examination, or does BTRFS handle that on its own?

Re: Mount stalls indefinitely after enabling quota groups.

2018-08-10 Thread Qu Wenruo
On 8/10/18 5:39 PM, Dan Merillat wrote: > On Fri, Aug 10, 2018 at 5:13 AM, Qu Wenruo wrote: >> >> >> On 8/10/18 4:47 PM, Dan Merillat wrote: >>> Unfortunately that doesn't appear to be it, a forced restart and >>> attempted to mount with skip_balance leads to the same thing. >> >> That's

Re: Mount stalls indefinitely after enabling quota groups.

2018-08-10 Thread Dan Merillat
E: Resending without the 500k attachment. On Fri, Aug 10, 2018 at 5:13 AM, Qu Wenruo wrote: > > > On 8/10/18 4:47 PM, Dan Merillat wrote: >> Unfortunately that doesn't appear to be it, a forced restart and >> attempted to mount with skip_balance leads to the same thing. > > That's strange. > >

Re: [PATCH] fstests: btrfs: Add test for corrupted orphan qgroup numbers

2018-08-10 Thread Qu Wenruo
On 8/10/18 5:42 PM, Eryu Guan wrote: > On Fri, Aug 10, 2018 at 05:10:29PM +0800, Qu Wenruo wrote: >> >> >> On 8/10/18 4:54 PM, Filipe Manana wrote: >>> On Fri, Aug 10, 2018 at 9:46 AM, Qu Wenruo wrote: On 8/9/18 5:26 PM, Filipe Manana wrote: > On Thu, Aug 9, 2018 at 8:45 AM,

Re: [PATCH] fstests: btrfs: Add test for corrupted orphan qgroup numbers

2018-08-10 Thread Eryu Guan
On Fri, Aug 10, 2018 at 05:10:29PM +0800, Qu Wenruo wrote: > > > On 8/10/18 4:54 PM, Filipe Manana wrote: > > On Fri, Aug 10, 2018 at 9:46 AM, Qu Wenruo wrote: > >> > >> > >> On 8/9/18 5:26 PM, Filipe Manana wrote: > >>> On Thu, Aug 9, 2018 at 8:45 AM, Qu Wenruo wrote: > This bug is

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Tomasz Pala
On Fri, Aug 10, 2018 at 15:55:46 +0800, Qu Wenruo wrote: >> The first thing about virtually every mechanism should be >> discoverability and reliability. I expect my quota not to change without >> my interaction. Never. How did you cope with this? >> If not - how are you going to explain such

Re: Mount stalls indefinitely after enabling quota groups.

2018-08-10 Thread Qu Wenruo
On 8/10/18 4:47 PM, Dan Merillat wrote: > Unfortunately that doesn't appear to be it, a forced restart and > attempted to mount with skip_balance leads to the same thing. That's strange. Would you please provide the following output to determine whether we have any balance running? # btrfs

Re: [PATCH] fstests: btrfs: Add test for corrupted orphan qgroup numbers

2018-08-10 Thread Qu Wenruo
On 8/10/18 4:54 PM, Filipe Manana wrote: > On Fri, Aug 10, 2018 at 9:46 AM, Qu Wenruo wrote: >> >> >> On 8/9/18 5:26 PM, Filipe Manana wrote: >>> On Thu, Aug 9, 2018 at 8:45 AM, Qu Wenruo wrote: This bug is exposed by populating a high level qgroup, and then make it orphan (high

Re: [PATCH] fstests: btrfs: Add test for corrupted orphan qgroup numbers

2018-08-10 Thread Filipe Manana
On Fri, Aug 10, 2018 at 9:46 AM, Qu Wenruo wrote: > > > On 8/9/18 5:26 PM, Filipe Manana wrote: >> On Thu, Aug 9, 2018 at 8:45 AM, Qu Wenruo wrote: >>> This bug is exposed by populating a high level qgroup, and then make it >>> orphan (high level qgroup without child) >> >> Same comment as in

Re: Mount stalls indefinitely after enabling quota groups.

2018-08-10 Thread Dan Merillat
Unfortunately that doesn't appear to be it, a forced restart and attempted to mount with skip_balance leads to the same thing. 20 minutes in btrfs-transactio had a large burst of reads then started spinning the CPU with the disk idle. Is this recoverable? I could leave it for a day or so if it

Re: [PATCH] fstests: btrfs: Add test for corrupted orphan qgroup numbers

2018-08-10 Thread Qu Wenruo
On 8/9/18 5:26 PM, Filipe Manana wrote: > On Thu, Aug 9, 2018 at 8:45 AM, Qu Wenruo wrote: >> This bug is exposed by populating a high level qgroup, and then make it >> orphan (high level qgroup without child) > > Same comment as in the kernel patch: > > "That sentence is confusing. An

Re: [PATCH v2] fstests: btrfs: Add test for corrupted childless qgroup numbers

2018-08-10 Thread Filipe Manana
On Fri, Aug 10, 2018 at 3:20 AM, Qu Wenruo wrote: > This bug is exposed by populating a high level qgroup, and then make it > childless with old qgroup numbers, and finally do rescan. > > Normally rescan should zero out all qgroups' accounting number, but due > to a kernel bug which won't mark

Re: Mount stalls indefinitely after enabling quota groups.

2018-08-10 Thread Qu Wenruo
On 8/10/18 3:40 PM, Dan Merillat wrote: > Kernel 4.17.9, 11tb BTRFS device (md-backed, not btrfs raid) > > I was testing something out and enabled quota groups and started getting > 2-5 minute long pauses where a btrfs-transaction thread spun at 100%. Looks pretty like a running balance and

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Qu Wenruo
On 8/10/18 3:17 PM, Tomasz Pala wrote: > On Fri, Aug 10, 2018 at 07:35:32 +0800, Qu Wenruo wrote: > >>> when limiting somebody's data space we usually don't care about the >>> underlying "savings" coming from any deduplicating technique - these are >>> purely bonuses for system owner, so he

Re: Mount stalls indefinitely after enabling quota groups.

2018-08-10 Thread Dan Merillat
[23084.426006] sysrq: SysRq : Show Blocked State [23084.426085] taskPC stack pid father [23084.426332] mount D0 4857 4618 0x0080 [23084.426403] Call Trace: [23084.426531] ? __schedule+0x2c3/0x830 [23084.426628] ? __wake_up_common+0x6f/0x120

Mount stalls indefinitely after enabling quota groups.

2018-08-10 Thread Dan Merillat
Kernel 4.17.9, 11tb BTRFS device (md-backed, not btrfs raid) I was testing something out and enabled quota groups and started getting 2-5 minute long pauses where a btrfs-transaction thread spun at 100%. Post-reboot the mount process spinds at 100% CPU, occasinally yielding to a

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Tomasz Pala
On Fri, Aug 10, 2018 at 07:03:18 +0300, Andrei Borzenkov wrote: >> So - the limit set on any user > > Does btrfs support per-user quota at all? I am aware only of per-subvolume > quotas. Well, this is a kind of deceptive word usage in "post-truth" times. In this case both "user" and "quota"

Re: List of known BTRFS Raid 5/6 Bugs?

2018-08-10 Thread Zygo Blaxell
On Fri, Aug 10, 2018 at 03:40:23AM +0200, erentheti...@mail.de wrote: > I am searching for more information regarding possible bugs related to > BTRFS Raid 5/6. All sites i could find are incomplete and information > contradicts itself: > > The Wiki Raid 5/6 Page

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Tomasz Pala
On Fri, Aug 10, 2018 at 07:35:32 +0800, Qu Wenruo wrote: >> when limiting somebody's data space we usually don't care about the >> underlying "savings" coming from any deduplicating technique - these are >> purely bonuses for system owner, so he could do larger resource overbooking. > > In