Re: "kernel BUG" and segmentation fault with "device delete"

2019-07-05 Thread Qu Wenruo
On 2019/7/6 下午1:13, Vladimir Panteleev wrote: [...] >> I'm not sure if it's the degraded mount cause the problem, as the >> enospc_debug output looks like reserved/pinned/over-reserved space has >> taken up all space, while no new chunk get allocated. > > The problem happens after replace-ing th

Re: "kernel BUG" and segmentation fault with "device delete"

2019-07-05 Thread Vladimir Panteleev
On 06/07/2019 05.01, Qu Wenruo wrote: After stubbing out btrfs_check_rw_degradable (because btrfs currently can't realize when it has all drives needed for RAID10), The point is, btrfs_check_rw_degradable() is already doing per-chunk level rw degradable checking. I would highly recommend not t

Re: "kernel BUG" and segmentation fault with "device delete"

2019-07-05 Thread Qu Wenruo
On 2019/7/5 下午12:39, Vladimir Panteleev wrote: > Hi, > > I'm trying to convert a data=RAID10,metadata=RAID1 (4 disks) array to > RAID1 (2 disks). The array was less than half full, and I disconnected > two parity drives, leaving two that contained one copy of all data. Definitely not something

Re: "kernel BUG" and segmentation fault with "device delete"

2019-07-05 Thread Vladimir Panteleev
On 06/07/2019 02.38, Chris Murphy wrote: On Fri, Jul 5, 2019 at 6:05 PM Vladimir Panteleev wrote: Unfortunately as mentioned before that wasn't an option. I was performing this operation on a DM snapshot target backed by a file that certainly could not fit the result of a RAID10-to-RAID1 rebala

Re: "kernel BUG" and segmentation fault with "device delete"

2019-07-05 Thread Chris Murphy
On Fri, Jul 5, 2019 at 6:05 PM Vladimir Panteleev wrote: > On 05/07/2019 21.43, Chris Murphy wrote: > > But I can't tell from the > > above exactly when each drive was disconnected. In this scenario you > > need to convert to raid1 first, wait for that to complete successfully > > before you can

Re: "kernel BUG" and segmentation fault with "device delete"

2019-07-05 Thread Vladimir Panteleev
Hi Chris, First, thank you very much for taking the time to reply. I greatly appreciate it. On 05/07/2019 21.43, Chris Murphy wrote: There's no parity on either raid10 or raid1. Right, thank you for the correction. Of course, I meant the duplicate copies of the RAID1 data. But I can't t

Re: "kernel BUG" and segmentation fault with "device delete"

2019-07-05 Thread Chris Murphy
On Fri, Jul 5, 2019 at 3:48 PM Chris Murphy wrote: > > We need to see a list of commands issued in order, along with the > physical connected state of each drive. I thought I understood what > you did from the previous email, but this paragraph contradicts my > understanding, especially when you s

Re: "kernel BUG" and segmentation fault with "device delete"

2019-07-05 Thread Chris Murphy
On Fri, Jul 5, 2019 at 4:20 AM Vladimir Panteleev wrote: > > On 05/07/2019 09.42, Andrei Borzenkov wrote: > > On Fri, Jul 5, 2019 at 7:45 AM Vladimir Panteleev > > wrote: > >> > >> Hi, > >> > >> I'm trying to convert a data=RAID10,metadata=RAID1 (4 disks) array to > >> RAID1 (2 disks). The array

Re: "kernel BUG" and segmentation fault with "device delete"

2019-07-05 Thread Chris Murphy
On Thu, Jul 4, 2019 at 10:39 PM Vladimir Panteleev wrote: > > Hi, > > I'm trying to convert a data=RAID10,metadata=RAID1 (4 disks) array to > RAID1 (2 disks). The array was less than half full, and I disconnected > two parity drives, leaving two that contained one copy of all data. There's no par

Re: snapshot rollback

2019-07-05 Thread Chris Murphy
On Fri, Jul 5, 2019 at 4:45 AM Ulli Horlacher wrote: > > > This is a conceptual btrfs question :-) > > I have this btrfs filesystem: > > root@xerus:~# mount | grep /test > /dev/sdd4 on /test type btrfs > (rw,relatime,space_cache,user_subvol_rm_allowed,subvolid=5,subvol=/) > > with some snapshots:

Re: syncfs() returns no error on fs failure

2019-07-05 Thread Martin Raiber
More research on this. Seems a generic error reporting mechanism for this is in the works https://lkml.org/lkml/2018/6/1/640 . Wrt. to btrfs one should always use BTRFS_IOC_SYNC because only this one seems to wait for delalloc work to finish: https://patchwork.kernel.org/patch/2927491/ (five year

Re: delete recursivly subvolumes?

2019-07-05 Thread Hugo Mills
On Fri, Jul 05, 2019 at 09:56:39PM +0200, Ulli Horlacher wrote: > On Fri 2019-07-05 (19:51), Hugo Mills wrote: > > > > Is there a command/script/whatever to snapshot (copy) a subvolume which > > > contains (somewhere) other subvolumes? > > > > > > Example: > > > > > > root@xerus:/test# btrfs_sub

Re: delete recursivly subvolumes?

2019-07-05 Thread Ulli Horlacher
On Fri 2019-07-05 (19:51), Hugo Mills wrote: > > Is there a command/script/whatever to snapshot (copy) a subvolume which > > contains (somewhere) other subvolumes? > > > > Example: > > > > root@xerus:/test# btrfs_subvolume_list /test/ | grep /tmp > > /test/tmp > > /test/tmp/xx/ss1 > > /test/tmp/

Re: delete recursivly subvolumes?

2019-07-05 Thread Hugo Mills
On Fri, Jul 05, 2019 at 09:47:20PM +0200, Ulli Horlacher wrote: > On Fri 2019-07-05 (21:39), Ulli Horlacher wrote: > > > Is there a command/script/whatever to remove subvolume which contains > > (somewhere) other subvolumes? > > ADONN QUESTION! :-) > > Is there a command/script/whatever to snaps

Re: delete recursivly subvolumes?

2019-07-05 Thread Ulli Horlacher
On Fri 2019-07-05 (21:39), Ulli Horlacher wrote: > Is there a command/script/whatever to remove subvolume which contains > (somewhere) other subvolumes? ADONN QUESTION! :-) Is there a command/script/whatever to snapshot (copy) a subvolume which contains (somewhere) other subvolumes? Example: r

delete recursivly subvolumes?

2019-07-05 Thread Ulli Horlacher
I am a master in writing unnecessary software :-} "I have a great idea! I'll write a program for this!" (some time and many lines of code later) "ARGH... there is already such a program and it is better than mine!" This time I ask BEFORE I do the coding: Is there a command/script/whatever to remo

Btrfs progs release 5.2

2019-07-05 Thread David Sterba
Hi, btrfs-progs version 5.2 have been released. Changes: Scrub status has been reworked: UUID: bf8720e0-606b-4065-8320-b48df2e8e669 Scrub started:Fri Jun 14 12:00:00 2019 Status: running Duration: 0:14:11 Time left:0:04:04 ETA:

Re: [PATCH v2] btrfs: inode: Don't compress if NODATASUM or NODATACOW set

2019-07-05 Thread David Sterba
On Thu, Jul 04, 2019 at 11:51:38PM +, WenRuo Qu wrote: > >> @@ -1630,7 +1655,8 @@ int btrfs_run_delalloc_range(struct inode *inode, > >> struct page *locked_page, > >>} else if (BTRFS_I(inode)->flags & BTRFS_INODE_PREALLOC && !force_cow) { > >>ret = run_delalloc_nocow(inode, lo

Re: [PATCH 2/5] Btrfs: fix inode cache block reserve leak on failure to allocate data space

2019-07-05 Thread Filipe Manana
On Fri, Jul 5, 2019 at 3:26 PM Nikolay Borisov wrote: > > > > On 5.07.19 г. 17:23 ч., Filipe Manana wrote: > > On Fri, Jul 5, 2019 at 3:09 PM Nikolay Borisov wrote: > >> > >> > >> > >> On 5.07.19 г. 13:42 ч., Filipe Manana wrote: > >>> On Fri, Jul 5, 2019 at 11:01 AM Nikolay Borisov wrote: > >>>

Re: [PATCH 2/5] Btrfs: fix inode cache block reserve leak on failure to allocate data space

2019-07-05 Thread Nikolay Borisov
On 5.07.19 г. 17:23 ч., Filipe Manana wrote: > On Fri, Jul 5, 2019 at 3:09 PM Nikolay Borisov wrote: >> >> >> >> On 5.07.19 г. 13:42 ч., Filipe Manana wrote: >>> On Fri, Jul 5, 2019 at 11:01 AM Nikolay Borisov wrote: On 4.07.19 г. 18:24 ч., fdman...@kernel.org wrote: >

Re: [PATCH 2/5] Btrfs: fix inode cache block reserve leak on failure to allocate data space

2019-07-05 Thread Filipe Manana
On Fri, Jul 5, 2019 at 3:09 PM Nikolay Borisov wrote: > > > > On 5.07.19 г. 13:42 ч., Filipe Manana wrote: > > On Fri, Jul 5, 2019 at 11:01 AM Nikolay Borisov wrote: > >> > >> > >> > >> On 4.07.19 г. 18:24 ч., fdman...@kernel.org wrote: > >>> From: Filipe Manana > >>> > >>> If we failed to alloc

Re: [PATCH 3/5] Btrfs: fix inode cache waiters hanging on failure to start caching thread

2019-07-05 Thread Nikolay Borisov
On 4.07.19 г. 18:24 ч., fdman...@kernel.org wrote: > From: Filipe Manana > > If we fail to start the inode caching thread, we print an error message > and disable the inode cache, however we never wake up any waiters, so they > hang forever waiting for the caching to finish. Fix this by waking

syncfs() returns no error on fs failure

2019-07-05 Thread Martin Raiber
Hi, I realize this isn't a btrfs specific problem but syncfs() returns no error even on complete fs failure. The problem is (I think) that the return value of sb->s_op->sync_fs is being ignored in fs/sync.c. I kind of assumed it would return an error if it fails to write the file system changes to

Re: [PATCH 2/5] Btrfs: fix inode cache block reserve leak on failure to allocate data space

2019-07-05 Thread Nikolay Borisov
On 5.07.19 г. 13:42 ч., Filipe Manana wrote: > On Fri, Jul 5, 2019 at 11:01 AM Nikolay Borisov wrote: >> >> >> >> On 4.07.19 г. 18:24 ч., fdman...@kernel.org wrote: >>> From: Filipe Manana >>> >>> If we failed to allocate the data extent(s) for the inode space cache, we >>> were bailing out wi

Re: Need help with a lockdep splat, possibly perf related?

2019-07-05 Thread Josef Bacik
On Wed, Jul 03, 2019 at 11:12:10PM +0200, Peter Zijlstra wrote: > On Wed, Jul 03, 2019 at 09:54:06AM -0400, Josef Bacik wrote: > > Hello, > > > > I've been seeing a variation of the following splat recently and I have no > > earthly idea what it's trying to tell me. > > That you have a lock cycle

Re: snapshot rollback

2019-07-05 Thread Graham Cobb
On 05/07/2019 12:47, Remi Gauvin wrote: > On 2019-07-05 7:06 a.m., Ulli Horlacher wrote: > >> >> Ok, it seems my idea (replacing the original root subvolume with a >> snapshot) is not possible. >> > ... > It is common practice with installers now to mount your root and home on > a subvolume for e

Re: snapshot rollback

2019-07-05 Thread Remi Gauvin
On 2019-07-05 7:06 a.m., Ulli Horlacher wrote: > > Ok, it seems my idea (replacing the original root subvolume with a > snapshot) is not possible. > Disclaimer: You probably want to wait at least 24 hours before trying my directions in case anyone has am important correction to make. You shou

[PATCH] btrfs/189: make the test work on systems with a page size greater than 4Kb

2019-07-05 Thread fdmanana
From: Filipe Manana The test currently uses offsets and lengths which are multiples of 4K, but not multiples of 64K (or any other page size between 4Kb and 64Kb). This makes the reflink calls fail with -EINVAL because reflink only operates on ranges that are aligned to the the filesystem's block

Re: snapshot rollback

2019-07-05 Thread Ulli Horlacher
On Fri 2019-07-05 (12:38), Ulli Horlacher wrote: > But (how) can I delete the original root subvol to free disk space? Found: https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-subvolume A freshly created filesystem is also a subvolume, called top-level, internally has an id 5. This subv

Re: [PATCH] btrfs: Ensure replaced device doesn't have pending chunk allocation

2019-07-05 Thread Greg KH
On Wed, Jul 03, 2019 at 12:45:49PM +0300, Nikolay Borisov wrote: > Recent FITRIM work, namely bbbf7243d62d ("btrfs: combine device update > operations during transaction commit") combined the way certain > operations are recoded in a transaction. As a result an ASSERT was added > in dev_replace_fin

Re: [PATCH 1/5] Btrfs: fix hang when loading existing inode cache off disk

2019-07-05 Thread Filipe Manana
On Fri, Jul 5, 2019 at 10:09 AM Nikolay Borisov wrote: > > > > On 4.07.19 г. 18:24 ч., fdman...@kernel.org wrote: > > From: Filipe Manana > > > > If we are able to load an existing inode cache off disk, we set the state > > of the cache to BTRFS_CACHE_FINISHED, but we don't wake up any one waitin

snapshot rollback

2019-07-05 Thread Ulli Horlacher
This is a conceptual btrfs question :-) I have this btrfs filesystem: root@xerus:~# mount | grep /test /dev/sdd4 on /test type btrfs (rw,relatime,space_cache,user_subvol_rm_allowed,subvolid=5,subvol=/) with some snapshots: root@xerus:~# btrfs subvolume list /test ID 736 gen 9722 top level 5

Re: [PATCH 2/5] Btrfs: fix inode cache block reserve leak on failure to allocate data space

2019-07-05 Thread Filipe Manana
On Fri, Jul 5, 2019 at 11:01 AM Nikolay Borisov wrote: > > > > On 4.07.19 г. 18:24 ч., fdman...@kernel.org wrote: > > From: Filipe Manana > > > > If we failed to allocate the data extent(s) for the inode space cache, we > > were bailing out without releasing the previously reserved metadata. This

Re: "kernel BUG" and segmentation fault with "device delete"

2019-07-05 Thread Vladimir Panteleev
On 05/07/2019 09.42, Andrei Borzenkov wrote: On Fri, Jul 5, 2019 at 7:45 AM Vladimir Panteleev wrote: Hi, I'm trying to convert a data=RAID10,metadata=RAID1 (4 disks) array to RAID1 (2 disks). The array was less than half full, and I disconnected two parity drives, btrfs does not have dedic

[PATCH v3 2/2] Btrfs: fix ENOSPC errors, leading to transaction aborts, when cloning extents

2019-07-05 Thread fdmanana
From: Filipe Manana When cloning extents (or deduplicating) we create a transaction with a space reservation that considers we will drop or update a single file extent item of the destination inode (that we modify a single leaf). That is fine for the vast majority of scenarios, however it might h

Re: [PATCH v2] generic: test cloning large exents to a file with many small extents

2019-07-05 Thread Filipe Manana
On Fri, Jul 5, 2019 at 8:43 AM Eryu Guan wrote: > > On Fri, Jun 28, 2019 at 11:08:36PM +0100, fdman...@kernel.org wrote: > > From: Filipe Manana > > > > Test that if we clone a file with some large extents into a file that has > > many small extents, when the fs is nearly full, the clone operatio

Re: [PATCH 2/5] Btrfs: fix inode cache block reserve leak on failure to allocate data space

2019-07-05 Thread Nikolay Borisov
On 4.07.19 г. 18:24 ч., fdman...@kernel.org wrote: > From: Filipe Manana > > If we failed to allocate the data extent(s) for the inode space cache, we > were bailing out without releasing the previously reserved metadata. This > was triggering the following warnings when unmounting a filesyste

Re: "kernel BUG" and segmentation fault with "device delete"

2019-07-05 Thread Andrei Borzenkov
On Fri, Jul 5, 2019 at 7:45 AM Vladimir Panteleev wrote: > > Hi, > > I'm trying to convert a data=RAID10,metadata=RAID1 (4 disks) array to > RAID1 (2 disks). The array was less than half full, and I disconnected > two parity drives, btrfs does not have dedicated parity drives; it is quite possibl

Re: [PATCH 1/5] Btrfs: fix hang when loading existing inode cache off disk

2019-07-05 Thread Nikolay Borisov
On 4.07.19 г. 18:24 ч., fdman...@kernel.org wrote: > From: Filipe Manana > > If we are able to load an existing inode cache off disk, we set the state > of the cache to BTRFS_CACHE_FINISHED, but we don't wake up any one waiting > for the cache to be available. This means that anyone waiting fo

Re: [PATCH 1/5] btrfs-progs: mkfs: Apply the sectorsize user specified on 64k page size system

2019-07-05 Thread Qu Wenruo
On 2019/7/5 下午3:45, Nikolay Borisov wrote: > > > On 5.07.19 г. 10:26 ч., Qu Wenruo wrote: >> [BUG] >> On aarch64 with 64k page size, mkfs.btrfs -s option doesn't work: >> $ mkfs.btrfs -s 4096 ~/10G.img -f >> btrfs-progs v5.1.1 >> See http://btrfs.wiki.kernel.org for more information. >>

Re: What are the maintenance recommendation ?

2019-07-05 Thread Pierre Couderc
On 7/3/19 6:37 AM, Zygo Blaxell wrote: On Sat, Jun 29, 2019 at 08:50:03PM +0200, Pierre Couderc wrote: 1- Is there a summary of btrfs recommendations for maintenance ? I have read somewhere that  a monthly  btrfs scrub is recommended. 1. Scrub detects and (when using the DUP or RAID1/10/5/6 p

Re: [PATCH 1/5] btrfs-progs: mkfs: Apply the sectorsize user specified on 64k page size system

2019-07-05 Thread Nikolay Borisov
On 5.07.19 г. 10:26 ч., Qu Wenruo wrote: > [BUG] > On aarch64 with 64k page size, mkfs.btrfs -s option doesn't work: > $ mkfs.btrfs -s 4096 ~/10G.img -f > btrfs-progs v5.1.1 > See http://btrfs.wiki.kernel.org for more information. > > Label: (null) > UUID:

Re: [PATCH v2] generic: test cloning large exents to a file with many small extents

2019-07-05 Thread Eryu Guan
On Fri, Jun 28, 2019 at 11:08:36PM +0100, fdman...@kernel.org wrote: > From: Filipe Manana > > Test that if we clone a file with some large extents into a file that has > many small extents, when the fs is nearly full, the clone operation does > not fail and produces the correct result. > > This

Re: [PATCH][next][V3] btrfs: fix memory leak of path on error return path

2019-07-05 Thread Nikolay Borisov
On 5.07.19 г. 10:26 ч., Colin King wrote: > From: Colin Ian King > > Currently if the allocation of roots or tmp_ulist fails the error handling > does not free up the allocation of path causing a memory leak. Fix this and > other similar leaks by moving the call of btrfs_free_path from label o

[PATCH 5/5] btrfs-progs: convert-tests: Skip tests if kernel doesn't support subpage sized sector size

2019-07-05 Thread Qu Wenruo
Most convert tests needs to mount the converted image, and both reiserfs and ext* uses 4k block size, on 32K page size system we can't mount them and will cause test failure. Skip most of convert tests, except 007-unsupported-block-sizes, which should fail on all systems. Signed-off-by: Qu Wenruo

[PATCH 0/5] btrfs-progs: tests: Make 64K page size system happier

2019-07-05 Thread Qu Wenruo
Since I got another rockpro64, finally I could do some tests with aarch64 64K page size mode. (The first board is working as a NAS for a while) Unsurprisingly there are several false test alerts in btrfs-progs selftests. Although there is no existing CI service based on 64K page sized system, we'

[PATCH 1/5] btrfs-progs: mkfs: Apply the sectorsize user specified on 64k page size system

2019-07-05 Thread Qu Wenruo
[BUG] On aarch64 with 64k page size, mkfs.btrfs -s option doesn't work: $ mkfs.btrfs -s 4096 ~/10G.img -f btrfs-progs v5.1.1 See http://btrfs.wiki.kernel.org for more information. Label: (null) UUID: c2a09334-aaca-4980-aefa-4b3e27390658 Node size:

[PATCH 2/5] btrfs-progs: fsck-tests: Check if current kernel can mount fs with specified sector size

2019-07-05 Thread Qu Wenruo
[BUG] When doing test on platforms with page size other than 4K (e.g aarch64 can use 64K page size, or ppc64le), certain test wills fail like: [TEST/fsck] 012-leaf-corruption mount: /home/adam/btrfs-progs/tests/mnt: wrong fs type, bad option, bad superblock on /dev/loop4, missing codepag

[PATCH 3/5] btrfs-progs: mkfs-tests: Skip 010-minimal-size if we can't mount with 4k sector size

2019-07-05 Thread Qu Wenruo
[BUG] Test case 010-minimal-size fails on aarch64 with 64K page size: [TEST/mkfs] 010-minimal-size failed: /home/adam/btrfs-progs/mkfs.btrfs -f -n 4k -m single -d single /home/adam/btrfs-progs/tests//test.img test failed for case 010-minimal-size make: *** [Makefile:361: test-mkfs] E

[PATCH 4/5] btrfs-progs: misc-tests: Make test cases work or skipped on 64K page size system

2019-07-05 Thread Qu Wenruo
[BUG] The following test cases fails on aarch64 64K page size mode: [TEST/misc] 010-convert-delete-ext2-subvol failed: mount -t btrfs -o loop /home/adam/btrfs-progs/tests//test.img /home/adam/btrfs-progs/tests//mnt test failed for case 010-convert-delete-ext2-subvol make: *** [Makefi

[PATCH][next][V3] btrfs: fix memory leak of path on error return path

2019-07-05 Thread Colin King
From: Colin Ian King Currently if the allocation of roots or tmp_ulist fails the error handling does not free up the allocation of path causing a memory leak. Fix this and other similar leaks by moving the call of btrfs_free_path from label out to label out_free_ulist. Kudos to David Sterba for

Re: [PATCH][next][V2] btrfs: fix memory leak of path on error return path

2019-07-05 Thread Colin Ian King
On 05/07/2019 02:30, Anand Jain wrote: > On 5/7/19 7:03 AM, Colin King wrote: >> From: Colin Ian King >> >> Currently if the allocation of roots or tmp_ulist fails the error >> handling >> does not free up the allocation of path causing a memory leak. Fix >> this and >> other similar leaks by movi

Re: "kernel BUG" and segmentation fault with "device delete"

2019-07-05 Thread Vladimir Panteleev
On 05/07/2019 04.39, Vladimir Panteleev wrote: The process reached a point where the last missing device shows as containing 20 GB of RAID1 metadata. At this point, attempting to delete the device causes the operation to shortly fail with "No space left", followed by a "kernel BUG at fs/btrfs/r