Re: [PATCH 04/13] Kbuild: Rust support

2021-04-19 Thread David Sterba
On Fri, Apr 16, 2021 at 07:34:51PM +0200, Miguel Ojeda wrote:
> On Fri, Apr 16, 2021 at 3:38 PM Peter Zijlstra  wrote:
> >
> > So if I read all this right, rust compiles to .o and, like any other .o
> > file is then fed into objtool (for x86_64). Did you have any problems
> > with objtool? Does it generate correct ORC unwind information?
> 
> I opened an issue a while ago to take a closer look at the ORC
> unwinder etc., so it is in my radar (thanks for raising it up,
> nevertheless!).
> 
> Currently, causing a panic in a nested non-inlined function (f -> g ->
> h) in one of the samples with the ORC unwinder enabled gives me
> something like:
> 
> [0.903456]  rust_begin_unwind+0x9/0x10
> [0.903456]  ? _RNvNtCsbDqzXfLQacH_4core9panicking9panic_fmt+0x29/0x30
> [0.903456]  ? _RNvNtCsbDqzXfLQacH_4core9panicking5panic+0x44/0x50
> [0.903456]  ? _RNvCsbDqzXfLQacH_12rust_minimal1h+0x1c/0x20
> [0.903456]  ? _RNvCsbDqzXfLQacH_12rust_minimal1g+0x9/0x10
> [0.903456]  ? _RNvCsbDqzXfLQacH_12rust_minimal1f+0x9/0x10
> [0.903456]  ?
> _RNvXCsbDqzXfLQacH_12rust_minimalNtB2_11RustMinimalNtCsbDqzXfLQacH_6kernel12KernelModule4init+0x73/0x80
> [0.903456]  ? _RNvXsa_NtCsbDqzXfLQacH_4core3fmtbNtB5_5Debug3fmt+0x30/0x30
> [0.903456]  ? __rust_minimal_init+0x11/0x20

Are there plans to unmangle the symbols when printing stacks? c++filt
says:

  rust_begin_unwind+0x9/0x10
  ? core[8787f43e282added]::panicking::panic_fmt+0x29/0x30
  ? core[8787f43e282added]::panicking::panic+0x44/0x50
  ? rust_minimal[8787f43e282added]::h+0x1c/0x20
  ? rust_minimal[8787f43e282added]::g+0x9/0x10
  ? rust_minimal[8787f43e282added]::f+0x9/0x10
  ? ::init+0x73/0x80
  ? ::fmt+0x30/0x30
  ? __rust_minimal_init+0x11/0x20

for simple functions it's barely parseable but the following is hardly
readable

> _RNvXs5_NtCsbDqzXfLQacH_11rust_binder11range_allocNtB5_15DescriptorStateNtNtCsbDqzXfLQacH_4core3fmt5Debug3fmt+0x60/0x60

translates to

  ::fmt


Re: [PATCH v2] fs/btrfs: Fix uninitialized variable

2021-04-19 Thread David Sterba
On Sat, Apr 17, 2021 at 04:36:16PM +0100, Khaled ROMDHANI wrote:
> As reported by the Coverity static analysis.
> The variable zone is not initialized which
> may causes a failed assertion.
> 
> Addresses-Coverity: ("Uninitialized variables")
> Signed-off-by: Khaled ROMDHANI 
> ---
> v2: add a default case as proposed by David Sterba
> ---
>  fs/btrfs/zoned.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
> index eeb3ebe11d7a..82527308d165 100644
> --- a/fs/btrfs/zoned.c
> +++ b/fs/btrfs/zoned.c
> @@ -143,6 +143,9 @@ static inline u32 sb_zone_number(int shift, int mirror)
>   case 0: zone = 0; break;
>   case 1: zone = 1ULL << (BTRFS_SB_LOG_FIRST_SHIFT - shift); break;
>   case 2: zone = 1ULL << (BTRFS_SB_LOG_SECOND_SHIFT - shift); break;
> + default:
> + zone = 0;

Well yeah but this is not a valid case at all, we'd rather catch that as
an assertion failure than letting is silently continue.


Re: [PATCH-next] fs/btrfs: Fix uninitialized variable

2021-04-16 Thread David Sterba
On Tue, Apr 13, 2021 at 02:06:04PM +0100, Khaled ROMDHANI wrote:
> The variable zone is not initialized. It
> may causes a failed assertion.

Failed assertion means the 2nd one checking that the result still fits
to 32bit type. That would mean that none of the cases were hit, but all
callers pass valid values.

It would be better to add a default: case to catch that explicitly,
though hitting that is considered 'will not happen'.


[GIT PULL] Btrfs fix for 5.12-rc7

2021-04-11 Thread David Sterba
From: David Sterba 

Hi,

here's one more patch that we'd like to get to 5.12 before release, it's
changing where and how the superblock is stored in the zoned mode. It is
an on-disk format change but so far there are no implications for users
as the proper mkfs support hasn't been merged and is waiting for the
kernel side to settle.

Until now, the superblocks were derived from the zone index, but zone
size can differ per device. This is changed to be based on fixed offset
values, to make it independent of the device zone size.

The work on that got a bit delayed, we discussed the exact locations to
support potential device sizes and usecases. (Partially delayed also due
to my vacation.) Having that in the same release where the zoned mode is
declared usable is highly desired, there are userspace projects that
need to be updated to recognize the feature.  Pushing that to the next
release would make things harder to test.

Please pull, thanks.



The following changes since commit c1d6abdac46ca8127274bea195d804e3f2cec7ee:

  btrfs: fix check_data_csum() error message for direct I/O (2021-03-18 
21:25:11 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.12-rc6-tag

for you to fetch changes up to 53b74fa990bf76f290aa5930abfcf37424a1a865:

  btrfs: zoned: move superblock logging zone location (2021-04-10 12:13:16 
+0200)


Naohiro Aota (1):
  btrfs: zoned: move superblock logging zone location

 fs/btrfs/zoned.c | 53 ++---
 1 file changed, 42 insertions(+), 11 deletions(-)


Re: [PATCH] btrfs: Use readahead_batch_length

2021-04-06 Thread David Sterba
On Sun, Mar 21, 2021 at 09:03:11PM +, Matthew Wilcox (Oracle) wrote:
> Implement readahead_batch_length() to determine the number of bytes in
> the current batch of readahead pages and use it in btrfs.
> 
> Signed-off-by: Matthew Wilcox (Oracle) 

Thanks, I'll take it through my tree as btrfs is probably the only user
of the new helper. The MM list hasn't been CCed, I've added it now but I
think the patch is trivial enough and does not need another ack, so it's
just for the record.

> ---
>  fs/btrfs/extent_io.c| 6 ++
>  include/linux/pagemap.h | 9 +
>  2 files changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index e9837562f7d6..97ac4ddb2857 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -4875,10 +4875,8 @@ void extent_readahead(struct readahead_control *rac)
>   int nr;
>  
>   while ((nr = readahead_page_batch(rac, pagepool))) {
> - u64 contig_start = page_offset(pagepool[0]);
> - u64 contig_end = page_offset(pagepool[nr - 1]) + PAGE_SIZE - 1;
> -
> - ASSERT(contig_start + nr * PAGE_SIZE - 1 == contig_end);
> + u64 contig_start = readahead_pos(rac);
> + u64 contig_end = contig_start + readahead_batch_length(rac) - 1;
>  
>   contiguous_readpages(pagepool, nr, contig_start, contig_end,
>   &em_cached, &bio, &bio_flags, &prev_em_start);
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index 2cbfd4c36026..92939afd4944 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -1174,6 +1174,15 @@ static inline unsigned int readahead_count(struct 
> readahead_control *rac)
>   return rac->_nr_pages;
>  }
>  
> +/**
> + * readahead_batch_length - The number of bytes in the current batch.
> + * @rac: The readahead request.
> + */
> +static inline loff_t readahead_batch_length(struct readahead_control *rac)
> +{
> + return rac->_batch_count * PAGE_SIZE;
> +}
> +
>  static inline unsigned long dir_pages(struct inode *inode)
>  {
>   return (unsigned long)(inode->i_size + PAGE_SIZE - 1) >>

--


Re: [PATCH] fs: btrfs: Remove repeated struct declaration

2021-04-06 Thread David Sterba
On Thu, Apr 01, 2021 at 04:03:39PM +0800, Wan Jiabing wrote:
> struct btrfs_inode is declared twice. One is declared at 67th line.
> The blew declaration is not needed. Remove the duplicate.
> struct btrfs_fs_info should be declared in the forward declarations.
> Move it to the forward declarations.

I've reworded the changelog a bit, patch added to misc-next, thanks.


Re: disk-io.c:undefined reference to `atomic64_set_386'

2021-04-06 Thread David Sterba
On Thu, Apr 01, 2021 at 12:34:38AM +0800, kernel test robot wrote:
> Hi Josef,
> 
> FYI, the error/warning still remains.
> 
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   5e46d1b78a03d52306f21f77a4e4a144b6d31486
> commit: 8260edba67a2e6bd5e709d32188e23aa22cb4a38 btrfs: make the init of 
> static elements in fs_info separate
> date:   1 year ago
> config: um-randconfig-r023-20210330 (attached as .config)
> compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
> reproduce (this is a W=1 build):
> # 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8260edba67a2e6bd5e709d32188e23aa22cb4a38
> git remote add linus 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> git fetch --no-tags linus master
> git checkout 8260edba67a2e6bd5e709d32188e23aa22cb4a38
> # save the attached .config to linux build tree
> make W=1 ARCH=um 

All the reports regarding atomic64_*_386 are from ARCH=um build, so
it needs to be fixed there.


Re: [PATCH] btrfs: Remove useless call "zlib_inflateEnd"

2021-04-01 Thread David Sterba
On Tue, Mar 30, 2021 at 06:17:01PM +0800, Jiapeng Chong wrote:
> Fix the following whitescan warning:
> 
> Calling "zlib_inflateEnd(&workspace->strm)" is only useful for its
> return value, which is ignored.

According to the zlib API documentation in include/linux/zlib.h

301 extern int zlib_deflateEnd (z_streamp strm);
302 /*
303  All dynamically allocated data structures for this stream are freed.
304This function discards any unprocessed input and does not flush any
305pending output.
306
307  deflateEnd returns Z_OK if success, Z_STREAM_ERROR if the
308stream state was inconsistent, Z_DATA_ERROR if the stream was freed
309prematurely (some input or output was discarded). In the error case,
310msg may be set but then points to a static string (which must not be
311deallocated).
312 */

The first paragraph says it could free data, so the call needs to be
there. The return value could have some meaning as it could point out to
some inconsistency in zlib internal state but just deleting is IMO
wrong.


Re: [PATCH] [v2] btrfs: zoned: bail out in btrfs_alloc_chunk for bad input

2021-03-29 Thread David Sterba
On Tue, Mar 23, 2021 at 03:31:19PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> gcc complains that the ctl->max_chunk_size member might be used
> uninitialized when none of the three conditions for initializing it in
> init_alloc_chunk_ctl_policy_zoned() are true:
> 
> In function ‘init_alloc_chunk_ctl_policy_zoned’,
> inlined from ‘init_alloc_chunk_ctl’ at fs/btrfs/volumes.c:5023:3,
> inlined from ‘btrfs_alloc_chunk’ at fs/btrfs/volumes.c:5340:2:
> include/linux/compiler-gcc.h:48:45: error: ‘ctl.max_chunk_size’ may be used 
> uninitialized [-Werror=maybe-uninitialized]
>  4998 | ctl->max_chunk_size = min(limit, ctl->max_chunk_size);
>   |   ^~~
> fs/btrfs/volumes.c: In function ‘btrfs_alloc_chunk’:
> fs/btrfs/volumes.c:5316:32: note: ‘ctl’ declared here
>  5316 | struct alloc_chunk_ctl ctl;
>   |^~~
> 
> If we ever get into this condition, something is seriously
> wrong, so the same logic as in init_alloc_chunk_ctl_policy_regular()
> and a few other places should be applied. This avoids both further
> data corruption, and the compile-time warning.
> 
> Fixes: 1cd6121f2a38 ("btrfs: zoned: implement zoned chunk allocator")
> Link: https://lore.kernel.org/lkml/20210323132343.gf7...@twin.jikos.cz/
> Suggested-by: David Sterba 
> Signed-off-by: Arnd Bergmann 

Added to misc-next, thanks.


Re: [PATCH v2] btrfs: fix a potential hole-punching failure

2021-03-29 Thread David Sterba
On Thu, Mar 25, 2021 at 09:56:22AM +0800, bingjingc wrote:
> From: BingJing Chang 
> 
> In commit d77815461f04 ("btrfs: Avoid trucating page or punching hole
> in a already existed hole."), existed holes can be skipped by calling
> find_first_non_hole() to adjust *start and *len. However, if the given
> len is invalid and large, when an EXTENT_MAP_HOLE extent is found, the
> *len will not be set to zero because (em->start + em->len) is less than
> (*start + *len). Then the ret will be 1 but the *len will not be set to
> 0. The propagated non-zero ret will result in fallocate failure.
> 
> In the while-loop of btrfs_replace_file_extents(), len is not updated
> every time before it calls find_first_non_hole(). That is, after
> btrfs_drop_extents() successfully drops the last non-hole file extent,
> it may fail with -ENOSPC when attempting to drop a file extent item
> representing a hole. The problem can happen. After it calls
> find_first_non_hole(), the cur_offset will be adjusted to be larger
> than or equal to end. However, since the len is not set to zero. The
> break-loop condition (ret && !len) will not meet. After it leaves the
> while-loop, fallocate will return 1, which is an unexpected return
> value.
> 
> We're not able to construct a reproducible way to let
> btrfs_drop_extents() fail with -ENOSPC after it drops the last non-hole
> file extent but with remaining holes left. However, it's quite easy to
> fix. We just need to update and check the len every time before we call
> find_first_non_hole(). To make the while loop more readable, we also
> pull the variable updates to the bottom of loop like this:
> while (cur_offset < end) {
> ...
> // update cur_offset & len
> // advance cur_offset & len in hole-punching case if needed
> }
> 
> Reported-by: Robbie Ko 
> Fixes: d77815461f04 ("btrfs: Avoid trucating page or punching hole in a
> already existed hole.")
> Reviewed-by: Robbie Ko 
> Reviewed-by: Chung-Chiang Cheng 
> Signed-off-by: BingJing Chang 

Thanks, added to misc-next.


Re: [PATCH] btrfs: Fix a typo

2021-03-26 Thread David Sterba
On Fri, Mar 26, 2021 at 06:29:32AM +0530, Bhaskar Chowdhury wrote:
> 
> s/reponsible/responsible/

So this is exactly what I don't want to see happen - one patch per typo.
I tried to explain it in the previous patch, please either fix all typos
under fs/btrfs or don't bother.


Re: [PATCH] btrfs: fixed rudimentary typos

2021-03-25 Thread David Sterba
On Thu, Mar 25, 2021 at 11:40:04PM +0530, Bhaskar Chowdhury wrote:
> On 13:49 Thu 25 Mar 2021, David Sterba wrote:
> >On Thu, Mar 25, 2021 at 09:51:13AM +0530, Bhaskar Chowdhury wrote:
> >>
> >> s/contaning/containing
> >> s/clearning/clearing/
> >
> >Have hou scanned the whole subdirectory for typos? We do typo fixing
> >about once a year in one big patch and won't fix them one by one.
> 
> Once a year You must be kidding! that is not good whatever the workflow
> you have .

No kidding. It's even worse, we get that every two years.

* 2016 0132761017e012ab4dc8584d679503f2ba26ca86
  33 files changed, 106 insertions(+), 105 deletions(-)

* 2018 52042d8e82ff50d40e76a275ac0b97aa663328b0
  25 files changed, 70 insertions(+), 69 deletions(-)

You can see the diffstat touches nearly all the files, almost hundred of
fixed typos per patch. Now compare that to sending 70-100 individual
patches. Time spent on any patch is not zero and for such trivial
changes it's not justified so the workflow is to do that in batches.
If you care about fixing typos in fs/btrfs/, please fix them all. I've
found about 50.


[GIT PULL] Btrfs fixes for 5.12-rc5

2021-03-25 Thread David Sterba
From: David Sterba 

Hi,

there are few fixes for issues that have some user visibility and are
simple enough for this time of development cycle.

Please pull thanks.

- a few fixes for rescue= mount option, adding more checks for missing
  trees

- fix sleeping in atomic context on qgroup deletion

- fix subvolume deletion on mount

- fix build with M= syntax

- fix checksum mismatch error message for direct io


The following changes since commit 485df75554257e883d0ce39bb886e8212349748e:

  btrfs: always pin deleted leaves when there are active tree mod log users 
(2021-03-16 20:32:22 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.12-rc4-tag

for you to fetch changes up to c1d6abdac46ca8127274bea195d804e3f2cec7ee:

  btrfs: fix check_data_csum() error message for direct I/O (2021-03-18 
21:25:11 +0100)


David Sterba (1):
  btrfs: fix build when using M=fs/btrfs

Filipe Manana (2):
  btrfs: fix subvolume/snapshot deletion not triggered on mount
  btrfs: fix sleep while in non-sleep context during qgroup removal

Johannes Thumshirn (1):
  btrfs: zoned: remove outdated WARN_ON in direct IO

Josef Bacik (3):
  btrfs: do not initialize dev stats if we have no dev_root
  btrfs: initialize device::fs_info always
  btrfs: do not initialize dev replace for bad dev root

Omar Sandoval (1):
  btrfs: fix check_data_csum() error message for direct I/O

 fs/btrfs/Makefile  | 10 ++
 fs/btrfs/dev-replace.c |  3 +++
 fs/btrfs/disk-io.c | 19 +--
 fs/btrfs/inode.c   | 18 +-
 fs/btrfs/qgroup.c  | 12 ++--
 fs/btrfs/volumes.c |  3 +++
 6 files changed, 48 insertions(+), 17 deletions(-)


Re: [PATCH] btrfs: fixed rudimentary typos

2021-03-25 Thread David Sterba
On Thu, Mar 25, 2021 at 09:51:13AM +0530, Bhaskar Chowdhury wrote:
> 
> s/contaning/containing
> s/clearning/clearing/

Have hou scanned the whole subdirectory for typos? We do typo fixing
about once a year in one big patch and won't fix them one by one.


Re: [PATCH AUTOSEL 5.11 26/44] btrfs: track qgroup released data in own variable in insert_prealloc_file_extent

2021-03-25 Thread David Sterba
On Thu, Mar 25, 2021 at 07:24:41AM -0400, Sasha Levin wrote:
> From: Qu Wenruo 
> 
> [ Upstream commit fbf48bb0b197e6894a04c714728c952af7153bf3 ]
> 
> There is a piece of weird code in insert_prealloc_file_extent(), which
> looks like:
> 
>   ret = btrfs_qgroup_release_data(inode, file_offset, len);
>   if (ret < 0)
>   return ERR_PTR(ret);
>   if (trans) {
>   ret = insert_reserved_file_extent(trans, inode,
> file_offset, &stack_fi,
> true, ret);
>   ...
>   }
>   extent_info.is_new_extent = true;
>   extent_info.qgroup_reserved = ret;
>   ...
> 
> Note how the variable @ret is abused here, and if anyone is adding code
> just after btrfs_qgroup_release_data() call, it's super easy to
> overwrite the @ret and cause tons of qgroup related bugs.
> 
> Fix such abuse by introducing new variable @qgroup_released, so that we
> won't reuse the existing variable @ret.
> 
> Signed-off-by: Qu Wenruo 
> Reviewed-by: David Sterba 
> Signed-off-by: David Sterba 
> Signed-off-by: Sasha Levin 

This patch is a preparatory work and does not make sense for backport
standalone. Either this one plus
https://lore.kernel.org/linux-btrfs/20210303104152.105877-2-...@suse.com/
or neither. And IIRC it does not apply directly and needs some
additional review before it can be backported to older code base, so it
has no CC: stable tags.


Re: [PATCH] btrfs: zoned: fix uninitialized max_chunk_size

2021-03-23 Thread David Sterba
On Tue, Mar 23, 2021 at 01:46:19PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> The ctl->max_chunk_size member might be used uninitialized
> when none of the three conditions for initializing it in
> init_alloc_chunk_ctl_policy_zoned() are true:
> 
> In function ‘init_alloc_chunk_ctl_policy_zoned’,
> inlined from ‘init_alloc_chunk_ctl’ at fs/btrfs/volumes.c:5023:3,
> inlined from ‘btrfs_alloc_chunk’ at fs/btrfs/volumes.c:5340:2:
> include/linux/compiler-gcc.h:48:45: error: ‘ctl.max_chunk_size’ may be used 
> uninitialized [-Werror=maybe-uninitialized]
>  4998 | ctl->max_chunk_size = min(limit, ctl->max_chunk_size);
>   |   ^~~
> fs/btrfs/volumes.c: In function ‘btrfs_alloc_chunk’:
> fs/btrfs/volumes.c:5316:32: note: ‘ctl’ declared here
>  5316 | struct alloc_chunk_ctl ctl;
>   |^~~
> 
> Initialize it to UINT_MAX and rely on the min() expression to limit
> it.
> 
> Fixes: 1cd6121f2a38 ("btrfs: zoned: implement zoned chunk allocator")
> Signed-off-by: Arnd Bergmann 
> ---
> Note that the -Wmaybe-unintialized warning is globally disabled
> by default. For some reason I got this warning anyway when building
> this specific file with gcc-11.

The warning catches a theoretical case but this would not happen in
pracitce.  There are three bits to check and that covers all valid
options, but there should be a final else {} like is in
init_alloc_chunk_ctl_policy_regular that does not let the function
continue as that would mean there are worse problems.

btrfs_alloc_chunk
  init_alloc_chunk_ctl
init_alloc_chunk_ctl_policy_zoned

and btrfs_alloc_chunk validates the ctl->flags against
BTRFS_BLOCK_GROUP_TYPE_MASK, which is exactly the tree branches.

> ---
>  fs/btrfs/volumes.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index bc3b33efddc5..b42b423b6a10 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -4980,6 +4980,7 @@ static void init_alloc_chunk_ctl_policy_zoned(
>   u64 type = ctl->type;
>  
>   ctl->max_stripe_size = zone_size;
> + ctl->max_chunk_size = UINT_MAX;

This would allow the min() work but otherwise is not an expected to
happen at all.

>   if (type & BTRFS_BLOCK_GROUP_DATA) {
>   ctl->max_chunk_size = round_down(BTRFS_MAX_DATA_CHUNK_SIZE,
>zone_size);
> -- 
> 2.29.2


Re: [PATCH v2 04/18] btrfs: convert to miscattr

2021-03-23 Thread David Sterba
On Mon, Mar 22, 2021 at 03:49:02PM +0100, Miklos Szeredi wrote:
> Use the miscattr API to let the VFS handle locking, permission checking and
> conversion.
> 
> Signed-off-by: Miklos Szeredi 
> Cc: David Sterba 
> ---
>  fs/btrfs/ctree.h |   3 +
>  fs/btrfs/inode.c |   4 +
>  fs/btrfs/ioctl.c | 249 +--
>  3 files changed, 52 insertions(+), 204 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index bd659354d043..c79886675c16 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -3184,6 +3184,9 @@ void btrfs_update_inode_bytes(struct btrfs_inode *inode,
>  /* ioctl.c */
>  long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
>  long btrfs_compat_ioctl(struct file *file, unsigned int cmd, unsigned long 
> arg);
> +int btrfs_miscattr_get(struct dentry *dentry, struct miscattr *ma);
> +int btrfs_miscattr_set(struct user_namespace *mnt_userns,
> +struct dentry *dentry, struct miscattr *ma);
>  int btrfs_ioctl_get_supported_features(void __user *arg);
>  void btrfs_sync_inode_flags_to_i_flags(struct inode *inode);
>  int __pure btrfs_is_empty_uuid(u8 *uuid);
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 2e1c282c202d..e21642f17396 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -10556,6 +10556,8 @@ static const struct inode_operations 
> btrfs_dir_inode_operations = {
>   .set_acl= btrfs_set_acl,
>   .update_time= btrfs_update_time,
>   .tmpfile= btrfs_tmpfile,
> + .miscattr_get   = btrfs_miscattr_get,
> + .miscattr_set   = btrfs_miscattr_set,
>  };
>  
>  static const struct file_operations btrfs_dir_file_operations = {
> @@ -10609,6 +10611,8 @@ static const struct inode_operations 
> btrfs_file_inode_operations = {
>   .get_acl= btrfs_get_acl,
>   .set_acl= btrfs_set_acl,
>   .update_time= btrfs_update_time,
> + .miscattr_get   = btrfs_miscattr_get,
> + .miscattr_set   = btrfs_miscattr_set,
>  };
>  static const struct inode_operations btrfs_special_inode_operations = {
>   .getattr= btrfs_getattr,
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 072e77726e94..5ce445a9a331 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -26,6 +26,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "ctree.h"
>  #include "disk-io.h"
>  #include "export.h"
> @@ -153,16 +154,6 @@ void btrfs_sync_inode_flags_to_i_flags(struct inode 
> *inode)
> new_fl);
>  }
>  
> -static int btrfs_ioctl_getflags(struct file *file, void __user *arg)
> -{
> - struct btrfs_inode *binode = BTRFS_I(file_inode(file));
> - unsigned int flags = btrfs_inode_flags_to_fsflags(binode->flags);
> -
> - if (copy_to_user(arg, &flags, sizeof(flags)))
> - return -EFAULT;
> - return 0;
> -}
> -
>  /*
>   * Check if @flags are a supported and valid set of FS_*_FL flags and that
>   * the old and new flags are not conflicting
> @@ -201,9 +192,34 @@ static int check_fsflags_compatible(struct btrfs_fs_info 
> *fs_info,
>   return 0;
>  }
>  
> -static int btrfs_ioctl_setflags(struct file *file, void __user *arg)
> +bool btrfs_exclop_start(struct btrfs_fs_info *fs_info,
> + enum btrfs_exclusive_operation type)
>  {
> - struct inode *inode = file_inode(file);
> + return !cmpxchg(&fs_info->exclusive_operation, BTRFS_EXCLOP_NONE, type);
> +}
> +
> +void btrfs_exclop_finish(struct btrfs_fs_info *fs_info)
> +{
> + WRITE_ONCE(fs_info->exclusive_operation, BTRFS_EXCLOP_NONE);
> + sysfs_notify(&fs_info->fs_devices->fsid_kobj, NULL, 
> "exclusive_operation");
> +}

This function is moved around for no reason, it's not relevant for the
attributes in any way and is exported so there's no problem with
visibility eg. due to being static.

> +/*
> + * Set flags/xflags from the internal inode flags. The remaining items of
> + * fsxattr are zeroed.
> + */
> +int btrfs_miscattr_get(struct dentry *dentry, struct miscattr *ma)
> +{
> + struct btrfs_inode *binode = BTRFS_I(d_inode(dentry));
> +
> + miscattr_fill_flags(ma, btrfs_inode_flags_to_fsflags(binode->flags));
> + return 0;
> +}
> +
> +int btrfs_miscattr_set(struct user_namespace *mnt_userns,
> +struct dentry *dentry, struct miscattr *ma)
> +{
> + struct inode *inode = d_inode(dentry);
>   struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>   struct btrfs_inode *binode = BTRFS_I(inode

Re: [PATCH v2 01/18] vfs: add miscattr ops

2021-03-23 Thread David Sterba
On Mon, Mar 22, 2021 at 03:33:38PM -0700, Darrick J. Wong wrote:
> On Mon, Mar 22, 2021 at 03:48:59PM +0100, Miklos Szeredi wrote:
> > --- a/Documentation/filesystems/vfs.rst
> > +++ b/Documentation/filesystems/vfs.rst
> > @@ -441,6 +441,9 @@ As of kernel 2.6.22, the following members are defined:
> >unsigned open_flag, umode_t create_mode);
> > int (*tmpfile) (struct user_namespace *, struct inode *, struct 
> > dentry *, umode_t);
> > int (*set_acl)(struct user_namespace *, struct inode *, struct 
> > posix_acl *, int);
> > +   int (*miscattr_set)(struct user_namespace *mnt_userns,
> > +   struct dentry *dentry, struct miscattr *ma);
> > +   int (*miscattr_get)(struct dentry *dentry, struct miscattr *ma);
> > };
> >  
> >  Again, all methods are called without any locks being held, unless
> > @@ -588,6 +591,18 @@ otherwise noted.
> > atomically creating, opening and unlinking a file in given
> > directory.
> >  
> > +``miscattr_get``
> 
> I wish this wasn't named "misc" because miscellaneous is vague.

It also adds yet another way to name all the attributes (the "N + 1st
standard" problem). So I'd rather reuse a term that's already known and
understood by users. And this is 'file attributes', eg. as noted in
chattr manual page "change file attributes on a Linux file system".
For clarity avoid any 'x' in the name so we easily distinguish that from
the extended attributes aka xattrs.

We can perhaps live with miscattrs in code as anybody who has ever
touched the flags/attrs interfaces knows what it is referring to.

> fileattr_get, perhaps?

That sounds about right to me.


[GIT PULL] Btrfs fixes for 5.12-rc4

2021-03-18 Thread David Sterba
From: David Sterba 

Hi,

there are still regressions being found and fixed in the zoned mode and
subpage code, the rest are fixes for bugs reported by users. Please
pull, thanks.

Regressions:

- subpage block support:
  - readahead works on the proper block size
  - fix last page zeroing

- zoned mode:
  - linked list corruption for tree log

Fixes:

- qgroup leak after falloc faiulre

- tree mod log and backref resolving
  - extent buffer cloning race when resolving backrefs
  - pin deleted leaves with active tree mod log users

- drop debugging flag from slab cache


The following changes since commit badae9c86979c459bd7d895d6d7ddc7a01131ff7:

  btrfs: zoned: do not account freed region of read-only block group as 
zone_unusable (2021-03-04 16:16:58 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.12-rc3-tag

for you to fetch changes up to 485df75554257e883d0ce39bb886e8212349748e:

  btrfs: always pin deleted leaves when there are active tree mod log users 
(2021-03-16 20:32:22 +0100)


David Sterba (1):
  btrfs: fix slab cache flags for free space tree bitmap

Filipe Manana (3):
  btrfs: zoned: fix linked list corruption after log root tree allocation 
failure
  btrfs: fix race when cloning extent buffer during rewind of an old root
  btrfs: always pin deleted leaves when there are active tree mod log users

Qu Wenruo (5):
  btrfs: fix wrong offset to zero out range beyond i_size
  btrfs: track qgroup released data in own variable in 
insert_prealloc_file_extent
  btrfs: fix qgroup data rsv leak caused by falloc failure
  btrfs: subpage: fix wild pointer access during metadata read failure
  btrfs: subpage: make readahead work properly

 fs/btrfs/ctree.c   |  2 ++
 fs/btrfs/extent-tree.c | 23 ++-
 fs/btrfs/extent_io.c   | 33 +++--
 fs/btrfs/inode.c   | 37 ++---
 fs/btrfs/reada.c   | 35 ++-
 fs/btrfs/tree-log.c|  8 
 6 files changed, 103 insertions(+), 35 deletions(-)


Re: [PATCH v2] btrfs: Use immediate assignment when referencing cc-option

2021-03-17 Thread David Sterba
On Wed, Mar 17, 2021 at 09:08:48AM -0700, Victor Erminpour wrote:
> Calling cc-option will use KBUILD_CFLAGS, which when lazy setting
> subdir-ccflags-y produces the following build error:
> 
> scripts/Makefile.lib:10: *** Recursive variable `KBUILD_CFLAGS' \
>   references itself (eventually).  Stop.
> 
> Use single := assignment for subdir-ccflags-y. The cc-option calls
> are done right away and we don't end up with KBUILD_CFLAGS
> referencing itself.
> 
> Signed-off-by: Victor Erminpour 
> ---
>  fs/btrfs/Makefile | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
> index b634c42115ea..583ca2028e08 100644
> --- a/fs/btrfs/Makefile
> +++ b/fs/btrfs/Makefile
> @@ -1,16 +1,16 @@
>  # SPDX-License-Identifier: GPL-2.0
>  
>  # Subset of W=1 warnings
> +subdir-ccflags-y := $(call cc-option, -Wunused-but-set-variable) \
> + $(call cc-option, -Wunused-const-variable) \
> + $(call cc-option, -Wpacked-not-aligned) \
> + $(call cc-option, -Wstringop-truncation)

That's still overwriting the value unconditionally, the idea was a
separate variable

https://lore.kernel.org/linux-btrfs/20210317104313.17028-1-dste...@suse.com


Re: [PATCH] btrfs: Use immediate assignment when referencing cc-option

2021-03-17 Thread David Sterba
On Tue, Mar 16, 2021 at 03:46:10PM -0700, Victor Erminpour wrote:
> Calling cc-option will use KBUILD_CFLAGS, which when lazy setting
> subdir-ccflags-y produces the following build error:
> 
> scripts/Makefile.lib:10: *** Recursive variable `KBUILD_CFLAGS' \
>   references itself (eventually).  Stop.
> 
> Use := assignment to subdir-ccflags-y when referencing cc-option.
> This causes make to also evaluate += immediately, cc-option
> calls are done right away and we don't end up with KBUILD_CFLAGS
> referencing itself.
> 
> Signed-off-by: Victor Erminpour 
> ---
>  fs/btrfs/Makefile | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
> index b634c42115ea..3dba1336fa95 100644
> --- a/fs/btrfs/Makefile
> +++ b/fs/btrfs/Makefile
> @@ -7,10 +7,10 @@ subdir-ccflags-y += -Wmissing-format-attribute
>  subdir-ccflags-y += -Wmissing-prototypes
>  subdir-ccflags-y += -Wold-style-definition
>  subdir-ccflags-y += -Wmissing-include-dirs
> -subdir-ccflags-y += $(call cc-option, -Wunused-but-set-variable)
> -subdir-ccflags-y += $(call cc-option, -Wunused-const-variable)
> -subdir-ccflags-y += $(call cc-option, -Wpacked-not-aligned)
> -subdir-ccflags-y += $(call cc-option, -Wstringop-truncation)
> +subdir-ccflags-y := $(call cc-option, -Wunused-but-set-variable)
> +subdir-ccflags-y := $(call cc-option, -Wunused-const-variable)
> +subdir-ccflags-y := $(call cc-option, -Wpacked-not-aligned)
> +subdir-ccflags-y := $(call cc-option, -Wstringop-truncation)

But this overwrites all the previously accumulated values from += so
effectively there's only the last one, no? That's not what we want.


Re: [PATCH 0/4] btrfs: Convert more kmaps to kmap_local_page()

2021-03-16 Thread David Sterba
On Fri, Mar 12, 2021 at 12:05:00PM -0800, Ira Weiny wrote:
> On Fri, Mar 12, 2021 at 08:41:41PM +0100, David Sterba wrote:
> > On Tue, Feb 16, 2021 at 06:48:22PM -0800, ira.we...@intel.com wrote:
> > > From: Ira Weiny 
> > > 
> > > I am submitting these for 5.13.
> > > 
> > > Further work to remove more kmap() calls in favor of the 
> > > kmap_local_page() this
> > > series converts those calls which required more than a common pattern 
> > > which
> > > were covered in my previous series[1].  This is the second of what I hope 
> > > to be
> > > 3 series to fully convert btrfs.  However, the 3rd series is going to be 
> > > an RFC
> > > because I need to have more eyes on it before I'm sure about what to do.  
> > > For
> > > now this series should be good to go for 5.13.
> > > 
> > > Also this series converts the kmaps in the raid5/6 code which required a 
> > > fix to
> > > the kmap'ings which was submitted in [2].
> > 
> > Branch added to for-next and will be moved to the devel queue next week.
> > I've added some comments about the ordering requirement, that's
> > something not obvious. There's a comment under 1st patch but that's
> > trivial to fix if needed. Thanks.
> 
> I've replied to the first patch.  LMK if you want me to respin it.

No need to respin, patchset now in misc-next. Thanks.


Re: [PATCH 1/4] fs/btrfs: Convert kmap to kmap_local_page() using coccinelle

2021-03-16 Thread David Sterba
On Fri, Mar 12, 2021 at 12:03:14PM -0800, Ira Weiny wrote:
> On Fri, Mar 12, 2021 at 07:58:39PM +0100, David Sterba wrote:
> > On Tue, Feb 16, 2021 at 06:48:23PM -0800, ira.we...@intel.com wrote:
> > > --- a/fs/btrfs/lzo.c
> > > +++ b/fs/btrfs/lzo.c
> > > @@ -118,7 +118,7 @@ int lzo_compress_pages(struct list_head *ws, struct 
> > > address_space *mapping,
> > >   struct workspace *workspace = list_entry(ws, struct workspace, list);
> > >   int ret = 0;
> > >   char *data_in;
> > > - char *cpage_out;
> > > + char *cpage_out, *sizes_ptr;
> > >   int nr_pages = 0;
> > >   struct page *in_page = NULL;
> > >   struct page *out_page = NULL;
> > > @@ -258,10 +258,9 @@ int lzo_compress_pages(struct list_head *ws, struct 
> > > address_space *mapping,
> > >   }
> > >  
> > >   /* store the size of all chunks of compressed data */
> > > - cpage_out = kmap(pages[0]);
> > > - write_compress_length(cpage_out, tot_out);
> > > -
> > > - kunmap(pages[0]);
> > > + sizes_ptr = kmap_local_page(pages[0]);
> > > + write_compress_length(sizes_ptr, tot_out);
> > > + kunmap_local(sizes_ptr);
> > 
> > Why is not cpage_out reused for this mapping? I don't see any reason for
> > another temporary variable, cpage_out is not used at this point.
> 
> For this patch that is true.  However, I'm trying to convert the other kmaps 
> as
> well.  To do that I'll need cpage_out preserved for the final kunmap_local().
> 
> Unfortunately, the required nesting ordering of kmap_local_page() makes
> converting the other calls hacky at best.  I'm not sure what to do about them.
> The best I've come up with is doing a hacky extra unmap/remap to preserve the
> nesting.
> 
> Anyway, I'd prefer to leave this additional temp variable but I can certainly
> change if it you want.  The other conversions may never work/land.  :-/

Ok, no problem keeping the variable then. I've added a note to changelog
why it's there. The whole conversion sounds tricky so adding trivial
helper code is no big deal.


Re: [PATCH 0/4] btrfs: Convert more kmaps to kmap_local_page()

2021-03-12 Thread David Sterba
On Tue, Feb 16, 2021 at 06:48:22PM -0800, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> I am submitting these for 5.13.
> 
> Further work to remove more kmap() calls in favor of the kmap_local_page() 
> this
> series converts those calls which required more than a common pattern which
> were covered in my previous series[1].  This is the second of what I hope to 
> be
> 3 series to fully convert btrfs.  However, the 3rd series is going to be an 
> RFC
> because I need to have more eyes on it before I'm sure about what to do.  For
> now this series should be good to go for 5.13.
> 
> Also this series converts the kmaps in the raid5/6 code which required a fix 
> to
> the kmap'ings which was submitted in [2].

Branch added to for-next and will be moved to the devel queue next week.
I've added some comments about the ordering requirement, that's
something not obvious. There's a comment under 1st patch but that's
trivial to fix if needed. Thanks.


Re: [PATCH 2/4] fs/btrfs: Convert raid5/6 kmaps to kmap_local_page()

2021-03-12 Thread David Sterba
On Tue, Feb 16, 2021 at 06:48:24PM -0800, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> These kmaps are thread local and don't need to be atomic.  So they can use
> the more efficient kmap_local_page().  However, the mapping of pages in
> the stripes and the additional parity and qstripe pages are a bit
> trickier because the unmapping must occur in the opposite order from the
> mapping.  Furthermore, the pointer array in __raid_recover_end_io() may
> get reordered.
> 
> Convert these calls to kmap_local_page() taking care to reverse the
> unmappings of any page arrays as well as being careful with the mappings
> of any special pages such as the parity and qstripe pages.

Though there's one more allocation for the additional array, I don't see
a simpler way to avoid it and track the array reordering at a lower
memory cost. At least it's not in a performance critical code and the
array size is reasonably small.


Re: [PATCH 1/4] fs/btrfs: Convert kmap to kmap_local_page() using coccinelle

2021-03-12 Thread David Sterba
On Tue, Feb 16, 2021 at 06:48:23PM -0800, ira.we...@intel.com wrote:
> --- a/fs/btrfs/lzo.c
> +++ b/fs/btrfs/lzo.c
> @@ -118,7 +118,7 @@ int lzo_compress_pages(struct list_head *ws, struct 
> address_space *mapping,
>   struct workspace *workspace = list_entry(ws, struct workspace, list);
>   int ret = 0;
>   char *data_in;
> - char *cpage_out;
> + char *cpage_out, *sizes_ptr;
>   int nr_pages = 0;
>   struct page *in_page = NULL;
>   struct page *out_page = NULL;
> @@ -258,10 +258,9 @@ int lzo_compress_pages(struct list_head *ws, struct 
> address_space *mapping,
>   }
>  
>   /* store the size of all chunks of compressed data */
> - cpage_out = kmap(pages[0]);
> - write_compress_length(cpage_out, tot_out);
> -
> - kunmap(pages[0]);
> + sizes_ptr = kmap_local_page(pages[0]);
> + write_compress_length(sizes_ptr, tot_out);
> + kunmap_local(sizes_ptr);

Why is not cpage_out reused for this mapping? I don't see any reason for
another temporary variable, cpage_out is not used at this point.


Re: [PATCH 0/3] btrfs: Convert kmap/memset/kunmap to memzero_user()

2021-03-12 Thread David Sterba
On Thu, Mar 11, 2021 at 07:57:48AM -0800, Ira Weiny wrote:
> On Wed, Mar 10, 2021 at 03:58:36PM -0800, Andrew Morton wrote:
> > On Tue,  9 Mar 2021 13:21:34 -0800 ira.we...@intel.com wrote:
> > 
> > > Previously this was submitted to convert to zero_user()[1].  zero_user() 
> > > is not
> > > the same as memzero_user() and in fact some zero_user() calls may be 
> > > better off
> > > as memzero_user().  Regardless it was incorrect to convert btrfs to
> > > zero_user().
> > > 
> > > This series corrects this by lifting memzero_user(), converting it to
> > > kmap_local_page(), and then using it in btrfs.
> > 
> > This impacts btrfs more than MM.  I suggest the btrfs developers grab
> > it, with my
> 
> I thought David wanted you to take these this time?
> 
> "I can play the messenger again but now it seems a round of review is needed
> and with some testing it'll be possible in some -rc. At that point you may 
> take
> the patches via the mm tree, unless Linus is ok with a late pull."
> 
>   -- https://lore.kernel.org/lkml/20210224123049.gx1...@twin.jikos.cz/
> 
> But reading that again I'm not sure what he meant.

As Linus had some objections I was not sure it was still feasible for
the merge window, but this is now sorted. This new patchset does further
changes in MM and the btrfs part is a straightforward cleanup. I've
noticed Andrew added the patches to his queue which I'd prefer so I've
added my reviewed-by to the third patch. Thanks.


Re: [PATCH 3/3] btrfs: Use memzero_page() instead of open coded kmap pattern

2021-03-12 Thread David Sterba
On Tue, Mar 09, 2021 at 01:21:37PM -0800, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> There are many places where kmap/memset/kunmap patterns occur.
> 
> Use the newly lifted memzero_page() to eliminate direct uses of kmap and
> leverage the new core functions use of kmap_local_page().
> 
> The development of this patch was aided by the following coccinelle
> script:
> 
> // 
> // SPDX-License-Identifier: GPL-2.0-only
> // Find kmap/memset/kunmap pattern and replace with memset*page calls
> //
> // NOTE: Offsets and other expressions may be more complex than what the 
> script
> // will automatically generate.  Therefore a catchall rule is provided to find
> // the pattern which then must be evaluated by hand.
> //
> // Confidence: Low
> // Copyright: (C) 2021 Intel Corporation
> // URL: http://coccinelle.lip6.fr/
> // Comments:
> // Options:
> 
> //
> // Then the memset pattern
> //
> @ memset_rule1 @
> expression page, V, L, Off;
> identifier ptr;
> type VP;
> @@
> 
> (
> -VP ptr = kmap(page);
> |
> -ptr = kmap(page);
> |
> -VP ptr = kmap_atomic(page);
> |
> -ptr = kmap_atomic(page);
> )
> <+...
> (
> -memset(ptr, 0, L);
> +memzero_page(page, 0, L);
> |
> -memset(ptr + Off, 0, L);
> +memzero_page(page, Off, L);
> |
> -memset(ptr, V, L);
> +memset_page(page, V, 0, L);
> |
> -memset(ptr + Off, V, L);
> +memset_page(page, V, Off, L);
> )
> ...+>
> (
> -kunmap(page);
> |
> -kunmap_atomic(ptr);
> )
> 
> // Remove any pointers left unused
> @
> depends on memset_rule1
> @
> identifier memset_rule1.ptr;
> type VP, VP1;
> @@
> 
> -VP ptr;
>   ... when != ptr;
> ? VP1 ptr;
> 
> //
> // Catch all
> //
> @ memset_rule2 @
> expression page;
> identifier ptr;
> expression GenTo, GenSize, GenValue;
> type VP;
> @@
> 
> (
> -VP ptr = kmap(page);
> |
> -ptr = kmap(page);
> |
> -VP ptr = kmap_atomic(page);
> |
> -ptr = kmap_atomic(page);
> )
> <+...
> (
> //
> // Some call sites have complex expressions within the memset/memcpy
> // The follow are catch alls which need to be evaluated by hand.
> //
> -memset(GenTo, 0, GenSize);
> +memzero_pageExtra(page, GenTo, GenSize);
> |
> -memset(GenTo, GenValue, GenSize);
> +memset_pageExtra(page, GenValue, GenTo, GenSize);
> )
> ...+>
> (
> -kunmap(page);
> |
> -kunmap_atomic(ptr);
> )
> 
> // Remove any pointers left unused
> @
> depends on memset_rule2
> @
> identifier memset_rule2.ptr;
> type VP, VP1;
> @@
> 
> -VP ptr;
>   ... when != ptr;
> ? VP1 ptr;
> 
> // 
> 
> Signed-off-by: Ira Weiny 

Reviewed-by: David Sterba 


Re: [PATCH] btrfs: turn btrfs_destroy_delayed_refs() into void function

2021-03-09 Thread David Sterba
On Tue, Mar 09, 2021 at 05:32:54PM +0800, Yang Li wrote:
> This function always return '0' and no callers use the return value.
> So make it a void function.
> 
> This eliminates the following coccicheck warning:
> ./fs/btrfs/disk-io.c:4522:5-8: Unneeded variable: "ret". Return "0" on
> line 4530
> 
> Reported-by: Abaci Robot 

Can you please tell your robot to ignore this warning, I'm getting tired
to explain the same thing again,

https://lore.kernel.org/linux-btrfs/20210302120708.gh7...@suse.cz/

this is like 5th attempt to blindly fix a tool warning without
understanding the code.


Re: [PATCH AUTOSEL 5.11 03/12] btrfs: subpage: fix the false data csum mismatch error

2021-03-08 Thread David Sterba
On Sun, Mar 07, 2021 at 08:57:37AM -0500, Sasha Levin wrote:
> From: Qu Wenruo 
> 
> [ Upstream commit c28ea613fafad910d08f67efe76ae552b1434e44 ]
> 
> [BUG]
> When running fstresss, we can hit strange data csum mismatch where the
> on-disk data is in fact correct (passes scrub).
> 
> With some extra debug info added, we have the following traces:
> 
>   0482us: btrfs_do_readpage: root=5 ino=284 offset=393216, submit force=0 
> pgoff=0 iosize=8192
>   0494us: btrfs_do_readpage: root=5 ino=284 offset=401408, submit force=0 
> pgoff=8192 iosize=4096
>   0498us: btrfs_submit_data_bio: root=5 ino=284 bio first bvec=393216 len=8192
>   0591us: btrfs_do_readpage: root=5 ino=284 offset=405504, submit force=0 
> pgoff=12288 iosize=36864
>   0594us: btrfs_submit_data_bio: root=5 ino=284 bio first bvec=401408 len=4096
>   0863us: btrfs_submit_data_bio: root=5 ino=284 bio first bvec=405504 
> len=36864
>   0933us: btrfs_verify_data_csum: root=5 ino=284 offset=393216 len=8192
>   0967us: btrfs_do_readpage: root=5 ino=284 offset=442368, skip beyond isize 
> pgoff=49152 iosize=16384
>   1047us: btrfs_verify_data_csum: root=5 ino=284 offset=401408 len=4096
>   1163us: btrfs_verify_data_csum: root=5 ino=284 offset=405504 len=36864
>   1290us: check_data_csum: !!! root=5 ino=284 offset=438272 pg_off=45056 !!!
>   7387us: end_bio_extent_readpage: root=5 ino=284 before pending_read_bios=0
> 
> [CAUSE]
> Normally we expect all submitted bio reads to only touch the range we
> specified, and under subpage context, it means we should only touch the
> range specified in each bvec.
> 
> But in data read path, inside end_bio_extent_readpage(), we have page
> zeroing which only takes regular page size into consideration.
> 
> This means for subpage if we have an inode whose content looks like below:
> 
>   0   16K 32K 48K 64K
>   |///|   |///|   |
> 
>   |//| = data needs to be read from disk
>   |  | = hole
> 
> And i_size is 64K initially.
> 
> Then the following race can happen:
> 
>   T1  |   T2
> +
> btrfs_do_readpage()   |
> |- isize = 64K;   |
> |  At this time, the isize is |
> |  64K|
> | |
> |- submit_extent_page()   |
> |  submit previous assembled bio|
> |  assemble bio for [0, 16K)  |
> | |
> |- submit_extent_page()   |
>submit read bio for [0, 16K) |
>assemble read bio for  |
>[32K, 48K) |
>   |
>   | btrfs_setsize()
>   | |- i_size_write(, 16K);
>   |Now i_size is only 16K
> end_io() for [0K, 16K)|
> |- end_bio_extent_readpage()  |
>|- btrfs_verify_data_csum()  |
>|  No csum error   |
>|- i_size = 16K;   |
>|- zero_user_segment(16K,  |
>   PAGE_SIZE); |
>   !!! We zeroed range |
>   !!! [32K, 48K)  |
>   | end_io for [32K, 48K)
>   | |- end_bio_extent_readpage()
>   ||- btrfs_verify_data_csum()
>   |   ! CSUM MISMATCH !
>   |   ! As the range is zeroed now !
> 
> [FIX]
> To fix the problem, make end_bio_extent_readpage() to only zero the
> range of bvec.
> 
> The bug only affects subpage read-write support, as for full read-only
> mount we can't change i_size thus won't hit the race condition.

Please drop this patch from autosel because of the above, this is in a
feature that's in progress and does not affect regular filesystems.


[GIT PULL] Btrfs fixes for 5.12-rc1, part 2

2021-03-05 Thread David Sterba
From: David Sterba 

Hi,

more regression fixes and stabilization. Please pull, thanks.

Regressions:

- zoned mode
  - count zone sizes in wider int types
  - fix space accounting for read-only block groups

- subpage: fix page tail zeroing

Fixes:

- fix spurious warning when remounting with free space tree

- fix warning when creating a directory with smack enabled

- ioctl checks for qgroup inheritance when creating a snapshot

- qgroup
  - fix missing unlock on error path in zero range
  - fix amount of released reservation on error
  - fix flushing from unsafe context with open transaction, potentially
deadlocking

- minor build warning fixes


The following changes since commit 6e37d245994189ba757df7dc2950a44d31421ac6:

  btrfs: zoned: fix deadlock on log sync (2021-02-22 18:08:48 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.12-rc1-tag

for you to fetch changes up to badae9c86979c459bd7d895d6d7ddc7a01131ff7:

  btrfs: zoned: do not account freed region of read-only block group as 
zone_unusable (2021-03-04 16:16:58 +0100)


Boris Burkov (1):
  btrfs: fix spurious free_space_tree remount warning

Dan Carpenter (1):
  btrfs: validate qgroup inherit for SNAP_CREATE_V2 ioctl

Filipe Manana (1):
  btrfs: fix warning when creating a directory with smack enabled

Naohiro Aota (2):
  btrfs: zoned: use sector_t for zone sectors
  btrfs: zoned: do not account freed region of read-only block group as 
zone_unusable

Nikolay Borisov (4):
  btrfs: unlock extents in btrfs_zero_range in case of quota reservation 
errors
  btrfs: free correct amount of space in 
btrfs_delayed_inode_reserve_metadata
  btrfs: export and rename qgroup_reserve_meta
  btrfs: don't flush from btrfs_delayed_inode_reserve_metadata

Qu Wenruo (1):
  btrfs: subpage: fix the false data csum mismatch error

Randy Dunlap (1):
  btrfs: ref-verify: use 'inline void' keyword ordering

 fs/btrfs/delayed-inode.c|  5 +++--
 fs/btrfs/extent_io.c| 21 -
 fs/btrfs/file.c |  5 -
 fs/btrfs/free-space-cache.c |  7 ++-
 fs/btrfs/inode.c|  2 +-
 fs/btrfs/ioctl.c| 19 ++-
 fs/btrfs/qgroup.c   |  8 
 fs/btrfs/qgroup.h   |  2 ++
 fs/btrfs/ref-verify.c   |  4 ++--
 fs/btrfs/super.c|  4 ++--
 fs/btrfs/xattr.c| 31 +++
 fs/btrfs/zoned.c|  4 ++--
 12 files changed, 87 insertions(+), 25 deletions(-)


Re: [PATCH] btrfs: Assign boolean values to a bool variable

2021-03-04 Thread David Sterba
On Wed, Mar 03, 2021 at 05:45:28PM +0800, Jiapeng Chong wrote:
> Fix the following coccicheck warnings:
> 
> ./fs/btrfs/volumes.c:1462:10-11: WARNING: return of 0/1 in function
> 'dev_extent_hole_check_zoned' with return type bool.
> 
> Reported-by: Abaci Robot 
> Signed-off-by: Jiapeng Chong 

Added to misc-next, thanks.


Re: [PATCH] btrfs: Remove unused variable ret

2021-03-02 Thread David Sterba
On Fri, Feb 19, 2021 at 02:18:58PM +0800, Jiapeng Chong wrote:
> Fix the following coccicheck warnings:
> 
> ./fs/btrfs/disk-io.c:4403:5-8: Unneeded variable: "ret". Return "0" on
> line 4411.

That maybe stops the check to report the warning but it's not the right
way to fix the code. The callees do not properly handle and propagate
errors so that needs to be fixed, and the return value propagated from
btrfs_destroy_delayed_refs.


Re: [PATCH] btrfs: turn btrfs_destroy_all_ordered_extents() into void functions

2021-03-02 Thread David Sterba
On Tue, Mar 02, 2021 at 04:57:56PM +0800, Yang Li wrote:
> These functions always return '0' and no callers use the return
> value. So make it a void function.

The reasoning needs to go the other way around: you can make a function
return void iff all callees also return void and don't do BUG_ON instead
of proper error handling.

Which in this case does not hold, because the function itself still does
BUG_ON.

> It fixes the following warning detected by coccinelle:
> ./fs/btrfs/disk-io.c:4522:5-8: Unneeded variable: "ret". Return "0" on
> line 4530

Yeah tools can identify the simple cases but fixing that properly needs
to extend to the whole callgraph and understand all the contexts where
local data are undone and errors propagated up the callchanin.


Re: [PATCH 34/44] tty: do not check tty_unregister_driver's return value

2021-03-02 Thread David Sterba
On Tue, Mar 02, 2021 at 07:22:04AM +0100, Jiri Slaby wrote:
> These drivers check tty_unregister_driver return value. But they don't
> handle a failure correctly (they free the driver in any case). So stop
> checking tty_unregister_driver return value and remove also the prints.
> 
> In the next patch, tty_unregister_driver's return type will be switched
> to void.
> 
> Signed-off-by: Jiri Slaby 
> Cc: Chris Zankel 
> Cc: Max Filippov 
> Cc: linux-xte...@linux-xtensa.org
> Cc: Jiri Kosina 
> Cc: David Sterba 
> ---

For

>  drivers/tty/ipwireless/tty.c| 7 +--

Acked-by: David Sterba 

Thanks.


[GIT PULL] Kmap conversions for 5.12, take 2

2021-03-01 Thread David Sterba
Hi,

this pull request contains changes regarding kmap API use and eg.
conversion from kmap_atomic to kmap_local_page.

Changes against v1 [1]:

- dropped patches using zero_user
- retested with my regular fstests-based workloads over the weekend

I'm keeping the original merge changelog as it seems to apply to v2 as
well.

Please pull, thanks.

[1] https://lore.kernel.org/lkml/cover.1614090658.git.dste...@suse.com/

The API belongs to memory management but to save cross-tree dependency
headaches we've agreed to take it through the btrfs tree because there
are some trivial conversions possible, while the rest will need some
time and getting the easy cases out of the way would be convenient.

The final patchset arrived shortly before merge window, which is not
perfect, but given that it's straightforward I don't think it's too
risky.

I've added it to my for-next branch and it's been in linux-next for more
than a week.  Meanwhile I've been testing it among my regular branches
with additional MM related debugging options.

The changes can be grouped:

- function exports, new helpers

- new VM_BUG_ON for additional verification; it's been discussed if it
  should be VM_BUG_ON or BUG_ON, the former was chosen due to
  performance reasons

- code replaced by relevant helpers


The following changes since commit 92bf22614b21a2706f4993b278017e437f7785b3:

  Linux 5.11-rc7 (2021-02-07 13:57:38 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git 
kmap-conversion-for-5.12

for you to fetch changes up to 80cc83842394e5ad3e93487359106aab3420bcb7:

  btrfs: use copy_highpage() instead of 2 kmaps() (2021-02-26 12:45:15 +0100)


Ira Weiny (6):
  mm/highmem: Lift memcpy_[to|from]_page to core
  mm/highmem: Convert memcpy_[to|from]_page() to kmap_local_page()
  mm/highmem: Introduce memcpy_page(), memmove_page(), and memset_page()
  mm/highmem: Add VM_BUG_ON() to mem*_page() calls
  btrfs: use memcpy_[to|from]_page() and kmap_local_page()
  btrfs: use copy_highpage() instead of 2 kmaps()

 fs/btrfs/compression.c  |  6 ++
 fs/btrfs/lzo.c  |  4 ++--
 fs/btrfs/raid56.c   | 10 +
 fs/btrfs/reflink.c  |  6 +-
 fs/btrfs/send.c |  7 ++-
 fs/btrfs/zlib.c |  5 ++---
 fs/btrfs/zstd.c |  6 ++
 include/linux/highmem.h | 56 +
 lib/iov_iter.c  | 14 -
 9 files changed, 68 insertions(+), 46 deletions(-)


[GIT PULL] Btrfs updates for 5.12-rc2

2021-03-01 Thread David Sterba
From: David Sterba 

Hi,

first batch of fixes that usually arrive during the merge window code
freeze. Regressions and stable material. Please pull, thanks.

Regressions:

- fix deadlock in log sync in zoned mode

- fix bugs in subpage mode still wrongly assuming sectorsize == page
  size

Fixes:

- fix missing kunmap of the Q stripe in RAID6

- block group fixes:
  - fix race between extent freeing/allocation when using bitmaps
  - avoid double put of block group when emptying cluster

- swapfile fixes:
  - fix swapfile writes vs running scrub
  - fix swapfile activation vs snapshot creation

- fix stale data exposure after cloning a hole with NO_HOLES enabled

- remove tree-checker check that does not work in case information from
  other leaves is necessary


The following changes since commit 9d294a685fbcb256ce8c5f7fd88a7596d0f52a8a:

  btrfs: zoned: enable to mount ZONED incompat flag (2021-02-09 02:52:24 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.12-rc1-tag

for you to fetch changes up to 6e37d245994189ba757df7dc2950a44d31421ac6:

  btrfs: zoned: fix deadlock on log sync (2021-02-22 18:08:48 +0100)


Filipe Manana (4):
  btrfs: avoid checking for RO block group twice during nocow writeback
  btrfs: fix race between writes to swap files and scrub
  btrfs: fix race between swap file activation and snapshot creation
  btrfs: fix stale data exposure after cloning a hole with NO_HOLES enabled

Ira Weiny (1):
  btrfs: fix raid6 qstripe kmap

Johannes Thumshirn (1):
  btrfs: zoned: fix deadlock on log sync

Josef Bacik (2):
  btrfs: tree-checker: do not error out if extent ref hash doesn't match
  btrfs: avoid double put of block group when emptying cluster

Nikolay Borisov (1):
  btrfs: fix race between extent freeing/allocation when using bitmaps

Qu Wenruo (2):
  btrfs: make btrfs_submit_compressed_read() subpage compatible
  btrfs: make check_compressed_csum() to be subpage compatible

 fs/btrfs/block-group.c  | 33 +++-
 fs/btrfs/block-group.h  |  9 +++
 fs/btrfs/compression.c  | 62 +++--
 fs/btrfs/ctree.h|  5 
 fs/btrfs/free-space-cache.c | 14 +-
 fs/btrfs/inode.c| 44 +++-
 fs/btrfs/raid56.c   | 21 ---
 fs/btrfs/reflink.c  | 18 +
 fs/btrfs/scrub.c|  9 ++-
 fs/btrfs/tree-checker.c | 16 +++-
 fs/btrfs/tree-log.c |  3 ---
 11 files changed, 175 insertions(+), 59 deletions(-)


Re: linux-next: manual merge of the akpm-current tree with the btrfs tree

2021-03-01 Thread David Sterba
On Fri, Feb 26, 2021 at 06:16:26AM +0100, Christoph Hellwig wrote:
> On Fri, Feb 26, 2021 at 10:32:50AM +1100, Stephen Rothwell wrote:
> > > >  -  return filemap_read(iocb, to, ret);
> > > >  +  if (iocb->ki_flags & IOCB_NOWAIT)
> > > >  +  iocb->ki_flags |= IOCB_NOIO;
> > > >  +
> > > > -   ret = generic_file_buffered_read(iocb, to, ret);
> > > > ++  ret = filemap_read(iocb, to, ret);
> > > >  +
> > > >  +  if (iocb->ki_flags & IOCB_NOWAIT) {
> > > >  +  iocb->ki_flags &= ~IOCB_NOIO;
> > > >  +  if (ret == 0)
> > > >  +  ret = -EAGAIN;
> > > >  +  }
> > > >  +
> > > >  +  return ret;
> > > >   }
> 
> I think the above code looks completely bogus.  Instead whatever code
> in btrfs hecks for IOCB_NOIO to avoid blocking readahead should also
> check IOCB_NOWAIT.

Thanks for the comment, I've removed the patch from for-next and
notified the authors.


Re: [GIT PULL] Kmap conversions for 5.12

2021-02-26 Thread David Sterba
On Thu, Feb 25, 2021 at 08:32:34AM -0800, Ira Weiny wrote:
> On Thu, Feb 25, 2021 at 02:12:52PM +0100, David Sterba wrote:
> > On Wed, Feb 24, 2021 at 09:59:12AM -0800, Ira Weiny wrote:
> > > On Wed, Feb 24, 2021 at 01:30:49PM +0100, David Sterba wrote:
> > > > On Tue, Feb 23, 2021 at 11:25:06AM -0800, Ira Weiny wrote:
> > > > > On Tue, Feb 23, 2021 at 09:13:42AM -0800, Linus Torvalds wrote:
> > > > > > On Tue, Feb 23, 2021 at 7:03 AM David Sterba  
> > > > > > wrote:
> > > > [...]
> > > > 
> > > > > Sorry.  I will change it.
> > > > 
> > > > Let me know how you want to proceed with the patchset/pull request.
> > > 
> > > To be clear I'd like to just drop the 2 patches which use zero_user() for 
> > > this
> > > merge window.
> > > 
> > > I've already submitted some additional btrfs changes for 5.13[1].  I can 
> > > rework
> > > these zero_user() patches and submit them through Andrew for 5.13 as 
> > > separate
> > > set.  That is what I meant by 'I will change it'.
> > > 
> > > > I
> > > > can play the messenger again but now it seems a round of review is
> > > > needed and with some testing it'll be possible in some -rc. At that
> > > > point you may take the patches via the mm tree, unless Linus is ok with
> > > > a late pull.
> > > 
> > > I'm ok with delaying the memzero_page() change to 5.13.  There are a lot 
> > > of
> > > kmap changes to come.  But I'm trying to do them as smaller series just 
> > > for
> > > this reason.  I don't want valid changes to be denied due to my messing 
> > > up just
> > > a few patches...  :-(  Hopefully you and Linus can forgive me on this one.
> > > 
> > > Is ok to just drop them and merge the rest of this series in 5.12?
> > 
> > Ok, no problem. Please let me know exactly which patches to drop, I'll
> > respin the branch. Thanks.
> 
> Drop These 2:
> 
> [PATCH V2 5/8] iov_iter: Remove memzero_page() in favor of zero_user()
> https://lore.kernel.org/lkml/20210210062221.3023586-6-ira.we...@intel.com/
> 
> [PATCH V2 8/8] btrfs: convert to zero_user()
> https://lore.kernel.org/lkml/20210210062221.3023586-9-ira.we...@intel.com/
> 
> 
> Keep:
> 
> [PATCH V2 1/8] mm/highmem: Lift memcpy_[to|from]_page to core 
> [PATCH V2 2/8] mm/highmem: Convert memcpy_[to|from]_page() to 
> kmap_local_page()
> [PATCH V2 3/8] mm/highmem: Introduce memcpy_page(), memmove_page(), and 
> memset_page()
> [PATCH V2 4/8] mm/highmem: Add VM_BUG_ON() to mem*_page() calls
>   ...
> [PATCH V2 6/8] btrfs: use memcpy_[to|from]_page() and kmap_local_page()
> [PATCH V2 7/8] btrfs: use copy_highpage() instead of 2 kmaps()
>   ...
> 
> I would resend but I'd rather keep the exact commits you had in your testing
> rather than potentially messing up the rebase this late.

Got it, thanks. It's easier for me to delete the patches once I have
them in the branch, that's been updated and now pushed to kernel org
again 
(https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git/log/?h=kmap-conversion-for-5.12)

I'll add it to testing branches and let it test over the weekend,
sending the pull request next week.


Re: [GIT PULL] Kmap conversions for 5.12

2021-02-25 Thread David Sterba
On Wed, Feb 24, 2021 at 09:59:12AM -0800, Ira Weiny wrote:
> On Wed, Feb 24, 2021 at 01:30:49PM +0100, David Sterba wrote:
> > On Tue, Feb 23, 2021 at 11:25:06AM -0800, Ira Weiny wrote:
> > > On Tue, Feb 23, 2021 at 09:13:42AM -0800, Linus Torvalds wrote:
> > > > On Tue, Feb 23, 2021 at 7:03 AM David Sterba  wrote:
> > [...]
> > 
> > > Sorry.  I will change it.
> > 
> > Let me know how you want to proceed with the patchset/pull request.
> 
> To be clear I'd like to just drop the 2 patches which use zero_user() for this
> merge window.
> 
> I've already submitted some additional btrfs changes for 5.13[1].  I can 
> rework
> these zero_user() patches and submit them through Andrew for 5.13 as separate
> set.  That is what I meant by 'I will change it'.
> 
> > I
> > can play the messenger again but now it seems a round of review is
> > needed and with some testing it'll be possible in some -rc. At that
> > point you may take the patches via the mm tree, unless Linus is ok with
> > a late pull.
> 
> I'm ok with delaying the memzero_page() change to 5.13.  There are a lot of
> kmap changes to come.  But I'm trying to do them as smaller series just for
> this reason.  I don't want valid changes to be denied due to my messing up 
> just
> a few patches...  :-(  Hopefully you and Linus can forgive me on this one.
> 
> Is ok to just drop them and merge the rest of this series in 5.12?

Ok, no problem. Please let me know exactly which patches to drop, I'll
respin the branch. Thanks.


Re: [PATCH] btrfs: Fixed a brace coding style issue

2021-02-24 Thread David Sterba
On Mon, Feb 15, 2021 at 08:38:20PM +0530, Maheep Kumar Kathuria wrote:
> Fixed a coding style issue in thresh_exec_hook()
> 
> Signed-off-by: Maheep Kumar Kathuria 
> ---
>  fs/btrfs/async-thread.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
> index 309516e6a968..38abeff7af69 100644
> --- a/fs/btrfs/async-thread.c
> +++ b/fs/btrfs/async-thread.c
> @@ -212,9 +212,8 @@ static inline void thresh_exec_hook(struct 
> __btrfs_workqueue *wq)
>  out:
>   spin_unlock(&wq->thres_lock);
>  
> - if (need_change) {
> + if (need_change)
>   workqueue_set_max_active(wq->normal_wq, wq->current_active);
> - }

This is really a trivial change, have you checked if there are more?
Fixing them in a larger batch would be better than one by one.


Re: [PATCH AUTOSEL 5.11 54/67] btrfs: make btrfs_start_delalloc_root's nr argument a long

2021-02-24 Thread David Sterba
On Wed, Feb 24, 2021 at 07:50:12AM -0500, Sasha Levin wrote:
> From: Nikolay Borisov 
> 
> [ Upstream commit 9db4dc241e87fccd8301357d5ef908f40b50f2e3 ]
> 
> It's currently u64 which gets instantly translated either to LONG_MAX
> (if U64_MAX is passed) or cast to an unsigned long (which is in fact,
> wrong because writeback_control::nr_to_write is a signed, long type).
> 
> Just convert the function's argument to be long time which obviates the
> need to manually convert u64 value to a long. Adjust all call sites
> which pass U64_MAX to pass LONG_MAX. Finally ensure that in
> shrink_delalloc the u64 is converted to a long without overflowing,
> resulting in a negative number.

This patch is a cleanup and I don't see any other patch depend on it, so
please drop it from autosel.


Re: [PATCH AUTOSEL 5.11 55/67] btrfs: only let one thread pre-flush delayed refs in commit

2021-02-24 Thread David Sterba
On Wed, Feb 24, 2021 at 07:50:13AM -0500, Sasha Levin wrote:
> From: Josef Bacik 
> 
> [ Upstream commit e19eb11f4f3d3b0463cd897016064a79cb6d8c6d ]
> 
> I've been running a stress test that runs 20 workers in their own
> subvolume, which are running an fsstress instance with 4 threads per
> worker, which is 80 total fsstress threads.  In addition to this I'm
> running balance in the background as well as creating and deleting
> snapshots.  This test takes around 12 hours to run normally, going
> slower and slower as the test goes on.
> 
> The reason for this is because fsstress is running fsync sometimes, and
> because we're messing with block groups we often fall through to
> btrfs_commit_transaction, so will often have 20-30 threads all calling
> btrfs_commit_transaction at the same time.
> 
> These all get stuck contending on the extent tree while they try to run
> delayed refs during the initial part of the commit.
> 
> This is suboptimal, really because the extent tree is a single point of
> failure we only want one thread acting on that tree at once to reduce
> lock contention.
> 
> Fix this by making the flushing mechanism a bit operation, to make it
> easy to use test_and_set_bit() in order to make sure only one task does
> this initial flush.
> 
> Once we're into the transaction commit we only have one thread doing
> delayed ref running, it's just this initial pre-flush that is
> problematic.  With this patch my stress test takes around 90 minutes to
> run, instead of 12 hours.
> 
> The memory barrier is not necessary for the flushing bit as it's
> ordered, unlike plain int. The transaction state accessed in
> btrfs_should_end_transaction could be affected by that too as it's not
> always used under transaction lock. Upon Nikolay's analysis in [1]
> it's not necessary:
> 
>   In should_end_transaction it's read without holding any locks. (U)
> 
>   It's modified in btrfs_cleanup_transaction without holding the
>   fs_info->trans_lock (U), but the STATE_ERROR flag is going to be set.
> 
>   set in cleanup_transaction under fs_info->trans_lock (L)
>   set in btrfs_commit_trans to COMMIT_START under fs_info->trans_lock.(L)
>   set in btrfs_commit_trans to COMMIT_DOING under fs_info->trans_lock.(L)
>   set in btrfs_commit_trans to COMMIT_UNBLOCK under
>   fs_info->trans_lock.(L)
> 
>   set in btrfs_commit_trans to COMMIT_COMPLETED without locks but at this
>   point the transaction is finished and fs_info->running_trans is NULL (U
>   but irrelevant).
> 
>   So by the looks of it we can have a concurrent READ race with a WRITE,
>   due to reads not taking a lock. In this case what we want to ensure is
>   we either see new or old state. I consulted with Will Deacon and he said
>   that in such a case we'd want to annotate the accesses to ->state with
>   (READ|WRITE)_ONCE so as to avoid a theoretical tear, in this case I
>   don't think this could happen but I imagine at some point KCSAN would
>   flag such an access as racy (which it is).
> 
> [1] 
> https://lore.kernel.org/linux-btrfs/e1fd5cc1-0f28-f670-69f4-e9958b496...@suse.com
> 
> Reviewed-by: Nikolay Borisov 
> Signed-off-by: Josef Bacik 
> [ add comments regarding memory barrier ]
> Signed-off-by: David Sterba 
> Signed-off-by: Sasha Levin 

Please drop this patch from autosel queue, it's part of a larger series
that reworks flushing and is not a standalone fix.


Re: [GIT PULL] Kmap conversions for 5.12

2021-02-24 Thread David Sterba
On Tue, Feb 23, 2021 at 11:25:06AM -0800, Ira Weiny wrote:
> On Tue, Feb 23, 2021 at 09:13:42AM -0800, Linus Torvalds wrote:
> > On Tue, Feb 23, 2021 at 7:03 AM David Sterba  wrote:
[...]

> Sorry.  I will change it.

Let me know how you want to proceed with the patchset/pull request. I
can play the messenger again but now it seems a round of review is
needed and with some testing it'll be possible in some -rc. At that
point you may take the patches via the mm tree, unless Linus is ok with
a late pull.


[GIT PULL] Kmap conversions for 5.12

2021-02-23 Thread David Sterba
From: David Sterba 

Hi,

this pull request contains changes regarding kmap API use and eg.
conversion from kmap_atomic to kmap_local_page.

The API belongs to memory management but to save cross-tree dependency
headaches we've agreed to take it through the btrfs tree because there
are some trivial conversions possible, while the rest will need some
time and getting the easy cases out of the way would be convenient.

The final patchset arrived shortly before merge window, which is not
perfect, but given that it's straightforward I don't think it's too
risky.

I've added it to my for-next branch and it's been in linux-next for more
than a week.  Meanwhile I've been testing it among my regular branches
with additional MM related debugging options.

The changes can be grouped:

- function exports, new helpers

- new VM_BUG_ON for additional verification; it's been discussed if it
  should be VM_BUG_ON or BUG_ON, the former was chosen due to
  performance reasons

- code replaced by relevant helpers

Please pull, thanks.


The following changes since commit 92bf22614b21a2706f4993b278017e437f7785b3:

  Linux 5.11-rc7 (2021-02-07 13:57:38 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git 
kmap-conversion-for-5.12-tag

for you to fetch changes up to bbc24c42f2c0ea037db3c7f319c860fd790aeb28:

  btrfs: convert to zero_user() (2021-02-11 20:18:25 +0100)


Ira Weiny (8):
  mm/highmem: Lift memcpy_[to|from]_page to core
  mm/highmem: Convert memcpy_[to|from]_page() to kmap_local_page()
  mm/highmem: Introduce memcpy_page(), memmove_page(), and memset_page()
  mm/highmem: Add VM_BUG_ON() to mem*_page() calls
  iov_iter: Remove memzero_page() in favor of zero_user()
  btrfs: use memcpy_[to|from]_page() and kmap_local_page()
  btrfs: use copy_highpage() instead of 2 kmaps()
  btrfs: convert to zero_user()

 fs/btrfs/compression.c  | 11 +++---
 fs/btrfs/extent_io.c| 22 ---
 fs/btrfs/inode.c| 32 
 fs/btrfs/lzo.c  |  4 ++--
 fs/btrfs/raid56.c   | 10 +
 fs/btrfs/reflink.c  | 12 ++-
 fs/btrfs/send.c |  7 ++-
 fs/btrfs/zlib.c | 10 +++--
 fs/btrfs/zstd.c | 11 +++---
 include/linux/highmem.h | 56 +
 lib/iov_iter.c  | 26 +++
 11 files changed, 88 insertions(+), 113 deletions(-)


Re: [PATCH] btrfs: ref-verify: use 'inline void' keyword ordering

2021-02-22 Thread David Sterba
On Thu, Feb 18, 2021 at 10:54:17PM -0800, Randy Dunlap wrote:
> Fix build warnings of function signature when CONFIG_STACKTRACE is not
> enabled by reordering the 'inline' and 'void' keywords.
> 
> ../fs/btrfs/ref-verify.c:221:1: warning: ‘inline’ is not at beginning of 
> declaration [-Wold-style-declaration]
>  static void inline __save_stack_trace(struct ref_action *ra)
> ../fs/btrfs/ref-verify.c:225:1: warning: ‘inline’ is not at beginning of 
> declaration [-Wold-style-declaration]
>  static void inline __print_stack_trace(struct btrfs_fs_info *fs_info,
> 
> Fixes: fd708b81d972 ("Btrfs: add a extent ref verify tool")
> Signed-off-by: Randy Dunlap 
> Cc: Josef Bacik 
> Cc: David Sterba 
> Cc: Chris Mason 
> Cc: linux-bt...@vger.kernel.org
> Cc: Andrew Morton 

Added to misc-next thanks.


Re: [PATCH 4/7] fsdax: Replace mmap entry in case of CoW

2021-02-16 Thread David Sterba
On Mon, Feb 08, 2021 at 01:09:21AM +0800, Shiyang Ruan wrote:
> We replace the existing entry to the newly allocated one
> in case of CoW. Also, we mark the entry as PAGECACHE_TAG_TOWRITE
> so writeback marks this entry as writeprotected. This
> helps us snapshots so new write pagefaults after snapshots
> trigger a CoW.
> 
> Signed-off-by: Goldwyn Rodrigues 
> Signed-off-by: Shiyang Ruan 
> ---
>  fs/dax.c | 31 +++
>  1 file changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/dax.c b/fs/dax.c
> index b2195cbdf2dc..29698a3d2e37 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -722,6 +722,9 @@ static int copy_cow_page_dax(struct block_device *bdev, 
> struct dax_device *dax_d
>   return 0;
>  }
>  
> +#define DAX_IF_DIRTY (1ULL << 0)
> +#define DAX_IF_COW   (1ULL << 1)

The constants are ULL, but I see other flags only 'unsigned long'

> +
>  /*
>   * By this point grab_mapping_entry() has ensured that we have a locked entry
>   * of the appropriate size so we don't have to worry about downgrading PMDs 
> to
> @@ -731,14 +734,16 @@ static int copy_cow_page_dax(struct block_device *bdev, 
> struct dax_device *dax_d
>   */
>  static void *dax_insert_entry(struct xa_state *xas,
>   struct address_space *mapping, struct vm_fault *vmf,
> - void *entry, pfn_t pfn, unsigned long flags, bool dirty)
> + void *entry, pfn_t pfn, unsigned long flags, bool insert_flags)

insert_flags is bool

>  {
>   void *new_entry = dax_make_entry(pfn, flags);
> + bool dirty = insert_flags & DAX_IF_DIRTY;

"insert_flags & DAX_IF_DIRTY" is "bool & ULL", this can't be right

> + bool cow = insert_flags & DAX_IF_COW;

Same

>  
>   if (dirty)
>   __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
>  
> - if (dax_is_zero_entry(entry) && !(flags & DAX_ZERO_PAGE)) {
> + if (cow || (dax_is_zero_entry(entry) && !(flags & DAX_ZERO_PAGE))) {
>   unsigned long index = xas->xa_index;
>   /* we are replacing a zero page with block mapping */
>   if (dax_is_pmd_entry(entry))
> @@ -750,7 +755,7 @@ static void *dax_insert_entry(struct xa_state *xas,
>  
>   xas_reset(xas);
>   xas_lock_irq(xas);
> - if (dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) {
> + if (cow || dax_is_zero_entry(entry) || dax_is_empty_entry(entry)) {
>   void *old;
>  
>   dax_disassociate_entry(entry, mapping, false);
> @@ -774,6 +779,9 @@ static void *dax_insert_entry(struct xa_state *xas,
>   if (dirty)
>   xas_set_mark(xas, PAGECACHE_TAG_DIRTY);
>  
> + if (cow)
> + xas_set_mark(xas, PAGECACHE_TAG_TOWRITE);
> +
>   xas_unlock_irq(xas);
>   return entry;
>  }
> @@ -1319,6 +1327,7 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault 
> *vmf, pfn_t *pfnp,
>   void *entry;
>   pfn_t pfn;
>   void *kaddr;
> + unsigned long insert_flags = 0;
>  
>   trace_dax_pte_fault(inode, vmf, ret);
>   /*
> @@ -1444,8 +1453,10 @@ static vm_fault_t dax_iomap_pte_fault(struct vm_fault 
> *vmf, pfn_t *pfnp,
>  
>   goto finish_iomap;
>   case IOMAP_UNWRITTEN:
> - if (write && iomap.flags & IOMAP_F_SHARED)
> + if (write && (iomap.flags & IOMAP_F_SHARED)) {
> + insert_flags |= DAX_IF_COW;

Here's an example of 'unsigned long = unsigned long long', though it'll
work, it would be better to unify all the types.

>   goto cow;
> + }
>   fallthrough;
>   case IOMAP_HOLE:
>   if (!write) {
> @@ -1555,6 +1566,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault 
> *vmf, pfn_t *pfnp,
>   int error;
>   pfn_t pfn;
>   void *kaddr;
> + unsigned long insert_flags = 0;
>  
>   /*
>* Check whether offset isn't beyond end of file now. Caller is
> @@ -1670,14 +1682,17 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault 
> *vmf, pfn_t *pfnp,
>   result = vmf_insert_pfn_pmd(vmf, pfn, write);
>   break;
>   case IOMAP_UNWRITTEN:
> - if (write && iomap.flags & IOMAP_F_SHARED)
> + if (write && (iomap.flags & IOMAP_F_SHARED)) {
> + insert_flags |= DAX_IF_COW;
>   goto cow;
> + }
>   fallthrough;
>   case IOMAP_HOLE:
> - if (WARN_ON_ONCE(write))
> + if (!write) {
> + result = dax_pmd_load_hole(&xas, vmf, &iomap, &entry);
>   break;
> - result = dax_pmd_load_hole(&xas, vmf, &iomap, &entry);
> - break;
> + }
> + fallthrough;
>   default:
>   WARN_ON_ONCE(1);
>   break;
> -- 
> 2.30.0
> 
> 


[GIT PULL] Btrfs updates for 5.12

2021-02-16 Thread David Sterba
Hi,

this update brings updates of space handling, performance improvements
or bug fixes. The subpage block size and zoned mode features have
reached state where they're usable but with limitations.

The branch merges cleanly on top of current master, there are some minor
conflicts reported by linux-next in iov_iter or kmap conversion.

Please pull, thanks.

- Performance or related

  - do not block on deleted block group mutex in the cleaner, avoids
some long stalls

  - improved flushing: make it work better with ticket space
reservations and avoid excessive transaction commits in some
scenarios, slightly improves throughput for random write load

  - preemptive background flushing: separate the logic from ticket
reservations, improve the accounting and decisions when to flush in
low space conditions

  - less lock contention related to running delayed refs, let just one
thread do the flushing when there are many inside transaction commit

  - dbench workload improvements: avoid unnecessary work when logging
inodes, fewer fallbacks to transaction commit and thus less waiting
for it (+7% throughput, -20% latency)

- Core

  - subpage block size
- currently read-only support
- refactor and generalize code where sectorsize is assumed to be
  page size, add the subpage handling everywhere
- the read-write support is on the way, page sizes are still limited
  to 4K or 64K

  - zoned mode, first working version but with limitations
- SMR/ZBC/ZNS friendly allocation mode, utilizing the "no fixed
  location for structures" and chunked allocation
- superblock as the only fixed data structure needs special
  handling, uses 2 consecutive zones as a ring buffer
- tree-log support with a dedicated block group to avoid unordered
  writes
- emulated zones on non-zoned devices
- not yet working
  - all non-single block group profiles, requires more
zone write pointer synchronization between the multiple block
groups
  - fitrim due to dependency on space cache, can be implemented

- Fixes

  - ref-verify: proper tree owner and node level tracking
  - fix pinned byte accounting, causing some early ENOSPC now more
likely due to other changes in delayed refs

- Other

  - error handling fixes and improvements
  - more error injection points
  - more function documentation
  - more and updated tracepoints
  - subset of W=1 checked by default
  - update comments to allow more automatic kdoc parameter checks


The following changes since commit e0756cfc7d7cd08c98a53b6009c091a3f6a50be6:

  Merge tag 'trace-v5.11-rc7' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace (2021-02-08 
11:32:39 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.12-tag

for you to fetch changes up to 9d294a685fbcb256ce8c5f7fd88a7596d0f52a8a:

  btrfs: zoned: enable to mount ZONED incompat flag (2021-02-09 02:52:24 +0100)


Abaci Team (1):
  btrfs: simplify condition in __btrfs_run_delayed_items

Filipe Manana (10):
  btrfs: send: remove stale code when checking for shared extents
  btrfs: remove wrong comment for can_nocow_extent()
  btrfs: remove unnecessary directory inode item update when deleting dir 
entry
  btrfs: stop setting nbytes when filling inode item for logging
  btrfs: avoid logging new ancestor inodes when logging new inode
  btrfs: skip logging directories already logged when logging all parents
  btrfs: skip logging inodes already logged when logging new entries
  btrfs: remove unnecessary check_parent_dirs_for_sync()
  btrfs: make concurrent fsyncs wait less when waiting for a transaction 
commit
  btrfs: fix extent buffer leak on failure to copy root

Johannes Thumshirn (7):
  block: add bio_add_zone_append_page
  btrfs: release path before calling to btrfs_load_block_group_zone_info
  btrfs: zoned: do not load fs_info::zoned from incompat flag
  btrfs: zoned: allow zoned filesystems on non-zoned block devices
  btrfs: zoned: check if bio spans across an ordered extent
  btrfs: zoned: cache if block group is on a sequential zone
  btrfs: save irq flags when looking up an ordered extent

Josef Bacik (34):
  btrfs: fix error handling in commit_fs_roots
  btrfs: allow error injection for btrfs_search_slot and btrfs_cow_block
  btrfs: noinline btrfs_should_cancel_balance
  btrfs: ref-verify: pass down tree block level when building refs
  btrfs: ref-verify: make sure owner is set for all refs
  btrfs: keep track of the root owner for relocation reads
  btrfs: do not cleanup upper nodes in btrfs_backref_cleanup_node
  btrfs: handle space_info::total_bytes_pinned inside the delayed ref itself
  btrfs: account 

[GIT PULL] AFFS fix for 5.12

2021-02-15 Thread David Sterba
Hi,

there's one minor fix for error handling in rename exchange. Please
pull, thanks.


The following changes since commit 92bf22614b21a2706f4993b278017e437f7785b3:

  Linux 5.11-rc7 (2021-02-07 13:57:38 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git 
affs-for-5.12-tag

for you to fetch changes up to 70779b897395b330ba5a47bed84f94178da599f9:

  fs/affs: release old buffer head on error path (2021-02-09 17:11:03 +0100)


Pan Bian (1):
  fs/affs: release old buffer head on error path

 fs/affs/namei.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)


[GIT PULL] Btrfs fix for 5.11-rc8

2021-02-13 Thread David Sterba
Hi,

a regression fix caused by a refactoring in 5.11. A corrupted superblock
wouldn't be detected by checksum verification due to wrongly placed
initialization of the checksum length, thus making memcmp always work.

I've verified it manually and ran other test suites before sending this.
Please pull, thanks.


The following changes since commit 9ad6d91f056b99dbe59a262810cb342519ea8d39:

  btrfs: fix log replay failure due to race with space cache rebuild 
(2021-01-25 18:44:53 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.11-rc7-tag

for you to fetch changes up to 83c68bbcb6ac2dbbcaf12e2281a29a9f73b97d0f:

  btrfs: initialize fs_info::csum_size earlier in open_ctree (2021-02-12 
14:48:24 +0100)


Su Yue (1):
  btrfs: initialize fs_info::csum_size earlier in open_ctree

 fs/btrfs/disk-io.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


Re: [PATCH V2 0/8] btrfs: convert kmaps to core page calls

2021-02-11 Thread David Sterba
On Tue, Feb 09, 2021 at 10:22:13PM -0800, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> Changes from V1:
>   Rework commit messages because they were very weak
>   Change 'fs/btrfs: X' to 'btrfs: x'
>   https://lore.kernel.org/lkml/20210209151442.gu1...@suse.cz/
>   Per Andrew
>   Split out changes to highmem.h
>   The addition memcpy, memmove, and memset page functions
>   The change from kmap_atomic to kmap_local_page
>   The addition of BUG_ON
>   The conversion of memzero_page to zero_user in iov_iter
>   Change BUG_ON to VM_BUG_ON
>   While we are refactoring adjust the line length down per Chaitany
> 
> 
> There are many places where kmap//kunmap patterns occur.  We lift a
> couple of these patterns to the core common functions and use them as well as
> existing core functions in the btrfs file system.  At the same time we convert
> those core functions to use kmap_local_page() which is more efficient in those
> calls.
> 
> Per the conversation on V1 it looks like Andrew would like this to go through
> the btrfs tree.  I think that is fine.  The other users of
> memcpy_[to|from]_page are probably not ready and I believe could be taken in 
> an
> early rc after David submits.
> 
> Is that ok with you David?

Yes.

The branch is now in
https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git/log/?h=kmap-conversion
let me know if I've missed acked-by or reviewed-by, I added those sent
to the mailinglist and added mine to the btrfs ones and to the iov_iter
patch.

I'll add the patchset to my for-next so it gets picked by linux-next and
will keep testing it for at least a week.

Though this is less than the expected time before merge window, the
reasoning is that it's exporting helpers that are going to be used in
various subsystems. The changes in btrfs are simple and would allow to
focus on the other less trivial conversions. ETA for the pull request is
mid of the 2nd week of the merge window or after rc1.


Re: [PATCH V2 4/8] mm/highmem: Add VM_BUG_ON() to mem*_page() calls

2021-02-11 Thread David Sterba
On Wed, Feb 10, 2021 at 01:22:28PM -0800, Ira Weiny wrote:
> On Wed, Feb 10, 2021 at 06:56:06PM +, Matthew Wilcox wrote:
> > On Wed, Feb 10, 2021 at 08:29:01AM -0800, Ira Weiny wrote:
> > > And I thought it was a good idea.  Any file system development should have
> > > tests with DEBUG_VM which should cover Matthew's concern while not having 
> > > the
> > > overhead in production.  Seemed like a decent compromise?
> > 
> > Why do you think these paths are only used during file system development?
> 
> I can't guarantee it but right now most of the conversions I have worked on 
> are
> in FS's.
> 
> > They're definitely used by networking, by device drivers of all kinds
> > and they're probably even used by the graphics system.
> > 
> > While developers *should* turn on DEBUG_VM during development, a
> > shockingly high percentage don't even turn on lockdep.
> 
> Honestly, I don't feel strongly enough to argue it.

I checked my devel config and I don't have DEBUG_VM enabled, while I
have a bunch of other debugging options related to locking or other
fine-grained sanity checks. The help text is not very specific what
exactly is being checked other that it hurts performance, so I read it
as that it's for MM developers that change the MM code, while in
filesystem we use the APIs.

However, for the this patchset I'll turn it on all testing instances of
course.

> Andrew?  David?  David this is going through your tree so would you feel more
> comfortable with 1 or the other?

I think it's a question for MM people, for now I assume it's supposed to
be VM_BUG_ON.


Re: [PATCH] fs/affs: release old buffer head on error path

2021-02-09 Thread David Sterba
On Wed, Jan 20, 2021 at 12:51:13AM -0800, Pan Bian wrote:
> The reference count of the old buffer head should be decremented on path
> that fails to get the new buffer head.
> 
> Fixes: 6b4657667ba0 ("fs/affs: add rename exchange")
> Signed-off-by: Pan Bian 

Thanks, added to affs branch. The fix is not that urgent for 5.11 so
I'll send it for the 5.12 merge window. I've added the stable tag so
it'll propagate to 4.14+.


Re: [PATCH 2/4] fs/btrfs: Use memcpy_[to|from]_page()

2021-02-09 Thread David Sterba
On Fri, Feb 05, 2021 at 03:23:02PM -0800, ira.we...@intel.com wrote:
> From: Ira Weiny 

There should be a short explanation what the patch does, eg.
"use helper for open coded kmap_atomic/memcpy/kunmap_atomic",
although I see there are conversions kmap_atomic -> kmap_local not in
the coccinelle script, probably done manually. This should be mentioned
too.

Also please use "btrfs:" followed by lowercase in the subject.


Re: [PATCH 0/4] btrfs: Convert kmaps to core page calls

2021-02-09 Thread David Sterba
On Fri, Feb 05, 2021 at 03:23:00PM -0800, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> There are many places where kmap//kunmap patterns occur.  We lift
> these various patterns to core common functions and use them in the btrfs file
> system.  At the same time we convert those core functions to use
> kmap_local_page() which is more efficient in those calls.
> 
> I think this is best accepted through Andrew's tree as it has the mem*_page
> functions in it.  But I'd like to get an ack from David or one of the other
> btrfs maintainers before the btrfs patches go through.

I'd rather take the non-mm patches through my tree so it gets tested
the same way as other btrfs changes, straightforward cleanups or not.

This brings the question how to do that as the first patch should go
through the MM tree. One option is to posptpone the actual cleanups
after the 1st patch is merged but this could take a long delay.

I'd suggest to take the 1st patch within MM tree in the upcoming merge
window and then I can prepare a separate pull with just the cleanups.
Removing an inter-tree patch dependency was a sufficient reason for
Linus in the past for such pull requests.

> There are a lot more kmap->kmap_local_page() conversions but kmap_local_page()
> requires some care with the unmapping order and so I'm still reviewing those
> changes because btrfs uses a lot of loops for it's kmaps.

It sounds to me that converting the kmaps will take some time anyway so
exporting the helpers first and then converting the subsystems might be
a good option. In case you'd like to get rid of the simple cases in
btrfs code now we can do the 2 pull requests.


Re: [PATCH 0/3] fs/efs: Follow kernel style guide

2021-02-07 Thread David Sterba
On Fri, Feb 05, 2021 at 11:37:46PM +0100, Richard Weinberger wrote:
> On Fri, Feb 5, 2021 at 11:26 PM Amy Parker  wrote:
> >
> > On Fri, Feb 5, 2021 at 5:1 AM David Sterba  wrote:
> > >
> > > On Thu, Feb 04, 2021 at 08:52:14PM -0800, Amy Parker wrote:
> > > > As the EFS driver is old and non-maintained,
> > >
> > > Is anybody using EFS on current kernels? There's not much point updating
> > > it to current coding style, deleting fs/efs is probably the best option.
> > >
> >
> > Wouldn't be surprised if there's a few systems out there that haven't
> > migrated at all.
> 
> Before ripping it from the kernel source you could do a FUSE port of EFS.
> That way old filesystems can still get used on Linux.

Agreed, replacing the obsolete filesystems by FUSE implementations would
be great. Regarding EFS I got pointed to
https://github.com/sophaskins/efs2tar that allows to access the old IRIX
CDs (can be found on archive.org), so there's something.


Re: [PATCH] ia64: Fix style guide breakage

2021-02-07 Thread David Sterba
On Fri, Feb 05, 2021 at 02:06:18PM -0800, Amy Parker wrote:
> Some statements do not have proper spacing between their C
> keywords (commonly if and for) throughout files in the ia64 tree.
> This patch corrects this to follow the kernel code style guide.
> 
> Signed-off-by: Amy Parker 
> ---
>  arch/ia64/hp/common/sba_iommu.c  | 6 +++---
>  arch/ia64/kernel/machine_kexec.c | 2 +-
>  arch/ia64/kernel/palinfo.c   | 6 +++---

ia64 got orphaned and not maintained in 96ec72a3425d1515b6, it's just
not really worth the time to spend the time cleaning up the code base.


Re: [PATCH] fs/btrfs: Fix raid6 qstripe kmap'ing

2021-02-05 Thread David Sterba
On Thu, Feb 04, 2021 at 07:52:36PM -0800, Ira Weiny wrote:
> On Thu, Feb 04, 2021 at 04:26:08PM +0100, David Sterba wrote:
> > On Wed, Feb 03, 2021 at 04:56:48PM +0100, David Sterba wrote:
> > > On Wed, Jan 27, 2021 at 10:15:03PM -0800, ira.we...@intel.com wrote:
> > > > From: Ira Weiny 
> > > Changelog is good, thanks. I've added stable tags as the missing unmap
> > > is a potential problem.
> > 
> > There are lots of tests faling, stack traces like below. I haven't seen
> > anything obvious in the patch so that needs a closer look and for the
> > time being I can't add the patch to for-next.
> 
> :-(
> 
> I think I may have been off by 1 on the raid6 kmap...
> 
> Something like this should fix it...
> 
> diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
> index b8a39dad0f00..dbf52f1a379d 100644
> --- a/fs/btrfs/raid56.c
> +++ b/fs/btrfs/raid56.c
> @@ -2370,7 +2370,7 @@ static noinline void finish_parity_scrub(struct 
> btrfs_raid_bio *rbio,
> goto cleanup;
> }
> SetPageUptodate(q_page);
> -   pointers[rbio->real_stripes] = kmap(q_page);
> +   pointers[rbio->real_stripes - 1] = kmap(q_page);

Oh right and tests agree it works.

> }
>  
> atomic_set(&rbio->error, 0);
> 
> Let me roll a new version.

No need to, I'll fold the fixup. Thanks.


Re: [PATCH 0/3] fs/efs: Follow kernel style guide

2021-02-05 Thread David Sterba
On Thu, Feb 04, 2021 at 08:52:14PM -0800, Amy Parker wrote:
> As the EFS driver is old and non-maintained,

Is anybody using EFS on current kernels? There's not much point updating
it to current coding style, deleting fs/efs is probably the best option.

The EFS name is common for several filesystems, not to be confused with
eg.  Encrypted File System. In linux it's the IRIX version, check
Kconfig, and you could hardly find the utilities to create such
filesystem.


Re: [PATCH] fs/btrfs: Fix raid6 qstripe kmap'ing

2021-02-04 Thread David Sterba
On Wed, Feb 03, 2021 at 04:56:48PM +0100, David Sterba wrote:
> On Wed, Jan 27, 2021 at 10:15:03PM -0800, ira.we...@intel.com wrote:
> > From: Ira Weiny 
> > 
> > When a qstripe is required an extra page is allocated and mapped.  There
> > were 3 problems.
> > 
> > 1) There is no reason to map the qstripe page more than 1 time if the
> >number of bits set in rbio->dbitmap is greater than one.
> > 2) There is no reason to map the parity page and unmap it each time
> >through the loop.
> > 3) There is no corresponding call of kunmap() for the qstripe page.
> > 
> > The page memory can continue to be reused with a single mapping on each
> > iteration by raid6_call.gen_syndrome() without remapping.  So map the
> > page for the duration of the loop.
> > 
> > Similarly, improve the algorithm by mapping the parity page just 1 time.
> > 
> > Fixes: 5a6ac9eacb49 ("Btrfs, raid56: support parity scrub on raid56")
> > To: Chris Mason 
> > To: Josef Bacik 
> > To: David Sterba 
> > Cc: Miao Xie 
> > Signed-off-by: Ira Weiny 
> > 
> > ---
> > This was found while replacing kmap() with kmap_local_page().  After
> > this patch unwinding all the mappings becomes pretty straight forward.
> > 
> > I'm not exactly sure I've worded this commit message intelligently.
> > Please forgive me if there is a better way to word it.
> 
> Changelog is good, thanks. I've added stable tags as the missing unmap
> is a potential problem.

There are lots of tests faling, stack traces like below. I haven't seen
anything obvious in the patch so that needs a closer look and for the
time being I can't add the patch to for-next.

 BUG: kernel NULL pointer dereference, address:
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 0 P4D 0
 Oops: 0002 [#1] PREEMPT SMP
 CPU: 2 PID: 17173 Comm: kworker/u8:5 Not tainted5.11.0-rc6-default+ #1422
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),BIOS 
rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
 Workqueue: btrfs-rmw btrfs_work_helper [btrfs]
 RIP: 0010:raid6_avx22_gen_syndrome+0x103/0x140 [raid6_pq]
 RSP: 0018:a090042cfcf8 EFLAGS: 00010246
 RAX: 9e98e1848e80 RBX: 9e98d5849000 RCX:0020
 RDX: 9e98e32be000 RSI:  RDI:9e98e1848e80
 RBP:  R08:  R09:0001
 R10: 9e98e1848e90 R11: 9e98e1848e98 R12:1000
 R13: 9e98e1848e88 R14: 0005 R15:0002
 FS:  () GS:9e993da0()knlGS:
 CS:  0010 DS:  ES:  CR0: 80050033
 CR2:  CR3: 23143003 CR4:00170ea0
 Call Trace:
  finish_parity_scrub+0x47b/0x7a0 [btrfs]
  raid56_parity_scrub_stripe+0x24e/0x260 [btrfs]
  btrfs_work_helper+0xd5/0x1d0 [btrfs]
  process_one_work+0x262/0x5f0
  worker_thread+0x4e/0x300
  ? process_one_work+0x5f0/0x5f0
  kthread+0x151/0x170
  ? __kthread_bind_mask+0x60/0x60


Re: [PATCH] fs/btrfs: Fix raid6 qstripe kmap'ing

2021-02-03 Thread David Sterba
On Wed, Jan 27, 2021 at 10:15:03PM -0800, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> When a qstripe is required an extra page is allocated and mapped.  There
> were 3 problems.
> 
> 1) There is no reason to map the qstripe page more than 1 time if the
>number of bits set in rbio->dbitmap is greater than one.
> 2) There is no reason to map the parity page and unmap it each time
>through the loop.
> 3) There is no corresponding call of kunmap() for the qstripe page.
> 
> The page memory can continue to be reused with a single mapping on each
> iteration by raid6_call.gen_syndrome() without remapping.  So map the
> page for the duration of the loop.
> 
> Similarly, improve the algorithm by mapping the parity page just 1 time.
> 
> Fixes: 5a6ac9eacb49 ("Btrfs, raid56: support parity scrub on raid56")
> To: Chris Mason 
> To: Josef Bacik 
> To: David Sterba 
> Cc: Miao Xie 
> Signed-off-by: Ira Weiny 
> 
> ---
> This was found while replacing kmap() with kmap_local_page().  After
> this patch unwinding all the mappings becomes pretty straight forward.
> 
> I'm not exactly sure I've worded this commit message intelligently.
> Please forgive me if there is a better way to word it.

Changelog is good, thanks. I've added stable tags as the missing unmap
is a potential problem.


[GIT PULL] Btrfs fixes for 5.11-rc6

2021-01-29 Thread David Sterba
Hi,

I'm not sure the first post of this pull request made it through so
sending again.

A few more fixes for a late rc:

- fix lockdep complaint on 32bit arches and also remove an unsafe memory
  use due to device vs filesystem lifetime

- two fixes for free space tree
  - race during log replay and cache rebuild, now more likely to happen
due to changes in this dev cycle
  - possible free space tree corruption with online conversion during
initial tree population

Please pull, thanks.


The following changes since commit 34d1eb0e599875064955a74712f08ff14c8e3d5f:

  btrfs: don't clear ret in btrfs_start_dirty_block_groups (2021-01-18 16:00:11 
+0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.11-rc5-tag

for you to fetch changes up to 9ad6d91f056b99dbe59a262810cb342519ea8d39:

  btrfs: fix log replay failure due to race with space cache rebuild 
(2021-01-25 18:44:53 +0100)


Filipe Manana (1):
  btrfs: fix log replay failure due to race with space cache rebuild

Josef Bacik (1):
  btrfs: fix possible free space tree corruption with online conversion

Su Yue (1):
  btrfs: fix lockdep warning due to seqcount_mutex on 32bit arch

 fs/btrfs/block-group.c | 10 +++-
 fs/btrfs/ctree.h   |  3 +++
 fs/btrfs/extent-tree.c | 61 ++
 fs/btrfs/free-space-tree.c | 10 +++-
 fs/btrfs/volumes.c |  2 +-
 fs/btrfs/volumes.h | 11 +
 6 files changed, 46 insertions(+), 51 deletions(-)


Re: [PATCH v2] btrfs: Avoid calling btrfs_get_chunk_map() twice

2021-01-29 Thread David Sterba
On Fri, Jan 29, 2021 at 07:02:41PM +, Michal Rostecki wrote:
> On Fri, Jan 29, 2021 at 06:47:53PM +0100, David Sterba wrote:
> > On Fri, Jan 29, 2021 at 05:15:21PM +, Michal Rostecki wrote:
> > > On Fri, Jan 29, 2021 at 11:22:48AM -0500, Josef Bacik wrote:
> > > > On 1/27/21 8:57 AM, Michal Rostecki wrote:
> > > > it happens when you run btrfs/060.  Please make sure to run xfstests 
> > > > against
> > > > patches before you submit them upstream.  Thanks,
> > > > 
> > > > Josef
> > > 
> > > Umm... I ran the xftests against v1 patch and didn't get that panic.
> > 
> > It could have been caused by my fixups to v2 and I can reproduce the
> > crash now too. I can't see any difference besides the u64/int switch and
> > the 'goto out' removal in btrfs_bio_fits_in_stripe.
> 
> I was able to fix the issue by the following diff. I will send it as the
> patch after confirming that all fstests are passing.

Thanks, can't reproduce the crash with that at least on test btrfs/060.


Re: [PATCH v2] btrfs: Avoid calling btrfs_get_chunk_map() twice

2021-01-29 Thread David Sterba
On Fri, Jan 29, 2021 at 05:15:21PM +, Michal Rostecki wrote:
> On Fri, Jan 29, 2021 at 11:22:48AM -0500, Josef Bacik wrote:
> > On 1/27/21 8:57 AM, Michal Rostecki wrote:
> > > From: Michal Rostecki 
> > > 
> > > Before this change, the btrfs_get_io_geometry() function was calling
> > > btrfs_get_chunk_map() to get the extent mapping, necessary for
> > > calculating the I/O geometry. It was using that extent mapping only
> > > internally and freeing the pointer after its execution.
> > > 
> > > That resulted in calling btrfs_get_chunk_map() de facto twice by the
> > > __btrfs_map_block() function. It was calling btrfs_get_io_geometry()
> > > first and then calling btrfs_get_chunk_map() directly to get the extent
> > > mapping, used by the rest of the function.
> > > 
> > > This change fixes that by passing the extent mapping to the
> > > btrfs_get_io_geometry() function as an argument.
> > > 
> > > v2:
> > > When btrfs_get_chunk_map() returns an error in btrfs_submit_direct():
> > > - Use errno_to_blk_status(PTR_ERR(em)) as the status
> > > - Set em to NULL
> > > 
> > > Signed-off-by: Michal Rostecki 
> > 
> > This panic'ed all of my test vms in their overnight xfstests runs, the 
> > panic is this
> > 
> > [ 2449.936502] BTRFS critical (device dm-7): mapping failed logical
> > 1113825280 bio len 40960 len 24576
> > [ 2449.937073] [ cut here ]
> > [ 2449.937329] kernel BUG at fs/btrfs/volumes.c:6450!
> > [ 2449.937604] invalid opcode:  [#1] SMP NOPTI
> > [ 2449.937855] CPU: 0 PID: 259045 Comm: kworker/u5:0 Not tainted 
> > 5.11.0-rc5+ #122
> > [ 2449.938252] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> > 1.13.0-2.fc32 04/01/2014
> > [ 2449.938713] Workqueue: btrfs-worker-high btrfs_work_helper
> > [ 2449.939016] RIP: 0010:btrfs_map_bio.cold+0x5a/0x5c
> > [ 2449.939392] Code: 37 87 ff ff e8 ed d4 8a ff 48 83 c4 18 e9 b5 52 8b ff
> > 49 89 c8 4c 89 fa 4c 89 f1 48 c7 c6 b0 c0 61 8b 48 89 ef e8 11 87 ff ff <0f>
> > 0b 4c 89 e7 e8 42 09 86 ff e9 fd 59 8b ff 49 8b 7a 50 44 89 f2
> > [ 2449.940402] RSP: :9f24c1637d90 EFLAGS: 00010282
> > [ 2449.940689] RAX: 0057 RBX: 90c78ff716b8 RCX: 
> > 
> > [ 2449.941080] RDX: 90c7fbc27ae0 RSI: 90c7fbc19110 RDI: 
> > 90c7fbc19110
> > [ 2449.941467] RBP: 90c7911d4000 R08:  R09: 
> > 
> > [ 2449.941853] R10: 9f24c1637b48 R11: 8b9723e8 R12: 
> > 
> > [ 2449.942243] R13:  R14: a000 R15: 
> > 4263a000
> > [ 2449.942632] FS:  () GS:90c7fbc0()
> > knlGS:
> > [ 2449.943072] CS:  0010 DS:  ES:  CR0: 80050033
> > [ 2449.943386] CR2: 5575163c3080 CR3: 00010ad6c004 CR4: 
> > 00370ef0
> > [ 2449.943772] Call Trace:
> > [ 2449.943915]  ? lock_release+0x1c3/0x290
> > [ 2449.944135]  run_one_async_done+0x3a/0x60
> > [ 2449.944360]  btrfs_work_helper+0x136/0x520
> > [ 2449.944588]  process_one_work+0x26e/0x570
> > [ 2449.944812]  worker_thread+0x55/0x3c0
> > [ 2449.945016]  ? process_one_work+0x570/0x570
> > [ 2449.945250]  kthread+0x137/0x150
> > [ 2449.945430]  ? __kthread_bind_mask+0x60/0x60
> > [ 2449.945666]  ret_from_fork+0x1f/0x30
> > 
> > it happens when you run btrfs/060.  Please make sure to run xfstests against
> > patches before you submit them upstream.  Thanks,
> > 
> > Josef
> 
> Umm... I ran the xftests against v1 patch and didn't get that panic.

It could have been caused by my fixups to v2 and I can reproduce the
crash now too. I can't see any difference besides the u64/int switch and
the 'goto out' removal in btrfs_bio_fits_in_stripe.


Re: [PATCH v2] btrfs: Avoid calling btrfs_get_chunk_map() twice

2021-01-28 Thread David Sterba
On Wed, Jan 27, 2021 at 02:57:27PM +0100, Michal Rostecki wrote:
> From: Michal Rostecki 
> 
> Before this change, the btrfs_get_io_geometry() function was calling
> btrfs_get_chunk_map() to get the extent mapping, necessary for
> calculating the I/O geometry. It was using that extent mapping only
> internally and freeing the pointer after its execution.
> 
> That resulted in calling btrfs_get_chunk_map() de facto twice by the
> __btrfs_map_block() function. It was calling btrfs_get_io_geometry()
> first and then calling btrfs_get_chunk_map() directly to get the extent
> mapping, used by the rest of the function.
> 
> This change fixes that by passing the extent mapping to the
> btrfs_get_io_geometry() function as an argument.
> 
> v2:
> When btrfs_get_chunk_map() returns an error in btrfs_submit_direct():
> - Use errno_to_blk_status(PTR_ERR(em)) as the status
> - Set em to NULL

The version-to-version changelog belongs under the -- line. If there's
something relevant in v2 it should be put into the proper changelog but
normal fixups like 'set em to NULL' do not have the long-term value that
we want to record in the changelog.

> Signed-off-by: Michal Rostecki 
> ---
>  fs/btrfs/inode.c   | 38 +-
>  fs/btrfs/volumes.c | 39 ---
>  fs/btrfs/volumes.h |  5 +++--
>  3 files changed, 48 insertions(+), 34 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 0dbe1aaa0b71..e2ee3a9c1140 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -2183,9 +2183,10 @@ int btrfs_bio_fits_in_stripe(struct page *page, size_t 
> size, struct bio *bio,
>   struct inode *inode = page->mapping->host;
>   struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>   u64 logical = bio->bi_iter.bi_sector << 9;
> + struct extent_map *em;
>   u64 length = 0;
>   u64 map_length;
> - int ret;
> + int ret = 0;
>   struct btrfs_io_geometry geom;
>  
>   if (bio_flags & EXTENT_BIO_COMPRESSED)
> @@ -2193,14 +2194,21 @@ int btrfs_bio_fits_in_stripe(struct page *page, 
> size_t size, struct bio *bio,
>  
>   length = bio->bi_iter.bi_size;
>   map_length = length;
> - ret = btrfs_get_io_geometry(fs_info, btrfs_op(bio), logical, map_length,
> - &geom);
> + em = btrfs_get_chunk_map(fs_info, logical, map_length);
> + if (IS_ERR(em))
> + return PTR_ERR(em);
> + ret = btrfs_get_io_geometry(fs_info, em, btrfs_op(bio), logical,
> + map_length, &geom);
>   if (ret < 0)
> - return ret;
> + goto out;
>  
> - if (geom.len < length + size)
> - return 1;
> - return 0;
> + if (geom.len < length + size) {
> + ret = 1;
> + goto out;
> + }
> +out:
> + free_extent_map(em);
> + return ret;
>  }
>  
>  /*
> @@ -7941,10 +7949,12 @@ static blk_qc_t btrfs_submit_direct(struct inode 
> *inode, struct iomap *iomap,
>   u64 submit_len;
>   int clone_offset = 0;
>   int clone_len;
> + int logical;

This needs to be u64

>   int ret;
>   blk_status_t status;
>   struct btrfs_io_geometry geom;
>   struct btrfs_dio_data *dio_data = iomap->private;
> + struct extent_map *em;
>  
>   dip = btrfs_create_dio_private(dio_bio, inode, file_offset);
>   if (!dip) {
> @@ -7970,11 +7980,18 @@ static blk_qc_t btrfs_submit_direct(struct inode 
> *inode, struct iomap *iomap,
>   }
>  
>   start_sector = dio_bio->bi_iter.bi_sector;
> + logical = start_sector << 9;

Otherwise this overflows on logical address larger than 2^23 which is 8G.


Re: [PATCH v2] btrfs: Avoid calling btrfs_get_chunk_map() twice

2021-01-28 Thread David Sterba
On Thu, Jan 28, 2021 at 10:38:45AM +, Filipe Manana wrote:
> On Thu, Jan 28, 2021 at 12:01 AM Michal Rostecki  wrote:
> >
> > From: Michal Rostecki 
> >
> > Before this change, the btrfs_get_io_geometry() function was calling
> > btrfs_get_chunk_map() to get the extent mapping, necessary for
> > calculating the I/O geometry. It was using that extent mapping only
> > internally and freeing the pointer after its execution.
> >
> > That resulted in calling btrfs_get_chunk_map() de facto twice by the
> > __btrfs_map_block() function. It was calling btrfs_get_io_geometry()
> > first and then calling btrfs_get_chunk_map() directly to get the extent
> > mapping, used by the rest of the function.
> >
> > This change fixes that by passing the extent mapping to the
> > btrfs_get_io_geometry() function as an argument.
> >
> > v2:
> > When btrfs_get_chunk_map() returns an error in btrfs_submit_direct():
> > - Use errno_to_blk_status(PTR_ERR(em)) as the status
> > - Set em to NULL
> >
> > Signed-off-by: Michal Rostecki 
> 
> Reviewed-by: Filipe Manana 
> 
> I think this is a fine optimization.
> For very large filesystems, i.e. several thousands of allocated
> chunks, not only this avoids searching two times the rbtree,
> saving time, it may also help reducing contention on the lock that
> protects the tree - thinking of writeback starting for multiple
> inodes,
> other tasks allocating or removing chunks, and anything else that
> requires access to the rbtree.

This should answer Nikolay's concerns if shifting around the parameters
is worth it. Caching values that could be expensive to read makes sense
to me and it's not that intrusive in the code. I'll add your analysis to
the changelog and incorporate the fixups that Nikolay suggested in v1.
Thanks.


Re: [PATCH] btrfs: Simplify the calculation of variables

2021-01-28 Thread David Sterba
On Wed, Jan 27, 2021 at 04:11:37PM +0800, Abaci Team wrote:
> Fix the following coccicheck warnings:
> 
> ./fs/btrfs/delayed-inode.c:1157:39-41: WARNING !A || A && B is
> equivalent to !A || B.
> 
> Reported-by: Abaci Robot 
> Suggested-by: Jiapeng Zhong 
> Signed-off-by: Abaci Team 

Added to misc-next, thanks.


Re: linux-next: build warning after merge of the btrfs tree

2021-01-27 Thread David Sterba
On Wed, Jan 27, 2021 at 10:18:31AM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> After merging the btrfs tree, today's linux-next build (powerpc
> ppc64_defconfig) produced this warning:
> 
> fs/btrfs/space-info.c:1373: warning: Function parameter or member 'start_ns' 
> not described in 'handle_reserve_ticket'
> fs/btrfs/space-info.c:1373: warning: Function parameter or member 
> 'orig_bytes' not described in 'handle_reserve_ticket'
> 
> Introduced by commit
> 
>   cf61ceb78394 ("btrfs: add a trace point for reserve tickets")

Will be fixed in the next snapshot, thanks.


Re: [PATCH] btrfs: remove redundant NULL check

2021-01-21 Thread David Sterba
On Thu, Jan 21, 2021 at 04:19:47PM +0800, Yang Li wrote:
> Fix below warnings reported by coccicheck:
> ./fs/btrfs/raid56.c:237:2-8: WARNING: NULL check before some freeing
> functions is not needed.
> 
> Reported-by: Abaci Robot 
> Signed-off-by: Yang Li 

Added to misc-next, thanks.


[GIT PULL] Btrfs fixes for 5.11-rc5

2021-01-20 Thread David Sterba
Hi,

a few more one line fixes for various bugs, stable material.

- fix send when emitting clone operation from the same file and root

- fix double free on error when cleaning backrefs

- lockdep fix during relocation

- handle potential error during reloc when starting transaction

- skip running delayed refs during commit (leftover from code removal in
  this dev cycle)

Please pull thanks.


The following changes since commit e076ab2a2ca70a0270232067cd49f76cd92efe64:

  btrfs: shrink delalloc pages instead of full inodes (2021-01-08 16:36:44 
+0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.11-rc4-tag

for you to fetch changes up to 34d1eb0e599875064955a74712f08ff14c8e3d5f:

  btrfs: don't clear ret in btrfs_start_dirty_block_groups (2021-01-18 16:00:11 
+0100)


David Sterba (1):
  btrfs: no need to run delayed refs after commit_fs_roots during commit

Filipe Manana (1):
  btrfs: send: fix invalid clone operations when cloning from the same file 
and root

Josef Bacik (4):
  btrfs: don't get an EINTR during drop_snapshot for reloc
  btrfs: do not double free backref nodes on error
  btrfs: fix lockdep splat in btrfs_recover_relocation
  btrfs: don't clear ret in btrfs_start_dirty_block_groups

 fs/btrfs/backref.c |  2 +-
 fs/btrfs/block-group.c |  3 ++-
 fs/btrfs/extent-tree.c | 10 +-
 fs/btrfs/send.c| 15 +++
 fs/btrfs/transaction.c |  8 
 fs/btrfs/volumes.c |  2 ++
 6 files changed, 29 insertions(+), 11 deletions(-)


Re: [LKP] Re: [btrfs] e076ab2a2c: fio.write_iops -18.3% regression

2021-01-14 Thread David Sterba
On Wed, Jan 13, 2021 at 01:58:18PM +0800, Xing Zhengjun wrote:
> 
> 
> On 1/12/2021 11:45 PM, David Sterba wrote:
> > On Tue, Jan 12, 2021 at 11:36:14PM +0800, kernel test robot wrote:
> >> Greeting,
> >>
> >> FYI, we noticed a -18.3% regression of fio.write_iops due to commit:
> >>
> >>
> >> commit: e076ab2a2ca70a0270232067cd49f76cd92efe64 ("btrfs: shrink delalloc 
> >> pages instead of full inodes")
> >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >>
> >>
> >> in testcase: fio-basic
> >> on test machine: 192 threads Intel(R) Xeon(R) CPU @ 2.20GHz with 192G 
> >> memory
> >> with following parameters:
> >>
> >>disk: 1SSD
> >>fs: btrfs
> >>runtime: 300s
> >>nr_task: 8
> >>rw: randwrite
> >>bs: 4k
> >>ioengine: sync
> >>test_size: 256g
> > Though I do a similar test (emulating bit torrent workload), it's a bit
> > extreme as it's 4k synchronous on a huge file. It always takes a lot of
> > time but could point out some concurrency issues namely on faster
> > devices. There are 8 threads possibly competing for the same inode lock
> > or other locks related to it.
> >
> > The mentioned commit fixed another perf regression on a much more common
> > workload (untgrring files), so at this point drop in this fio workload
> > is inevitable.
> 
> Do you have a plan to fix it? Thanks.

My plan is to find somebody who will fix it. Thanks.


Re: [btrfs] e076ab2a2c: fio.write_iops -18.3% regression

2021-01-12 Thread David Sterba
On Tue, Jan 12, 2021 at 11:36:14PM +0800, kernel test robot wrote:
> 
> Greeting,
> 
> FYI, we noticed a -18.3% regression of fio.write_iops due to commit:
> 
> 
> commit: e076ab2a2ca70a0270232067cd49f76cd92efe64 ("btrfs: shrink delalloc 
> pages instead of full inodes")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> 
> in testcase: fio-basic
> on test machine: 192 threads Intel(R) Xeon(R) CPU @ 2.20GHz with 192G memory
> with following parameters:
> 
>   disk: 1SSD
>   fs: btrfs
>   runtime: 300s
>   nr_task: 8
>   rw: randwrite
>   bs: 4k
>   ioengine: sync
>   test_size: 256g

Though I do a similar test (emulating bit torrent workload), it's a bit
extreme as it's 4k synchronous on a huge file. It always takes a lot of
time but could point out some concurrency issues namely on faster
devices. There are 8 threads possibly competing for the same inode lock
or other locks related to it.

The mentioned commit fixed another perf regression on a much more common
workload (untgrring files), so at this point drop in this fio workload
is inevitable.


[GIT PULL] Btrfs fixes for 5.11-rc4

2021-01-11 Thread David Sterba
Hi,

more material for stable trees. Please pull, thanks.

- tree-checker: check item end overflow

- fix false warning during relocation regarding extent type

- fix inode flushing logic, caused notable performance regression (since
  5.10)

- debugging fixups:
  - print correct offset for reloc tree key
  - pass reliable fs_info pointer to error reporting helper


The following changes since commit a8cc263eb58ca133617662a5a5e07131d0ebf299:

  btrfs: run delayed iputs when remounting RO to avoid leaking them (2020-12-18 
15:00:08 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.11-rc3-tag

for you to fetch changes up to e076ab2a2ca70a0270232067cd49f76cd92efe64:

  btrfs: shrink delalloc pages instead of full inodes (2021-01-08 16:36:44 
+0100)


Josef Bacik (2):
  btrfs: print the actual offset in btrfs_root_name
  btrfs: shrink delalloc pages instead of full inodes

Qu Wenruo (1):
  btrfs: reloc: fix wrong file extent type check to avoid false ENOENT

Su Yue (2):
  btrfs: prevent NULL pointer dereference in extent_io_tree_panic
  btrfs: tree-checker: check if chunk item end overflows

 fs/btrfs/disk-io.c  |  2 +-
 fs/btrfs/extent_io.c|  4 +---
 fs/btrfs/inode.c| 60 +++--
 fs/btrfs/print-tree.c   | 10 -
 fs/btrfs/print-tree.h   |  2 +-
 fs/btrfs/relocation.c   |  7 +-
 fs/btrfs/space-info.c   |  4 +++-
 fs/btrfs/tree-checker.c |  7 ++
 8 files changed, 67 insertions(+), 29 deletions(-)


Re: KASAN: null-ptr-deref Write in start_transaction

2021-01-08 Thread David Sterba
On Fri, Jan 08, 2021 at 02:22:00PM +, Filipe Manana wrote:
> On Thu, Jan 7, 2021 at 1:13 PM syzbot
>  wrote:
> >
> > syzbot suspects this issue was fixed by commit:
> >
> > commit f30bed83426c5cb9fce6cabb3f7cc5a9d5428fcc
> > Author: Filipe Manana 
> > Date:   Fri Nov 13 11:24:17 2020 +
> >
> > btrfs: remove unnecessary attempt to drop extent maps after adding 
> > inline extent
> >
> > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=13ddc30b50
> > start commit:   521b619a Merge tag 'linux-kselftest-kunit-fixes-5.10-rc3' ..
> > git tree:   upstream
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=61033507391c77ff
> > dashboard link: https://syzkaller.appspot.com/bug?extid=6700bca07dff187809c4
> > syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=14a07ab250
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10fe69c650
> >
> > If the result looks correct, please mark the issue as fixed by replying 
> > with:
> >
> > #syz fix: btrfs: remove unnecessary attempt to drop extent maps after 
> > adding inline extent
> 
> Nop, it can't be this change.
> 
> What should fix it should be the following commit:
> 
> commit ecfdc08b8cc65d737eebc26a1ee1875a097fd6a0
> Author: Goldwyn Rodrigues 
> Date:   Thu Sep 24 11:39:21 2020 -0500
> 
> btrfs: remove dio iomap DSYNC workaround

#syz fix: btrfs: remove dio iomap DSYNC workaround


Re: KASAN: null-ptr-deref Write in start_transaction

2021-01-08 Thread David Sterba
On Fri, Jan 08, 2021 at 02:22:00PM +, Filipe Manana wrote:
> On Thu, Jan 7, 2021 at 1:13 PM syzbot
>  wrote:
> >
> > syzbot suspects this issue was fixed by commit:
> >
> > commit f30bed83426c5cb9fce6cabb3f7cc5a9d5428fcc
> > Author: Filipe Manana 
> > Date:   Fri Nov 13 11:24:17 2020 +
> >
> > btrfs: remove unnecessary attempt to drop extent maps after adding 
> > inline extent
> >
> > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=13ddc30b50
> > start commit:   521b619a Merge tag 'linux-kselftest-kunit-fixes-5.10-rc3' ..
> > git tree:   upstream
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=61033507391c77ff
> > dashboard link: https://syzkaller.appspot.com/bug?extid=6700bca07dff187809c4
> > syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=14a07ab250
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10fe69c650
> >
> > If the result looks correct, please mark the issue as fixed by replying 
> > with:
> >
> > #syz fix: btrfs: remove unnecessary attempt to drop extent maps after 
> > adding inline extent
> 
> Nop, it can't be this change.
> 
> What should fix it should be the following commit:
> 
> commit ecfdc08b8cc65d737eebc26a1ee1875a097fd6a0
> Author: Goldwyn Rodrigues 
> Date:   Thu Sep 24 11:39:21 2020 -0500
> 
> btrfs: remove dio iomap DSYNC workaround

Thanks!

#syz unfix


Re: KASAN: null-ptr-deref Write in start_transaction

2021-01-08 Thread David Sterba
On Fri, Jan 08, 2021 at 10:17:25AM +0100, Dmitry Vyukov wrote:
> On Thu, Jan 7, 2021 at 2:11 PM syzbot
>  wrote:
> >
> > syzbot suspects this issue was fixed by commit:
> >
> > commit f30bed83426c5cb9fce6cabb3f7cc5a9d5428fcc
> > Author: Filipe Manana 
> > Date:   Fri Nov 13 11:24:17 2020 +
> >
> > btrfs: remove unnecessary attempt to drop extent maps after adding 
> > inline extent
> >
> > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=13ddc30b50
> > start commit:   521b619a Merge tag 'linux-kselftest-kunit-fixes-5.10-rc3' ..
> > git tree:   upstream
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=61033507391c77ff
> > dashboard link: https://syzkaller.appspot.com/bug?extid=6700bca07dff187809c4
> > syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=14a07ab250
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10fe69c650
> >
> > If the result looks correct, please mark the issue as fixed by replying 
> > with:
> >
> > #syz fix: btrfs: remove unnecessary attempt to drop extent maps after 
> > adding inline extent
> >
> > For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> 
> #syz fix: btrfs: remove unnecessary attempt to drop extent maps after
> adding inline extent

I have looked at the report and suspected fix yestereday and was not
sure that it's really the right fix.  The commit removes some call so it
all looks like an accidental fix and something still might be going on.
So I'm a bit surprised that you mark it as fixed. It will make the
syzbot report go away so from that POV ok and we'll know if it happens
again, but I'd expect at least some analysis before closing the report.


[GIT PULL] Btrfs fixes for 5.11-rc3

2021-01-06 Thread David Sterba
Hi,

a few more fixes that arrived before the end of the year.  Please pull,
thanks.

- a bunch of fixes related to transaction handle lifetime wrt various
  operations (umount, remount, qgroup scan, orphan cleanup)

- async discard scheduling fixes

- fix item size calculation when item keys collide for extend refs
  (hardlinks)

- fix qgroup flushing from running transaction

- fix send, wrong file path when there is an inode with a pending rmdir

- fix deadlock when cloning inline extent and low on free metadata space


The following changes since commit b42fe98c92698d2a10094997e5f4d2dd968fd44f:

  btrfs: scrub: allow scrub to work with subpage sectorsize (2020-12-09 
19:16:11 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.11-rc2-tag

for you to fetch changes up to a8cc263eb58ca133617662a5a5e07131d0ebf299:

  btrfs: run delayed iputs when remounting RO to avoid leaking them (2020-12-18 
15:00:08 +0100)


Filipe Manana (7):
  btrfs: fix deadlock when cloning inline extent and low on free metadata 
space
  btrfs: send: fix wrong file path when there is an inode with a pending 
rmdir
  btrfs: fix transaction leak and crash after RO remount caused by qgroup 
rescan
  btrfs: fix transaction leak and crash after cleaning up orphans on RO 
mount
  btrfs: fix race between RO remount and the cleaner task
  btrfs: add assertion for empty list of transactions at late stage of 
umount
  btrfs: run delayed iputs when remounting RO to avoid leaking them

Josef Bacik (1):
  btrfs: tests: initialize test inodes location

Pavel Begunkov (3):
  btrfs: fix async discard stall
  btrfs: fix racy access to discard_ctl data
  btrfs: merge critical sections of discard lock in workfn

Qu Wenruo (1):
  btrfs: qgroup: don't try to wait flushing if we're already holding a 
transaction

ethanwu (1):
  btrfs: correctly calculate item size used when item key collision happens

 fs/btrfs/btrfs_inode.h   |  9 ++
 fs/btrfs/ctree.c | 24 +--
 fs/btrfs/ctree.h | 29 --
 fs/btrfs/dev-replace.c   |  2 +-
 fs/btrfs/discard.c   | 70 +++-
 fs/btrfs/disk-io.c   | 13 
 fs/btrfs/extent-tree.c   |  2 ++
 fs/btrfs/file-item.c |  2 ++
 fs/btrfs/inode.c | 15 +++---
 fs/btrfs/ioctl.c |  2 +-
 fs/btrfs/qgroup.c| 43 +++
 fs/btrfs/reflink.c   | 15 ++
 fs/btrfs/send.c  | 49 +++
 fs/btrfs/space-info.c|  2 +-
 fs/btrfs/super.c | 40 +++--
 fs/btrfs/tests/btrfs-tests.c | 10 +--
 fs/btrfs/tests/inode-tests.c |  9 --
 fs/btrfs/volumes.c   |  4 +--
 18 files changed, 243 insertions(+), 97 deletions(-)


Re: [PATCH v2] fs/btrfs: avoid null pointer dereference if reloc control has not been initialized

2021-01-05 Thread David Sterba
I'm keeping the rest of the mail as I received it, it's completely
garbled and would need manual reformatting. The changelog and even the
diff are completely on one line(!).

On Tue, Jan 05, 2021 at 11:08:42AM +0800, bodefang wrote:
> Similar to commmit<389305b2aa68>(“btrfs: relocation: Only remove reloc 
> rb_trees if reloc control has been initialized”).it turns out that 
> fs_info::reloc_ctl can be NULL in btrfs_recover_relocation() as we allocate 
> relocation control after allreloc roots have been verified.,so there should 
> be a check for rc to prevent null pointer dereference.
> 
> Signed-off-by: Defang Bo Reviewed-by: David Sterba 
> ---

Oh and don't add reviewed-by unless it's explicitly stated by the
person.

> Changes since v1:
> - More accurate description for this patch to describe how the NULL can get 
> there.
> ---
> --- fs/btrfs/relocation.c | 6 ++ 1 file changed, 6 insertions(+) diff 
> --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 3602806..ca03571 
> 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -626,6 
> +626,9 @@ static int __must_check __add_reloc_root(struct btrfs_root *root) 
> struct mapping_node *node; struct reloc_control *rc = fs_info->reloc_ctl; + 
> if (!rc) + return 0; + node = kmalloc(sizeof(*node), GFP_NOFS); if (!node) 
> return -ENOMEM; @@ -703,6 +706,9 @@ static int __update_reloc_root(struct 
> btrfs_root *root) struct rb_node *rb_node; struct mapping_node *node = NULL; 
> struct reloc_control *rc = fs_info->reloc_ctl; + + if (!rc) + return 0; 
> spin_lock(&rc->reloc_root_tree.lock); rb_node = 
> rb_simple_search(&rc->reloc_root_tree.rb_root, -- 2.7.4
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> At 2021-01-04 23:11:13, "David Sterba"  wrote:
> >On Sun, Dec 27, 2020 at 10:55:31PM +0800, Defang Bo wrote:
> >> Similar to commmit<389305b2>,
> >
> >Please use full commit reference like 389305b2aa68 ("btrfs: relocation:
> >Only remove reloc rb_trees if reloc control has been initialized")
> >
> >> it turns out that fs_info::reloc_ctl can be NULL ,
> >
> >Please describe how the NULL can get there.


Re: [PATCH 3/3] fs/btrfs: avoid null pointer dereference if reloc control has not been initialized

2021-01-04 Thread David Sterba
On Sun, Dec 27, 2020 at 10:55:31PM +0800, Defang Bo wrote:
> Similar to commmit<389305b2>,

Please use full commit reference like 389305b2aa68 ("btrfs: relocation:
Only remove reloc rb_trees if reloc control has been initialized")

> it turns out that fs_info::reloc_ctl can be NULL ,

Please describe how the NULL can get there.


Re: [PATCH v2 -next] btrfs: use DEFINE_MUTEX() for mutex lock

2021-01-04 Thread David Sterba
On Thu, Dec 24, 2020 at 09:22:17PM +0800, Zheng Yongjun wrote:
> mutex lock can be initialized automatically with DEFINE_MUTEX()
> rather than explicitly calling mutex_init().

And is there some reason why it should be done that way?


Re: [f2fs-dev] [PATCH v7 0/3] Update to zstd-1.4.6

2020-12-16 Thread David Sterba
On Wed, Dec 16, 2020 at 11:58:07AM +1100, Herbert Xu wrote:
> On Wed, Dec 16, 2020 at 12:48:51AM +, Nick Terrell wrote:
> >
> > Thanks for the advice! The first zstd patches went through Herbert’s tree, 
> > which is
> > why I’ve sent them this way.
> 
> Sorry, but I'm not touch these patches as Christoph's objections
> don't seem to have been addressed.

I have objections to the current patchset as well, the build bot has
found that some of the function frames are overly large (up to 3800
bytes) [1], besides the original complaint that the patch 3/3 is 1.5MiB.

[1] https://lore.kernel.org/lkml/20201204140314.gs6...@twin.jikos.cz/


Re: [PATCH] btrfs: fix boolreturn.cocci warnings

2020-12-15 Thread David Sterba
On Sun, Nov 01, 2020 at 01:20:51PM +0800, kernel test robot wrote:
> From: kernel test robot 
> 
> fs/btrfs/space-info.c:810:9-10: WARNING: return of 0/1 in function 
> 'need_preemptive_reclaim' with return type bool
> 
>  Return statements in functions returning bool should use
>  true/false instead of 1/0.
> Generated by: scripts/coccinelle/misc/boolreturn.cocci
> 
> Fixes: fc96d3794eb2 ("btrfs: rename need_do_async_reclaim")
> CC: Josef Bacik 
> Signed-off-by: kernel test robot 

The patchset is still in a topic branch so I folded the change. There
were more int/bool mismatches in other patches, that got fixed too.


Re: [PATCH v2] btrfs: free-space-cache: Fix error return code in __load_free_space_cache

2020-12-15 Thread David Sterba
On Mon, Dec 07, 2020 at 09:56:12PM +0800, Zhihao Cheng wrote:
> Fix to return the error code(instead always 0) when memory allocating
> failed in __load_free_space_cache().
> 
> This lacks the analysis of consequences, so there's only one caller and

By "This lacks the analysis of consequences" I was referring to
changelog in your patch and I did not expect you copy&paste it verbatim
to the next patch :)

Anyway, updated patch added to misc-next, thanks.


Re: linux-next: build failure after merge of the block tree

2020-12-14 Thread David Sterba
On Tue, Dec 15, 2020 at 08:43:00AM +1100, Stephen Rothwell wrote:
> Hi David,
> 
> On Mon, 14 Dec 2020 22:36:12 +0100 David Sterba  wrote:
> >
> > On Mon, Dec 14, 2020 at 01:12:46PM -0700, Jens Axboe wrote:
> > > On 12/14/20 1:09 PM, Stephen Rothwell wrote:  
> > > > Just a reminder that I am still applying the above merge fix.  
> > > 
> > > I sent in my core changes, but they haven't been pulled yet. So I guess
> > > we're dealing with a timing situation... David, did you send in the btrfs
> > > pull yet?  
> > 
> > Yes
> > https://lore.kernel.org/lkml/cover.1607955523.git.dste...@suse.com/
> 
> I would expect you *both* to at least mention this conflict to Linus ...

2nd paragraph in the mail

"There are no merge conflicts against current master branch, in past
 weeks some conflicts emerged in linux-next but IIRC were trivial."


Re: linux-next: build failure after merge of the block tree

2020-12-14 Thread David Sterba
On Mon, Dec 14, 2020 at 01:12:46PM -0700, Jens Axboe wrote:
> On 12/14/20 1:09 PM, Stephen Rothwell wrote:
> > Just a reminder that I am still applying the above merge fix.
> 
> I sent in my core changes, but they haven't been pulled yet. So I guess
> we're dealing with a timing situation... David, did you send in the btrfs
> pull yet?

Yes
https://lore.kernel.org/lkml/cover.1607955523.git.dste...@suse.com/


[GIT PULL] Btrfs updates for 5.11

2020-12-14 Thread David Sterba
  btrfs: fix lockdep warning when creating free space tree

David Sterba (22):
  btrfs: use the right number of levels for lockdep keysets
  btrfs: generate lockdep keyset names at compile time
  btrfs: send: use helpers to access root_item::ctransid
  btrfs: check-integrity: use proper helper to access btrfs_header
  btrfs: use root_item helpers for limit and flags in btrfs_create_tree
  btrfs: add set/get accessors for root_item::drop_level
  btrfs: remove unnecessary casts in printk
  btrfs: use precalculated sectorsize_bits from fs_info
  btrfs: replace div_u64 by shift in free_space_bitmap_size
  btrfs: replace s_blocksize_bits with fs_info::sectorsize_bits
  btrfs: store precalculated csum_size in fs_info
  btrfs: precalculate checksums per leaf once
  btrfs: use cached value of fs_info::csum_size everywhere
  btrfs: switch cached fs_info::csum_size from u16 to u32
  btrfs: remove unnecessary local variables for checksum size
  btrfs: check integrity: remove local copy of csum_size
  btrfs: scrub: remove local copy of csum_size from context
  btrfs: reorder extent buffer members for better packing
  btrfs: remove stub device info from messages when we have no fs_info
  btrfs: tree-checker: annotate all error branches as unlikely
  btrfs: drop casts of bio bi_sector
  btrfs: remove recalc_thresholds from free space ops

Filipe Manana (16):
  btrfs: assert we are holding the reada_lock when releasing a readahead 
zone
  btrfs: do not start readahead for csum tree when scrubbing non-data block 
groups
  btrfs: do not start and wait for delalloc on snapshot roots on 
transaction commit
  btrfs: refactor btrfs_drop_extents() to make it easier to extend
  btrfs: fix race when defragmenting leads to unnecessary IO
  btrfs: update the number of bytes used by an inode atomically
  btrfs: skip unnecessary searches for xattrs when logging an inode
  btrfs: stop incrementing log batch when joining log transaction
  btrfs: remove unnecessary attempt to drop extent maps after adding inline 
extent
  btrfs: unlock path before checking if extent is shared during nocow 
writeback
  btrfs: fix race causing unnecessary inode logging during link and rename
  btrfs: fix race that results in logging old extents during a fast fsync
  btrfs: fix race that causes unnecessary logging of ancestor inodes
  btrfs: fix race that makes inode logging fallback to transaction commit
  btrfs: fix race leading to unnecessary transaction commit when logging 
inode
  btrfs: do not block inode logging for so long during transaction commit

Goldwyn Rodrigues (14):
  btrfs: calculate num_pages, reserve_bytes once in btrfs_buffered_write
  btrfs: use iosize while reading compressed pages
  btrfs: use round_down while calculating start position in 
btrfs_dirty_pages()
  btrfs: set EXTENT_NORESERVE bits side btrfs_dirty_pages()
  btrfs: split btrfs_direct_IO to read and write
  btrfs: move pos increment and pagecache extension to btrfs_buffered_write
  btrfs: check FS error state bit early during write
  btrfs: introduce btrfs_write_check()
  btrfs: introduce btrfs_inode_lock()/unlock()
  btrfs: push inode locking and unlocking into buffered/direct write
  btrfs: use shared lock for direct writes within EOF
  btrfs: remove btrfs_inode::dio_sem
  btrfs: call iomap_dio_complete() without inode_lock
  btrfs: remove dio iomap DSYNC workaround

Josef Bacik (41):
  btrfs: unify the ro checking for mount options
  btrfs: push the NODATASUM check into btrfs_lookup_bio_sums
  btrfs: sysfs: export supported rescue= mount options
  btrfs: add a helper to print out rescue= options
  btrfs: show rescue=usebackuproot in /proc/mounts
  btrfs: introduce mount option rescue=ignorebadroots
  btrfs: introduce mount option rescue=ignoredatacsums
  btrfs: introduce mount option rescue=all
  btrfs: switch extent buffer tree lock to rw_semaphore
  btrfs: locking: remove all the blocking helpers
  btrfs: locking: rip out path->leave_spinning
  btrfs: do not shorten unpin len for caching block groups
  btrfs: update last_byte_to_unpin in switch_commit_roots
  btrfs: explicitly protect ->last_byte_to_unpin in unpin_extent_range
  btrfs: cleanup btrfs_discard_update_discardable usage
  btrfs: load free space cache into a temporary ctl
  btrfs: load the free space cache inode extents from commit root
  btrfs: load free space cache asynchronously
  btrfs: protect fs_info->caching_block_groups by block_group_cache_lock
  btrfs: remove lockdep classes for the fs tree
  btrfs: cleanup extent buffer readahead
  btrfs: use btrfs_read_node_slot in btrfs_realloc_node
  btrfs: use btrfs_read_node_slot in walk_down_reloc_tree
  btrfs: use btrfs_read_node_slot in do_reloc

Re: [PATCH] btrfs: free-space-cache: Fix error return code in __load_free_space_cache

2020-12-04 Thread David Sterba
On Fri, Nov 20, 2020 at 09:08:04AM +0800, Zhihao Cheng wrote:
> Fix to return the error code(instead always 0) when memory allocating
> failed in __load_free_space_cache().

This lacks the analysis of consequences, so there's only one caller and
that will treat values <=0 as 'cache not loaded'. There's no functional
change but otherwise the error values should be there for clarity.
Changelog updated.

> Fixes: a67509c30079f4c50 ("Btrfs: add a io_ctl struct and helpers ...")

BTW, please don't trim the patch subject in the Fixes line.


Re: [PATCH] btrfs: free-space-cache: Fix error return code in __load_free_space_cache

2020-12-04 Thread David Sterba
On Fri, Nov 20, 2020 at 09:08:04AM +0800, Zhihao Cheng wrote:
> Fix to return the error code(instead always 0) when memory allocating
> failed in __load_free_space_cache().

Hm right the error handling flow in that function is a mess and the
error values are not set. Your patch is a minimal fix so I'll add it,
the function could use some cleanups though. Thanks.


Re: [PATCH v6 3/3] lib: zstd: Upgrade to latest upstream zstd version 1.4.6

2020-12-04 Thread David Sterba
On Thu, Dec 03, 2020 at 07:58:16AM +0800, kernel test robot wrote:
> All warnings (new ones prefixed by >>):
> 
>lib/zstd/compress/zstd_double_fast.c: In function 
> 'ZSTD_compressBlock_doubleFast_extDict_generic':
> >> lib/zstd/compress/zstd_double_fast.c:501:1: warning: the frame size of 
> >> 3724 bytes is larger than 1280 bytes [-Wframe-larger-than=]

Frame size 3724?

>lib/zstd/compress/zstd_double_fast.c: In function 
> 'ZSTD_compressBlock_doubleFast':
>lib/zstd/compress/zstd_double_fast.c:336:1: warning: the frame size of 
> 3792 bytes is larger than 1280 bytes [-Wframe-larger-than=]

3792

>lib/zstd/compress/zstd_double_fast.c: In function 
> 'ZSTD_compressBlock_doubleFast_dictMatchState':
>lib/zstd/compress/zstd_double_fast.c:356:1: warning: the frame size of 
> 3808 bytes is larger than 1280 bytes [-Wframe-larger-than=]

3808

>lib/zstd/compress/zstd_fast.c: In function 
> 'ZSTD_compressBlock_fast_extDict_generic':
> >> lib/zstd/compress/zstd_fast.c:476:1: warning: the frame size of 2736 bytes 
> >> is larger than 1280 bytes [-Wframe-larger-than=]

2736

>lib/zstd/compress/zstd_fast.c: In function 'ZSTD_compressBlock_fast':
>lib/zstd/compress/zstd_fast.c:204:1: warning: the frame size of 1508 bytes 
> is larger than 1280 bytes [-Wframe-larger-than=]

1508

>lib/zstd/compress/zstd_fast.c: In function 
> 'ZSTD_compressBlock_fast_dictMatchState':
>lib/zstd/compress/zstd_fast.c:372:1: warning: the frame size of 1540 bytes 
> is larger than 1280 bytes [-Wframe-larger-than=]

1540

For userspace code it's nothing but in kernel it's a lot for a single
function. The largest number is almost one page, there were days where
this would be one half of the whole stack space. We can't waste precious
resources like that. Taking the userspace code as-is does not seem to
work.


Re: [PATCH][next] btrfs: fix spelling mistake "inititialize" -> "initialize"

2020-11-28 Thread David Sterba
On Tue, Nov 17, 2020 at 01:07:04PM +, Colin King wrote:
> From: Colin Ian King 
> 
> There is a spelling mistake in a btrfs_err error message. Fix it.
> 
> Signed-off-by: Colin Ian King 

Folded to the patch, thanks.


Re: [PATCH -next] btrfs: remove unused variable

2020-11-28 Thread David Sterba
On Tue, Nov 17, 2020 at 05:47:43PM +0800, Zou Wei wrote:
> Fix variable set but not used compilation warnings:
> 
> fs/btrfs/ctree.c:1581:6: warning: variable ‘parent_level’ set but not used 
> [-Wunused-but-set-variable]
>   int parent_level;
>   ^~~~
> 
> fs/btrfs/zoned.c:503:6: warning: variable ‘zone_size’ set but not used 
> [-Wunused-but-set-variable]
>   u64 zone_size;
>   ^
> 
> Signed-off-by: Zou Wei 

> ---
>  fs/btrfs/ctree.c | 3 ---
>  fs/btrfs/zoned.c | 2 --
>  2 files changed, 5 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index 32a57a7..e5a0941 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -1578,13 +1578,10 @@ int btrfs_realloc_node(struct btrfs_trans_handle 
> *trans,
>   int end_slot;
>   int i;
>   int err = 0;
> - int parent_level;
>   u32 blocksize;
>   int progress_passed = 0;
>   struct btrfs_disk_key disk_key;
>  
> - parent_level = btrfs_header_level(parent);

That one folded to the patch, thanks.

> -
>   WARN_ON(trans->transaction != fs_info->running_transaction);
>   WARN_ON(trans->transid != fs_info->generation);
>  
> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
> index fa9cc61..742f088 100644
> --- a/fs/btrfs/zoned.c
> +++ b/fs/btrfs/zoned.c
> @@ -500,7 +500,6 @@ int btrfs_sb_log_location_bdev(struct block_device *bdev, 
> int mirror, int rw,
>   unsigned int zone_sectors;
>   u32 sb_zone;
>   int ret;
> - u64 zone_size;
>   u8 zone_sectors_shift;
>   sector_t nr_sectors = bdev->bd_part->nr_sects;
>   u32 nr_zones;
> @@ -515,7 +514,6 @@ int btrfs_sb_log_location_bdev(struct block_device *bdev, 
> int mirror, int rw,
>   zone_sectors = bdev_zone_sectors(bdev);
>   if (!is_power_of_2(zone_sectors))
>   return -EINVAL;
> - zone_size = zone_sectors << SECTOR_SHIFT;

That was intended to be used a few lines below, so that's not for
removal.

>   zone_sectors_shift = ilog2(zone_sectors);
>   nr_zones = nr_sectors >> zone_sectors_shift;
>  
> -- 
> 2.6.2


[GIT PULL] Btrfs fixes for 5.10-rc6

2020-11-27 Thread David Sterba
Hi,

a few fixes for various warnings that accumulated over past two weeks.

- tree-checker: add missing return values for some errors

- lockdep fixes
  - when reading qgroup config and starting quota rescan
  - reverse order of quota ioctl lock and VFS freeze lock

- avoid accessing potentially stale fs info during device scan,
  reported by syzbot

- add scope NOFS protection around qgroup relation changes

- check for running transaction before flushing qgroups

- fix tracking of new delalloc ranges for some cases

Please pull, thanks.


The following changes since commit 468600c6ec28613b756193c5f780aac062f1acdf:

  btrfs: ref-verify: fix memory leak in btrfs_ref_tree_mod (2020-11-05 13:03:39 
+0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.10-rc5-tag

for you to fetch changes up to a855fbe69229078cd8aecd8974fb996a5ca651e6:

  btrfs: fix lockdep splat when enabling and disabling qgroups (2020-11-23 
21:16:43 +0100)


Daniel Xu (1):
  btrfs: tree-checker: add missing return after error in root_item

David Sterba (1):
  btrfs: tree-checker: add missing returns after data_ref alignment checks

Filipe Manana (4):
  btrfs: fix missing delalloc new bit for new delalloc ranges
  btrfs: fix lockdep splat when reading qgroup config on mount
  btrfs: do nofs allocations when adding and removing qgroup relations
  btrfs: fix lockdep splat when enabling and disabling qgroups

Johannes Thumshirn (1):
  btrfs: don't access possibly stale fs_info data for printing duplicate 
device

Qu Wenruo (1):
  btrfs: qgroup: don't commit transaction when we already hold the handle

 fs/btrfs/ctree.h |  5 ++-
 fs/btrfs/file.c  | 57 
 fs/btrfs/inode.c | 58 +
 fs/btrfs/qgroup.c| 88 +++-
 fs/btrfs/tests/inode-tests.c | 12 --
 fs/btrfs/tree-checker.c  |  3 ++
 fs/btrfs/volumes.c   |  8 +++-
 7 files changed, 158 insertions(+), 73 deletions(-)


Re: [PATCH 05/17] fs/btrfs: Convert to memzero_page()

2020-11-24 Thread David Sterba
On Mon, Nov 23, 2020 at 10:07:43PM -0800, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> Remove the kmap/memset()/kunmap pattern and use the new memzero_page()
> call where possible.
> 
> Cc: Chris Mason 
> Cc: Josef Bacik 
> Cc: David Sterba 
> Signed-off-by: Ira Weiny 
> ---
>  fs/btrfs/inode.c | 21 +

The patch converts the pattern only in inode.c, but there's more in
compression.c, extent_io.c, zlib.c,d zstd.c (kmap_atomic) and reflink.c,
send.c (kmap).


Re: [PATCH v2] btrfs: return EAGAIN when receiving a pending signal in the defrag loops

2020-11-23 Thread David Sterba
On Wed, Nov 18, 2020 at 12:02:36PM +0800, xiakaixu1...@gmail.com wrote:
> From: Kaixu Xia 
> 
> The variable ret is overwritten by the following variable defrag_count.
> Actually the code should return EAGAIN when receiving a pending signal
> in the defrag loops.

This lacks explanation why is EAGAIN supposed to be the right return
value. This is about semantics of the FITRIM ioctl, changing that would
affect userspace applications.


Re: [PATCH] btrfs: remove the useless value assignment in block_rsv_release_bytes

2020-11-23 Thread David Sterba
On Tue, Nov 17, 2020 at 11:17:17AM +0800, kaixuxia wrote:
> 
> 
> On 2020/11/16 23:15, David Sterba wrote:
> > On Sun, Nov 15, 2020 at 02:39:23PM +0800, xiakaixu1...@gmail.com wrote:
> >> From: Kaixu Xia 
> >>
> >> The variable qgroup_to_release is overwritten by the following if/else
> >> statement before it is used, so this assignment is useless. Remove it.
> > 
> > Again this lacks explanation why removing it is correct.
> > 
> Actually this assignment is redundant because the variable qgroup_to_release
> has been overwritten before it is used. The logic like this,

That's obvious and I did not mean that. Have you checked in which commit
the variable became unused and why? It's possible that it was indeed
just an oversight, but if not it could point to a bug.


Re: [PATCH] btrfs: sysfs: remove unneeded semicolon

2020-11-16 Thread David Sterba
On Sun, Nov 01, 2020 at 07:30:08AM -0800, t...@redhat.com wrote:
> From: Tom Rix 
> 
> A semicolon is not needed after a switch statement.
> 
> Signed-off-by: Tom Rix 

Added to misc-next, thanks.


Re: [PATCH] btrfs: remove the useless value assignment in btrfs_defrag_file

2020-11-16 Thread David Sterba
On Sat, Nov 14, 2020 at 05:06:21PM +0800, xiakaixu1...@gmail.com wrote:
> From: Kaixu Xia 
> 
> The variable ret is overwritten by the following variable defrag_count
> and this assignment is useless, so remove it.

This could be actually pointing to a bug, please explain why you think
it's correct to remove it and not to return EAGAIN.


Re: [PATCH] btrfs: remove the useless value assignment in block_rsv_release_bytes

2020-11-16 Thread David Sterba
On Sun, Nov 15, 2020 at 02:39:23PM +0800, xiakaixu1...@gmail.com wrote:
> From: Kaixu Xia 
> 
> The variable qgroup_to_release is overwritten by the following if/else
> statement before it is used, so this assignment is useless. Remove it.

Again this lacks explanation why removing it is correct.


Re: [RFC] fs: Avoid to use lockdep information if it's turned off

2020-11-13 Thread David Sterba
On Thu, Nov 12, 2020 at 11:22:12AM +0800, Boqun Feng wrote:
> For the "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!" warning, do you see
> that every time when you run xfstests and don't see other lockdep
> splats? If so, that means we reach the limitation of number of lockdep
> hlock chains, and we should fix that.

It's not every time and depends on the release, eg. I found no reports
in a sample log for 5.7..5.9, while there are many for 5.2..5.6 and
5.10, every 2nd or 3rd run.

[0.185150] ... MAX_LOCKDEP_SUBCLASSES:  8
[0.186202] ... MAX_LOCK_DEPTH:  48
[0.187286] ... MAX_LOCKDEP_KEYS:8192
[0.188404] ... CLASSHASH_SIZE:  4096
[0.189519] ... MAX_LOCKDEP_ENTRIES: 32768
[0.190672] ... MAX_LOCKDEP_CHAINS:  65536
[0.191814] ... CHAINHASH_SIZE:  32768


Re: [RFC] fs: Avoid to use lockdep information if it's turned off

2020-11-11 Thread David Sterba
On Tue, Nov 10, 2020 at 04:33:27PM +0100, David Sterba wrote:
> On Tue, Nov 10, 2020 at 09:37:37AM +0800, Boqun Feng wrote:
> 
> I'll run another test on top of the development branch in case there are
> unrelated lockdep warning bugs that have been fixed meanwhile.

Similar reports but earlier test and probably completely valid due to
"BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!"

btrfs/057   [16:01:29][ 1580.146087] run fstests btrfs/057 at 
2020-11-10 16:01:29
[ 1580.787867] BTRFS info (device vda): disk space caching is enabled
[ 1580.789366] BTRFS info (device vda): has skinny extents
[ 1581.052542] BTRFS: device fsid 84018822-2e45-4341-80be-da6d2b4e033a devid 1 
transid 5 /dev/vdb scanned by mkfs.btrfs (18739)
[ 1581.105177] BTRFS info (device vdb): turning on sync discard
[ 1581.106834] BTRFS info (device vdb): disk space caching is enabled
[ 1581.108423] BTRFS info (device vdb): has skinny extents
[ 1581.109799] BTRFS info (device vdb): flagging fs with big metadata feature
[ 1581.120343] BTRFS info (device vdb): checking UUID tree
[ 1586.942699] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
[ 1586.945725] turning off the locking correctness validator.
[ 1586.948823] Please attach the output of /proc/lock_stat to the bug report
[ 1586.952153] CPU: 0 PID: 18771 Comm: fsstress Not tainted 5.10.0-rc3-default+ 
#1355
[ 1586.954919] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 1586.958630] Call Trace:
[ 1586.959214]  dump_stack+0x77/0x97
[ 1586.960030]  add_chain_cache.cold+0x29/0x30
[ 1586.961028]  validate_chain+0x278/0x780
[ 1586.961979]  __lock_acquire+0x3fb/0x730
[ 1586.962880]  lock_acquire.part.0+0xac/0x1a0
[ 1586.963895]  ? try_to_wake_up+0x59/0x450
[ 1586.965153]  ? rcu_read_lock_sched_held+0x3f/0x70
[ 1586.966569]  ? lock_acquire+0xc4/0x150
[ 1586.967699]  ? try_to_wake_up+0x59/0x450
[ 1586.968882]  _raw_spin_lock_irqsave+0x43/0x90
[ 1586.970207]  ? try_to_wake_up+0x59/0x450
[ 1586.971404]  try_to_wake_up+0x59/0x450
[ 1586.973149]  wake_up_q+0x60/0xb0
[ 1586.974620]  __up_write+0x117/0x1d0
[ 1586.975080] [ cut here ]
[ 1586.976039]  btrfs_release_path+0xc8/0x180 [btrfs]
[ 1586.977718] WARNING: CPU: 2 PID: 18772 at fs/super.c:1676 
__sb_start_write+0x113/0x2a0
[ 1586.979478]  __btrfs_update_delayed_inode+0x1c1/0x2c0 [btrfs]
[ 1586.979506]  btrfs_commit_inode_delayed_inode+0x115/0x120 [btrfs]
[ 1586.982484] Modules linked in:
[ 1586.984080]  btrfs_evict_inode+0x1e2/0x370 [btrfs]
[ 1586.985557]  dm_flakey
[ 1586.986419]  ? evict+0xc3/0x220
[ 1586.986421]  evict+0xd5/0x220
[ 1586.986423]  vfs_rmdir.part.0+0x10c/0x180
[ 1586.986426]  do_rmdir+0x14b/0x1b0
[ 1586.987504]  dm_mod
[ 1586.988244]  do_syscall_64+0x2d/0x70
[ 1586.988947]  xxhash_generic btrfs
[ 1586.989779]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1586.990906]  blake2b_generic
[ 1586.991808] RIP: 0033:0x7f0ad919b5d7
[ 1586.992451]  libcrc32c
[ 1586.993427] Code: 73 01 c3 48 8b 0d 99 f8 0c 00 f7 d8 64 89 01 48 83 c8 ff 
c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 54 00 00 00 0f 05 <48> 3d 01 
f0 ff ff 73 01 c3 48 8b 0d 69 f8 0c 00 f7 d8 64 89 01 48
[ 1586.994380]  crc32c_intel
[ 1586.995546] RSP: 002b:7ffc152bf368 EFLAGS: 0202 ORIG_RAX: 
0054
[ 1586.996034]  xor
[ 1586.996613] RAX: ffda RBX: 01f4 RCX: 7f0ad919b5d7
[ 1586.996615] RDX: 00316264 RSI:  RDI: 0045eff0
[ 1586.997033]  zstd_decompress
[ 1587.001060] RBP: 7ffc152bf4b0 R08: 0045eff0 R09: 7ffc152bf4a4
[ 1587.001061] R10: 00b1 R11: 0202 R12: 030e
[ 1587.001062] R13:  R14:  R15: 
[ 1587.013639]  zstd_compress xxhash lzo_compress lzo_decompress raid6_pq loop
[ 1587.015763] CPU: 2 PID: 18772 Comm: fsstress Not tainted 5.10.0-rc3-default+ 
#1355
[ 1587.017719] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 1587.020878] RIP: 0010:__sb_start_write+0x113/0x2a0
[ 1587.022233] Code: f3 f8 da 58 85 c0 0f 84 95 01 00 00 40 84 ed 0f 85 4c 01 
00 00 45 84 e4 0f 85 75 01 00 00 5b 44 89 e8 5d 41 5c 41 5d 41 5e c3 <0f> 0b 48 
89 e8 31 d2 be 31 00 00 00 48 c7 c7 ca 98 e4 a7 48 c1 e0
[ 1587.027309] RSP: 0018:afbac3c2fdd0 EFLAGS: 00010246
[ 1587.028998] RAX: 0001 RBX: a320cc6b4478 RCX: 
[ 1587.031038] RDX: 0001 RSI:  RDI: a320cc6b4478
[ 1587.032905] RBP: 0003 R08: 01b8 R09: a320dbfa9b88
[ 1587.035782] R10: 0001 R11: ff9bcd99 R12: 0001
[ 1587.037974] R13: a320cc6b4398 R14: a320cc6b4698 R15: a320e06f4000
[ 1587.040473] FS:  7f0ad90a7b80() GS:a3213da0() 
knlGS:
[ 1587.043694] CS:  0010 DS:  ES:  CR0: 80050033
[

  1   2   3   4   5   6   7   >