date:20200224

Re: [f2fs-dev] [PATCH v7 6/9] scsi: ufs: Add inline encryption support to UFS

2020-02-24 Thread Stanley Chu

Hi Christoph,

On Mon, 2020-02-24 at 15:37 -0800, Christoph Hellwig wrote:
> On Sun, Feb 23, 2020 at 09:47:36PM +0800, Stanley Chu wrote:
> > Yes, MediaTek is keeping work closely with inline encryption patch sets.
> > Currently the v6 version can work well (without
> > UFSHCD_QUIRK_BROKEN_CRYPTO quirk) at least in our MT6779 SoC platform
> > which basic SoC support and some other peripheral drivers are under
> > upstreaming as below link,
> > 
> > https://patchwork.kernel.org/project/linux-mediatek/list/?state=%
> > 2A&q=6779&series=&submitter=&delegate=&archive=both
> > 
> > The integration with inline encryption patch set needs to patch
> > ufs-mediatek and patches are ready in downstream. We plan to upstream
> > them soon after inline encryption patch sets get merged.
> 
> What amount of support do you need in ufs-mediatek?  It seems like
> pretty much every ufs low-level driver needs some kind of specific
> support now, right?  I wonder if we should instead opt into the support
> instead of all the quirking here.

The patch in ufs-mediatek is aimed to issue vendor-specific SMC calls
for host initialization and configuration. This is because MediaTek UFS
host has some "secure-protected" registers/features which need to be
accessed/switched in secure world. 

Such protection is not mentioned by UFSHCI specifications thus inline
encryption patch set assumes that every registers in UFSHCI can be
accessed normally in kernel. This makes sense and surely the patchset
can work fine in any "standard" UFS host. However if host has special
design then it can work normally only if some vendor-specific treatment
is applied.

I think one of the reason to apply UFSHCD_QUIRK_BROKEN_CRYPTO quirk in
ufs-qcom host is similar to above case.

Thanks,
Stanley Chu

___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: use kmem_cache pool during inline xattr lookups

2020-02-24 Thread Ju Hyung Park

Hi Chao.

Sorry for the late reply, I've been busy with my uni stuffs.

It's nice to see this issue getting some attentions back, but
unfortunately, I can't test it at the moment.

Here's the patch we used to see the alloc statistics:
https://pastebin.com/bhBh2dgs by Sultan Alsawaf.

Hope it helps.

On Tue, Feb 25, 2020 at 10:43 AM Chao Yu  wrote:
>
> It's been observed that kzalloc() on lookup_all_xattrs() are called millions
> of times on Android, quickly becoming the top abuser of slub memory allocator.
>
> Use a dedicated kmem cache pool for xattr lookups to mitigate this.
>
> Signed-off-by: Park Ju Hyung 
> Signed-off-by: Chao Yu 
> ---
>
> Hi, Ju Hyung,
>
> Let me know if you have any objection on this patch. :)
>
>  fs/f2fs/f2fs.h  |  3 +++
>  fs/f2fs/super.c | 10 -
>  fs/f2fs/xattr.c | 54 -
>  fs/f2fs/xattr.h |  4 
>  4 files changed, 65 insertions(+), 6 deletions(-)
>
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 12a197e89a3e..23b93a116c73 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -1486,6 +1486,9 @@ struct f2fs_sb_info {
> __u32 s_chksum_seed;
>
> struct workqueue_struct *post_read_wq;  /* post read workqueue */
> +
> +   struct kmem_cache *inline_xattr_slab;   /* inline xattr entry */
> +   unsigned int inline_xattr_slab_size;/* default inline xattr slab 
> size */
>  };
>
>  struct f2fs_private_dio {
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index d241e07c0bfa..0b16204d3b7d 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -1201,6 +1201,7 @@ static void f2fs_put_super(struct super_block *sb)
> kvfree(sbi->raw_super);
>
> destroy_device_list(sbi);
> +   f2fs_destroy_xattr_caches(sbi);
> mempool_destroy(sbi->write_io_dummy);
>  #ifdef CONFIG_QUOTA
> for (i = 0; i < MAXQUOTAS; i++)
> @@ -3457,12 +3458,17 @@ static int f2fs_fill_super(struct super_block *sb, 
> void *data, int silent)
> }
> }
>
> +   /* init per sbi slab cache */
> +   err = f2fs_init_xattr_caches(sbi);
> +   if (err)
> +   goto free_io_dummy;
> +
> /* get an inode for meta space */
> sbi->meta_inode = f2fs_iget(sb, F2FS_META_INO(sbi));
> if (IS_ERR(sbi->meta_inode)) {
> f2fs_err(sbi, "Failed to read F2FS meta data inode");
> err = PTR_ERR(sbi->meta_inode);
> -   goto free_io_dummy;
> +   goto free_xattr_cache;
> }
>
> err = f2fs_get_valid_checkpoint(sbi);
> @@ -3735,6 +3741,8 @@ static int f2fs_fill_super(struct super_block *sb, void 
> *data, int silent)
> make_bad_inode(sbi->meta_inode);
> iput(sbi->meta_inode);
> sbi->meta_inode = NULL;
> +free_xattr_cache:
> +   f2fs_destroy_xattr_caches(sbi);
>  free_io_dummy:
> mempool_destroy(sbi->write_io_dummy);
>  free_percpu:
> diff --git a/fs/f2fs/xattr.c b/fs/f2fs/xattr.c
> index a3360a97e624..e46a10eb0e42 100644
> --- a/fs/f2fs/xattr.c
> +++ b/fs/f2fs/xattr.c
> @@ -23,6 +23,25 @@
>  #include "xattr.h"
>  #include "segment.h"
>
> +static void *xattr_alloc(struct f2fs_sb_info *sbi, int size, bool *is_inline)
> +{
> +   *is_inline = (size == sbi->inline_xattr_slab_size);
> +
> +   if (*is_inline)
> +   return kmem_cache_zalloc(sbi->inline_xattr_slab, GFP_NOFS);
> +
> +   return f2fs_kzalloc(sbi, size, GFP_NOFS);
> +}
> +
> +static void xattr_free(struct f2fs_sb_info *sbi, void *xattr_addr,
> +   bool is_inline)
> +{
> +   if (is_inline)
> +   kmem_cache_free(sbi->inline_xattr_slab, xattr_addr);
> +   else
> +   kvfree(xattr_addr);
> +}
> +
>  static int f2fs_xattr_generic_get(const struct xattr_handler *handler,
> struct dentry *unused, struct inode *inode,
> const char *name, void *buffer, size_t size)
> @@ -301,7 +320,8 @@ static int read_xattr_block(struct inode *inode, void 
> *txattr_addr)
>  static int lookup_all_xattrs(struct inode *inode, struct page *ipage,
> unsigned int index, unsigned int len,
> const char *name, struct f2fs_xattr_entry 
> **xe,
> -   void **base_addr, int *base_size)
> +   void **base_addr, int *base_size,
> +   bool *is_inline)
>  {
> void *cur_addr, *txattr_addr, *last_txattr_addr;
> void *last_addr = NULL;
> @@ -313,7 +333,7 @@ static int lookup_all_xattrs(struct inode *inode, struct 
> page *ipage,
> return -ENODATA;
>
> *base_size = XATTR_SIZE(inode) + XATTR_PADDING_SIZE;
> -   txattr_addr = f2fs_kzalloc(F2FS_I_SB(inode), *base_size, GFP_NOFS);
> +   txattr_addr = xattr_alloc(F2FS_I_SB(inode), *base_size, is_inline);
> if (!txattr_addr)
> return -ENOMEM;
>
>

Re: [f2fs-dev] [PATCH 3/3] f2fs: avoid unneeded barrier in do_checkpoint()

2020-02-24 Thread Chao Yu

On 2020/2/25 6:00, Jaegeuk Kim wrote:
> On 02/19, Chao Yu wrote:
>> On 2020/2/19 10:51, Jaegeuk Kim wrote:
>>> On 02/18, Chao Yu wrote:
 We don't need to wait all dirty page submitting IO twice,
 remove unneeded wait step.
>>>
>>> What happens if checkpoint and other meta writs are reordered?
>>
>> checkpoint can be done as following:
>>
>> 1. All meta except last cp-park of checkpoint area.
>> 2. last cp-park of checkpoint area
>>
>> So we only need to keep barrier in between step 1 and 2, we don't need
>> to care about the write order of meta in step 1, right?
> 
> Ah, let me integrate this patch into Sahitya's patch.>
> f2fs: fix the panic in do_checkpoint()

No problem.

Thanks,

> 
>>
>> Thanks,
>>
>>>

 Signed-off-by: Chao Yu 
 ---
  fs/f2fs/checkpoint.c | 2 --
  1 file changed, 2 deletions(-)

 diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
 index 751815cb4c2b..9c88fb3d255a 100644
 --- a/fs/f2fs/checkpoint.c
 +++ b/fs/f2fs/checkpoint.c
 @@ -1384,8 +1384,6 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, 
 struct cp_control *cpc)
  
/* Flush all the NAT/SIT pages */
f2fs_sync_meta_pages(sbi, META, LONG_MAX, FS_CP_META_IO);
 -  /* Wait for all dirty meta pages to be submitted for IO */
 -  f2fs_wait_on_all_pages(sbi, F2FS_DIRTY_META);
  
/*
 * modify checkpoint
 -- 
 2.18.0.rc1
>>> .
>>>
> .
> 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 3/3] f2fs: skip migration only when BG_GC is called

2020-02-24 Thread Chao Yu

On 2020/2/25 5:53, Jaegeuk Kim wrote:
> On 02/19, Chao Yu wrote:
>> On 2020/2/19 11:04, Jaegeuk Kim wrote:
>>> On 02/19, Chao Yu wrote:
 On 2020/2/19 7:27, Jaegeuk Kim wrote:
> On 02/17, Chao Yu wrote:
>> On 2020/2/15 2:58, Jaegeuk Kim wrote:
>>> FG_GC needs to move entire section more quickly.
>>>
>>> Signed-off-by: Jaegeuk Kim 
>>> ---
>>>  fs/f2fs/gc.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>> index bbf4db3f6bb4..1676eebc8c8b 100644
>>> --- a/fs/f2fs/gc.c
>>> +++ b/fs/f2fs/gc.c
>>> @@ -1203,7 +1203,7 @@ static int do_garbage_collect(struct f2fs_sb_info 
>>> *sbi,
>>>  
>>> if (get_valid_blocks(sbi, segno, false) == 0)
>>> goto freed;
>>> -   if (__is_large_section(sbi) &&
>>> +   if (gc_type == BG_GC && __is_large_section(sbi) &&
>>> migrated >= sbi->migration_granularity)
>>
>> I knew migrating one large section is a more efficient way, but this can
>> increase long-tail latency of f2fs_balance_fs() occasionally, especially 
>> in
>> extreme fragmented space.
>
> FG_GC requires to wait for whole section migration which shows the entire
> latency.

 That will cause long-tail latency for single f2fs_balance_fs() procedure,
 which it looks a very long hang when userspace call f2fs syscall, so why
 not splitting total elapsed time into several f2fs_balance_fs() to avoid 
 that.
>>>
>>> Then, other ops can easily make more dirty segments. The intention of FG_GC 
>>> is
>>
>> Yup, that's a problem, if there are more dirty datas being made, reserved 
>> segments
>> may be ran out during FG_GC.
>>
>>> to block everything and make min. free segments as a best shot.
>>
>> I just try to simulate write GC's logic in FTL to mitigate single op's max 
>> latency,
>> otherwise benchmark looks hang during FG_GC (in a 500mb+ section).

Oh, if we want to FG_GC only migrate on segment on time, it needs to change
has_not_enough_free_secs() condition as well, so previous logic is broken.

Nvm, please add:

Reviewed-by: Chao Yu 

> 
> Hmm, I think we may need to think another way like doing BG_GC more 
> aggressively.

Agreed, I guess SMR scenario may need such policy: higher trigger frequency on 
bggc
when free space decreases, but less block migration for each bggc cycle.

Thanks,

> 
>>
>> Thanks,
>>
>>>

 Thanks,

>
>>
>> Thanks,
>>
>>> goto skip;
>>> if (!PageUptodate(sum_page) || 
>>> unlikely(f2fs_cp_error(sbi)))
>>>
> .
>
>>> .
>>>
> .
> 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [RFC PATCH 2/2] f2fs: introduce F2FS_IOC_RELEASE_COMPRESS_BLOCKS

2020-02-24 Thread Chao Yu

On 2020/2/25 5:46, Jaegeuk Kim wrote:
> On 02/21, Chao Yu wrote:
>> There are still reserved blocks on compressed inode, this patch
>> introduce a new ioctl to help release reserved blocks back to
>> filesystem, so that userspace can reuse those freed space.
> 
> I thought we can add compressed_blocks into free space, after converting
> the file to immutable one. Otherwise, this patch is able to corrupt the

Yes, I remember that.

So any suggestion on this patch? we can:
- check immutable flag before releasing space, or
- we can add immutable flag on compressed inode in this ioctl interface?

Thanks,

> filesystem very easily.
> 
>>
>> Signed-off-by: Chao Yu 
>> ---
>>  fs/f2fs/f2fs.h |   6 +++
>>  fs/f2fs/file.c | 129 -
>>  2 files changed, 134 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> index 15199df5d40a..468f807fd917 100644
>> --- a/fs/f2fs/f2fs.h
>> +++ b/fs/f2fs/f2fs.h
>> @@ -427,6 +427,8 @@ static inline bool __has_cursum_space(struct 
>> f2fs_journal *journal,
>>  #define F2FS_IOC_PRECACHE_EXTENTS   _IO(F2FS_IOCTL_MAGIC, 15)
>>  #define F2FS_IOC_RESIZE_FS  _IOW(F2FS_IOCTL_MAGIC, 16, __u64)
>>  #define F2FS_IOC_GET_COMPRESS_BLOCKS_IOR(F2FS_IOCTL_MAGIC, 17, 
>> __u64)
>> +#define F2FS_IOC_RELEASE_COMPRESS_BLOCKS\
>> +_IOR(F2FS_IOCTL_MAGIC, 18, __u64)
>>  
>>  #define F2FS_IOC_GET_VOLUME_NAMEFS_IOC_GETFSLABEL
>>  #define F2FS_IOC_SET_VOLUME_NAMEFS_IOC_SETFSLABEL
>> @@ -3957,6 +3959,10 @@ static inline void f2fs_i_compr_blocks_update(struct 
>> inode *inode,
>>  {
>>  int diff = F2FS_I(inode)->i_cluster_size - blocks;
>>  
>> +/* don't update i_compr_blocks if saved blocks were released */
>> +if (!add && !F2FS_I(inode)->i_compr_blocks)
>> +return;
>> +
>>  if (add) {
>>  F2FS_I(inode)->i_compr_blocks += diff;
>>  stat_add_compr_blocks(inode, diff);
>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>> index 235708c892af..613f87151d90 100644
>> --- a/fs/f2fs/file.c
>> +++ b/fs/f2fs/file.c
>> @@ -557,6 +557,7 @@ void f2fs_truncate_data_blocks_range(struct 
>> dnode_of_data *dn, int count)
>>  bool compressed_cluster = false;
>>  int cluster_index = 0, valid_blocks = 0;
>>  int cluster_size = F2FS_I(dn->inode)->i_cluster_size;
>> +bool released = !F2FS_I(dn->inode)->i_compr_blocks;
>>  
>>  if (IS_INODE(dn->node_page) && f2fs_has_extra_attr(dn->inode))
>>  base = get_extra_isize(dn->inode);
>> @@ -595,7 +596,9 @@ void f2fs_truncate_data_blocks_range(struct 
>> dnode_of_data *dn, int count)
>>  clear_inode_flag(dn->inode, FI_FIRST_BLOCK_WRITTEN);
>>  
>>  f2fs_invalidate_blocks(sbi, blkaddr);
>> -nr_free++;
>> +
>> +if (released && blkaddr != COMPRESS_ADDR)
>> +nr_free++;
>>  }
>>  
>>  if (compressed_cluster)
>> @@ -3416,6 +3419,127 @@ static int f2fs_get_compress_blocks(struct file 
>> *filp, unsigned long arg)
>>  return put_user(blocks, (u64 __user *)arg);
>>  }
>>  
>> +static int release_compress_blocks(struct dnode_of_data *dn, pgoff_t count)
>> +{
>> +struct f2fs_sb_info *sbi = F2FS_I_SB(dn->inode);
>> +unsigned int released_blocks = 0;
>> +int cluster_size = F2FS_I(dn->inode)->i_cluster_size;
>> +
>> +while (count) {
>> +int compr_blocks = 0;
>> +block_t blkaddr = f2fs_data_blkaddr(dn);
>> +int i;
>> +
>> +if (blkaddr != COMPRESS_ADDR) {
>> +dn->ofs_in_node += cluster_size;
>> +goto next;
>> +}
>> +
>> +for (i = 0; i < cluster_size; i++, dn->ofs_in_node++) {
>> +blkaddr = f2fs_data_blkaddr(dn);
>> +
>> +if (__is_valid_data_blkaddr(blkaddr)) {
>> +compr_blocks++;
>> +if (unlikely(!f2fs_is_valid_blkaddr(sbi, 
>> blkaddr,
>> +DATA_GENERIC_ENHANCE)))
>> +return -EFSCORRUPTED;
>> +}
>> +
>> +if (blkaddr != NEW_ADDR)
>> +continue;
>> +
>> +dn->data_blkaddr = NULL_ADDR;
>> +f2fs_set_data_blkaddr(dn);
>> +}
>> +
>> +f2fs_i_compr_blocks_update(dn->inode, compr_blocks, false);
>> +dec_valid_block_count(sbi, dn->inode,
>> +cluster_size - compr_blocks);
>> +
>> +released_blocks += cluster_size - compr_blocks;
>> +next:
>> +count -= cluster_size;
>> +}
>> +
>> +return released_blocks;
>> +}
>> +
>> +static int f2fs_release_compress_blocks(struct file *filp, unsigned long 
>> arg)
>> +{
>> +struct inode *inode = file_inode(filp);
>> +struct f2fs_sb_in

Re: [f2fs-dev] [PATCH v7 21/24] iomap: Restructure iomap_readpages_actor

2020-02-24 Thread Matthew Wilcox

On Mon, Feb 24, 2020 at 02:17:49PM -0800, Christoph Hellwig wrote:
> On Thu, Feb 20, 2020 at 08:24:04AM -0800, Matthew Wilcox wrote:
> > On Thu, Feb 20, 2020 at 07:47:41AM -0800, Christoph Hellwig wrote:
> > > On Wed, Feb 19, 2020 at 01:01:00PM -0800, Matthew Wilcox wrote:
> > > > From: "Matthew Wilcox (Oracle)" 
> > > > 
> > > > By putting the 'have we reached the end of the page' condition at the 
> > > > end
> > > > of the loop instead of the beginning, we can remove the 'submit the last
> > > > page' code from iomap_readpages().  Also check that 
> > > > iomap_readpage_actor()
> > > > didn't return 0, which would lead to an endless loop.
> > > 
> > > I'm obviously biassed a I wrote the original code, but I find the new
> > > very much harder to understand (not that the previous one was easy, this
> > > is tricky code..).
> > 
> > Agreed, I found the original code hard to understand.  I think this is
> > easier because now cur_page doesn't leak outside this loop, so it has
> > an obvious lifecycle.
> 
> I really don't like this patch, and would prefer if the series goes
> ahead without it, as the current sctructure works just fine even
> with the readahead changes.

Dave Chinner specifically asked me to do it this way, so please fight
amongst yourselves.


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: use kmem_cache pool during inline xattr lookups

2020-02-24 Thread Chao Yu

It's been observed that kzalloc() on lookup_all_xattrs() are called millions
of times on Android, quickly becoming the top abuser of slub memory allocator.

Use a dedicated kmem cache pool for xattr lookups to mitigate this.

Signed-off-by: Park Ju Hyung 
Signed-off-by: Chao Yu 
---

Hi, Ju Hyung,

Let me know if you have any objection on this patch. :)

 fs/f2fs/f2fs.h  |  3 +++
 fs/f2fs/super.c | 10 -
 fs/f2fs/xattr.c | 54 -
 fs/f2fs/xattr.h |  4 
 4 files changed, 65 insertions(+), 6 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 12a197e89a3e..23b93a116c73 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1486,6 +1486,9 @@ struct f2fs_sb_info {
__u32 s_chksum_seed;
 
struct workqueue_struct *post_read_wq;  /* post read workqueue */
+
+   struct kmem_cache *inline_xattr_slab;   /* inline xattr entry */
+   unsigned int inline_xattr_slab_size;/* default inline xattr slab 
size */
 };
 
 struct f2fs_private_dio {
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index d241e07c0bfa..0b16204d3b7d 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1201,6 +1201,7 @@ static void f2fs_put_super(struct super_block *sb)
kvfree(sbi->raw_super);
 
destroy_device_list(sbi);
+   f2fs_destroy_xattr_caches(sbi);
mempool_destroy(sbi->write_io_dummy);
 #ifdef CONFIG_QUOTA
for (i = 0; i < MAXQUOTAS; i++)
@@ -3457,12 +3458,17 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
}
}
 
+   /* init per sbi slab cache */
+   err = f2fs_init_xattr_caches(sbi);
+   if (err)
+   goto free_io_dummy;
+
/* get an inode for meta space */
sbi->meta_inode = f2fs_iget(sb, F2FS_META_INO(sbi));
if (IS_ERR(sbi->meta_inode)) {
f2fs_err(sbi, "Failed to read F2FS meta data inode");
err = PTR_ERR(sbi->meta_inode);
-   goto free_io_dummy;
+   goto free_xattr_cache;
}
 
err = f2fs_get_valid_checkpoint(sbi);
@@ -3735,6 +3741,8 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
make_bad_inode(sbi->meta_inode);
iput(sbi->meta_inode);
sbi->meta_inode = NULL;
+free_xattr_cache:
+   f2fs_destroy_xattr_caches(sbi);
 free_io_dummy:
mempool_destroy(sbi->write_io_dummy);
 free_percpu:
diff --git a/fs/f2fs/xattr.c b/fs/f2fs/xattr.c
index a3360a97e624..e46a10eb0e42 100644
--- a/fs/f2fs/xattr.c
+++ b/fs/f2fs/xattr.c
@@ -23,6 +23,25 @@
 #include "xattr.h"
 #include "segment.h"
 
+static void *xattr_alloc(struct f2fs_sb_info *sbi, int size, bool *is_inline)
+{
+   *is_inline = (size == sbi->inline_xattr_slab_size);
+
+   if (*is_inline)
+   return kmem_cache_zalloc(sbi->inline_xattr_slab, GFP_NOFS);
+
+   return f2fs_kzalloc(sbi, size, GFP_NOFS);
+}
+
+static void xattr_free(struct f2fs_sb_info *sbi, void *xattr_addr,
+   bool is_inline)
+{
+   if (is_inline)
+   kmem_cache_free(sbi->inline_xattr_slab, xattr_addr);
+   else
+   kvfree(xattr_addr);
+}
+
 static int f2fs_xattr_generic_get(const struct xattr_handler *handler,
struct dentry *unused, struct inode *inode,
const char *name, void *buffer, size_t size)
@@ -301,7 +320,8 @@ static int read_xattr_block(struct inode *inode, void 
*txattr_addr)
 static int lookup_all_xattrs(struct inode *inode, struct page *ipage,
unsigned int index, unsigned int len,
const char *name, struct f2fs_xattr_entry **xe,
-   void **base_addr, int *base_size)
+   void **base_addr, int *base_size,
+   bool *is_inline)
 {
void *cur_addr, *txattr_addr, *last_txattr_addr;
void *last_addr = NULL;
@@ -313,7 +333,7 @@ static int lookup_all_xattrs(struct inode *inode, struct 
page *ipage,
return -ENODATA;
 
*base_size = XATTR_SIZE(inode) + XATTR_PADDING_SIZE;
-   txattr_addr = f2fs_kzalloc(F2FS_I_SB(inode), *base_size, GFP_NOFS);
+   txattr_addr = xattr_alloc(F2FS_I_SB(inode), *base_size, is_inline);
if (!txattr_addr)
return -ENOMEM;
 
@@ -362,7 +382,7 @@ static int lookup_all_xattrs(struct inode *inode, struct 
page *ipage,
*base_addr = txattr_addr;
return 0;
 out:
-   kvfree(txattr_addr);
+   xattr_free(F2FS_I_SB(inode), txattr_addr, *is_inline);
return err;
 }
 
@@ -499,6 +519,7 @@ int f2fs_getxattr(struct inode *inode, int index, const 
char *name,
unsigned int size, len;
void *base_addr = NULL;
int base_size;
+   bool is_inline;
 
if (name == NULL)
return -EINVAL;
@@ -509,7 +530,7 @@ int f2fs_getxattr(struct inode *inode, int

Re: [f2fs-dev] [PATCH v7 6/9] scsi: ufs: Add inline encryption support to UFS

2020-02-24 Thread Christoph Hellwig

On Sun, Feb 23, 2020 at 09:47:36PM +0800, Stanley Chu wrote:
> Yes, MediaTek is keeping work closely with inline encryption patch sets.
> Currently the v6 version can work well (without
> UFSHCD_QUIRK_BROKEN_CRYPTO quirk) at least in our MT6779 SoC platform
> which basic SoC support and some other peripheral drivers are under
> upstreaming as below link,
> 
> https://patchwork.kernel.org/project/linux-mediatek/list/?state=%
> 2A&q=6779&series=&submitter=&delegate=&archive=both
> 
> The integration with inline encryption patch set needs to patch
> ufs-mediatek and patches are ready in downstream. We plan to upstream
> them soon after inline encryption patch sets get merged.

What amount of support do you need in ufs-mediatek?  It seems like
pretty much every ufs low-level driver needs some kind of specific
support now, right?  I wonder if we should instead opt into the support
instead of all the quirking here.


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v7 3/9] block: blk-crypto-fallback for Inline Encryption

2020-02-24 Thread Christoph Hellwig

On Fri, Feb 21, 2020 at 10:34:37AM -0800, Eric Biggers wrote:
> On Fri, Feb 21, 2020 at 09:35:39AM -0800, Christoph Hellwig wrote:
> > High-level question:  Does the whole keyslot manager concept even make
> > sense for the fallback?  With the work-queue we have item that exectutes
> > at a time per cpu.  So just allocatea per-cpu crypto_skcipher for
> > each encryption mode and there should never be a slot limitation.  Or
> > do I miss something?
> 
> It does make sense because if blk-crypto-fallback didn't use a keyslot 
> manager,
> it would have to call crypto_skcipher_setkey() on the I/O path for every bio 
> to
> ensure that the CPU's crypto_skcipher has the correct key.  That's 
> undesirable,
> because setting a new key can be expensive with some encryption algorithms, 
> and
> also it can require a memory allocation which can fail.  For example, with the
> Adiantum algorithm, setting a key requires encrypting ~1100 bytes of data in
> order to generate subkeys.  It's better to set a key once and use it many 
> times.

I didn't think of such expensive operations when setting the key.
Note that you would not have to do it on every I/O, as chances are high
you'll get I/O from the same submitter and thus the same key, and we
can optimize for that case pretty easily.

But if you think the keyslot manager is better I accept that, this was
just a throught when looking over the code.


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v7 2/9] block: Inline encryption support for blk-mq

2020-02-24 Thread Christoph Hellwig

On Fri, Feb 21, 2020 at 04:52:33PM -0800, Satya Tangirala wrote:
> > What is the rationale for this limitation?  Restricting unrelated
> > features from being used together is a pretty bad design pattern and
> > should be avoided where possible.  If it can't it needs to be documented
> > very clearly.
> > 
> My understanding of blk-integrity is that for writes, blk-integrity
> generates some integrity info for a bio and sends it along with the bio,
> and the device on the other end verifies that the data it received to
> write matches up with the integrity info provided with the bio, and
> saves the integrity info along with the data. As for reads, the device
> sends the data along with the saved integrity info and blk-integrity
> verifies that the data received matches up with the integrity info.

Yes, a device supporting inline encryption and integrity will have to
update the guard tag to match the encrypted data as well.  That alone
is a good enough reason to reject the combination for now until it
is fully supported.  It needs to be properly document, and I think
we should also do it at probe time if possible, not when submitting
I/O.


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v7 21/24] iomap: Restructure iomap_readpages_actor

2020-02-24 Thread Christoph Hellwig

On Thu, Feb 20, 2020 at 08:24:04AM -0800, Matthew Wilcox wrote:
> On Thu, Feb 20, 2020 at 07:47:41AM -0800, Christoph Hellwig wrote:
> > On Wed, Feb 19, 2020 at 01:01:00PM -0800, Matthew Wilcox wrote:
> > > From: "Matthew Wilcox (Oracle)" 
> > > 
> > > By putting the 'have we reached the end of the page' condition at the end
> > > of the loop instead of the beginning, we can remove the 'submit the last
> > > page' code from iomap_readpages().  Also check that iomap_readpage_actor()
> > > didn't return 0, which would lead to an endless loop.
> > 
> > I'm obviously biassed a I wrote the original code, but I find the new
> > very much harder to understand (not that the previous one was easy, this
> > is tricky code..).
> 
> Agreed, I found the original code hard to understand.  I think this is
> easier because now cur_page doesn't leak outside this loop, so it has
> an obvious lifecycle.

I really don't like this patch, and would prefer if the series goes
ahead without it, as the current sctructure works just fine even
with the readahead changes.


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v2] f2fs: introduce DEFAULT_IO_TIMEOUT

2020-02-24 Thread Jaegeuk Kim

On 02/20, Chao Yu wrote:
> Hi Jaegeuk,
> 
> Could you help to adapt this change for f2fs_wait_on_all_pages() manually
> in your tree?
> 
> - io_schedule_timeout(HZ/50);
> + io_schedule_timeout(DEFAULT_IO_TIMEOUT);

Done.

> 
> Thanks,
> 
> On 2020/2/17 17:45, Chao Yu wrote:
> > As Geert Uytterhoeven reported:
> > 
> > for parameter HZ/50 in congestion_wait(BLK_RW_ASYNC, HZ/50);
> > 
> > On some platforms, HZ can be less than 50, then unexpected 0 timeout
> > jiffies will be set in congestion_wait().
> > 
> > This patch introduces a macro DEFAULT_IO_TIMEOUT to wrap a determinate
> > value with msecs_to_jiffies(20) to instead HZ/50 to avoid such issue.
> > 
> > Quoted from Geert Uytterhoeven:
> > 
> > "A timeout of HZ means 1 second.
> > HZ/50 means 20 ms, but has the risk of being zero, if HZ < 50.
> > 
> > If you want to use a timeout of 20 ms, you best use msecs_to_jiffies(20),
> > as that takes care of the special cases, and never returns 0."
> > 
> > Signed-off-by: Chao Yu 
> > ---
> > v2:
> > - use msecs_to_jiffies(20) instead.
> >  fs/f2fs/compress.c |  3 ++-
> >  fs/f2fs/data.c |  4 ++--
> >  fs/f2fs/f2fs.h |  3 +++
> >  fs/f2fs/gc.c   |  3 ++-
> >  fs/f2fs/inode.c|  2 +-
> >  fs/f2fs/node.c |  2 +-
> >  fs/f2fs/recovery.c |  5 +++--
> >  fs/f2fs/segment.c  | 10 ++
> >  fs/f2fs/super.c|  6 --
> >  9 files changed, 24 insertions(+), 14 deletions(-)
> > 
> > diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
> > index 0282149aa4c8..b9ea531ffbb6 100644
> > --- a/fs/f2fs/compress.c
> > +++ b/fs/f2fs/compress.c
> > @@ -985,7 +985,8 @@ static int f2fs_write_raw_pages(struct compress_ctx *cc,
> > } else if (ret == -EAGAIN) {
> > ret = 0;
> > cond_resched();
> > -   congestion_wait(BLK_RW_ASYNC, HZ/50);
> > +   congestion_wait(BLK_RW_ASYNC,
> > +   DEFAULT_IO_TIMEOUT);
> > lock_page(cc->rpages[i]);
> > clear_page_dirty_for_io(cc->rpages[i]);
> > goto retry_write;
> > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > index 019c91f7b301..a877cc50a4c3 100644
> > --- a/fs/f2fs/data.c
> > +++ b/fs/f2fs/data.c
> > @@ -2332,7 +2332,7 @@ int f2fs_encrypt_one_page(struct f2fs_io_info *fio)
> > /* flush pending IOs and wait for a while in the ENOMEM case */
> > if (PTR_ERR(fio->encrypted_page) == -ENOMEM) {
> > f2fs_flush_merged_writes(fio->sbi);
> > -   congestion_wait(BLK_RW_ASYNC, HZ/50);
> > +   congestion_wait(BLK_RW_ASYNC, DEFAULT_IO_TIMEOUT);
> > gfp_flags |= __GFP_NOFAIL;
> > goto retry_encrypt;
> > }
> > @@ -2923,7 +2923,7 @@ static int f2fs_write_cache_pages(struct 
> > address_space *mapping,
> > if (wbc->sync_mode == WB_SYNC_ALL) {
> > cond_resched();
> > congestion_wait(BLK_RW_ASYNC,
> > -   HZ/50);
> > +   DEFAULT_IO_TIMEOUT);
> > goto retry_write;
> > }
> > goto next;
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > index 816a5adb83a4..72c91a6dd549 100644
> > --- a/fs/f2fs/f2fs.h
> > +++ b/fs/f2fs/f2fs.h
> > @@ -558,6 +558,9 @@ enum {
> >  
> >  #define DEFAULT_RETRY_IO_COUNT 8   /* maximum retry read IO count 
> > */
> >  
> > +/* congestion wait timeout value, default: 20ms */
> > +#defineDEFAULT_IO_TIMEOUT  (msecs_to_jiffies(20))
> > +
> >  /* maximum retry quota flush count */
> >  #define DEFAULT_RETRY_QUOTA_FLUSH_COUNT8
> >  
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index 897de003e423..3db11d5a50d5 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -974,7 +974,8 @@ static int move_data_page(struct inode *inode, block_t 
> > bidx, int gc_type,
> > if (err) {
> > clear_cold_data(page);
> > if (err == -ENOMEM) {
> > -   congestion_wait(BLK_RW_ASYNC, HZ/50);
> > +   congestion_wait(BLK_RW_ASYNC,
> > +   DEFAULT_IO_TIMEOUT);
> > goto retry;
> > }
> > if (is_dirty)
> > diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
> > index 78c3f1d70f1d..156cc5ef3044 100644
> > --- a/fs/f2fs/inode.c
> > +++ b/fs/f2fs/inode.c
> > @@ -518,7 +518,7 @@ struct inode *f2fs_iget_retry(struct super_block *sb, 
> > unsigned long ino)
> > inode = f2fs_iget(sb, ino);
> > if (IS_ERR(inode)) {
> >

Re: [f2fs-dev] [PATCH 3/3] f2fs: avoid unneeded barrier in do_checkpoint()

2020-02-24 Thread Jaegeuk Kim

On 02/19, Chao Yu wrote:
> On 2020/2/19 10:51, Jaegeuk Kim wrote:
> > On 02/18, Chao Yu wrote:
> >> We don't need to wait all dirty page submitting IO twice,
> >> remove unneeded wait step.
> > 
> > What happens if checkpoint and other meta writs are reordered?
> 
> checkpoint can be done as following:
> 
> 1. All meta except last cp-park of checkpoint area.
> 2. last cp-park of checkpoint area
> 
> So we only need to keep barrier in between step 1 and 2, we don't need
> to care about the write order of meta in step 1, right?

Ah, let me integrate this patch into Sahitya's patch.

f2fs: fix the panic in do_checkpoint()

> 
> Thanks,
> 
> > 
> >>
> >> Signed-off-by: Chao Yu 
> >> ---
> >>  fs/f2fs/checkpoint.c | 2 --
> >>  1 file changed, 2 deletions(-)
> >>
> >> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> >> index 751815cb4c2b..9c88fb3d255a 100644
> >> --- a/fs/f2fs/checkpoint.c
> >> +++ b/fs/f2fs/checkpoint.c
> >> @@ -1384,8 +1384,6 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, 
> >> struct cp_control *cpc)
> >>  
> >>/* Flush all the NAT/SIT pages */
> >>f2fs_sync_meta_pages(sbi, META, LONG_MAX, FS_CP_META_IO);
> >> -  /* Wait for all dirty meta pages to be submitted for IO */
> >> -  f2fs_wait_on_all_pages(sbi, F2FS_DIRTY_META);
> >>  
> >>/*
> >> * modify checkpoint
> >> -- 
> >> 2.18.0.rc1
> > .
> > 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v7 14/24] btrfs: Convert from readpages to readahead

2020-02-24 Thread Christoph Hellwig

On Mon, Feb 24, 2020 at 01:54:14PM -0800, Matthew Wilcox wrote:
> > First I think the implicit ARRAY_SIZE in readahead_page_batch is highly
> > dangerous, as it will do the wrong thing when passing a pointer or
> > function argument.
> 
> somebody already thought of that ;-)
> 
> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + 
> __must_be_array(arr))

Ok.  Still find it pretty weird to design a primary interface that
just works with an array type.


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v7 14/24] btrfs: Convert from readpages to readahead

2020-02-24 Thread Matthew Wilcox

On Mon, Feb 24, 2020 at 01:43:47PM -0800, Christoph Hellwig wrote:
> On Thu, Feb 20, 2020 at 07:57:27AM -0800, Christoph Hellwig wrote:
> > On Thu, Feb 20, 2020 at 07:54:52AM -0800, Matthew Wilcox wrote:
> > > On Thu, Feb 20, 2020 at 07:46:58AM -0800, Christoph Hellwig wrote:
> > > > On Thu, Feb 20, 2020 at 05:48:49AM -0800, Matthew Wilcox wrote:
> > > > > btrfs: Convert from readpages to readahead
> > > > >   
> > > > > Implement the new readahead method in btrfs.  Add a 
> > > > > readahead_page_batch()
> > > > > to optimise fetching a batch of pages at once.
> > > > 
> > > > Shouldn't this readahead_page_batch heper go into a separate patch so
> > > > that it clearly stands out?
> > > 
> > > I'll move it into 'Put readahead pages in cache earlier' for v8 (the
> > > same patch where we add readahead_page())
> > 
> > One argument for keeping it in a patch of its own is that btrfs appears
> > to be the only user, and Goldwyn has a WIP conversion of btrfs to iomap,
> > so it might go away pretty soon and we could just revert the commit.
> > 
> > But this starts to get into really minor details, so I'll shut up now :)
> 
> So looking at this again I have another comment and a question.
> 
> First I think the implicit ARRAY_SIZE in readahead_page_batch is highly
> dangerous, as it will do the wrong thing when passing a pointer or
> function argument.

somebody already thought of that ;-)

#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))

> Second I wonder іf it would be worth to also switch to a batched
> operation in iomap if the xarray overhead is high enough.  That should
> be pretty trivial, but we don't really need to do it in this series.

I've also considered keeping a small array of pointers inside the
readahead_control so nobody needs to have a readahead_page_batch()
operation.  Even keeping 10 pointers in there will reduce the XArray
overhead by 90%.  But this fit the current btrfs model well, and it
lets us play with different approaches by abstracting everything away.
I'm sure this won't be the last patch that touches the readahead code ;-)


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [Cluster-devel] [PATCH v7 13/24] fs: Convert mpage_readpages to mpage_readahead

2020-02-24 Thread Christoph Hellwig

On Wed, Feb 19, 2020 at 01:00:52PM -0800, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" 
> 
> Implement the new readahead aop and convert all callers (block_dev,
> exfat, ext2, fat, gfs2, hpfs, isofs, jfs, nilfs2, ocfs2, omfs, qnx6,
> reiserfs & udf).  The callers are all trivial except for GFS2 & OCFS2.

Looks sensible:

Reviewed-by: Christoph Hellwig 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 3/3] f2fs: skip migration only when BG_GC is called

2020-02-24 Thread Jaegeuk Kim

On 02/19, Chao Yu wrote:
> On 2020/2/19 11:04, Jaegeuk Kim wrote:
> > On 02/19, Chao Yu wrote:
> >> On 2020/2/19 7:27, Jaegeuk Kim wrote:
> >>> On 02/17, Chao Yu wrote:
>  On 2020/2/15 2:58, Jaegeuk Kim wrote:
> > FG_GC needs to move entire section more quickly.
> >
> > Signed-off-by: Jaegeuk Kim 
> > ---
> >  fs/f2fs/gc.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index bbf4db3f6bb4..1676eebc8c8b 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -1203,7 +1203,7 @@ static int do_garbage_collect(struct f2fs_sb_info 
> > *sbi,
> >  
> > if (get_valid_blocks(sbi, segno, false) == 0)
> > goto freed;
> > -   if (__is_large_section(sbi) &&
> > +   if (gc_type == BG_GC && __is_large_section(sbi) &&
> > migrated >= sbi->migration_granularity)
> 
>  I knew migrating one large section is a more efficient way, but this can
>  increase long-tail latency of f2fs_balance_fs() occasionally, especially 
>  in
>  extreme fragmented space.
> >>>
> >>> FG_GC requires to wait for whole section migration which shows the entire
> >>> latency.
> >>
> >> That will cause long-tail latency for single f2fs_balance_fs() procedure,
> >> which it looks a very long hang when userspace call f2fs syscall, so why
> >> not splitting total elapsed time into several f2fs_balance_fs() to avoid 
> >> that.
> > 
> > Then, other ops can easily make more dirty segments. The intention of FG_GC 
> > is
> 
> Yup, that's a problem, if there are more dirty datas being made, reserved 
> segments
> may be ran out during FG_GC.
> 
> > to block everything and make min. free segments as a best shot.
> 
> I just try to simulate write GC's logic in FTL to mitigate single op's max 
> latency,
> otherwise benchmark looks hang during FG_GC (in a 500mb+ section).

Hmm, I think we may need to think another way like doing BG_GC more 
aggressively.

> 
> Thanks,
> 
> > 
> >>
> >> Thanks,
> >>
> >>>
> 
>  Thanks,
> 
> > goto skip;
> > if (!PageUptodate(sum_page) || 
> > unlikely(f2fs_cp_error(sbi)))
> >
> >>> .
> >>>
> > .
> > 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [Cluster-devel] [PATCH v7 12/24] mm: Add page_cache_readahead_unbounded

2020-02-24 Thread Christoph Hellwig

On Wed, Feb 19, 2020 at 01:00:51PM -0800, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" 
> 
> ext4 and f2fs have duplicated the guts of the readahead code so
> they can read past i_size.  Instead, separate out the guts of the
> readahead code so they can call it directly.

I don't like this, but then I like the horrible open coded versions
even less..  Can you add a do not use for new code comment to the
function as well?

Otherwise looks good:

Reviewed-by: Christoph Hellwig 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [RFC PATCH 2/2] f2fs: introduce F2FS_IOC_RELEASE_COMPRESS_BLOCKS

2020-02-24 Thread Jaegeuk Kim

On 02/21, Chao Yu wrote:
> There are still reserved blocks on compressed inode, this patch
> introduce a new ioctl to help release reserved blocks back to
> filesystem, so that userspace can reuse those freed space.

I thought we can add compressed_blocks into free space, after converting
the file to immutable one. Otherwise, this patch is able to corrupt the
filesystem very easily.

> 
> Signed-off-by: Chao Yu 
> ---
>  fs/f2fs/f2fs.h |   6 +++
>  fs/f2fs/file.c | 129 -
>  2 files changed, 134 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 15199df5d40a..468f807fd917 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -427,6 +427,8 @@ static inline bool __has_cursum_space(struct f2fs_journal 
> *journal,
>  #define F2FS_IOC_PRECACHE_EXTENTS_IO(F2FS_IOCTL_MAGIC, 15)
>  #define F2FS_IOC_RESIZE_FS   _IOW(F2FS_IOCTL_MAGIC, 16, __u64)
>  #define F2FS_IOC_GET_COMPRESS_BLOCKS _IOR(F2FS_IOCTL_MAGIC, 17, __u64)
> +#define F2FS_IOC_RELEASE_COMPRESS_BLOCKS \
> + _IOR(F2FS_IOCTL_MAGIC, 18, __u64)
>  
>  #define F2FS_IOC_GET_VOLUME_NAME FS_IOC_GETFSLABEL
>  #define F2FS_IOC_SET_VOLUME_NAME FS_IOC_SETFSLABEL
> @@ -3957,6 +3959,10 @@ static inline void f2fs_i_compr_blocks_update(struct 
> inode *inode,
>  {
>   int diff = F2FS_I(inode)->i_cluster_size - blocks;
>  
> + /* don't update i_compr_blocks if saved blocks were released */
> + if (!add && !F2FS_I(inode)->i_compr_blocks)
> + return;
> +
>   if (add) {
>   F2FS_I(inode)->i_compr_blocks += diff;
>   stat_add_compr_blocks(inode, diff);
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 235708c892af..613f87151d90 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -557,6 +557,7 @@ void f2fs_truncate_data_blocks_range(struct dnode_of_data 
> *dn, int count)
>   bool compressed_cluster = false;
>   int cluster_index = 0, valid_blocks = 0;
>   int cluster_size = F2FS_I(dn->inode)->i_cluster_size;
> + bool released = !F2FS_I(dn->inode)->i_compr_blocks;
>  
>   if (IS_INODE(dn->node_page) && f2fs_has_extra_attr(dn->inode))
>   base = get_extra_isize(dn->inode);
> @@ -595,7 +596,9 @@ void f2fs_truncate_data_blocks_range(struct dnode_of_data 
> *dn, int count)
>   clear_inode_flag(dn->inode, FI_FIRST_BLOCK_WRITTEN);
>  
>   f2fs_invalidate_blocks(sbi, blkaddr);
> - nr_free++;
> +
> + if (released && blkaddr != COMPRESS_ADDR)
> + nr_free++;
>   }
>  
>   if (compressed_cluster)
> @@ -3416,6 +3419,127 @@ static int f2fs_get_compress_blocks(struct file 
> *filp, unsigned long arg)
>   return put_user(blocks, (u64 __user *)arg);
>  }
>  
> +static int release_compress_blocks(struct dnode_of_data *dn, pgoff_t count)
> +{
> + struct f2fs_sb_info *sbi = F2FS_I_SB(dn->inode);
> + unsigned int released_blocks = 0;
> + int cluster_size = F2FS_I(dn->inode)->i_cluster_size;
> +
> + while (count) {
> + int compr_blocks = 0;
> + block_t blkaddr = f2fs_data_blkaddr(dn);
> + int i;
> +
> + if (blkaddr != COMPRESS_ADDR) {
> + dn->ofs_in_node += cluster_size;
> + goto next;
> + }
> +
> + for (i = 0; i < cluster_size; i++, dn->ofs_in_node++) {
> + blkaddr = f2fs_data_blkaddr(dn);
> +
> + if (__is_valid_data_blkaddr(blkaddr)) {
> + compr_blocks++;
> + if (unlikely(!f2fs_is_valid_blkaddr(sbi, 
> blkaddr,
> + DATA_GENERIC_ENHANCE)))
> + return -EFSCORRUPTED;
> + }
> +
> + if (blkaddr != NEW_ADDR)
> + continue;
> +
> + dn->data_blkaddr = NULL_ADDR;
> + f2fs_set_data_blkaddr(dn);
> + }
> +
> + f2fs_i_compr_blocks_update(dn->inode, compr_blocks, false);
> + dec_valid_block_count(sbi, dn->inode,
> + cluster_size - compr_blocks);
> +
> + released_blocks += cluster_size - compr_blocks;
> +next:
> + count -= cluster_size;
> + }
> +
> + return released_blocks;
> +}
> +
> +static int f2fs_release_compress_blocks(struct file *filp, unsigned long arg)
> +{
> + struct inode *inode = file_inode(filp);
> + struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> + pgoff_t page_idx = 0, last_idx;
> + unsigned int released_blocks = 0;
> + int ret;
> +
> + if (!f2fs_sb_has_compression(F2FS_I_SB(inode)))
> + return -EOPNOTSUPP;
> +
> + if (!f2fs_compressed_file(inode))
> + return -EINVAL;
> +
> + if (

Re: [f2fs-dev] [PATCH v7 14/24] btrfs: Convert from readpages to readahead

2020-02-24 Thread Christoph Hellwig

On Thu, Feb 20, 2020 at 07:57:27AM -0800, Christoph Hellwig wrote:
> On Thu, Feb 20, 2020 at 07:54:52AM -0800, Matthew Wilcox wrote:
> > On Thu, Feb 20, 2020 at 07:46:58AM -0800, Christoph Hellwig wrote:
> > > On Thu, Feb 20, 2020 at 05:48:49AM -0800, Matthew Wilcox wrote:
> > > > btrfs: Convert from readpages to readahead
> > > >   
> > > > Implement the new readahead method in btrfs.  Add a 
> > > > readahead_page_batch()
> > > > to optimise fetching a batch of pages at once.
> > > 
> > > Shouldn't this readahead_page_batch heper go into a separate patch so
> > > that it clearly stands out?
> > 
> > I'll move it into 'Put readahead pages in cache earlier' for v8 (the
> > same patch where we add readahead_page())
> 
> One argument for keeping it in a patch of its own is that btrfs appears
> to be the only user, and Goldwyn has a WIP conversion of btrfs to iomap,
> so it might go away pretty soon and we could just revert the commit.
> 
> But this starts to get into really minor details, so I'll shut up now :)

So looking at this again I have another comment and a question.

First I think the implicit ARRAY_SIZE in readahead_page_batch is highly
dangerous, as it will do the wrong thing when passing a pointer or
function argument.

Second I wonder іf it would be worth to also switch to a batched
operation in iomap if the xarray overhead is high enough.  That should
be pretty trivial, but we don't really need to do it in this series.


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v7 10/24] mm: Add readahead address space operation

2020-02-24 Thread Christoph Hellwig

Looks fine modulo the little data type issue:

Reviewed-by: Christoph Hellwig 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v7 09/24] mm: Put readahead pages in cache earlier

2020-02-24 Thread Christoph Hellwig

On Wed, Feb 19, 2020 at 01:00:48PM -0800, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" 
> 
> When populating the page cache for readahead, mappings that use
> ->readpages must populate the page cache themselves as the pages are
> passed on a linked list which would normally be used for the page cache's
> LRU.  For mappings that use ->readpage or the upcoming ->readahead method,
> we can put the pages into the page cache as soon as they're allocated,
> which solves a race between readahead and direct IO.  It also lets us
> remove the gfp argument from read_pages().
> 
> Use the new readahead_page() API to implement the repeated calls to
> ->readpage(), just like most filesystems will.  This iterator also
> supports huge pages, even though none of the filesystems have been
> converted to use them yet.
> 
> Signed-off-by: Matthew Wilcox (Oracle) 
> ---
>  include/linux/pagemap.h | 20 +
>  mm/readahead.c  | 48 +
>  2 files changed, 49 insertions(+), 19 deletions(-)
> 
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index 55fcea0249e6..4989d330fada 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -647,8 +647,28 @@ struct readahead_control {
>  /* private: use the readahead_* accessors instead */
>   pgoff_t _index;
>   unsigned int _nr_pages;
> + unsigned int _batch_count;
>  };
>  
> +static inline struct page *readahead_page(struct readahead_control *rac)
> +{
> + struct page *page;
> +
> + BUG_ON(rac->_batch_count > rac->_nr_pages);
> + rac->_nr_pages -= rac->_batch_count;
> + rac->_index += rac->_batch_count;
> + rac->_batch_count = 0;
> +
> + if (!rac->_nr_pages)
> + return NULL;
> +
> + page = xa_load(&rac->mapping->i_pages, rac->_index);
> + VM_BUG_ON_PAGE(!PageLocked(page), page);
> + rac->_batch_count = hpage_nr_pages(page);
> +
> + return page;
> +}
> +
>  /* The number of pages in this readahead block */
>  static inline unsigned int readahead_count(struct readahead_control *rac)
>  {
> diff --git a/mm/readahead.c b/mm/readahead.c
> index 83df5c061d33..aaa209559ba2 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -113,15 +113,14 @@ int read_cache_pages(struct address_space *mapping, 
> struct list_head *pages,
>  
>  EXPORT_SYMBOL(read_cache_pages);
>  
> -static void read_pages(struct readahead_control *rac, struct list_head 
> *pages,
> - gfp_t gfp)
> +static void read_pages(struct readahead_control *rac, struct list_head 
> *pages)
>  {
>   const struct address_space_operations *aops = rac->mapping->a_ops;
> + struct page *page;
>   struct blk_plug plug;
> - unsigned page_idx;
>  
>   if (!readahead_count(rac))
> - return;
> + goto out;
>  
>   blk_start_plug(&plug);
>  
> @@ -130,23 +129,23 @@ static void read_pages(struct readahead_control *rac, 
> struct list_head *pages,
>   readahead_count(rac));
>   /* Clean up the remaining pages */
>   put_pages_list(pages);
> - goto out;
> - }
> -
> - for (page_idx = 0; page_idx < readahead_count(rac); page_idx++) {
> - struct page *page = lru_to_page(pages);
> - list_del(&page->lru);
> - if (!add_to_page_cache_lru(page, rac->mapping, page->index,
> - gfp))
> + rac->_index += rac->_nr_pages;
> + rac->_nr_pages = 0;
> + } else {
> + while ((page = readahead_page(rac))) {
>   aops->readpage(rac->file, page);
> - put_page(page);
> + put_page(page);
> + }
>   }
>  
> -out:
>   blk_finish_plug(&plug);
>  
>   BUG_ON(!list_empty(pages));
> - rac->_nr_pages = 0;
> + BUG_ON(readahead_count(rac));
> +
> +out:
> + /* If we were called due to a conflicting page, skip over it */
> + rac->_index++;
>  }
>  
>  /*
> @@ -165,9 +164,11 @@ void __do_page_cache_readahead(struct address_space 
> *mapping,
>   LIST_HEAD(page_pool);
>   loff_t isize = i_size_read(inode);
>   gfp_t gfp_mask = readahead_gfp_mask(mapping);
> + bool use_list = mapping->a_ops->readpages;

I find this single use variable a little weird.  Not a dealbreaker,
but just checking the methods would seem a little more obvious to me.

Except for this and the other nitpick the patch looks good to me:

Reviewed-by: Christoph Hellwig 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v7 08/24] mm: Remove 'page_offset' from readahead loop

2020-02-24 Thread Christoph Hellwig

On Wed, Feb 19, 2020 at 01:00:47PM -0800, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" 
> 
> Replace the page_offset variable with 'index + i'.
> 
> Signed-off-by: Matthew Wilcox (Oracle) 

Looks good,

Reviewed-by: Christoph Hellwig 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v7 05/24] mm: Use readahead_control to pass arguments

2020-02-24 Thread Christoph Hellwig

On Wed, Feb 19, 2020 at 01:00:44PM -0800, Matthew Wilcox wrote:
> @@ -160,9 +164,13 @@ void __do_page_cache_readahead(struct address_space 
> *mapping,
>   unsigned long end_index;/* The last page we want to read */
>   LIST_HEAD(page_pool);
>   int page_idx;
> - unsigned int nr_pages = 0;
>   loff_t isize = i_size_read(inode);
>   gfp_t gfp_mask = readahead_gfp_mask(mapping);
> + struct readahead_control rac = {
> + .mapping = mapping,
> + .file = filp,
> + ._nr_pages = 0,

There is no real need to initialize fields to zero if we initialize
the structure at all.  And while I'd normally not care that much,
having as few references as possible to the _-prefixed internal
variables helps making clear how internal they are.

Otherwise looks good:

Reviewed-by: Christoph Hellwig 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v7 04/24] mm: Move readahead nr_pages check into read_pages

2020-02-24 Thread Christoph Hellwig

On Wed, Feb 19, 2020 at 01:00:43PM -0800, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" 
> 
> Simplify the callers by moving the check for nr_pages and the BUG_ON
> into read_pages().
> 
> Signed-off-by: Matthew Wilcox (Oracle) 

Looks good,

Reviewed-by: Christoph Hellwig 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v7 02/24] mm: Return void from various readahead functions

2020-02-24 Thread Christoph Hellwig

Looks good,

Reviewed-by: Christoph Hellwig 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v7 01/24] mm: Move readahead prototypes from mm.h

2020-02-24 Thread Christoph Hellwig

On Wed, Feb 19, 2020 at 01:00:40PM -0800, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" 
> 
> The readahead code is part of the page cache so should be found in the
> pagemap.h file.  force_page_cache_readahead is only used within mm,
> so move it to mm/internal.h instead.  Remove the parameter names where
> they add no value, and rename the ones which were actively misleading.

Looks good,

Reviewed-by: Christoph Hellwig 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [man-pages PATCH v2] statx.2: document STATX_ATTR_VERITY

2020-02-24 Thread Eric Biggers

On Tue, Jan 28, 2020 at 11:24:49AM -0800, Eric Biggers wrote:
> From: Eric Biggers 
> 
> Document the verity attribute for statx(), which was added in
> Linux 5.5.
> 
> For more context, see the fs-verity documentation:
> https://www.kernel.org/doc/html/latest/filesystems/fsverity.html
> 
> Signed-off-by: Eric Biggers 
> ---
>  man2/statx.2 | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/man2/statx.2 b/man2/statx.2
> index d2f1b07b8..d015ee73d 100644
> --- a/man2/statx.2
> +++ b/man2/statx.2
> @@ -461,6 +461,11 @@ See
>  .TP
>  .B STATX_ATTR_ENCRYPTED
>  A key is required for the file to be encrypted by the filesystem.
> +.TP
> +.B STATX_ATTR_VERITY
> +Since Linux 5.5: the file has fs-verity enabled.  It cannot be written to, 
> and
> +all reads from it will be verified against a cryptographic hash that covers 
> the
> +entire file, e.g. via a Merkle tree.
>  .SH RETURN VALUE
>  On success, zero is returned.
>  On error, \-1 is returned, and
> -- 
> 2.25.0.341.g760bfbb309-goog
> 

Ping?  Michael, can you apply this man-pages patch?

- Eric


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v7 22/24] iomap: Convert from readpages to readahead

2020-02-24 Thread Darrick J. Wong

On Sun, Feb 23, 2020 at 08:33:55PM -0800, Matthew Wilcox wrote:
> On Fri, Feb 21, 2020 at 05:00:13PM -0800, Darrick J. Wong wrote:
> > On Thu, Feb 20, 2020 at 08:57:34AM -0800, Matthew Wilcox wrote:
> > > On Thu, Feb 20, 2020 at 07:49:12AM -0800, Christoph Hellwig wrote:
> > > +/**
> > > + * iomap_readahead - Attempt to read pages from a file.
> > > + * @rac: Describes the pages to be read.
> > > + * @ops: The operations vector for the filesystem.
> > > + *
> > > + * This function is for filesystems to call to implement their readahead
> > > + * address_space operation.
> > > + *
> > > + * Context: The file is pinned by the caller, and the pages to be read 
> > > are
> > > + * all locked and have an elevated refcount.  This function will unlock
> > > + * the pages (once I/O has completed on them, or I/O has been determined 
> > > to
> > > + * not be necessary).  It will also decrease the refcount once the pages
> > > + * have been submitted for I/O.  After this point, the page may be 
> > > removed
> > > + * from the page cache, and should not be referenced.
> > > + */
> > > 
> > > > Isn't the context documentation something that belongs into the aop
> > > > documentation?  I've never really seen the value of duplicating this
> > > > information in method instances, as it is just bound to be out of date
> > > > rather sooner than later.
> > > 
> > > I'm in two minds about it as well.  There's definitely no value in
> > > providing kernel-doc for implementations of a common interface ... so
> > > rather than fixing the nilfs2 kernel-doc, I just deleted it.  But this
> > > isn't just the implementation, like nilfs2_readahead() is, it's a library
> > > function for filesystems to call, so it deserves documentation.  On the
> > > other hand, there's no real thought to this on the part of the filesystem;
> > > the implementation just calls this with the appropriate ops pointer.
> > > 
> > > Then again, I kind of feel like we need more documentation of iomap to
> > > help filesystems convert to using it.  But maybe kernel-doc isn't the
> > > mechanism to provide that.
> > 
> > I think we need more documentation of the parts of iomap where it can
> > call back into the filesystem (looking at you, iomap_dio_ops).
> > 
> > I'm not opposed to letting this comment stay, though I don't see it as
> > all that necessary since iomap_readahead implements a callout that's
> > documented in vfs.rst and is thus subject to all the constraints listed
> > in the (*readahead) documentation.
> 
> Right.  And that's not currently in kernel-doc format, but should be.
> Something for a different patchset, IMO.
> 
> What we need documenting _here_ is the conditions under which the
> iomap_ops are called so the filesystem author doesn't need to piece them
> together from three different places.  Here's what I currently have:
> 
>  * Context: The @ops callbacks may submit I/O (eg to read the addresses of
>  * blocks from disc), and may wait for it.  The caller may be trying to
>  * access a different page, and so sleeping excessively should be avoided.
>  * It may allocate memory, but should avoid large allocations.  This
>  * function is called with memalloc_nofs set, so allocations will not cause
>  * the filesystem to be reentered.

How large? :)

--D


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] Writes stoped working on f2fs after the compression support was added

2020-02-24 Thread Ondřej Jirman

On Mon, Feb 24, 2020 at 03:03:49PM +0100, megi xff wrote:
> On Mon, Feb 24, 2020 at 02:58:37PM +0100, megi xff wrote:
> > Hello,
> > 
> > On Mon, Feb 24, 2020 at 06:41:03PM +0800, Chao Yu wrote:
> > > On 2020/2/24 18:37, Chao Yu wrote:
> > > > Hi,
> > > > 
> > > > Thanks for the report.
> > > > 
> > > > Could you dump all other task stack info via "echo "t" > 
> > > > /proc/sysrq-trigger"?
> > > > 
> > > >>
> > > >> [  246.758021] INFO: task kworker/u16:1:58 blocked for more than 122 
> > > >> seconds.
> > > >> [  246.758040]   Not tainted 5.6.0-rc2-00590-g9983bdae4974e #11
> > > >> [  246.758044] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
> > > >> disables this message.
> > > >> [  246.758052] kworker/u16:1   D058  2 0x
> > > >> [  246.758090] Workqueue: writeback wb_workfn (flush-179:0)
> > > >> [  246.758099] Backtrace:
> > > >> [  246.758121] [] (__schedule) from [] 
> > > >> (schedule+0x78/0xf4)
> > > >> [  246.758130]  r10:da644000 r9: r8:da645a60 r7:da283e10 
> > > >> r6:0002 r5:da644000
> > > >> [  246.758132]  r4:da4d3600
> > > >> [  246.758148] [] (schedule) from [] 
> > > >> (rwsem_down_write_slowpath+0x24c/0x4c0)
> > > >> [  246.758152]  r5:0001 r4:da283e00
> > > >> [  246.758161] [] (rwsem_down_write_slowpath) from 
> > > >> [] (down_write+0x6c/0x70)
> > > >> [  246.758167]  r10:da283e00 r9:da645d80 r8:d9ed r7:0001 
> > > >> r6: r5:eff213b0
> > > >> [  246.758169]  r4:da283e00
> > > >> [  246.758187] [] (down_write) from [] 
> > > >> (f2fs_write_single_data_page+0x608/0x7ac)
> > > > 
> > > > I'm not sure what is this semaphore, I suspect this is 
> > > > F2FS_I(inode)->i_sem, in order to make
> > > > sure of this, can you help to add below function, and use them to 
> > > > replace
> > > > all {down,up}_{write,read}(&.i_sem) invoking? then reproduce this issue 
> > > > and catch the log.
> > > 
> > > Sorry, just forgot attaching below function.
> > > 
> > > void inode_down_write(struct inode *inode)
> > > {
> > >   printk("%s from %pS\n", __func__, __builtin_return_address(0));
> > >   down_write(&F2FS_I(inode)->i_sem);
> > > }
> > > 
> > > void inode_up_write(struct inode *inode)
> > > {
> > >   up_write(&F2FS_I(inode)->i_sem);
> > >   printk("%s from %pS\n", __func__, __builtin_return_address(0));
> > > }
> > > 
> > > void inode_down_read(struct inode *inode)
> > > {
> > >   printk("%s from %pS\n", __func__, __builtin_return_address(0));
> > >   down_read(&F2FS_I(inode)->i_sem);
> > > }
> > > 
> > > void inode_up_read(struct inode *inode)
> > > {
> > >   up_read(&F2FS_I(inode)->i_sem);
> > >   printk("%s from %pS\n", __func__, __builtin_return_address(0));
> > > }
> > > 
> > 
> > Here's the log and vmlinux file that may help mapping the code addresses 
> > back to
> > code, hope it helps:
> > 
> > https://megous.com/dl/tmp/f2fs-dmesg-log
> > https://megous.com/dl/tmp/f2fs-log-build-artifacts.tar.gz
> 
> Just by a looks of it:
> 
> root@tbs2[/proc/sys/kernel] # dmesg | grep up_write | wc -l
> 324
> root@tbs2[/proc/sys/kernel] # dmesg | grep down_write | wc -l
> 347
> 
> there seems to be a mismatch of lock/unlock counts.
 
Sorry, a wrong grep expression.

root@tbs2[~] # dmesg | grep inode_down_write | wc -l
357
root@tbs2[~] # dmesg | grep inode_up_write | wc -l
357
root@tbs2[~] # dmesg | grep inode_up_read | wc -l
16
root@tbs2[~] # dmesg | grep inode_down_read | wc -l
16

So it's probably not inode locking.

> root@tbs2[/proc/sys/kernel] # dmesg | grep down_read | wc -l
> 16
> root@tbs2[/proc/sys/kernel] # dmesg | grep up_read | wc -l
> 16
> 
> regards,
>   o.
> 
> > thank you,
> > o.
> > 
> > > > Thanks,
> > > > 
> > > >> [  246.758190]  r5:eff213b0 r4:da283c60
> > > >> [  246.758198] [] (f2fs_write_single_data_page) from 
> > > >> [] (f2fs_write_cache_pages+0x2b4/0x7c4)
> > > >> [  246.758204]  r10:da645c28 r9:da283d60 r8:da283c60 r7:000f 
> > > >> r6:da645d80 r5:0001
> > > >> [  246.758206]  r4:eff213b0
> > > >> [  246.758214] [] (f2fs_write_cache_pages) from [] 
> > > >> (f2fs_write_data_pages+0x344/0x35c)
> > > >> [  246.758220]  r10: r9:d9ed002c r8:d9ed r7:0004 
> > > >> r6:da283d60 r5:da283c60
> > > >> [  246.758223]  r4:da645d80
> > > >> [  246.758238] [] (f2fs_write_data_pages) from [] 
> > > >> (do_writepages+0x3c/0xd4)
> > > >> [  246.758244]  r10:000a r9:c0e03d00 r8:0c00 r7:c0264ddc 
> > > >> r6:da645d80 r5:da283d60
> > > >> [  246.758246]  r4:da283c60
> > > >> [  246.758254] [] (do_writepages) from [] 
> > > >> (__writeback_single_inode+0x44/0x454)
> > > >> [  246.758259]  r7:da283d60 r6:da645eac r5:da645d80 r4:da283c60
> > > >> [  246.758266] [] (__writeback_single_inode) from 
> > > >> [] (writeback_sb_inodes+0x204/0x4b0)
> > > >> [  246.758272]  r10:000a r9:c0e03d00 r8:da283cc8 r7:da283c60 
> > > >> r6:da645eac r5:da283d08
> > > >> [  246.758274]  r4:d9dc9848
> > > >> [  246.758281] [] (writeback_sb_inodes) from [] 
> > > >> (__writeback_inodes_wb+0x50/0xe4)
> > > >> [

Re: [f2fs-dev] Writes stoped working on f2fs after the compression support was added

2020-02-24 Thread Ondřej Jirman

On Mon, Feb 24, 2020 at 06:41:03PM +0800, Chao Yu wrote:
> On 2020/2/24 18:37, Chao Yu wrote:
> > Hi,
> > 
> > Thanks for the report.
> > 
> > On 2020/2/23 2:17, Ondřej Jirman wrote:
> >> Hello,
> >>
> >> I observe hung background f2fs task on 5.6-rc2+ shortly after boot, 
> >> leading to
> >> panics. I use f2fs as rootfs. See below for stack trace. The reads 
> >> continue to
> >> work (if I disable panic on hung task). But sync blocks, writes block, etc.
> >>
> >> It does not happen on all my f2fs filesystems, so it may depend on workload
> >> or my particular filesystem state. It happens on two separate devices 
> >> though,
> >> both 32-bit, and doesn't happen on a 64-bit device. (might be a false lead,
> >> though)
> >>
> >> I went through the f2fs-for-5.6 tag/branch and reverted each patch right
> >> down to:
> >>
> >>   4c8ff7095bef64fc47e996a938f7d57f9e077da3 f2fs: support data compression
> >>
> >> I could still reproduce the issue.
> >>
> >> After reverting the data compression too, I could no longer reproduce the
> >> issue. Perhaps not very helpful, since that patch is huge.
> >>
> >> I tried to collect some information. Some ftrace logs, etc. Not sure what 
> >> would
> >> help debug this. Let me know if I can help debug this more.
> >>
> >> Symptoms look the same as this issue:
> >>
> >>   
> >> https://www.mail-archive.com/linux-f2fs-devel@lists.sourceforge.net/msg15298.html
> >>
> >> ftrace: https://megous.com/dl/tmp/f2fs-debug-info.txt
> >>
> >> longer ftrace: https://megous.com/dl/tmp/f2fs-debug-info-full.txt
> > 
> > I checked the full trace log, however I didn't find any clue to troubleshot 
> > this issue.
> > 
> > [snip]
> > 
> >> dmesg:
> > 
> > Could you dump all other task stack info via "echo "t" > 
> > /proc/sysrq-trigger"?
> > 
> >>
> >> [  246.758021] INFO: task kworker/u16:1:58 blocked for more than 122 
> >> seconds.
> >> [  246.758040]   Not tainted 5.6.0-rc2-00590-g9983bdae4974e #11
> >> [  246.758044] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> >> this message.
> >> [  246.758052] kworker/u16:1   D058  2 0x
> >> [  246.758090] Workqueue: writeback wb_workfn (flush-179:0)
> >> [  246.758099] Backtrace:
> >> [  246.758121] [] (__schedule) from [] 
> >> (schedule+0x78/0xf4)
> >> [  246.758130]  r10:da644000 r9: r8:da645a60 r7:da283e10 
> >> r6:0002 r5:da644000
> >> [  246.758132]  r4:da4d3600
> >> [  246.758148] [] (schedule) from [] 
> >> (rwsem_down_write_slowpath+0x24c/0x4c0)
> >> [  246.758152]  r5:0001 r4:da283e00
> >> [  246.758161] [] (rwsem_down_write_slowpath) from [] 
> >> (down_write+0x6c/0x70)
> >> [  246.758167]  r10:da283e00 r9:da645d80 r8:d9ed r7:0001 
> >> r6: r5:eff213b0
> >> [  246.758169]  r4:da283e00
> >> [  246.758187] [] (down_write) from [] 
> >> (f2fs_write_single_data_page+0x608/0x7ac)
> > 
> > I'm not sure what is this semaphore, I suspect this is 
> > F2FS_I(inode)->i_sem, in order to make
> > sure of this, can you help to add below function, and use them to replace
> > all {down,up}_{write,read}(&.i_sem) invoking? then reproduce this issue and 
> > catch the log.
> 
> Sorry, just forgot attaching below function.
> 
> void inode_down_write(struct inode *inode)
> {
>   printk("%s from %pS\n", __func__, __builtin_return_address(0));
>   down_write(&F2FS_I(inode)->i_sem);
> }
> 
> void inode_up_write(struct inode *inode)
> {
>   up_write(&F2FS_I(inode)->i_sem);
>   printk("%s from %pS\n", __func__, __builtin_return_address(0));
> }
> 
> void inode_down_read(struct inode *inode)
> {
>   printk("%s from %pS\n", __func__, __builtin_return_address(0));
>   down_read(&F2FS_I(inode)->i_sem);
> }
> 
> void inode_up_read(struct inode *inode)
> {
>   up_read(&F2FS_I(inode)->i_sem);
>   printk("%s from %pS\n", __func__, __builtin_return_address(0));
> }

The actual debug code:

https://megous.com/git/linux/commit/?h=f2fs-debug-5.6&id=2f152bf17c9c5fe1f7c8449cb690ad0739d3ad44

> > 
> > Thanks,
> > 
> >> [  246.758190]  r5:eff213b0 r4:da283c60
> >> [  246.758198] [] (f2fs_write_single_data_page) from 
> >> [] (f2fs_write_cache_pages+0x2b4/0x7c4)
> >> [  246.758204]  r10:da645c28 r9:da283d60 r8:da283c60 r7:000f 
> >> r6:da645d80 r5:0001
> >> [  246.758206]  r4:eff213b0
> >> [  246.758214] [] (f2fs_write_cache_pages) from [] 
> >> (f2fs_write_data_pages+0x344/0x35c)
> >> [  246.758220]  r10: r9:d9ed002c r8:d9ed r7:0004 
> >> r6:da283d60 r5:da283c60
> >> [  246.758223]  r4:da645d80
> >> [  246.758238] [] (f2fs_write_data_pages) from [] 
> >> (do_writepages+0x3c/0xd4)
> >> [  246.758244]  r10:000a r9:c0e03d00 r8:0c00 r7:c0264ddc 
> >> r6:da645d80 r5:da283d60
> >> [  246.758246]  r4:da283c60
> >> [  246.758254] [] (do_writepages) from [] 
> >> (__writeback_single_inode+0x44/0x454)
> >> [  246.758259]  r7:da283d60 r6:da645eac r5:da645d80 r4:da283c60
> >> [  246.758266] [] (__writeback_single_inode) from [] 
> >> (writeback_sb_inodes+0

Re: [f2fs-dev] Writes stoped working on f2fs after the compression support was added

2020-02-24 Thread Ondřej Jirman

On Mon, Feb 24, 2020 at 02:58:37PM +0100, megi xff wrote:
> Hello,
> 
> On Mon, Feb 24, 2020 at 06:41:03PM +0800, Chao Yu wrote:
> > On 2020/2/24 18:37, Chao Yu wrote:
> > > Hi,
> > > 
> > > Thanks for the report.
> > > 
> > > Could you dump all other task stack info via "echo "t" > 
> > > /proc/sysrq-trigger"?
> > > 
> > >>
> > >> [  246.758021] INFO: task kworker/u16:1:58 blocked for more than 122 
> > >> seconds.
> > >> [  246.758040]   Not tainted 5.6.0-rc2-00590-g9983bdae4974e #11
> > >> [  246.758044] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
> > >> disables this message.
> > >> [  246.758052] kworker/u16:1   D058  2 0x
> > >> [  246.758090] Workqueue: writeback wb_workfn (flush-179:0)
> > >> [  246.758099] Backtrace:
> > >> [  246.758121] [] (__schedule) from [] 
> > >> (schedule+0x78/0xf4)
> > >> [  246.758130]  r10:da644000 r9: r8:da645a60 r7:da283e10 
> > >> r6:0002 r5:da644000
> > >> [  246.758132]  r4:da4d3600
> > >> [  246.758148] [] (schedule) from [] 
> > >> (rwsem_down_write_slowpath+0x24c/0x4c0)
> > >> [  246.758152]  r5:0001 r4:da283e00
> > >> [  246.758161] [] (rwsem_down_write_slowpath) from 
> > >> [] (down_write+0x6c/0x70)
> > >> [  246.758167]  r10:da283e00 r9:da645d80 r8:d9ed r7:0001 
> > >> r6: r5:eff213b0
> > >> [  246.758169]  r4:da283e00
> > >> [  246.758187] [] (down_write) from [] 
> > >> (f2fs_write_single_data_page+0x608/0x7ac)
> > > 
> > > I'm not sure what is this semaphore, I suspect this is 
> > > F2FS_I(inode)->i_sem, in order to make
> > > sure of this, can you help to add below function, and use them to replace
> > > all {down,up}_{write,read}(&.i_sem) invoking? then reproduce this issue 
> > > and catch the log.
> > 
> > Sorry, just forgot attaching below function.
> > 
> > void inode_down_write(struct inode *inode)
> > {
> > printk("%s from %pS\n", __func__, __builtin_return_address(0));
> > down_write(&F2FS_I(inode)->i_sem);
> > }
> > 
> > void inode_up_write(struct inode *inode)
> > {
> > up_write(&F2FS_I(inode)->i_sem);
> > printk("%s from %pS\n", __func__, __builtin_return_address(0));
> > }
> > 
> > void inode_down_read(struct inode *inode)
> > {
> > printk("%s from %pS\n", __func__, __builtin_return_address(0));
> > down_read(&F2FS_I(inode)->i_sem);
> > }
> > 
> > void inode_up_read(struct inode *inode)
> > {
> > up_read(&F2FS_I(inode)->i_sem);
> > printk("%s from %pS\n", __func__, __builtin_return_address(0));
> > }
> > 
> 
> Here's the log and vmlinux file that may help mapping the code addresses back 
> to
> code, hope it helps:
> 
> https://megous.com/dl/tmp/f2fs-dmesg-log
> https://megous.com/dl/tmp/f2fs-log-build-artifacts.tar.gz

Just by a looks of it:

root@tbs2[/proc/sys/kernel] # dmesg | grep up_write | wc -l
324
root@tbs2[/proc/sys/kernel] # dmesg | grep down_write | wc -l
347

there seems to be a mismatch of lock/unlock counts.

root@tbs2[/proc/sys/kernel] # dmesg | grep down_read | wc -l
16
root@tbs2[/proc/sys/kernel] # dmesg | grep up_read | wc -l
16

regards,
o.

> thank you,
>   o.
> 
> > > Thanks,
> > > 
> > >> [  246.758190]  r5:eff213b0 r4:da283c60
> > >> [  246.758198] [] (f2fs_write_single_data_page) from 
> > >> [] (f2fs_write_cache_pages+0x2b4/0x7c4)
> > >> [  246.758204]  r10:da645c28 r9:da283d60 r8:da283c60 r7:000f 
> > >> r6:da645d80 r5:0001
> > >> [  246.758206]  r4:eff213b0
> > >> [  246.758214] [] (f2fs_write_cache_pages) from [] 
> > >> (f2fs_write_data_pages+0x344/0x35c)
> > >> [  246.758220]  r10: r9:d9ed002c r8:d9ed r7:0004 
> > >> r6:da283d60 r5:da283c60
> > >> [  246.758223]  r4:da645d80
> > >> [  246.758238] [] (f2fs_write_data_pages) from [] 
> > >> (do_writepages+0x3c/0xd4)
> > >> [  246.758244]  r10:000a r9:c0e03d00 r8:0c00 r7:c0264ddc 
> > >> r6:da645d80 r5:da283d60
> > >> [  246.758246]  r4:da283c60
> > >> [  246.758254] [] (do_writepages) from [] 
> > >> (__writeback_single_inode+0x44/0x454)
> > >> [  246.758259]  r7:da283d60 r6:da645eac r5:da645d80 r4:da283c60
> > >> [  246.758266] [] (__writeback_single_inode) from [] 
> > >> (writeback_sb_inodes+0x204/0x4b0)
> > >> [  246.758272]  r10:000a r9:c0e03d00 r8:da283cc8 r7:da283c60 
> > >> r6:da645eac r5:da283d08
> > >> [  246.758274]  r4:d9dc9848
> > >> [  246.758281] [] (writeback_sb_inodes) from [] 
> > >> (__writeback_inodes_wb+0x50/0xe4)
> > >> [  246.758287]  r10:da3797a8 r9:c0e03d00 r8:d9dc985c r7:da645eac 
> > >> r6: r5:d9dc9848
> > >> [  246.758289]  r4:da5a8800
> > >> [  246.758296] [] (__writeback_inodes_wb) from [] 
> > >> (wb_writeback+0x294/0x338)
> > >> [  246.758302]  r10:fffbf200 r9:da644000 r8:c0e04e64 r7:d9dc9848 
> > >> r6:d9dc9874 r5:da645eac
> > >> [  246.758305]  r4:d9dc9848
> > >> [  246.758312] [] (wb_writeback) from [] 
> > >> (wb_workfn+0x35c/0x54c)
> > >> [  246.758318]  r10:da5f2005 r9:d9dc984c r8:d9dc9948 r7:d9dc9848 
> > >> r6: r5:d9dc9954
> > >> [  246.758321]  r4:31e6
> > >> [

Re: [f2fs-dev] Writes stoped working on f2fs after the compression support was added

2020-02-24 Thread Ondřej Jirman

Hello,

On Mon, Feb 24, 2020 at 06:41:03PM +0800, Chao Yu wrote:
> On 2020/2/24 18:37, Chao Yu wrote:
> > Hi,
> > 
> > Thanks for the report.
> > 
> > Could you dump all other task stack info via "echo "t" > 
> > /proc/sysrq-trigger"?
> > 
> >>
> >> [  246.758021] INFO: task kworker/u16:1:58 blocked for more than 122 
> >> seconds.
> >> [  246.758040]   Not tainted 5.6.0-rc2-00590-g9983bdae4974e #11
> >> [  246.758044] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> >> this message.
> >> [  246.758052] kworker/u16:1   D058  2 0x
> >> [  246.758090] Workqueue: writeback wb_workfn (flush-179:0)
> >> [  246.758099] Backtrace:
> >> [  246.758121] [] (__schedule) from [] 
> >> (schedule+0x78/0xf4)
> >> [  246.758130]  r10:da644000 r9: r8:da645a60 r7:da283e10 
> >> r6:0002 r5:da644000
> >> [  246.758132]  r4:da4d3600
> >> [  246.758148] [] (schedule) from [] 
> >> (rwsem_down_write_slowpath+0x24c/0x4c0)
> >> [  246.758152]  r5:0001 r4:da283e00
> >> [  246.758161] [] (rwsem_down_write_slowpath) from [] 
> >> (down_write+0x6c/0x70)
> >> [  246.758167]  r10:da283e00 r9:da645d80 r8:d9ed r7:0001 
> >> r6: r5:eff213b0
> >> [  246.758169]  r4:da283e00
> >> [  246.758187] [] (down_write) from [] 
> >> (f2fs_write_single_data_page+0x608/0x7ac)
> > 
> > I'm not sure what is this semaphore, I suspect this is 
> > F2FS_I(inode)->i_sem, in order to make
> > sure of this, can you help to add below function, and use them to replace
> > all {down,up}_{write,read}(&.i_sem) invoking? then reproduce this issue and 
> > catch the log.
> 
> Sorry, just forgot attaching below function.
> 
> void inode_down_write(struct inode *inode)
> {
>   printk("%s from %pS\n", __func__, __builtin_return_address(0));
>   down_write(&F2FS_I(inode)->i_sem);
> }
> 
> void inode_up_write(struct inode *inode)
> {
>   up_write(&F2FS_I(inode)->i_sem);
>   printk("%s from %pS\n", __func__, __builtin_return_address(0));
> }
> 
> void inode_down_read(struct inode *inode)
> {
>   printk("%s from %pS\n", __func__, __builtin_return_address(0));
>   down_read(&F2FS_I(inode)->i_sem);
> }
> 
> void inode_up_read(struct inode *inode)
> {
>   up_read(&F2FS_I(inode)->i_sem);
>   printk("%s from %pS\n", __func__, __builtin_return_address(0));
> }
> 

Here's the log and vmlinux file that may help mapping the code addresses back to
code, hope it helps:

https://megous.com/dl/tmp/f2fs-dmesg-log
https://megous.com/dl/tmp/f2fs-log-build-artifacts.tar.gz

thank you,
o.

> > Thanks,
> > 
> >> [  246.758190]  r5:eff213b0 r4:da283c60
> >> [  246.758198] [] (f2fs_write_single_data_page) from 
> >> [] (f2fs_write_cache_pages+0x2b4/0x7c4)
> >> [  246.758204]  r10:da645c28 r9:da283d60 r8:da283c60 r7:000f 
> >> r6:da645d80 r5:0001
> >> [  246.758206]  r4:eff213b0
> >> [  246.758214] [] (f2fs_write_cache_pages) from [] 
> >> (f2fs_write_data_pages+0x344/0x35c)
> >> [  246.758220]  r10: r9:d9ed002c r8:d9ed r7:0004 
> >> r6:da283d60 r5:da283c60
> >> [  246.758223]  r4:da645d80
> >> [  246.758238] [] (f2fs_write_data_pages) from [] 
> >> (do_writepages+0x3c/0xd4)
> >> [  246.758244]  r10:000a r9:c0e03d00 r8:0c00 r7:c0264ddc 
> >> r6:da645d80 r5:da283d60
> >> [  246.758246]  r4:da283c60
> >> [  246.758254] [] (do_writepages) from [] 
> >> (__writeback_single_inode+0x44/0x454)
> >> [  246.758259]  r7:da283d60 r6:da645eac r5:da645d80 r4:da283c60
> >> [  246.758266] [] (__writeback_single_inode) from [] 
> >> (writeback_sb_inodes+0x204/0x4b0)
> >> [  246.758272]  r10:000a r9:c0e03d00 r8:da283cc8 r7:da283c60 
> >> r6:da645eac r5:da283d08
> >> [  246.758274]  r4:d9dc9848
> >> [  246.758281] [] (writeback_sb_inodes) from [] 
> >> (__writeback_inodes_wb+0x50/0xe4)
> >> [  246.758287]  r10:da3797a8 r9:c0e03d00 r8:d9dc985c r7:da645eac 
> >> r6: r5:d9dc9848
> >> [  246.758289]  r4:da5a8800
> >> [  246.758296] [] (__writeback_inodes_wb) from [] 
> >> (wb_writeback+0x294/0x338)
> >> [  246.758302]  r10:fffbf200 r9:da644000 r8:c0e04e64 r7:d9dc9848 
> >> r6:d9dc9874 r5:da645eac
> >> [  246.758305]  r4:d9dc9848
> >> [  246.758312] [] (wb_writeback) from [] 
> >> (wb_workfn+0x35c/0x54c)
> >> [  246.758318]  r10:da5f2005 r9:d9dc984c r8:d9dc9948 r7:d9dc9848 
> >> r6: r5:d9dc9954
> >> [  246.758321]  r4:31e6
> >> [  246.758334] [] (wb_workfn) from [] 
> >> (process_one_work+0x214/0x544)
> >> [  246.758340]  r10:da5f2005 r9:0200 r8: r7:da5f2000 
> >> r6:ef044400 r5:da5eb000
> >> [  246.758343]  r4:d9dc9954
> >> [  246.758350] [] (process_one_work) from [] 
> >> (worker_thread+0x4c/0x574)
> >> [  246.758357]  r10:ef044400 r9:c0e03d00 r8:ef044418 r7:0088 
> >> r6:ef044400 r5:da5eb014
> >> [  246.758359]  r4:da5eb000
> >> [  246.758368] [] (worker_thread) from [] 
> >> (kthread+0x144/0x170)
> >> [  246.758374]  r10:ec9e5e90 r9:dabf325c r8:da5eb000 r7:da644000 
> >> r6: r5:da5fe000
> >> [  246.758377]  r4:dabf3240
> >> [

Re: [f2fs-dev] f2fs.fsck should allow dry-run on RO mounted device

2020-02-24 Thread Chao Yu

On 2020/2/23 2:28, Ondřej Jirman wrote:
> Hello,
> 
> I was trying to run: fsck.f2fs --dry-run /dev/mmcblk0p2 on a RO mounted 
> device,
> and fsck refuses to run. Strace shows that it tries to open the block device
> with O_EXCL even in RO mode, which will always fail if the block device
> is mounted.

Thanks for your report, I've figured out one patch to fix this.

Thanks,

> 
> openat(AT_FDCWD, "/proc/mounts", O_RDONLY|O_CLOEXEC) = 3
> fstat64(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
> read(3, "/dev/root / f2fs ro,lazytime,rel"..., 1024) = 843
> close(3)= 0
> write(1, "Info: Mounted device!\n", 22Info: Mounted device!
> ) = 22
> write(1, "Info: Check FS only on RO mounte"..., 41Info: Check FS only on RO 
> mounted device
> ) = 41
> stat64("/dev/mmcblk0p2", {st_mode=S_IFBLK|0600, st_rdev=makedev(0xb3, 0x2), 
> ...}) = 0
> openat(AT_FDCWD, "/dev/mmcblk0p2", O_RDWR|O_EXCL|O_LARGEFILE) = -1 EBUSY 
> (Device or resource busy)
> openat(AT_FDCWD, "/dev/mmcblk0p2", O_RDONLY|O_EXCL|O_LARGEFILE) = -1 EBUSY 
> (Device or resource busy)
> write(1, "\tError: Failed to open the devic"..., 35   Error: Failed to open 
> the device!
> ) = 35
> exit_group(-1)  = ?
> +++ exited with 255 +++
> 
> 
> fsck.f2fs --dry-run /dev/mmcblk0p2
> Info: Dry run
> Info: Mounted device!
> Info: Check FS only on RO mounted device
>   Error: Failed to open the device!
> 
> I suggest not using O_EXCL for --dry-run check.
> 
> regards,
>   Ondrej
> 
> 
> ___
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> .
> 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] fsck.f2fs: allow --dry-run to check readonly mounted fs

2020-02-24 Thread Chao Yu

As Ondřej Jirman  reported:

I was trying to run: fsck.f2fs --dry-run /dev/mmcblk0p2 on a RO mounted device,
and fsck refuses to run. Strace shows that it tries to open the block device
with O_EXCL even in RO mode, which will always fail if the block device
is mounted.

fsck.f2fs --dry-run /dev/mmcblk0p2
Info: Dry run
Info: Mounted device!
Info: Check FS only on RO mounted device
Error: Failed to open the device!

I suggest not using O_EXCL for --dry-run check.

Let's change to allow --dry-run to check readonly mounted fs.

Reported-by: Ondřej Jirman 
Signed-off-by: Chao Yu 
---
 lib/libf2fs.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/libf2fs.c b/lib/libf2fs.c
index d527d68..5c064c8 100644
--- a/lib/libf2fs.c
+++ b/lib/libf2fs.c
@@ -907,7 +907,8 @@ int get_device_info(int i)
return -1;
}
 
-   if (S_ISBLK(stat_buf->st_mode) && !c.force && c.func != DUMP) {
+   if (S_ISBLK(stat_buf->st_mode) &&
+   !c.force && c.func != DUMP && !c.dry_run) {
fd = open(dev->path, O_RDWR | O_EXCL);
if (fd < 0)
fd = open_check_fs(dev->path, O_EXCL);
-- 
2.18.0.rc1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 2/5] f2fs: fix to avoid potential deadlock

2020-02-24 Thread Chao Yu

Using f2fs_trylock_op() in f2fs_write_compressed_pages() to avoid potential
deadlock like we did in f2fs_write_single_data_page().

Signed-off-by: Chao Yu 
---
 fs/f2fs/compress.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index b1bc057ee266..4ba65a3b0456 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -772,7 +772,6 @@ static int f2fs_write_compressed_pages(struct compress_ctx 
*cc,
.encrypted_page = NULL,
.compressed_page = NULL,
.submitted = false,
-   .need_lock = LOCK_RETRY,
.io_type = io_type,
.io_wbc = wbc,
.encrypted = f2fs_encrypted_file(cc->inode),
@@ -785,9 +784,10 @@ static int f2fs_write_compressed_pages(struct compress_ctx 
*cc,
loff_t psize;
int i, err;
 
-   set_new_dnode(&dn, cc->inode, NULL, NULL, 0);
+   if (!f2fs_trylock_op(sbi))
+   return -EAGAIN;
 
-   f2fs_lock_op(sbi);
+   set_new_dnode(&dn, cc->inode, NULL, NULL, 0);
 
err = f2fs_get_dnode_of_data(&dn, start_idx, LOOKUP_NODE);
if (err)
-- 
2.18.0.rc1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 3/5] f2fs: fix to avoid NULL pointer dereference

2020-02-24 Thread Chao Yu

Unable to handle kernel NULL pointer dereference at virtual address 
PC is at f2fs_free_dic+0x60/0x2c8
LR is at f2fs_decompress_pages+0x3c4/0x3e8
f2fs_free_dic+0x60/0x2c8
f2fs_decompress_pages+0x3c4/0x3e8
__read_end_io+0x78/0x19c
f2fs_post_read_work+0x6c/0x94
process_one_work+0x210/0x48c
worker_thread+0x2e8/0x44c
kthread+0x110/0x120
ret_from_fork+0x10/0x18

In f2fs_free_dic(), we can not use f2fs_put_page(,1) to release dic->tpages[i],
as the page's mapping is NULL.

Signed-off-by: Chao Yu 
---
 fs/f2fs/compress.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 4ba65a3b0456..864035549329 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -1133,7 +1133,8 @@ void f2fs_free_dic(struct decompress_io_ctx *dic)
for (i = 0; i < dic->cluster_size; i++) {
if (dic->rpages[i])
continue;
-   f2fs_put_page(dic->tpages[i], 1);
+   unlock_page(dic->tpages[i]);
+   put_page(dic->tpages[i]);
}
kfree(dic->tpages);
}
-- 
2.18.0.rc1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 5/5] f2fs: add missing function name in kernel message

2020-02-24 Thread Chao Yu

Otherwise, we can not distinguish the exact location of messages,
when there are more than one places printing same message.

Signed-off-by: Chao Yu 
---
 fs/f2fs/f2fs.h | 2 +-
 fs/f2fs/node.c | 5 +++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index bd264d8cddaf..4a02edc2454b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -2231,7 +2231,7 @@ static inline void dec_valid_node_count(struct 
f2fs_sb_info *sbi,
dquot_free_inode(inode);
} else {
if (unlikely(inode->i_blocks == 0)) {
-   f2fs_warn(sbi, "Inconsistent i_blocks, ino:%lu, 
iblocks:%llu",
+   f2fs_warn(sbi, "dec_valid_node_count: inconsistent 
i_blocks, ino:%lu, iblocks:%llu",
  inode->i_ino,
  (unsigned long long)inode->i_blocks);
set_sbi_flag(sbi, SBI_NEED_FSCK);
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 040ffb09a126..951c66d63350 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1187,8 +1187,9 @@ int f2fs_remove_inode_page(struct inode *inode)
}
 
if (unlikely(inode->i_blocks != 0 && inode->i_blocks != 8)) {
-   f2fs_warn(F2FS_I_SB(inode), "Inconsistent i_blocks, ino:%lu, 
iblocks:%llu",
- inode->i_ino, (unsigned long long)inode->i_blocks);
+   f2fs_warn(F2FS_I_SB(inode),
+   "f2fs_remove_inode_page: inconsistent i_blocks, 
ino:%lu, iblocks:%llu",
+   inode->i_ino, (unsigned long long)inode->i_blocks);
set_sbi_flag(F2FS_I_SB(inode), SBI_NEED_FSCK);
}
 
-- 
2.18.0.rc1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 1/5] f2fs: fix to show tracepoint correctly

2020-02-24 Thread Chao Yu

Signed-off-by: Chao Yu 
---
 fs/f2fs/file.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 613f87151d90..13f3454ceb92 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -3649,8 +3649,10 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, 
struct iov_iter *from)
goto out;
}
 
-   if (!f2fs_is_compress_backend_ready(inode))
-   return -EOPNOTSUPP;
+   if (!f2fs_is_compress_backend_ready(inode)) {
+   ret = -EOPNOTSUPP;
+   goto out;
+   }
 
if (iocb->ki_flags & IOCB_NOWAIT) {
if (!inode_trylock(inode)) {
-- 
2.18.0.rc1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 4/5] f2fs: recycle unused compress_data.chksum feild

2020-02-24 Thread Chao Yu

In Struct compress_data, chksum field was never used, remove it.

Signed-off-by: Chao Yu 
---
 fs/f2fs/compress.c | 1 -
 fs/f2fs/f2fs.h | 3 +--
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 864035549329..e8b88f7bb7a5 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -380,7 +380,6 @@ static int f2fs_compress_pages(struct compress_ctx *cc)
}
 
cc->cbuf->clen = cpu_to_le32(cc->clen);
-   cc->cbuf->chksum = cpu_to_le32(0);
 
for (i = 0; i < COMPRESS_DATA_RESERVED_SIZE; i++)
cc->cbuf->reserved[i] = cpu_to_le32(0);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 468f807fd917..bd264d8cddaf 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1233,10 +1233,9 @@ enum compress_algorithm_type {
COMPRESS_MAX,
 };
 
-#define COMPRESS_DATA_RESERVED_SIZE4
+#define COMPRESS_DATA_RESERVED_SIZE5
 struct compress_data {
__le32 clen;/* compressed data size */
-   __le32 chksum;  /* checksum of compressed data */
__le32 reserved[COMPRESS_DATA_RESERVED_SIZE];   /* reserved */
u8 cdata[]; /* compressed data */
 };
-- 
2.18.0.rc1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] Writes stoped working on f2fs after the compression support was added

2020-02-24 Thread Chao Yu

On 2020/2/24 18:37, Chao Yu wrote:
> Hi,
> 
> Thanks for the report.
> 
> On 2020/2/23 2:17, Ondřej Jirman wrote:
>> Hello,
>>
>> I observe hung background f2fs task on 5.6-rc2+ shortly after boot, leading 
>> to
>> panics. I use f2fs as rootfs. See below for stack trace. The reads continue 
>> to
>> work (if I disable panic on hung task). But sync blocks, writes block, etc.
>>
>> It does not happen on all my f2fs filesystems, so it may depend on workload
>> or my particular filesystem state. It happens on two separate devices though,
>> both 32-bit, and doesn't happen on a 64-bit device. (might be a false lead,
>> though)
>>
>> I went through the f2fs-for-5.6 tag/branch and reverted each patch right
>> down to:
>>
>>   4c8ff7095bef64fc47e996a938f7d57f9e077da3 f2fs: support data compression
>>
>> I could still reproduce the issue.
>>
>> After reverting the data compression too, I could no longer reproduce the
>> issue. Perhaps not very helpful, since that patch is huge.
>>
>> I tried to collect some information. Some ftrace logs, etc. Not sure what 
>> would
>> help debug this. Let me know if I can help debug this more.
>>
>> Symptoms look the same as this issue:
>>
>>   
>> https://www.mail-archive.com/linux-f2fs-devel@lists.sourceforge.net/msg15298.html
>>
>> ftrace: https://megous.com/dl/tmp/f2fs-debug-info.txt
>>
>> longer ftrace: https://megous.com/dl/tmp/f2fs-debug-info-full.txt
> 
> I checked the full trace log, however I didn't find any clue to troubleshot 
> this issue.
> 
> [snip]
> 
>> dmesg:
> 
> Could you dump all other task stack info via "echo "t" > /proc/sysrq-trigger"?
> 
>>
>> [  246.758021] INFO: task kworker/u16:1:58 blocked for more than 122 seconds.
>> [  246.758040]   Not tainted 5.6.0-rc2-00590-g9983bdae4974e #11
>> [  246.758044] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
>> this message.
>> [  246.758052] kworker/u16:1   D058  2 0x
>> [  246.758090] Workqueue: writeback wb_workfn (flush-179:0)
>> [  246.758099] Backtrace:
>> [  246.758121] [] (__schedule) from [] 
>> (schedule+0x78/0xf4)
>> [  246.758130]  r10:da644000 r9: r8:da645a60 r7:da283e10 r6:0002 
>> r5:da644000
>> [  246.758132]  r4:da4d3600
>> [  246.758148] [] (schedule) from [] 
>> (rwsem_down_write_slowpath+0x24c/0x4c0)
>> [  246.758152]  r5:0001 r4:da283e00
>> [  246.758161] [] (rwsem_down_write_slowpath) from [] 
>> (down_write+0x6c/0x70)
>> [  246.758167]  r10:da283e00 r9:da645d80 r8:d9ed r7:0001 r6: 
>> r5:eff213b0
>> [  246.758169]  r4:da283e00
>> [  246.758187] [] (down_write) from [] 
>> (f2fs_write_single_data_page+0x608/0x7ac)
> 
> I'm not sure what is this semaphore, I suspect this is F2FS_I(inode)->i_sem, 
> in order to make
> sure of this, can you help to add below function, and use them to replace
> all {down,up}_{write,read}(&.i_sem) invoking? then reproduce this issue and 
> catch the log.

Sorry, just forgot attaching below function.

void inode_down_write(struct inode *inode)
{
printk("%s from %pS\n", __func__, __builtin_return_address(0));
down_write(&F2FS_I(inode)->i_sem);
}

void inode_up_write(struct inode *inode)
{
up_write(&F2FS_I(inode)->i_sem);
printk("%s from %pS\n", __func__, __builtin_return_address(0));
}

void inode_down_read(struct inode *inode)
{
printk("%s from %pS\n", __func__, __builtin_return_address(0));
down_read(&F2FS_I(inode)->i_sem);
}

void inode_up_read(struct inode *inode)
{
up_read(&F2FS_I(inode)->i_sem);
printk("%s from %pS\n", __func__, __builtin_return_address(0));
}

> 
> Thanks,
> 
>> [  246.758190]  r5:eff213b0 r4:da283c60
>> [  246.758198] [] (f2fs_write_single_data_page) from [] 
>> (f2fs_write_cache_pages+0x2b4/0x7c4)
>> [  246.758204]  r10:da645c28 r9:da283d60 r8:da283c60 r7:000f r6:da645d80 
>> r5:0001
>> [  246.758206]  r4:eff213b0
>> [  246.758214] [] (f2fs_write_cache_pages) from [] 
>> (f2fs_write_data_pages+0x344/0x35c)
>> [  246.758220]  r10: r9:d9ed002c r8:d9ed r7:0004 r6:da283d60 
>> r5:da283c60
>> [  246.758223]  r4:da645d80
>> [  246.758238] [] (f2fs_write_data_pages) from [] 
>> (do_writepages+0x3c/0xd4)
>> [  246.758244]  r10:000a r9:c0e03d00 r8:0c00 r7:c0264ddc r6:da645d80 
>> r5:da283d60
>> [  246.758246]  r4:da283c60
>> [  246.758254] [] (do_writepages) from [] 
>> (__writeback_single_inode+0x44/0x454)
>> [  246.758259]  r7:da283d60 r6:da645eac r5:da645d80 r4:da283c60
>> [  246.758266] [] (__writeback_single_inode) from [] 
>> (writeback_sb_inodes+0x204/0x4b0)
>> [  246.758272]  r10:000a r9:c0e03d00 r8:da283cc8 r7:da283c60 r6:da645eac 
>> r5:da283d08
>> [  246.758274]  r4:d9dc9848
>> [  246.758281] [] (writeback_sb_inodes) from [] 
>> (__writeback_inodes_wb+0x50/0xe4)
>> [  246.758287]  r10:da3797a8 r9:c0e03d00 r8:d9dc985c r7:da645eac r6: 
>> r5:d9dc9848
>> [  246.758289]  r4:da5a8800
>> [  246.758296] [] (__writeback_inodes_wb) from [] 
>> (wb_writeback+0x294/0x338)
>>

Re: [f2fs-dev] Writes stoped working on f2fs after the compression support was added

2020-02-24 Thread Chao Yu

Hi,

Thanks for the report.

On 2020/2/23 2:17, Ondřej Jirman wrote:
> Hello,
> 
> I observe hung background f2fs task on 5.6-rc2+ shortly after boot, leading to
> panics. I use f2fs as rootfs. See below for stack trace. The reads continue to
> work (if I disable panic on hung task). But sync blocks, writes block, etc.
> 
> It does not happen on all my f2fs filesystems, so it may depend on workload
> or my particular filesystem state. It happens on two separate devices though,
> both 32-bit, and doesn't happen on a 64-bit device. (might be a false lead,
> though)
> 
> I went through the f2fs-for-5.6 tag/branch and reverted each patch right
> down to:
> 
>   4c8ff7095bef64fc47e996a938f7d57f9e077da3 f2fs: support data compression
> 
> I could still reproduce the issue.
> 
> After reverting the data compression too, I could no longer reproduce the
> issue. Perhaps not very helpful, since that patch is huge.
> 
> I tried to collect some information. Some ftrace logs, etc. Not sure what 
> would
> help debug this. Let me know if I can help debug this more.
> 
> Symptoms look the same as this issue:
> 
>   
> https://www.mail-archive.com/linux-f2fs-devel@lists.sourceforge.net/msg15298.html
> 
> ftrace: https://megous.com/dl/tmp/f2fs-debug-info.txt
> 
> longer ftrace: https://megous.com/dl/tmp/f2fs-debug-info-full.txt

I checked the full trace log, however I didn't find any clue to troubleshot 
this issue.

[snip]

> dmesg:

Could you dump all other task stack info via "echo "t" > /proc/sysrq-trigger"?

> 
> [  246.758021] INFO: task kworker/u16:1:58 blocked for more than 122 seconds.
> [  246.758040]   Not tainted 5.6.0-rc2-00590-g9983bdae4974e #11
> [  246.758044] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this message.
> [  246.758052] kworker/u16:1   D058  2 0x
> [  246.758090] Workqueue: writeback wb_workfn (flush-179:0)
> [  246.758099] Backtrace:
> [  246.758121] [] (__schedule) from [] 
> (schedule+0x78/0xf4)
> [  246.758130]  r10:da644000 r9: r8:da645a60 r7:da283e10 r6:0002 
> r5:da644000
> [  246.758132]  r4:da4d3600
> [  246.758148] [] (schedule) from [] 
> (rwsem_down_write_slowpath+0x24c/0x4c0)
> [  246.758152]  r5:0001 r4:da283e00
> [  246.758161] [] (rwsem_down_write_slowpath) from [] 
> (down_write+0x6c/0x70)
> [  246.758167]  r10:da283e00 r9:da645d80 r8:d9ed r7:0001 r6: 
> r5:eff213b0
> [  246.758169]  r4:da283e00
> [  246.758187] [] (down_write) from [] 
> (f2fs_write_single_data_page+0x608/0x7ac)

I'm not sure what is this semaphore, I suspect this is F2FS_I(inode)->i_sem, in 
order to make
sure of this, can you help to add below function, and use them to replace
all {down,up}_{write,read}(&.i_sem) invoking? then reproduce this issue and 
catch the log.

Thanks,

> [  246.758190]  r5:eff213b0 r4:da283c60
> [  246.758198] [] (f2fs_write_single_data_page) from [] 
> (f2fs_write_cache_pages+0x2b4/0x7c4)
> [  246.758204]  r10:da645c28 r9:da283d60 r8:da283c60 r7:000f r6:da645d80 
> r5:0001
> [  246.758206]  r4:eff213b0
> [  246.758214] [] (f2fs_write_cache_pages) from [] 
> (f2fs_write_data_pages+0x344/0x35c)
> [  246.758220]  r10: r9:d9ed002c r8:d9ed r7:0004 r6:da283d60 
> r5:da283c60
> [  246.758223]  r4:da645d80
> [  246.758238] [] (f2fs_write_data_pages) from [] 
> (do_writepages+0x3c/0xd4)
> [  246.758244]  r10:000a r9:c0e03d00 r8:0c00 r7:c0264ddc r6:da645d80 
> r5:da283d60
> [  246.758246]  r4:da283c60
> [  246.758254] [] (do_writepages) from [] 
> (__writeback_single_inode+0x44/0x454)
> [  246.758259]  r7:da283d60 r6:da645eac r5:da645d80 r4:da283c60
> [  246.758266] [] (__writeback_single_inode) from [] 
> (writeback_sb_inodes+0x204/0x4b0)
> [  246.758272]  r10:000a r9:c0e03d00 r8:da283cc8 r7:da283c60 r6:da645eac 
> r5:da283d08
> [  246.758274]  r4:d9dc9848
> [  246.758281] [] (writeback_sb_inodes) from [] 
> (__writeback_inodes_wb+0x50/0xe4)
> [  246.758287]  r10:da3797a8 r9:c0e03d00 r8:d9dc985c r7:da645eac r6: 
> r5:d9dc9848
> [  246.758289]  r4:da5a8800
> [  246.758296] [] (__writeback_inodes_wb) from [] 
> (wb_writeback+0x294/0x338)
> [  246.758302]  r10:fffbf200 r9:da644000 r8:c0e04e64 r7:d9dc9848 r6:d9dc9874 
> r5:da645eac
> [  246.758305]  r4:d9dc9848
> [  246.758312] [] (wb_writeback) from [] 
> (wb_workfn+0x35c/0x54c)
> [  246.758318]  r10:da5f2005 r9:d9dc984c r8:d9dc9948 r7:d9dc9848 r6: 
> r5:d9dc9954
> [  246.758321]  r4:31e6
> [  246.758334] [] (wb_workfn) from [] 
> (process_one_work+0x214/0x544)
> [  246.758340]  r10:da5f2005 r9:0200 r8: r7:da5f2000 r6:ef044400 
> r5:da5eb000
> [  246.758343]  r4:d9dc9954
> [  246.758350] [] (process_one_work) from [] 
> (worker_thread+0x4c/0x574)
> [  246.758357]  r10:ef044400 r9:c0e03d00 r8:ef044418 r7:0088 r6:ef044400 
> r5:da5eb014
> [  246.758359]  r4:da5eb000
> [  246.758368] [] (worker_thread) from [] 
> (kthread+0x144/0x170)
> [  246.758374]  r10:ec9e5e90 r9:dabf325c r8:da5eb000 r7:da644000 r

42 matches

Mail list logo