Re: [PATCH] staging: erofs: removing an extra call to iloc() in fill_inode()

2019-08-14 Thread Gao Xiang
On Wed, Aug 14, 2019 at 09:22:53AM +0530, Pratik Shinde wrote:
> Yes.since we already have a function with same name (and we are using it in
> same context).
> 'inode_loc' was the most meaningful name I could come up with :)
> 
> --Pratik.

And one more small suggestion... see the following,
https://lore.kernel.org/lkml/20190805044225.ga14...@kroah.com/

Happy hacking! :)

Thanks,
Gao Xiang

> 
> On Wed, Aug 14, 2019 at 7:37 AM Gao Xiang  wrote:
> 
> > On Wed, Aug 14, 2019 at 09:56:09AM +0800, Chao Yu wrote:
> > > On 2019/8/14 9:59, Gao Xiang wrote:
> > > > Hi Pratik,
> > > >
> > > > On Wed, Aug 14, 2019 at 02:08:40AM +0530, Pratik Shinde wrote:
> > > >> in fill_inode() we call iloc() twice.Avoiding the extra call by
> > > >> storing the result.
> > > >>
> > > >> Signed-off-by: Pratik Shinde 
> > > >
> > > > I have no objection of this patch, but I'd like to
> > > > hear Chao/Greg's idea about this...
> > >
> > > It looks more clean. :)
> > >
> > > Nitpick, maybe change 'inode_loc' to shorter 'iloc' will be better.
> >
> > iloc is the name of static inline helper function in internal.h
> > used for shorter lines...
> >
> > Thanks,
> > Gao Xiang
> >
> > >
> > > Reviewed-by: Chao Yu 
> > >
> > > Thanks,
> > >
> > > >
> > > > Thanks,
> > > > Gao Xiang
> > > >
> > > >> ---
> > > >>  drivers/staging/erofs/inode.c | 7 ---
> > > >>  1 file changed, 4 insertions(+), 3 deletions(-)
> > > >>
> > > >> diff --git a/drivers/staging/erofs/inode.c
> > b/drivers/staging/erofs/inode.c
> > > >> index 4c3d8bf..d82ba6c 100644
> > > >> --- a/drivers/staging/erofs/inode.c
> > > >> +++ b/drivers/staging/erofs/inode.c
> > > >> @@ -167,11 +167,12 @@ static int fill_inode(struct inode *inode, int
> > isdir)
> > > >>int err;
> > > >>erofs_blk_t blkaddr;
> > > >>unsigned int ofs;
> > > >> +  erofs_off_t inode_loc;
> > > >>
> > > >>trace_erofs_fill_inode(inode, isdir);
> > > >> -
> > > >> -  blkaddr = erofs_blknr(iloc(sbi, vi->nid));
> > > >> -  ofs = erofs_blkoff(iloc(sbi, vi->nid));
> > > >> +  inode_loc = iloc(sbi, vi->nid);
> > > >> +  blkaddr = erofs_blknr(inode_loc);
> > > >> +  ofs = erofs_blkoff(inode_loc);
> > > >>
> > > >>debugln("%s, reading inode nid %llu at %u of blkaddr %u",
> > > >>__func__, vi->nid, ofs, blkaddr);
> > > >> --
> > > >> 2.9.3
> > > >>
> > > > .
> > > >
> >
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH RESEND 1/2] staging: erofs: introduce EFSCORRUPTED and more logs

2019-08-14 Thread Chao Yu
On 2019/8/14 12:32, Gao Xiang wrote:
> Previously, EROFS uses EIO to indicate that filesystem is
> corrupted as well, but other filesystems tend to use
> EUCLEAN instead, let's follow what others do right now.
> 
> Also, add some more prints to the syslog.
> 
> Suggested-by: Pavel Machek 
> Signed-off-by: Gao Xiang 

Reviewed-by: Chao Yu 

Thanks,
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[BUG] removing and reinserting imx-media causes kernel to explode

2019-08-14 Thread Russell King - ARM Linux admin
I just did this:

rmmod imx-media
modprobe imx-media

and was greeted by the below kernel messages.  I don't think this has
been the first issue I found with the iMX media stuff involving a module
unload/reload cycle - may I suggest that this is added to the testing
regime for this code?  Thanks.

imx-media: Removing imx-media
ipu1_vdic: Removing
ipu1_ic_prp: Removing
ipu1_ic_prpenc: Removing
ipu1_ic_prpvf: Removing
ipu2_vdic: Removing
ipu2_ic_prp: Removing
ipu2_ic_prpenc: Removing
ipu2_ic_prpvf: Removing
imx_media: module is from the staging directory, the quality is unknown, you 
have been warned.
ipu2_ic_prpvf: Registered ipu2_ic_prpvf capture as /dev/video2
imx-media: subdev ipu2_ic_prpvf bound
ipu2_ic_prpenc: Registered ipu2_ic_prpenc capture as /dev/video3
imx-media: subdev ipu2_ic_prpenc bound
imx-media: subdev ipu2_ic_prp bound
imx-media: subdev ipu2_vdic bound
ipu1_ic_prpvf: Registered ipu1_ic_prpvf capture as /dev/video4
imx-media: subdev ipu1_ic_prpvf bound
ipu1_ic_prpenc: Registered ipu1_ic_prpenc capture as /dev/video5
imx-media: subdev ipu1_ic_prpenc bound
imx-media: subdev ipu1_ic_prp bound
imx-media: subdev ipu1_vdic bound
kobject (ddca68f0): tried to init an initialized object, something is seriously 
wrong.
CPU: 1 PID: 31521 Comm: modprobe Tainted: G C5.2.0+ #325
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[] (show_stack) from [] (dump_stack+0x9c/0xd4)
[] (dump_stack) from [] (kobject_init+0x74/0x94)
[] (kobject_init) from [] (device_initialize+0x1c/0xec)
[] (device_initialize) from [] (device_register+0xc/0x18)
[] (device_register) from [] 
(__video_register_device+0x9b4/0x1228)
[] (__video_register_device) from [] 
(imx_media_capture_device_register+0x44/0x1f4 [imx_media_capture])
[] (imx_media_capture_device_register [imx_media_capture]) from 
[] (csi_registered+0x154/0x19c [imx_media_csi])
[] (csi_registered [imx_media_csi]) from [] 
(v4l2_device_register_subdev+0xd0/0x164)
[] (v4l2_device_register_subdev) from [] 
(v4l2_async_match_notify+0x1c/0x130)
[] (v4l2_async_match_notify) from [] 
(v4l2_async_notifier_try_all_subdevs+0x48/0x94)
[] (v4l2_async_notifier_try_all_subdevs) from [] 
(__v4l2_async_notifier_register+0xa8/0x110)
[] (__v4l2_async_notifier_register) from [] 
(v4l2_async_notifier_register+0x3c/0x54)
[] (v4l2_async_notifier_register) from [] 
(imx_media_dev_notifier_register+0x2c/0x70 [imx_media])
[] (imx_media_dev_notifier_register [imx_media]) from [] 
(imx_media_probe+0x3c/0x8c [imx_media])
[] (imx_media_probe [imx_media]) from [] 
(platform_drv_probe+0x48/0x98)
[] (platform_drv_probe) from [] (really_probe+0x1d8/0x2c0)
[] (really_probe) from [] (driver_probe_device+0x5c/0x174)
[] (driver_probe_device) from [] 
(device_driver_attach+0x58/0x60)
[] (device_driver_attach) from [] 
(__driver_attach+0x84/0xc0)
[] (__driver_attach) from [] (bus_for_each_dev+0x58/0x7c)
[] (bus_for_each_dev) from [] (bus_add_driver+0xd0/0x1cc)
[] (bus_add_driver) from [] (driver_register+0x7c/0x110)
[] (driver_register) from [] (do_one_initcall+0x74/0x308)
[] (do_one_initcall) from [] (do_init_module+0x5c/0x1f4)
[] (do_init_module) from [] (load_module+0x19a4/0x2020)
[] (load_module) from [] (sys_finit_module+0x8c/0x98)
[] (sys_finit_module) from [] (ret_fast_syscall+0x0/0x28)
Exception stack(0xdb677fa8 to 0xdb677ff0)
7fa0:   00b04170  0003 007bd84c  00b05cb8
7fc0: 00b04170  1ee84500 017b 0004  00b04eb8 
7fe0: be958178 be958168 007b54bb b6c28712
ipu1_csi0: Registered ipu1_csi0 capture as /dev/video6
imx-media: subdev ipu1_csi0 bound
kobject (dcd780f0): tried to init an initialized object, something is seriously 
wrong.
CPU: 1 PID: 31521 Comm: modprobe Tainted: G C5.2.0+ #325
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[] (show_stack) from [] (dump_stack+0x9c/0xd4)
[] (dump_stack) from [] (kobject_init+0x74/0x94)
[] (kobject_init) from [] (device_initialize+0x1c/0xec)
[] (device_initialize) from [] (device_register+0xc/0x18)
[] (device_register) from [] 
(__video_register_device+0x9b4/0x1228)
[] (__video_register_device) from [] 
(imx_media_capture_device_register+0x44/0x1f4 [imx_media_capture])
[] (imx_media_capture_device_register [imx_media_capture]) from 
[] (csi_registered+0x154/0x19c [imx_media_csi])
[] (csi_registered [imx_media_csi]) from [] 
(v4l2_device_register_subdev+0xd0/0x164)
[] (v4l2_device_register_subdev) from [] 
(v4l2_async_match_notify+0x1c/0x130)
[] (v4l2_async_match_notify) from [] 
(v4l2_async_notifier_try_all_subdevs+0x48/0x94)
[] (v4l2_async_notifier_try_all_subdevs) from [] 
(__v4l2_async_notifier_register+0xa8/0x110)
[] (__v4l2_async_notifier_register) from [] 
(v4l2_async_notifier_register+0x3c/0x54)
[] (v4l2_async_notifier_register) from [] 
(imx_media_dev_notifier_register+0x2c/0x70 [imx_media])
[] (imx_media_dev_notifier_register [i

Re: [PATCH RESEND 2/2] staging: erofs: differentiate unsupported on-disk format

2019-08-14 Thread Chao Yu
On 2019/8/14 12:32, Gao Xiang wrote:
> For some specific fields, use ENOTSUPP instead of EIO
> for values which look sane but aren't supported right now.
> 
> Signed-off-by: Gao Xiang 

Reviewed-by: Chao Yu 

> + return -ENOTSUPP;

A little bit confused about when we need to use ENOTSUPP or EOPNOTSUPP, I
checked several manual of syscall, it looks EOPNOTSUPP is widely used.

Thanks,
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH RESEND 2/2] staging: erofs: differentiate unsupported on-disk format

2019-08-14 Thread Gao Xiang
Hi Chao,

On Wed, Aug 14, 2019 at 05:25:51PM +0800, Chao Yu wrote:
> On 2019/8/14 12:32, Gao Xiang wrote:
> > For some specific fields, use ENOTSUPP instead of EIO
> > for values which look sane but aren't supported right now.
> > 
> > Signed-off-by: Gao Xiang 
> 
> Reviewed-by: Chao Yu 
> 
> > +   return -ENOTSUPP;
> 
> A little bit confused about when we need to use ENOTSUPP or EOPNOTSUPP, I
> checked several manual of syscall, it looks EOPNOTSUPP is widely used.

It seems that you are right, I didn't notice this.
Let me resend this patchset to fix them all...

Thanks,
Gao Xiang

> 
> Thanks,
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v3 1/3] ARM: dts: imx6ul: Add csi node

2019-08-14 Thread Sakari Ailus
Hi Sébastien,

On Wed, Jul 31, 2019 at 06:32:57PM +0200, Sébastien Szymanski wrote:
> Add csi node for i.MX6UL SoC.
> 
> Reviewed-by: Fabio Estevam 
> Signed-off-by: Sébastien Szymanski 

This should be probably merged through the ARM tree.

I can take the other two.

> ---
> 
> Changes for v3:
>  - none
> 
> Changes for v2:
>  - only "mclk" clock is required now.
> 
>  arch/arm/boot/dts/imx6ul.dtsi | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/imx6ul.dtsi b/arch/arm/boot/dts/imx6ul.dtsi
> index 81d4b4925127..56cfcf0e5084 100644
> --- a/arch/arm/boot/dts/imx6ul.dtsi
> +++ b/arch/arm/boot/dts/imx6ul.dtsi
> @@ -957,6 +957,15 @@
>   };
>   };
>  
> + csi: csi@21c4000 {
> + compatible = "fsl,imx6ul-csi", "fsl,imx7-csi";
> + reg = <0x021c4000 0x4000>;
> + interrupts = ;
> + clocks = <&clks IMX6UL_CLK_CSI>;
> + clock-names = "mclk";
> + status = "disabled";
> + };
> +
>   lcdif: lcdif@21c8000 {
>   compatible = "fsl,imx6ul-lcdif", 
> "fsl,imx28-lcdif";
>   reg = <0x021c8000 0x4000>;

-- 
Sakari Ailus
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 04/22] media: Move v4l2_fwnode_parse_link from v4l2 to driver base

2019-08-14 Thread Russell King - ARM Linux admin
On Tue, Aug 06, 2019 at 09:53:41AM -0700, Steve Longerbeam wrote:
> The full patchset doesn't seem to be up yet, but see [1] for the cover
> letter.

Was the entire series copied to the mailing lists, or just selected
patches?  I only saw 4, 9, 11 and 13-22 via lakml.

In the absence of the other patches, will this solve imx-media binding
the internal subdevs of sensor devices to the CSI2 interface?

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v2 3/3] staging: erofs: correct all misused ENOTSUPP

2019-08-14 Thread Gao Xiang
As Chao pointed out [1], ENOTSUPP is used for NFS
protocol only, we should use EOPNOTSUPP instead...

[1] 
https://lore.kernel.org/lkml/108ee2f9-75dd-b8ab-8da7-b81c17baf...@huawei.com/

Reported-by: Chao Yu 
Signed-off-by: Gao Xiang 
---
 drivers/staging/erofs/decompressor.c | 2 +-
 drivers/staging/erofs/internal.h | 6 +++---
 drivers/staging/erofs/xattr.c| 2 +-
 drivers/staging/erofs/xattr.h| 4 ++--
 drivers/staging/erofs/zmap.c | 8 
 5 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/erofs/decompressor.c 
b/drivers/staging/erofs/decompressor.c
index 5361a2bbedb6..32a811ac704a 100644
--- a/drivers/staging/erofs/decompressor.c
+++ b/drivers/staging/erofs/decompressor.c
@@ -124,7 +124,7 @@ static int lz4_decompress(struct z_erofs_decompress_req 
*rq, u8 *out)
int ret;
 
if (rq->inputsize > PAGE_SIZE)
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
 
src = kmap_atomic(*rq->in);
inputmargin = 0;
diff --git a/drivers/staging/erofs/internal.h b/drivers/staging/erofs/internal.h
index 12f737cbc0c0..0e8d58546c52 100644
--- a/drivers/staging/erofs/internal.h
+++ b/drivers/staging/erofs/internal.h
@@ -403,12 +403,12 @@ int z_erofs_map_blocks_iter(struct inode *inode,
struct erofs_map_blocks *map,
int flags);
 #else
-static inline int z_erofs_fill_inode(struct inode *inode) { return -ENOTSUPP; }
+static inline int z_erofs_fill_inode(struct inode *inode) { return 
-EOPNOTSUPP; }
 static inline int z_erofs_map_blocks_iter(struct inode *inode,
  struct erofs_map_blocks *map,
  int flags)
 {
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
 }
 #endif /* !CONFIG_EROFS_FS_ZIP */
 
@@ -516,7 +516,7 @@ void *erofs_get_pcpubuf(unsigned int pagenr);
 #else
 static inline void *erofs_get_pcpubuf(unsigned int pagenr)
 {
-   return ERR_PTR(-ENOTSUPP);
+   return ERR_PTR(-EOPNOTSUPP);
 }
 
 #define erofs_put_pcpubuf(buf) do {} while (0)
diff --git a/drivers/staging/erofs/xattr.c b/drivers/staging/erofs/xattr.c
index c5bfc9be412f..e7e5840e3f9d 100644
--- a/drivers/staging/erofs/xattr.c
+++ b/drivers/staging/erofs/xattr.c
@@ -71,7 +71,7 @@ static int init_inode_xattrs(struct inode *inode)
if (vi->xattr_isize == sizeof(struct erofs_xattr_ibody_header)) {
errln("xattr_isize %d of nid %llu is not supported yet",
  vi->xattr_isize, vi->nid);
-   ret = -ENOTSUPP;
+   ret = -EOPNOTSUPP;
goto out_unlock;
} else if (vi->xattr_isize < sizeof(struct erofs_xattr_ibody_header)) {
if (unlikely(vi->xattr_isize)) {
diff --git a/drivers/staging/erofs/xattr.h b/drivers/staging/erofs/xattr.h
index 63cc87e3d3f4..e20249647541 100644
--- a/drivers/staging/erofs/xattr.h
+++ b/drivers/staging/erofs/xattr.h
@@ -74,13 +74,13 @@ static inline int erofs_getxattr(struct inode *inode, int 
index,
 const char *name, void *buffer,
 size_t buffer_size)
 {
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
 }
 
 static inline ssize_t erofs_listxattr(struct dentry *dentry,
  char *buffer, size_t buffer_size)
 {
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
 }
 #endif /* !CONFIG_EROFS_FS_XATTR */
 
diff --git a/drivers/staging/erofs/zmap.c b/drivers/staging/erofs/zmap.c
index 5551e615e8ea..b61b9b5950ac 100644
--- a/drivers/staging/erofs/zmap.c
+++ b/drivers/staging/erofs/zmap.c
@@ -68,7 +68,7 @@ static int fill_inode_lazy(struct inode *inode)
if (vi->z_algorithmtype[0] >= Z_EROFS_COMPRESSION_MAX) {
errln("unknown compression format %u for nid %llu, please 
upgrade kernel",
  vi->z_algorithmtype[0], vi->nid);
-   err = -ENOTSUPP;
+   err = -EOPNOTSUPP;
goto unmap_done;
}
 
@@ -79,7 +79,7 @@ static int fill_inode_lazy(struct inode *inode)
if (vi->z_physical_clusterbits[0] != LOG_BLOCK_SIZE) {
errln("unsupported physical clusterbits %u for nid %llu, please 
upgrade kernel",
  vi->z_physical_clusterbits[0], vi->nid);
-   err = -ENOTSUPP;
+   err = -EOPNOTSUPP;
goto unmap_done;
}
 
@@ -211,7 +211,7 @@ static int unpack_compacted_index(struct 
z_erofs_maprecorder *m,
else if (1 << amortizedshift == 2 && lclusterbits == 12)
vcnt = 16;
else
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
 
encodebits = ((vcnt << amortizedshift) - sizeof(__le32)) * 8 / vcnt;
base = round_down(eofs, vcnt << amortizedshift);
@@ -275,7 +275,7 @@ static int compacted_load_cluster_from_disk(struct 
z_erofs_maprecorder *m,
int err;
 
if (lclusterbits != 1

[PATCH v2 1/3] staging: erofs: introduce EFSCORRUPTED and more logs

2019-08-14 Thread Gao Xiang
Previously, EROFS uses EIO to indicate that filesystem
is corrupted as well. However, as Pavel said [1], other
filesystems tend to use EUCLEAN(EFSCORRUPTED) instead,
let's follow what others do right now.

Also, add some more prints to the syslog.

[1] https://lore.kernel.org/lkml/20190813114821.GB11559@amd/

Suggested-by: Pavel Machek 
Reviewed-by: Chao Yu 
Signed-off-by: Gao Xiang 
---
change log from v1:
 - update the commit message;

This patchset has dependency on the previous patchset yesterday
 https://lore.kernel.org/lkml/20190813023054.73126-1-gaoxian...@huawei.com/

Thanks,
Gao Xiang

 drivers/staging/erofs/data.c |  6 --
 drivers/staging/erofs/dir.c  | 15 ---
 drivers/staging/erofs/inode.c| 17 -
 drivers/staging/erofs/internal.h |  2 ++
 drivers/staging/erofs/namei.c|  6 --
 drivers/staging/erofs/xattr.c|  5 +++--
 drivers/staging/erofs/zmap.c |  5 +++--
 7 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/drivers/staging/erofs/data.c b/drivers/staging/erofs/data.c
index 4cdb743c8b8d..72c4b4c5296b 100644
--- a/drivers/staging/erofs/data.c
+++ b/drivers/staging/erofs/data.c
@@ -143,10 +143,12 @@ static int erofs_map_blocks_flatmode(struct inode *inode,
vi->xattr_isize + erofs_blkoff(map->m_la);
map->m_plen = inode->i_size - offset;
 
-   /* inline data should locate in one meta block */
+   /* inline data should be located in one meta block */
if (erofs_blkoff(map->m_pa) + map->m_plen > PAGE_SIZE) {
+   errln("inline data cross block boundary @ nid %llu",
+ vi->nid);
DBG_BUGON(1);
-   err = -EIO;
+   err = -EFSCORRUPTED;
goto err_out;
}
 
diff --git a/drivers/staging/erofs/dir.c b/drivers/staging/erofs/dir.c
index 2fbfc4935077..01efc96e1212 100644
--- a/drivers/staging/erofs/dir.c
+++ b/drivers/staging/erofs/dir.c
@@ -34,7 +34,7 @@ static void debug_one_dentry(unsigned char d_type, const char 
*de_name,
 #endif
 }
 
-static int erofs_fill_dentries(struct dir_context *ctx,
+static int erofs_fill_dentries(struct inode *dir, struct dir_context *ctx,
   void *dentry_blk, unsigned int *ofs,
   unsigned int nameoff, unsigned int maxsize)
 {
@@ -63,8 +63,9 @@ static int erofs_fill_dentries(struct dir_context *ctx,
/* a corrupted entry is found */
if (unlikely(nameoff + de_namelen > maxsize ||
 de_namelen > EROFS_NAME_LEN)) {
+   errln("bogus dirent @ nid %llu", EROFS_V(dir)->nid);
DBG_BUGON(1);
-   return -EIO;
+   return -EFSCORRUPTED;
}
 
debug_one_dentry(d_type, de_name, de_namelen);
@@ -104,10 +105,9 @@ static int erofs_readdir(struct file *f, struct 
dir_context *ctx)
 
if (unlikely(nameoff < sizeof(struct erofs_dirent) ||
 nameoff >= PAGE_SIZE)) {
-   errln("%s, invalid de[0].nameoff %u",
- __func__, nameoff);
-
-   err = -EIO;
+   errln("%s, invalid de[0].nameoff %u @ nid %llu",
+ __func__, nameoff, EROFS_V(dir)->nid);
+   err = -EFSCORRUPTED;
goto skip_this;
}
 
@@ -123,7 +123,8 @@ static int erofs_readdir(struct file *f, struct dir_context 
*ctx)
goto skip_this;
}
 
-   err = erofs_fill_dentries(ctx, de, &ofs, nameoff, maxsize);
+   err = erofs_fill_dentries(dir, ctx, de, &ofs,
+ nameoff, maxsize);
 skip_this:
kunmap(dentry_page);
 
diff --git a/drivers/staging/erofs/inode.c b/drivers/staging/erofs/inode.c
index 286729143365..461fd4213ce7 100644
--- a/drivers/staging/erofs/inode.c
+++ b/drivers/staging/erofs/inode.c
@@ -43,7 +43,7 @@ static int read_inode(struct inode *inode, void *data)
else if (S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode))
inode->i_rdev = 0;
else
-   return -EIO;
+   goto bogusimode;
 
i_uid_write(inode, le32_to_cpu(v2->i_uid));
i_gid_write(inode, le32_to_cpu(v2->i_gid));
@@ -76,7 +76,7 @@ static int read_inode(struct inode *inode, void *data)
else if (S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode))
inode->i_rdev = 0;
else
-   return -EIO;
+   goto bogusimode;
 
i_uid_write(inode, le16_to_cpu(v1->i_uid));
i_gid_write(inode, le16_to_cpu(v1->i_gid))

[PATCH v2 2/3] staging: erofs: differentiate unsupported on-disk format

2019-08-14 Thread Gao Xiang
For some specific fields, use EOPNOTSUPP instead of EIO
for values which look sane but aren't supported right now.

Reviewed-by: Chao Yu 
Signed-off-by: Gao Xiang 
---
change log from v1:
 - use EOPNOTSUPP rather than ENOTSUPP pointed by Chao;

 drivers/staging/erofs/inode.c | 4 ++--
 drivers/staging/erofs/zmap.c  | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/erofs/inode.c b/drivers/staging/erofs/inode.c
index 461fd4213ce7..c8f3ded17583 100644
--- a/drivers/staging/erofs/inode.c
+++ b/drivers/staging/erofs/inode.c
@@ -24,7 +24,7 @@ static int read_inode(struct inode *inode, void *data)
errln("unsupported data mapping %u of nid %llu",
  vi->datamode, vi->nid);
DBG_BUGON(1);
-   return -EIO;
+   return -EOPNOTSUPP;
}
 
if (__inode_version(advise) == EROFS_INODE_LAYOUT_V2) {
@@ -95,7 +95,7 @@ static int read_inode(struct inode *inode, void *data)
errln("unsupported on-disk inode version %u of nid %llu",
  __inode_version(advise), vi->nid);
DBG_BUGON(1);
-   return -EIO;
+   return -EOPNOTSUPP;
}
 
if (!nblks)
diff --git a/drivers/staging/erofs/zmap.c b/drivers/staging/erofs/zmap.c
index 16b3625604f4..5551e615e8ea 100644
--- a/drivers/staging/erofs/zmap.c
+++ b/drivers/staging/erofs/zmap.c
@@ -178,7 +178,7 @@ static int vle_legacy_load_cluster_from_disk(struct 
z_erofs_maprecorder *m,
break;
default:
DBG_BUGON(1);
-   return -EIO;
+   return -EOPNOTSUPP;
}
m->type = type;
return 0;
@@ -362,7 +362,7 @@ static int vle_extent_lookback(struct z_erofs_maprecorder 
*m,
errln("unknown type %u at lcn %lu of nid %llu",
  m->type, lcn, vi->nid);
DBG_BUGON(1);
-   return -EIO;
+   return -EOPNOTSUPP;
}
return 0;
 }
@@ -436,7 +436,7 @@ int z_erofs_map_blocks_iter(struct inode *inode,
default:
errln("unknown type %u at offset %llu of nid %llu",
  m.type, ofs, vi->nid);
-   err = -EIO;
+   err = -EOPNOTSUPP;
goto unmap_out;
}
 
-- 
2.17.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 04/22] media: Move v4l2_fwnode_parse_link from v4l2 to driver base

2019-08-14 Thread Steve Longerbeam




On 8/14/19 3:30 AM, Russell King - ARM Linux admin wrote:

On Tue, Aug 06, 2019 at 09:53:41AM -0700, Steve Longerbeam wrote:

The full patchset doesn't seem to be up yet, but see [1] for the cover
letter.

Was the entire series copied to the mailing lists, or just selected
patches?  I only saw 4, 9, 11 and 13-22 via lakml.


The whole series was posted to the linux-media ML, see [1]. At the time, 
none of the linux-media ML archives had the whole series.



In the absence of the other patches, will this solve imx-media binding
the internal subdevs of sensor devices to the CSI2 interface?


"internal subdevs of sensor devices" ?? That doesn't make any sense.

Sensors are external to the SoC, there are no "internal" sensor devices.

Not sure what you mean by "binding" either in this context, but external 
sensors can connect via fwnode endpoint, and later translated to media 
link, to the receiver CSI-2 sink.


Steve

[1] https://www.spinics.net/lists/linux-media/msg155160.html
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 04/22] media: Move v4l2_fwnode_parse_link from v4l2 to driver base

2019-08-14 Thread Russell King - ARM Linux admin
On Wed, Aug 14, 2019 at 12:04:41PM -0700, Steve Longerbeam wrote:
> 
> 
> On 8/14/19 3:30 AM, Russell King - ARM Linux admin wrote:
> > On Tue, Aug 06, 2019 at 09:53:41AM -0700, Steve Longerbeam wrote:
> > > The full patchset doesn't seem to be up yet, but see [1] for the cover
> > > letter.
> > Was the entire series copied to the mailing lists, or just selected
> > patches?  I only saw 4, 9, 11 and 13-22 via lakml.
> 
> The whole series was posted to the linux-media ML, see [1]. At the time,
> none of the linux-media ML archives had the whole series.
> 
> > In the absence of the other patches, will this solve imx-media binding
> > the internal subdevs of sensor devices to the CSI2 interface?
> 
> "internal subdevs of sensor devices" ?? That doesn't make any sense.

Sorry, but it makes complete sense when you consider that sensor
devices may have more than one subdev, but there should be only one
that is the "output" to whatever the camera is attached to.  The
other subdevs are internal to the sensor.

subdevs are not purely the remit of SoC drivers.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 04/22] media: Move v4l2_fwnode_parse_link from v4l2 to driver base

2019-08-14 Thread Steve Longerbeam




On 8/14/19 3:04 PM, Russell King - ARM Linux admin wrote:

On Wed, Aug 14, 2019 at 12:04:41PM -0700, Steve Longerbeam wrote:


On 8/14/19 3:30 AM, Russell King - ARM Linux admin wrote:

On Tue, Aug 06, 2019 at 09:53:41AM -0700, Steve Longerbeam wrote:

The full patchset doesn't seem to be up yet, but see [1] for the cover
letter.

Was the entire series copied to the mailing lists, or just selected
patches?  I only saw 4, 9, 11 and 13-22 via lakml.

The whole series was posted to the linux-media ML, see [1]. At the time,
none of the linux-media ML archives had the whole series.


In the absence of the other patches, will this solve imx-media binding
the internal subdevs of sensor devices to the CSI2 interface?

"internal subdevs of sensor devices" ?? That doesn't make any sense.

Sorry, but it makes complete sense when you consider that sensor
devices may have more than one subdev, but there should be only one
that is the "output" to whatever the camera is attached to.  The
other subdevs are internal to the sensor.


Ah, thanks for the clarification. Yes, by "internal subdevs" I 
understand what you mean now. The adv748x and smiapp are examples.




subdevs are not purely the remit of SoC drivers.


So there is no binding of internal subdevs to the receiver CSI-2. The 
receiver CSI-2 subdev will create media links to the subdev that has an 
externally exposed fwnode endpoint that connects with the CSI-2 sink pad.


Steve

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH 04/22] media: Move v4l2_fwnode_parse_link from v4l2 to driver base

2019-08-14 Thread Russell King - ARM Linux admin
On Wed, Aug 14, 2019 at 04:00:30PM -0700, Steve Longerbeam wrote:
> 
> 
> On 8/14/19 3:04 PM, Russell King - ARM Linux admin wrote:
> > On Wed, Aug 14, 2019 at 12:04:41PM -0700, Steve Longerbeam wrote:
> > > 
> > > On 8/14/19 3:30 AM, Russell King - ARM Linux admin wrote:
> > > > On Tue, Aug 06, 2019 at 09:53:41AM -0700, Steve Longerbeam wrote:
> > > > > The full patchset doesn't seem to be up yet, but see [1] for the cover
> > > > > letter.
> > > > Was the entire series copied to the mailing lists, or just selected
> > > > patches?  I only saw 4, 9, 11 and 13-22 via lakml.
> > > The whole series was posted to the linux-media ML, see [1]. At the time,
> > > none of the linux-media ML archives had the whole series.
> > > 
> > > > In the absence of the other patches, will this solve imx-media binding
> > > > the internal subdevs of sensor devices to the CSI2 interface?
> > > "internal subdevs of sensor devices" ?? That doesn't make any sense.
> > Sorry, but it makes complete sense when you consider that sensor
> > devices may have more than one subdev, but there should be only one
> > that is the "output" to whatever the camera is attached to.  The
> > other subdevs are internal to the sensor.
> 
> Ah, thanks for the clarification. Yes, by "internal subdevs" I understand
> what you mean now. The adv748x and smiapp are examples.
> 
> > 
> > subdevs are not purely the remit of SoC drivers.
> 
> So there is no binding of internal subdevs to the receiver CSI-2. The
> receiver CSI-2 subdev will create media links to the subdev that has an
> externally exposed fwnode endpoint that connects with the CSI-2 sink pad.

Maybe - with 5.2, I get:

- entity 15: imx6-mipi-csi2 (5 pads, 6 links)
 type V4L2 subdev subtype Unknown flags 0
 device node name /dev/v4l-subdev2
pad0: Sink
...
<- "imx219 0-0010":0 []
<- "imx219 pixel 0-0010":0 []

Adding some debug in gives:

[   11.963362] imx-media: imx_media_create_of_links() for imx6-mipi-csi2
[   11.963396] imx-media: create_of_link(): 
/soc/aips-bus@200/iomuxc-gpr@20e/ipu1_csi0_mux
[   11.963422] imx-media: create_of_link(): /soc/ipu@240
[   11.963450] imx-media: create_of_link(): /soc/ipu@280
[   11.963478] imx-media: create_of_link(): 
/soc/aips-bus@200/iomuxc-gpr@20e/ipu2_csi1_mux
[   11.963489] imx-media: imx6-mipi-csi2:4 -> ipu2_csi1_mux:0
[   11.963522] imx-media: create_of_link(): 
/soc/aips-bus@210/i2c@21a/camera@10
[   11.963533] imx-media: imx219 0-0010:0 -> imx6-mipi-csi2:0
[   11.963549] imx-media: imx_media_create_of_links() for imx219 pixel 0-0010
[   11.963577] imx-media: create_of_link(): /soc/aips-bus@210/mipi@21dc000
[   11.963587] imx-media: imx219 pixel 0-0010:0 -> imx6-mipi-csi2:0
[   11.963602] imx-media: imx_media_create_of_links() for imx219 0-0010

Note that it's not created by imx6-mipi-csi2, but by imx-media delving
around in the imx219 subdevs.

>From what I can see, smiapp does the same thing that I do in imx219 -
sets the subdev->dev member to point at the struct device, which then
means that v4l2_device_register_subdev() will associate the same fwnode
with both "imx219 pixel 0-0010" and "imx219 0-0010".

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v2 3/3] staging: erofs: correct all misused ENOTSUPP

2019-08-14 Thread Chao Yu
On 2019/8/14 18:37, Gao Xiang wrote:
> As Chao pointed out [1], ENOTSUPP is used for NFS
> protocol only, we should use EOPNOTSUPP instead...
> 
> [1] 
> https://lore.kernel.org/lkml/108ee2f9-75dd-b8ab-8da7-b81c17baf...@huawei.com/
> 
> Reported-by: Chao Yu 
> Signed-off-by: Gao Xiang 

Reviewed-by: Chao Yu 

Thanks,
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v8 11/24] erofs: introduce xattr & posixacl support

2019-08-14 Thread Gao Xiang
This implements xattr and posixacl functionalities.

1) Inline and shared xattrs are introduced for flexibility.

   Specifically, if the same xattr is used for a large number of
   inodes or the size of a xattr is so large that unsuitable to
   inline, a shared xattr will be used instead in xattr meta.

2) Add .get_acl() to read the file's acl from its xattrs to
   make POSIX ACL usable.

   Here is the on-disk detail,
   fullname: system.posix_acl_access
   struct erofs_xattr_entry:
   .e_name_len = 0
   .e_name_index = EROFS_XATTR_INDEX_POSIX_ACL_ACCESS (2)

   fullname: system.posix_acl_default
   struct erofs_xattr_entry:
   .e_name_len = 0
   .e_name_index = EROFS_XATTR_INDEX_POSIX_ACL_DEFAULT (3)

Signed-off-by: Gao Xiang 
---
 fs/erofs/Kconfig|  38 +++
 fs/erofs/Makefile   |   1 +
 fs/erofs/inode.c|  14 +-
 fs/erofs/internal.h |  16 +
 fs/erofs/namei.c|   6 +-
 fs/erofs/super.c|  77 -
 fs/erofs/xattr.c| 705 
 fs/erofs/xattr.h|  94 ++
 8 files changed, 948 insertions(+), 3 deletions(-)
 create mode 100644 fs/erofs/xattr.c
 create mode 100644 fs/erofs/xattr.h

diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig
index 98f05043448a..c5e7c5ae026e 100644
--- a/fs/erofs/Kconfig
+++ b/fs/erofs/Kconfig
@@ -34,3 +34,41 @@ config EROFS_FAULT_INJECTION
  Test EROFS to inject faults such as ENOMEM, EIO, and so on.
  If unsure, say N.
 
+config EROFS_FS_XATTR
+   bool "EROFS extended attributes"
+   depends on EROFS_FS
+   default y
+   help
+ Extended attributes are name:value pairs associated with inodes by
+ the kernel or by users (see the attr(5) manual page, or visit
+  for details).
+
+ If unsure, say N.
+
+config EROFS_FS_POSIX_ACL
+   bool "EROFS Access Control Lists"
+   depends on EROFS_FS_XATTR
+   select FS_POSIX_ACL
+   default y
+   help
+ Posix Access Control Lists (ACLs) support permissions for users and
+ groups beyond the owner/group/world scheme.
+
+ To learn more about Access Control Lists, visit the POSIX ACLs for
+ Linux website .
+
+ If you don't know what Access Control Lists are, say N.
+
+config EROFS_FS_SECURITY
+   bool "EROFS Security Labels"
+   depends on EROFS_FS_XATTR
+   default y
+   help
+ Security labels provide an access control facility to support Linux
+ Security Models (LSMs) accepted by AppArmor, SELinux, Smack and TOMOYO
+ Linux. This option enables an extended attribute handler for file
+ security labels in the erofs filesystem, so that it requires enabling
+ the extended attribute support in advance.
+
+ If you are not using a security module, say N.
+
diff --git a/fs/erofs/Makefile b/fs/erofs/Makefile
index c3f4e549ef90..190a73083f23 100644
--- a/fs/erofs/Makefile
+++ b/fs/erofs/Makefile
@@ -6,4 +6,5 @@ ccflags-y += -DEROFS_VERSION=\"$(EROFS_VERSION)\"
 
 obj-$(CONFIG_EROFS_FS) += erofs.o
 erofs-objs := super.o inode.o data.o namei.o dir.o
+erofs-$(CONFIG_EROFS_FS_XATTR) += xattr.o
 
diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
index f55193856359..d4e5de383435 100644
--- a/fs/erofs/inode.c
+++ b/fs/erofs/inode.c
@@ -6,7 +6,7 @@
  * http://www.huawei.com/
  * Created by Gao Xiang 
  */
-#include "internal.h"
+#include "xattr.h"
 
 #include 
 
@@ -307,15 +307,27 @@ int erofs_getattr(const struct path *path, struct kstat 
*stat,
 
 const struct inode_operations erofs_generic_iops = {
.getattr = erofs_getattr,
+#ifdef CONFIG_EROFS_FS_XATTR
+   .listxattr = erofs_listxattr,
+#endif
+   .get_acl = erofs_get_acl,
 };
 
 const struct inode_operations erofs_symlink_iops = {
.get_link = page_get_link,
.getattr = erofs_getattr,
+#ifdef CONFIG_EROFS_FS_XATTR
+   .listxattr = erofs_listxattr,
+#endif
+   .get_acl = erofs_get_acl,
 };
 
 const struct inode_operations erofs_fast_symlink_iops = {
.get_link = simple_get_link,
.getattr = erofs_getattr,
+#ifdef CONFIG_EROFS_FS_XATTR
+   .listxattr = erofs_listxattr,
+#endif
+   .get_acl = erofs_get_acl,
 };
 
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 5b946eabc239..f42cdda2eebc 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -62,6 +62,9 @@ typedef u32 erofs_blk_t;
 struct erofs_sb_info {
u32 blocks;
u32 meta_blkaddr;
+#ifdef CONFIG_EROFS_FS_XATTR
+   u32 xattr_blkaddr;
+#endif
 
/* inode slot unit size in bit shift */
unsigned char islotbits;
@@ -132,6 +135,8 @@ static inline void *erofs_kmalloc(struct erofs_sb_info *sbi,
 #define EROFS_I_SB(inode) ((struct erofs_sb_info *)(inode)->i_sb->s_fs_info)
 
 /* Mount flags set via mount options or defaults */
+#define EROFS_MOUNT_XATTR_USER 0x0010
+#define EROFS_MOUNT_POSIX_ACL  0x0020
 

[PATCH v8 04/24] erofs: add raw address_space operations

2019-08-14 Thread Gao Xiang
This commit adds functions for meta and raw data, and also
provides address_space_operations for raw data access.

Signed-off-by: Gao Xiang 
---
 fs/erofs/data.c | 419 
 1 file changed, 419 insertions(+)
 create mode 100644 fs/erofs/data.c

diff --git a/fs/erofs/data.c b/fs/erofs/data.c
new file mode 100644
index ..3d8f1511cacb
--- /dev/null
+++ b/fs/erofs/data.c
@@ -0,0 +1,419 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * linux/fs/erofs/data.c
+ *
+ * Copyright (C) 2017-2018 HUAWEI, Inc.
+ * http://www.huawei.com/
+ * Created by Gao Xiang 
+ */
+#include "internal.h"
+#include 
+
+#include 
+
+static inline void read_endio(struct bio *bio)
+{
+   struct super_block *const sb = bio->bi_private;
+   struct bio_vec *bvec;
+   blk_status_t err = bio->bi_status;
+   struct bvec_iter_all iter_all;
+
+   if (time_to_inject(EROFS_SB(sb), FAULT_READ_IO)) {
+   erofs_show_injection_info(FAULT_READ_IO);
+   err = BLK_STS_IOERR;
+   }
+
+   bio_for_each_segment_all(bvec, bio, iter_all) {
+   struct page *page = bvec->bv_page;
+
+   /* page is already locked */
+   DBG_BUGON(PageUptodate(page));
+
+   if (unlikely(err))
+   SetPageError(page);
+   else
+   SetPageUptodate(page);
+
+   unlock_page(page);
+   /* page could be reclaimed now */
+   }
+   bio_put(bio);
+}
+
+/* prio -- true is used for dir */
+struct page *__erofs_get_meta_page(struct super_block *sb,
+  erofs_blk_t blkaddr, bool prio, bool nofail)
+{
+   struct inode *const bd_inode = sb->s_bdev->bd_inode;
+   struct address_space *const mapping = bd_inode->i_mapping;
+   /* prefer retrying in the allocator to blindly looping below */
+   const gfp_t gfp = mapping_gfp_constraint(mapping, ~__GFP_FS) |
+   (nofail ? __GFP_NOFAIL : 0);
+   unsigned int io_retries = nofail ? EROFS_IO_MAX_RETRIES_NOFAIL : 0;
+   struct page *page;
+   int err;
+
+repeat:
+   page = find_or_create_page(mapping, blkaddr, gfp);
+   if (unlikely(!page)) {
+   DBG_BUGON(nofail);
+   return ERR_PTR(-ENOMEM);
+   }
+   DBG_BUGON(!PageLocked(page));
+
+   if (!PageUptodate(page)) {
+   struct bio *bio;
+
+   bio = erofs_grab_bio(sb, blkaddr, 1, sb, read_endio, nofail);
+   if (IS_ERR(bio)) {
+   DBG_BUGON(nofail);
+   err = PTR_ERR(bio);
+   goto err_out;
+   }
+
+   err = bio_add_page(bio, page, PAGE_SIZE, 0);
+   if (unlikely(err != PAGE_SIZE)) {
+   err = -EFAULT;
+   goto err_out;
+   }
+
+   __submit_bio(bio, REQ_OP_READ,
+REQ_META | (prio ? REQ_PRIO : 0));
+
+   lock_page(page);
+
+   /* this page has been truncated by others */
+   if (unlikely(page->mapping != mapping)) {
+unlock_repeat:
+   unlock_page(page);
+   put_page(page);
+   goto repeat;
+   }
+
+   /* more likely a read error */
+   if (unlikely(!PageUptodate(page))) {
+   if (io_retries) {
+   --io_retries;
+   goto unlock_repeat;
+   }
+   err = -EIO;
+   goto err_out;
+   }
+   }
+   return page;
+
+err_out:
+   unlock_page(page);
+   put_page(page);
+   return ERR_PTR(err);
+}
+
+static int erofs_map_blocks_flatmode(struct inode *inode,
+struct erofs_map_blocks *map,
+int flags)
+{
+   int err = 0;
+   erofs_blk_t nblocks, lastblk;
+   u64 offset = map->m_la;
+   struct erofs_vnode *vi = EROFS_V(inode);
+
+   trace_erofs_map_blocks_flatmode_enter(inode, map, flags);
+
+   nblocks = DIV_ROUND_UP(inode->i_size, PAGE_SIZE);
+   lastblk = nblocks - is_inode_flat_inline(inode);
+
+   if (unlikely(offset >= inode->i_size)) {
+   /* leave out-of-bound access unmapped */
+   map->m_flags = 0;
+   map->m_plen = 0;
+   goto out;
+   }
+
+   /* there is no hole in flatmode */
+   map->m_flags = EROFS_MAP_MAPPED;
+
+   if (offset < blknr_to_addr(lastblk)) {
+   map->m_pa = blknr_to_addr(vi->raw_blkaddr) + map->m_la;
+   map->m_plen = blknr_to_addr(lastblk) - offset;
+   } else if (is_inode_flat_inline(inode)) {
+   /* 2 - inode inline B: inode, [xattrs], inline last blk... */
+   struct erofs_sb_info *sbi = EROFS_SB(inode->

[PATCH v8 07/24] erofs: add directory operations

2019-08-14 Thread Gao Xiang
This adds functions for directory, mainly readdir.

Signed-off-by: Gao Xiang 
---
 fs/erofs/dir.c | 148 +
 1 file changed, 148 insertions(+)
 create mode 100644 fs/erofs/dir.c

diff --git a/fs/erofs/dir.c b/fs/erofs/dir.c
new file mode 100644
index ..c52d27bedff4
--- /dev/null
+++ b/fs/erofs/dir.c
@@ -0,0 +1,148 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * linux/fs/erofs/dir.c
+ *
+ * Copyright (C) 2017-2018 HUAWEI, Inc.
+ * http://www.huawei.com/
+ * Created by Gao Xiang 
+ */
+#include "internal.h"
+
+static const unsigned char erofs_filetype_table[EROFS_FT_MAX] = {
+   [EROFS_FT_UNKNOWN]  = DT_UNKNOWN,
+   [EROFS_FT_REG_FILE] = DT_REG,
+   [EROFS_FT_DIR]  = DT_DIR,
+   [EROFS_FT_CHRDEV]   = DT_CHR,
+   [EROFS_FT_BLKDEV]   = DT_BLK,
+   [EROFS_FT_FIFO] = DT_FIFO,
+   [EROFS_FT_SOCK] = DT_SOCK,
+   [EROFS_FT_SYMLINK]  = DT_LNK,
+};
+
+static void debug_one_dentry(unsigned char d_type, const char *de_name,
+unsigned int de_namelen)
+{
+#ifdef CONFIG_EROFS_FS_DEBUG
+   /* since the on-disk name could not have the trailing '\0' */
+   unsigned char dbg_namebuf[EROFS_NAME_LEN + 1];
+
+   memcpy(dbg_namebuf, de_name, de_namelen);
+   dbg_namebuf[de_namelen] = '\0';
+
+   debugln("found dirent %s de_len %u d_type %d", dbg_namebuf,
+   de_namelen, d_type);
+#endif
+}
+
+static int erofs_fill_dentries(struct inode *dir, struct dir_context *ctx,
+  void *dentry_blk, unsigned int *ofs,
+  unsigned int nameoff, unsigned int maxsize)
+{
+   struct erofs_dirent *de = dentry_blk + *ofs;
+   const struct erofs_dirent *end = dentry_blk + nameoff;
+
+   while (de < end) {
+   const char *de_name;
+   unsigned int de_namelen;
+   unsigned char d_type;
+
+   if (de->file_type < EROFS_FT_MAX)
+   d_type = erofs_filetype_table[de->file_type];
+   else
+   d_type = DT_UNKNOWN;
+
+   nameoff = le16_to_cpu(de->nameoff);
+   de_name = (char *)dentry_blk + nameoff;
+
+   /* the last dirent in the block? */
+   if (de + 1 >= end)
+   de_namelen = strnlen(de_name, maxsize - nameoff);
+   else
+   de_namelen = le16_to_cpu(de[1].nameoff) - nameoff;
+
+   /* a corrupted entry is found */
+   if (unlikely(nameoff + de_namelen > maxsize ||
+de_namelen > EROFS_NAME_LEN)) {
+   errln("bogus dirent @ nid %llu", EROFS_V(dir)->nid);
+   DBG_BUGON(1);
+   return -EFSCORRUPTED;
+   }
+
+   debug_one_dentry(d_type, de_name, de_namelen);
+   if (!dir_emit(ctx, de_name, de_namelen,
+ le64_to_cpu(de->nid), d_type))
+   /* stopped by some reason */
+   return 1;
+   ++de;
+   *ofs += sizeof(struct erofs_dirent);
+   }
+   *ofs = maxsize;
+   return 0;
+}
+
+static int erofs_readdir(struct file *f, struct dir_context *ctx)
+{
+   struct inode *dir = file_inode(f);
+   struct address_space *mapping = dir->i_mapping;
+   const size_t dirsize = i_size_read(dir);
+   unsigned int i = ctx->pos / EROFS_BLKSIZ;
+   unsigned int ofs = ctx->pos % EROFS_BLKSIZ;
+   int err = 0;
+   bool initial = true;
+
+   while (ctx->pos < dirsize) {
+   struct page *dentry_page;
+   struct erofs_dirent *de;
+   unsigned int nameoff, maxsize;
+
+   dentry_page = read_mapping_page(mapping, i, NULL);
+   if (IS_ERR(dentry_page))
+   continue;
+
+   de = (struct erofs_dirent *)kmap(dentry_page);
+
+   nameoff = le16_to_cpu(de->nameoff);
+
+   if (unlikely(nameoff < sizeof(struct erofs_dirent) ||
+nameoff >= PAGE_SIZE)) {
+   errln("%s, invalid de[0].nameoff %u @ nid %llu",
+ __func__, nameoff, EROFS_V(dir)->nid);
+   err = -EFSCORRUPTED;
+   goto skip_this;
+   }
+
+   maxsize = min_t(unsigned int,
+   dirsize - ctx->pos + ofs, PAGE_SIZE);
+
+   /* search dirents at the arbitrary position */
+   if (unlikely(initial)) {
+   initial = false;
+
+   ofs = roundup(ofs, sizeof(struct erofs_dirent));
+   if (unlikely(ofs >= nameoff))
+   goto skip_this;
+   }
+
+   err = erofs_fill_dentries(dir, ctx, de, &ofs,
+  

[PATCH v8 08/24] erofs: add namei functions

2019-08-14 Thread Gao Xiang
This commit adds functions that transfer names to inodes.

Signed-off-by: Gao Xiang 
---
 fs/erofs/namei.c | 249 +++
 1 file changed, 249 insertions(+)
 create mode 100644 fs/erofs/namei.c

diff --git a/fs/erofs/namei.c b/fs/erofs/namei.c
new file mode 100644
index ..98b9e8cebca5
--- /dev/null
+++ b/fs/erofs/namei.c
@@ -0,0 +1,249 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * linux/fs/erofs/namei.c
+ *
+ * Copyright (C) 2017-2018 HUAWEI, Inc.
+ * http://www.huawei.com/
+ * Created by Gao Xiang 
+ */
+#include "internal.h"
+
+#include 
+
+struct erofs_qstr {
+   const unsigned char *name;
+   const unsigned char *end;
+};
+
+/* based on the end of qn is accurate and it must have the trailing '\0' */
+static inline int dirnamecmp(const struct erofs_qstr *qn,
+const struct erofs_qstr *qd,
+unsigned int *matched)
+{
+   unsigned int i = *matched;
+
+   /*
+* on-disk error, let's only BUG_ON in the debugging mode.
+* otherwise, it will return 1 to just skip the invalid name
+* and go on (in consideration of the lookup performance).
+*/
+   DBG_BUGON(qd->name > qd->end);
+
+   /* qd could not have trailing '\0' */
+   /* However it is absolutely safe if < qd->end */
+   while (qd->name + i < qd->end && qd->name[i] != '\0') {
+   if (qn->name[i] != qd->name[i]) {
+   *matched = i;
+   return qn->name[i] > qd->name[i] ? 1 : -1;
+   }
+   ++i;
+   }
+   *matched = i;
+   /* See comments in __d_alloc on the terminating NUL character */
+   return qn->name[i] == '\0' ? 0 : 1;
+}
+
+#define nameoff_from_disk(off, sz) (le16_to_cpu(off) & ((sz) - 1))
+
+static struct erofs_dirent *find_target_dirent(struct erofs_qstr *name,
+  u8 *data,
+  unsigned int dirblksize,
+  const int ndirents)
+{
+   int head, back;
+   unsigned int startprfx, endprfx;
+   struct erofs_dirent *const de = (struct erofs_dirent *)data;
+
+   /* since the 1st dirent has been evaluated previously */
+   head = 1;
+   back = ndirents - 1;
+   startprfx = endprfx = 0;
+
+   while (head <= back) {
+   const int mid = head + (back - head) / 2;
+   const int nameoff = nameoff_from_disk(de[mid].nameoff,
+ dirblksize);
+   unsigned int matched = min(startprfx, endprfx);
+   struct erofs_qstr dname = {
+   .name = data + nameoff,
+   .end = unlikely(mid >= ndirents - 1) ?
+   data + dirblksize :
+   data + nameoff_from_disk(de[mid + 1].nameoff,
+dirblksize)
+   };
+
+   /* string comparison without already matched prefix */
+   int ret = dirnamecmp(name, &dname, &matched);
+
+   if (unlikely(!ret)) {
+   return de + mid;
+   } else if (ret > 0) {
+   head = mid + 1;
+   startprfx = matched;
+   } else {
+   back = mid - 1;
+   endprfx = matched;
+   }
+   }
+
+   return ERR_PTR(-ENOENT);
+}
+
+static struct page *find_target_block_classic(struct inode *dir,
+ struct erofs_qstr *name,
+ int *_ndirents)
+{
+   unsigned int startprfx, endprfx;
+   int head, back;
+   struct address_space *const mapping = dir->i_mapping;
+   struct page *candidate = ERR_PTR(-ENOENT);
+
+   startprfx = endprfx = 0;
+   head = 0;
+   back = inode_datablocks(dir) - 1;
+
+   while (head <= back) {
+   const int mid = head + (back - head) / 2;
+   struct page *page = read_mapping_page(mapping, mid, NULL);
+
+   if (!IS_ERR(page)) {
+   struct erofs_dirent *de = kmap_atomic(page);
+   const int nameoff = nameoff_from_disk(de->nameoff,
+ EROFS_BLKSIZ);
+   const int ndirents = nameoff / sizeof(*de);
+   int diff;
+   unsigned int matched;
+   struct erofs_qstr dname;
+
+   if (unlikely(!ndirents)) {
+   kunmap_atomic(de);
+   put_page(page);
+   errln("corrupted dir block %d @ nid %llu",
+ mid, EROFS_V(dir)->nid);
+

[PATCH v8 01/24] erofs: add on-disk layout

2019-08-14 Thread Gao Xiang
This commit adds the on-disk layout header file of erofs.
On-disk format is compatible with erofs-staging added in 4.19.

In addition, add EROFS_SUPER_MAGIC_V1 to magic.h.

Signed-off-by: Gao Xiang 
---
 fs/erofs/erofs_fs.h| 316 +
 include/uapi/linux/magic.h |   1 +
 2 files changed, 317 insertions(+)
 create mode 100644 fs/erofs/erofs_fs.h

diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
new file mode 100644
index ..230fcba1099d
--- /dev/null
+++ b/fs/erofs/erofs_fs.h
@@ -0,0 +1,316 @@
+/* SPDX-License-Identifier: GPL-2.0-only OR Apache-2.0 */
+/*
+ * linux/fs/erofs/erofs_fs.h
+ *
+ * Copyright (C) 2017-2018 HUAWEI, Inc.
+ * http://www.huawei.com/
+ * Created by Gao Xiang 
+ */
+#ifndef __EROFS_FS_H
+#define __EROFS_FS_H
+
+/* Enhanced(Extended) ROM File System */
+#define EROFS_SUPER_OFFSET  1024
+
+/*
+ * Any bits that aren't in EROFS_ALL_REQUIREMENTS should be
+ * incompatible with this kernel version.
+ */
+#define EROFS_REQUIREMENT_LZ4_0PADDING 0x0001
+#define EROFS_ALL_REQUIREMENTS 0
+
+struct erofs_super_block {
+/*  0 */__le32 magic;   /* in the little endian */
+/*  4 */__le32 checksum;/* crc32c(super_block) */
+/*  8 */__le32 features;/* (aka. feature_compat) */
+/* 12 */__u8 blkszbits; /* support block_size == PAGE_SIZE only */
+/* 13 */__u8 reserved;
+
+/* 14 */__le16 root_nid;
+/* 16 */__le64 inos;/* total valid ino # (== f_files - f_favail) */
+
+/* 24 */__le64 build_time;  /* inode v1 time derivation */
+/* 32 */__le32 build_time_nsec;
+/* 36 */__le32 blocks;  /* used for statfs */
+/* 40 */__le32 meta_blkaddr;
+/* 44 */__le32 xattr_blkaddr;
+/* 48 */__u8 uuid[16];  /* 128-bit uuid for volume */
+/* 64 */__u8 volume_name[16];   /* volume name */
+/* 80 */__le32 requirements;/* (aka. feature_incompat) */
+
+/* 84 */__u8 reserved2[44];
+} __packed; /* 128 bytes */
+
+/*
+ * erofs inode data mapping:
+ * 0 - inode plain without inline data A:
+ * inode, [xattrs], ... | ... | no-holed data
+ * 1 - inode VLE compression B (legacy):
+ * inode, [xattrs], extents ... | ...
+ * 2 - inode plain with inline data C:
+ * inode, [xattrs], last_inline_data, ... | ... | no-holed data
+ * 3 - inode compression D:
+ * inode, [xattrs], map_header, extents ... | ...
+ * 4~7 - reserved
+ */
+enum {
+   EROFS_INODE_FLAT_PLAIN,
+   EROFS_INODE_FLAT_COMPRESSION_LEGACY,
+   EROFS_INODE_FLAT_INLINE,
+   EROFS_INODE_FLAT_COMPRESSION,
+   EROFS_INODE_LAYOUT_MAX
+};
+
+static inline bool erofs_inode_is_data_compressed(unsigned int datamode)
+{
+   if (datamode == EROFS_INODE_FLAT_COMPRESSION)
+   return true;
+   return datamode == EROFS_INODE_FLAT_COMPRESSION_LEGACY;
+}
+
+/* bit definitions of inode i_advise */
+#define EROFS_I_VERSION_BITS1
+#define EROFS_I_DATA_MAPPING_BITS   3
+
+#define EROFS_I_VERSION_BIT 0
+#define EROFS_I_DATA_MAPPING_BIT1
+
+struct erofs_inode_v1 {
+/*  0 */__le16 i_advise;
+
+/* 1 header + n-1 * 4 bytes inline xattr to keep continuity */
+/*  2 */__le16 i_xattr_icount;
+/*  4 */__le16 i_mode;
+/*  6 */__le16 i_nlink;
+/*  8 */__le32 i_size;
+/* 12 */__le32 i_reserved;
+/* 16 */union {
+   /* file total compressed blocks for data mapping 1 */
+   __le32 compressed_blocks;
+   __le32 raw_blkaddr;
+
+   /* for device files, used to indicate old/new device # */
+   __le32 rdev;
+   } i_u __packed;
+/* 20 */__le32 i_ino;   /* only used for 32-bit stat compatibility */
+/* 24 */__le16 i_uid;
+/* 26 */__le16 i_gid;
+/* 28 */__le32 i_reserved2;
+} __packed;
+
+/* 32 bytes on-disk inode */
+#define EROFS_INODE_LAYOUT_V1   0
+/* 64 bytes on-disk inode */
+#define EROFS_INODE_LAYOUT_V2   1
+
+struct erofs_inode_v2 {
+/*  0 */__le16 i_advise;
+
+/* 1 header + n-1 * 4 bytes inline xattr to keep continuity */
+/*  2 */__le16 i_xattr_icount;
+/*  4 */__le16 i_mode;
+/*  6 */__le16 i_reserved;
+/*  8 */__le64 i_size;
+/* 16 */union {
+   /* file total compressed blocks for data mapping 1 */
+   __le32 compressed_blocks;
+   __le32 raw_blkaddr;
+
+   /* for device files, used to indicate old/new device # */
+   __le32 rdev;
+   } i_u __packed;
+
+   /* only used for 32-bit stat compatibility */
+/* 20 */__le32 i_ino;
+
+/* 24 */__le32 i_uid;
+/* 28 */__le32 i_gid;
+/* 32 */__le64 i_ctime;
+/* 40 */__le32 i_ctime_nsec;
+/* 44 */__le32 i_nlink;
+/* 48 */__u8   i_reserved2[16];
+} __packed; /* 64 bytes */
+
+#define EROFS_MAX_SHARED_XATTRS (128)
+/* h_shared_count between 129 ... 255 are special # */
+#define EROFS_SHARED_XATTR_EXTENT   (255)
+
+/*
+ * inline xattrs (n == i_xattr_icount):
+ * erofs_xattr_ibody_header(1) + (n - 1) * 4 bytes
+ *  12 bytes   /   \

[PATCH v8 14/24] erofs: introduce superblock registration

2019-08-14 Thread Gao Xiang
In order to introducing shrinker solution for erofs,
let's manage all mounted erofs instances at first.

Signed-off-by: Gao Xiang 
---
 fs/erofs/Makefile   |  2 +-
 fs/erofs/internal.h | 13 +
 fs/erofs/super.c|  9 +
 fs/erofs/utils.c| 32 
 4 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 fs/erofs/utils.c

diff --git a/fs/erofs/Makefile b/fs/erofs/Makefile
index 481a966caf06..930770be124f 100644
--- a/fs/erofs/Makefile
+++ b/fs/erofs/Makefile
@@ -5,7 +5,7 @@ EROFS_VERSION = "1.0"
 ccflags-y += -DEROFS_VERSION=\"$(EROFS_VERSION)\"
 
 obj-$(CONFIG_EROFS_FS) += erofs.o
-erofs-objs := super.o inode.o data.o namei.o dir.o
+erofs-objs := super.o inode.o data.o namei.o dir.o utils.o
 erofs-$(CONFIG_EROFS_FS_XATTR) += xattr.o
 erofs-$(CONFIG_EROFS_FS_ZIP) += zmap.o
 
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 8432f488409d..62f1e3ffe0a2 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -60,6 +60,10 @@ typedef u64 erofs_off_t;
 typedef u32 erofs_blk_t;
 
 struct erofs_sb_info {
+#ifdef CONFIG_EROFS_FS_ZIP
+   /* list for all registered superblocks, mainly for shrinker */
+   struct list_head list;
+#endif /* CONFIG_EROFS_FS_ZIP */
u32 blocks;
u32 meta_blkaddr;
 #ifdef CONFIG_EROFS_FS_XATTR
@@ -400,6 +404,15 @@ int erofs_namei(struct inode *dir, struct qstr *name,
 /* dir.c */
 extern const struct file_operations erofs_dir_fops;
 
+/* utils.c */
+#ifdef CONFIG_EROFS_FS_ZIP
+void erofs_shrinker_register(struct super_block *sb);
+void erofs_shrinker_unregister(struct super_block *sb);
+#else
+static inline void erofs_shrinker_register(struct super_block *sb) {}
+static inline void erofs_shrinker_unregister(struct super_block *sb) {}
+#endif /* !CONFIG_EROFS_FS_ZIP */
+
 #define EFSCORRUPTEDEUCLEAN /* Filesystem is corrupted */
 
 #endif /* __EROFS_INTERNAL_H */
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 561ae6f7fe13..2eca3b25db75 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -354,6 +354,8 @@ static int erofs_fill_super(struct super_block *sb, void 
*data, int silent)
if (unlikely(!sb->s_root))
return -ENOMEM;
 
+   erofs_shrinker_register(sb);
+
if (!silent)
infoln("mounted on %s with opts: %s.", sb->s_id, (char *)data);
return 0;
@@ -385,6 +387,12 @@ static void erofs_kill_sb(struct super_block *sb)
sb->s_fs_info = NULL;
 }
 
+/* called when ->s_root is non-NULL */
+static void erofs_put_super(struct super_block *sb)
+{
+   erofs_shrinker_unregister(sb);
+}
+
 static struct file_system_type erofs_fs_type = {
.owner  = THIS_MODULE,
.name   = "erofs",
@@ -496,6 +504,7 @@ static int erofs_remount(struct super_block *sb, int 
*flags, char *data)
 }
 
 const struct super_operations erofs_sops = {
+   .put_super = erofs_put_super,
.alloc_inode = alloc_inode,
.free_inode = free_inode,
.statfs = erofs_statfs,
diff --git a/fs/erofs/utils.c b/fs/erofs/utils.c
new file mode 100644
index ..791b2df1f761
--- /dev/null
+++ b/fs/erofs/utils.c
@@ -0,0 +1,32 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * linux/fs/erofs/utils.c
+ *
+ * Copyright (C) 2018 HUAWEI, Inc.
+ * http://www.huawei.com/
+ * Created by Gao Xiang 
+ */
+#include "internal.h"
+
+#ifdef CONFIG_EROFS_FS_ZIP
+/* protects the mounted 'erofs_sb_list' */
+static DEFINE_SPINLOCK(erofs_sb_list_lock);
+static LIST_HEAD(erofs_sb_list);
+
+void erofs_shrinker_register(struct super_block *sb)
+{
+   struct erofs_sb_info *sbi = EROFS_SB(sb);
+
+   spin_lock(&erofs_sb_list_lock);
+   list_add(&sbi->list, &erofs_sb_list);
+   spin_unlock(&erofs_sb_list_lock);
+}
+
+void erofs_shrinker_unregister(struct super_block *sb)
+{
+   spin_lock(&erofs_sb_list_lock);
+   list_del(&EROFS_SB(sb)->list);
+   spin_unlock(&erofs_sb_list_lock);
+}
+#endif /* !CONFIG_EROFS_FS_ZIP */
+
-- 
2.17.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v8 13/24] erofs: add compression indexes support

2019-08-14 Thread Gao Xiang
This patch adds EROFS compression indexes support,
including legacy and compacted 2/4B indexes.

In addition, it introduces an iterable L2P mapping
operation called 'z_erofs_map_blocks_iter'.

Compared with 'erofs_map_blocks', it avoids a number
of redundant 'release and regrab' processes if they
request the same meta page.

Signed-off-by: Gao Xiang 
---
 fs/erofs/Kconfig |   9 +
 fs/erofs/Makefile|   1 +
 fs/erofs/data.c  |  10 +-
 fs/erofs/inode.c |   2 +-
 fs/erofs/internal.h  |  35 ++-
 fs/erofs/zmap.c  | 461 +++
 include/trace/events/erofs.h |  17 +-
 7 files changed, 530 insertions(+), 5 deletions(-)
 create mode 100644 fs/erofs/zmap.c

diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig
index c5e7c5ae026e..a475fbebb831 100644
--- a/fs/erofs/Kconfig
+++ b/fs/erofs/Kconfig
@@ -72,3 +72,12 @@ config EROFS_FS_SECURITY
 
  If you are not using a security module, say N.
 
+config EROFS_FS_ZIP
+   bool "EROFS Data Compression Support"
+   depends on EROFS_FS
+   select LZ4_DECOMPRESS
+   help
+ Enable fixed-sized output compression for EROFS.
+
+ If you don't want to enable compression feature, say N.
+
diff --git a/fs/erofs/Makefile b/fs/erofs/Makefile
index 190a73083f23..481a966caf06 100644
--- a/fs/erofs/Makefile
+++ b/fs/erofs/Makefile
@@ -7,4 +7,5 @@ ccflags-y += -DEROFS_VERSION=\"$(EROFS_VERSION)\"
 obj-$(CONFIG_EROFS_FS) += erofs.o
 erofs-objs := super.o inode.o data.o namei.o dir.o
 erofs-$(CONFIG_EROFS_FS_XATTR) += xattr.o
+erofs-$(CONFIG_EROFS_FS_ZIP) += zmap.o
 
diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index 3d8f1511cacb..0850b4725196 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -172,9 +172,15 @@ static int erofs_map_blocks_flatmode(struct inode *inode,
 int erofs_map_blocks(struct inode *inode,
 struct erofs_map_blocks *map, int flags)
 {
-   if (is_inode_layout_compression(inode))
-   return -EOPNOTSUPP;
+   if (unlikely(is_inode_layout_compression(inode))) {
+   int err = z_erofs_map_blocks_iter(inode, map, flags);
 
+   if (map->mpage) {
+   put_page(map->mpage);
+   map->mpage = NULL;
+   }
+   return err;
+   }
return erofs_map_blocks_flatmode(inode, map, flags);
 }
 
diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
index d4e5de383435..7a9383a2b0a5 100644
--- a/fs/erofs/inode.c
+++ b/fs/erofs/inode.c
@@ -212,7 +212,7 @@ static int fill_inode(struct inode *inode, int isdir)
}
 
if (is_inode_layout_compression(inode)) {
-   err = -EOPNOTSUPP;
+   err = z_erofs_fill_inode(inode);
goto out_unlock;
}
 
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index f42cdda2eebc..8432f488409d 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -173,9 +173,12 @@ static inline erofs_off_t iloc(struct erofs_sb_info *sbi, 
erofs_nid_t nid)
 
 /* atomic flag definitions */
 #define EROFS_V_EA_INITED_BIT  0
+#define EROFS_V_Z_INITED_BIT   1
 
 /* bitlock definitions (arranged in reverse order) */
 #define EROFS_V_BL_XATTR_BIT   (BITS_PER_LONG - 1)
+#define EROFS_V_BL_Z_BIT   (BITS_PER_LONG - 2)
+
 struct erofs_vnode {
erofs_nid_t nid;
 
@@ -189,7 +192,17 @@ struct erofs_vnode {
unsigned int xattr_shared_count;
unsigned int *xattr_shared_xattrs;
 
-   erofs_blk_t raw_blkaddr;
+   union {
+   erofs_blk_t raw_blkaddr;
+#ifdef CONFIG_EROFS_FS_ZIP
+   struct {
+   unsigned short z_advise;
+   unsigned char  z_algorithmtype[2];
+   unsigned char  z_logical_clusterbits;
+   unsigned char  z_physical_clusterbits[2];
+   };
+#endif /* CONFIG_EROFS_FS_ZIP */
+   };
/* the corresponding vfs inode */
struct inode vfs_inode;
 };
@@ -262,6 +275,10 @@ enum {
 #define EROFS_MAP_MAPPED   (1 << BH_Mapped)
 /* Located in metadata (could be copied from bd_inode) */
 #define EROFS_MAP_META (1 << BH_Meta)
+/* The extent has been compressed */
+#define EROFS_MAP_ZIPPED   (1 << BH_Zipped)
+/* The length of extent is full */
+#define EROFS_MAP_FULL_MAPPED  (1 << BH_FullMapped)
 
 struct erofs_map_blocks {
erofs_off_t m_pa, m_la;
@@ -275,6 +292,22 @@ struct erofs_map_blocks {
 /* Flags used by erofs_map_blocks() */
 #define EROFS_GET_BLOCKS_RAW0x0001
 
+/* zmap.c */
+#ifdef CONFIG_EROFS_FS_ZIP
+int z_erofs_fill_inode(struct inode *inode);
+int z_erofs_map_blocks_iter(struct inode *inode,
+   struct erofs_map_blocks *map,
+   int flags);
+#else
+static inline int z_erofs_fill_inode(struct inode *inode) { return 
-EOPNOTSUPP; }
+static inline int z_erofs_map_blocks_iter(struct inode

[PATCH v8 05/24] erofs: add inode operations

2019-08-14 Thread Gao Xiang
This adds core functions to get, read an inode.
It adds statx support as well.

Signed-off-by: Gao Xiang 
---
 fs/erofs/inode.c | 293 +++
 1 file changed, 293 insertions(+)
 create mode 100644 fs/erofs/inode.c

diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
new file mode 100644
index ..9960edaf6f7a
--- /dev/null
+++ b/fs/erofs/inode.c
@@ -0,0 +1,293 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * linux/fs/erofs/inode.c
+ *
+ * Copyright (C) 2017-2018 HUAWEI, Inc.
+ * http://www.huawei.com/
+ * Created by Gao Xiang 
+ */
+#include "internal.h"
+
+#include 
+
+/* no locking */
+static int read_inode(struct inode *inode, void *data)
+{
+   struct erofs_vnode *vi = EROFS_V(inode);
+   struct erofs_inode_v1 *v1 = data;
+   const unsigned int advise = le16_to_cpu(v1->i_advise);
+   erofs_blk_t nblks = 0;
+
+   vi->datamode = __inode_data_mapping(advise);
+
+   if (unlikely(vi->datamode >= EROFS_INODE_LAYOUT_MAX)) {
+   errln("unsupported data mapping %u of nid %llu",
+ vi->datamode, vi->nid);
+   DBG_BUGON(1);
+   return -EOPNOTSUPP;
+   }
+
+   if (__inode_version(advise) == EROFS_INODE_LAYOUT_V2) {
+   struct erofs_inode_v2 *v2 = data;
+
+   vi->inode_isize = sizeof(struct erofs_inode_v2);
+   vi->xattr_isize = ondisk_xattr_ibody_size(v2->i_xattr_icount);
+
+   inode->i_mode = le16_to_cpu(v2->i_mode);
+   vi->raw_blkaddr = le32_to_cpu(v2->i_u.raw_blkaddr);
+
+   i_uid_write(inode, le32_to_cpu(v2->i_uid));
+   i_gid_write(inode, le32_to_cpu(v2->i_gid));
+   set_nlink(inode, le32_to_cpu(v2->i_nlink));
+
+   /* ns timestamp */
+   inode->i_mtime.tv_sec = inode->i_ctime.tv_sec =
+   le64_to_cpu(v2->i_ctime);
+   inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec =
+   le32_to_cpu(v2->i_ctime_nsec);
+
+   inode->i_size = le64_to_cpu(v2->i_size);
+
+   /* total blocks for compressed files */
+   if (is_inode_layout_compression(inode))
+   nblks = le32_to_cpu(v2->i_u.compressed_blocks);
+   } else if (__inode_version(advise) == EROFS_INODE_LAYOUT_V1) {
+   struct erofs_sb_info *sbi = EROFS_SB(inode->i_sb);
+
+   vi->inode_isize = sizeof(struct erofs_inode_v1);
+   vi->xattr_isize = ondisk_xattr_ibody_size(v1->i_xattr_icount);
+
+   inode->i_mode = le16_to_cpu(v1->i_mode);
+   vi->raw_blkaddr = le32_to_cpu(v1->i_u.raw_blkaddr);
+
+   i_uid_write(inode, le16_to_cpu(v1->i_uid));
+   i_gid_write(inode, le16_to_cpu(v1->i_gid));
+   set_nlink(inode, le16_to_cpu(v1->i_nlink));
+
+   /* use build time to derive all file time */
+   inode->i_mtime.tv_sec = inode->i_ctime.tv_sec =
+   sbi->build_time;
+   inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec =
+   sbi->build_time_nsec;
+
+   inode->i_size = le32_to_cpu(v1->i_size);
+   if (is_inode_layout_compression(inode))
+   nblks = le32_to_cpu(v1->i_u.compressed_blocks);
+   } else {
+   errln("unsupported on-disk inode version %u of nid %llu",
+ __inode_version(advise), vi->nid);
+   DBG_BUGON(1);
+   return -EOPNOTSUPP;
+   }
+
+   if (!nblks)
+   /* measure inode.i_blocks as generic filesystems */
+   inode->i_blocks = roundup(inode->i_size, EROFS_BLKSIZ) >> 9;
+   else
+   inode->i_blocks = nblks << LOG_SECTORS_PER_BLOCK;
+   return 0;
+}
+
+/*
+ * try_lock can be required since locking order is:
+ *   file data(fs_inode)
+ *meta(bd_inode)
+ * but the majority of the callers is "iget",
+ * in that case we are pretty sure no deadlock since
+ * no data operations exist. However I tend to
+ * try_lock since it takes no much overhead and
+ * will success immediately.
+ */
+static int fill_inline_data(struct inode *inode, void *data,
+   unsigned int m_pofs)
+{
+   struct erofs_vnode *vi = EROFS_V(inode);
+   struct erofs_sb_info *sbi = EROFS_I_SB(inode);
+
+   /* should be inode inline C */
+   if (!is_inode_flat_inline(inode))
+   return 0;
+
+   /* fast symlink (following ext4) */
+   if (S_ISLNK(inode->i_mode) && inode->i_size < PAGE_SIZE) {
+   char *lnk = erofs_kmalloc(sbi, inode->i_size + 1, GFP_KERNEL);
+
+   if (unlikely(!lnk))
+   return -ENOMEM;
+
+   m_pofs += vi->inode_isize + vi->xattr_isize;
+
+   /* inline symlink data shouldn't across page boundary as well */
+   if (unlikely(m_pofs + inode->i_size > PAGE_SIZE)

[PATCH v8 10/24] erofs: update Kconfig and Makefile

2019-08-14 Thread Gao Xiang
This commit adds Makefile and Kconfig for erofs, and
updates Makefile and Kconfig files in the fs directory.

Signed-off-by: Gao Xiang 
---
 fs/Kconfig|  1 +
 fs/Makefile   |  1 +
 fs/erofs/Kconfig  | 36 
 fs/erofs/Makefile |  9 +
 4 files changed, 47 insertions(+)
 create mode 100644 fs/erofs/Kconfig
 create mode 100644 fs/erofs/Makefile

diff --git a/fs/Kconfig b/fs/Kconfig
index bfb1c6095c7a..669d46550e6d 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -261,6 +261,7 @@ source "fs/romfs/Kconfig"
 source "fs/pstore/Kconfig"
 source "fs/sysv/Kconfig"
 source "fs/ufs/Kconfig"
+source "fs/erofs/Kconfig"
 
 endif # MISC_FILESYSTEMS
 
diff --git a/fs/Makefile b/fs/Makefile
index d60089fd689b..b2e4973a0bea 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -130,3 +130,4 @@ obj-$(CONFIG_F2FS_FS)   += f2fs/
 obj-$(CONFIG_CEPH_FS)  += ceph/
 obj-$(CONFIG_PSTORE)   += pstore/
 obj-$(CONFIG_EFIVAR_FS)+= efivarfs/
+obj-$(CONFIG_EROFS_FS) += erofs/
diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig
new file mode 100644
index ..98f05043448a
--- /dev/null
+++ b/fs/erofs/Kconfig
@@ -0,0 +1,36 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+config EROFS_FS
+   tristate "EROFS filesystem support"
+   depends on BLOCK
+   help
+ EROFS (Enhanced Read-Only File System) is a lightweight
+ read-only file system with modern designs (eg. page-sized
+ blocks, inline xattrs/data, etc.) for scenarios which need
+ high-performance read-only requirements, e.g. Android OS
+ for mobile phones and LIVECDs.
+
+ It also provides fixed-sized output compression support,
+ which improves storage density, keeps relatively higher
+ compression ratios, which is more useful to achieve high
+ performance for embedded devices with limited memory.
+
+ If unsure, say N.
+
+config EROFS_FS_DEBUG
+   bool "EROFS debugging feature"
+   depends on EROFS_FS
+   help
+ Print debugging messages and enable more BUG_ONs which check
+ filesystem consistency and find potential issues aggressively,
+ which can be used for Android eng build, for example.
+
+ For daily use, say N.
+
+config EROFS_FAULT_INJECTION
+   bool "EROFS fault injection facility"
+   depends on EROFS_FS
+   help
+ Test EROFS to inject faults such as ENOMEM, EIO, and so on.
+ If unsure, say N.
+
diff --git a/fs/erofs/Makefile b/fs/erofs/Makefile
new file mode 100644
index ..c3f4e549ef90
--- /dev/null
+++ b/fs/erofs/Makefile
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+EROFS_VERSION = "1.0"
+
+ccflags-y += -DEROFS_VERSION=\"$(EROFS_VERSION)\"
+
+obj-$(CONFIG_EROFS_FS) += erofs.o
+erofs-objs := super.o inode.o data.o namei.o dir.o
+
-- 
2.17.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v8 17/24] erofs: introduce per-CPU buffers implementation

2019-08-14 Thread Gao Xiang
This patch introduces per-CPU buffers in order for
the upcoming generic decompression framework to use.

Note that I tried to use in-kernel per-CPU buffer or
per-CPU page approaches to clean up further, however
noticeable performanace regression (about 2% for
sequential read) was observed.

Let's leave it as-is for now.

Signed-off-by: Gao Xiang 
---
 fs/erofs/Kconfig| 14 ++
 fs/erofs/internal.h | 21 +
 fs/erofs/utils.c| 12 
 3 files changed, 47 insertions(+)

diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig
index a475fbebb831..5f8787c0cf89 100644
--- a/fs/erofs/Kconfig
+++ b/fs/erofs/Kconfig
@@ -81,3 +81,17 @@ config EROFS_FS_ZIP
 
  If you don't want to enable compression feature, say N.
 
+config EROFS_FS_CLUSTER_PAGE_LIMIT
+   int "EROFS Cluster Pages Hard Limit"
+   depends on EROFS_FS_ZIP
+   range 1 256
+   default "1"
+   help
+ Indicates maximum # of pages of a compressed
+ physical cluster.
+
+ For example, if files in a image were compressed
+ into 8k-unit, hard limit should not be configured
+ less than 2. Otherwise, the image will be refused
+ to mount on this kernel.
+
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 6a2407fb3013..3222947c9bab 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -222,6 +222,12 @@ static inline int erofs_wait_on_workgroup_freezed(struct 
erofs_workgroup *grp)
return v;
 }
 #endif /* !CONFIG_SMP */
+
+/* hard limit of pages per compressed cluster */
+#define Z_EROFS_CLUSTER_MAX_PAGES   (CONFIG_EROFS_FS_CLUSTER_PAGE_LIMIT)
+#define EROFS_PCPUBUF_NR_PAGES  Z_EROFS_CLUSTER_MAX_PAGES
+#else
+#define EROFS_PCPUBUF_NR_PAGES  0
 #endif /* !CONFIG_EROFS_FS_ZIP */
 
 /* we strictly follow PAGE_SIZE and no buffer head yet */
@@ -482,6 +488,21 @@ int erofs_namei(struct inode *dir, struct qstr *name,
 extern const struct file_operations erofs_dir_fops;
 
 /* utils.c */
+#if (EROFS_PCPUBUF_NR_PAGES > 0)
+void *erofs_get_pcpubuf(unsigned int pagenr);
+#define erofs_put_pcpubuf(buf) do { \
+   (void)&(buf);   \
+   preempt_enable();   \
+} while (0)
+#else
+static inline void *erofs_get_pcpubuf(unsigned int pagenr)
+{
+   return ERR_PTR(-EOPNOTSUPP);
+}
+
+#define erofs_put_pcpubuf(buf) do {} while (0)
+#endif
+
 #ifdef CONFIG_EROFS_FS_ZIP
 int erofs_workgroup_put(struct erofs_workgroup *grp);
 struct erofs_workgroup *erofs_find_workgroup(struct super_block *sb,
diff --git a/fs/erofs/utils.c b/fs/erofs/utils.c
index 628178261056..f3eed9af24d6 100644
--- a/fs/erofs/utils.c
+++ b/fs/erofs/utils.c
@@ -9,6 +9,18 @@
 #include "internal.h"
 #include 
 
+#if (EROFS_PCPUBUF_NR_PAGES > 0)
+static struct {
+   u8 data[PAGE_SIZE * EROFS_PCPUBUF_NR_PAGES];
+} cacheline_aligned_in_smp erofs_pcpubuf[NR_CPUS];
+
+void *erofs_get_pcpubuf(unsigned int pagenr)
+{
+   preempt_disable();
+   return &erofs_pcpubuf[smp_processor_id()].data[pagenr * PAGE_SIZE];
+}
+#endif
+
 #ifdef CONFIG_EROFS_FS_ZIP
 /* global shrink count (for all mounted EROFS instances) */
 static atomic_long_t erofs_global_shrink_cnt;
-- 
2.17.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v8 06/24] erofs: support special inode

2019-08-14 Thread Gao Xiang
This patch adds to support special inode, such as
block dev, char, socket, pipe inode.

Signed-off-by: Gao Xiang 
---
 fs/erofs/inode.c | 32 ++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
index 9960edaf6f7a..f55193856359 100644
--- a/fs/erofs/inode.c
+++ b/fs/erofs/inode.c
@@ -34,7 +34,16 @@ static int read_inode(struct inode *inode, void *data)
vi->xattr_isize = ondisk_xattr_ibody_size(v2->i_xattr_icount);
 
inode->i_mode = le16_to_cpu(v2->i_mode);
-   vi->raw_blkaddr = le32_to_cpu(v2->i_u.raw_blkaddr);
+   if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
+   S_ISLNK(inode->i_mode))
+   vi->raw_blkaddr = le32_to_cpu(v2->i_u.raw_blkaddr);
+   else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode))
+   inode->i_rdev =
+   new_decode_dev(le32_to_cpu(v2->i_u.rdev));
+   else if (S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode))
+   inode->i_rdev = 0;
+   else
+   goto bogusimode;
 
i_uid_write(inode, le32_to_cpu(v2->i_uid));
i_gid_write(inode, le32_to_cpu(v2->i_gid));
@@ -58,7 +67,16 @@ static int read_inode(struct inode *inode, void *data)
vi->xattr_isize = ondisk_xattr_ibody_size(v1->i_xattr_icount);
 
inode->i_mode = le16_to_cpu(v1->i_mode);
-   vi->raw_blkaddr = le32_to_cpu(v1->i_u.raw_blkaddr);
+   if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
+   S_ISLNK(inode->i_mode))
+   vi->raw_blkaddr = le32_to_cpu(v1->i_u.raw_blkaddr);
+   else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode))
+   inode->i_rdev =
+   new_decode_dev(le32_to_cpu(v1->i_u.rdev));
+   else if (S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode))
+   inode->i_rdev = 0;
+   else
+   goto bogusimode;
 
i_uid_write(inode, le16_to_cpu(v1->i_uid));
i_gid_write(inode, le16_to_cpu(v1->i_gid));
@@ -86,6 +104,11 @@ static int read_inode(struct inode *inode, void *data)
else
inode->i_blocks = nblks << LOG_SECTORS_PER_BLOCK;
return 0;
+
+bogusimode:
+   errln("bogus i_mode (%o) @ nid %llu", inode->i_mode, vi->nid);
+   DBG_BUGON(1);
+   return -EFSCORRUPTED;
 }
 
 /*
@@ -178,6 +201,11 @@ static int fill_inode(struct inode *inode, int isdir)
/* by default, page_get_link is used for symlink */
inode->i_op = &erofs_symlink_iops;
inode_nohighmem(inode);
+   } else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode) ||
+   S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {
+   inode->i_op = &erofs_generic_iops;
+   init_special_inode(inode, inode->i_mode, inode->i_rdev);
+   goto out_unlock;
} else {
err = -EFSCORRUPTED;
goto out_unlock;
-- 
2.17.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v8 02/24] erofs: add erofs in-memory stuffs

2019-08-14 Thread Gao Xiang
 - erofs_sb_info:
   contains erofs-specific in-memory information.

 - erofs_vnode:
   contains vfs_inode and other fs-specific information.
   same as super block, the only one in-memory definition exists.

 - erofs_map_blocks
   Logical to physical block mapping, used by erofs_map_blocks().

Signed-off-by: Gao Xiang 
---
 fs/erofs/internal.h | 357 
 1 file changed, 357 insertions(+)
 create mode 100644 fs/erofs/internal.h

diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
new file mode 100644
index ..5b946eabc239
--- /dev/null
+++ b/fs/erofs/internal.h
@@ -0,0 +1,357 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * linux/fs/erofs/internal.h
+ *
+ * Copyright (C) 2017-2018 HUAWEI, Inc.
+ * http://www.huawei.com/
+ * Created by Gao Xiang 
+ */
+#ifndef __EROFS_INTERNAL_H
+#define __EROFS_INTERNAL_H
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "erofs_fs.h"
+
+/* redefine pr_fmt "erofs: " */
+#undef pr_fmt
+#define pr_fmt(fmt) "erofs: " fmt
+
+#define errln(x, ...)   pr_err(x "\n", ##__VA_ARGS__)
+#define infoln(x, ...)  pr_info(x "\n", ##__VA_ARGS__)
+#ifdef CONFIG_EROFS_FS_DEBUG
+#define debugln(x, ...) pr_debug(x "\n", ##__VA_ARGS__)
+#define DBG_BUGON   BUG_ON
+#else
+#define debugln(x, ...) ((void)0)
+#define DBG_BUGON(x)((void)(x))
+#endif /* !CONFIG_EROFS_FS_DEBUG */
+
+enum {
+   FAULT_KMALLOC,
+   FAULT_READ_IO,
+   FAULT_MAX,
+};
+
+#ifdef CONFIG_EROFS_FAULT_INJECTION
+extern const char *erofs_fault_name[FAULT_MAX];
+#define IS_FAULT_SET(fi, type) ((fi)->inject_type & (1 << (type)))
+
+struct erofs_fault_info {
+   atomic_t inject_ops;
+   unsigned int inject_rate;
+   unsigned int inject_type;
+};
+#endif /* CONFIG_EROFS_FAULT_INJECTION */
+
+/* EROFS_SUPER_MAGIC_V1 to represent the whole file system */
+#define EROFS_SUPER_MAGIC   EROFS_SUPER_MAGIC_V1
+
+typedef u64 erofs_nid_t;
+typedef u64 erofs_off_t;
+/* data type for filesystem-wide blocks number */
+typedef u32 erofs_blk_t;
+
+struct erofs_sb_info {
+   u32 blocks;
+   u32 meta_blkaddr;
+
+   /* inode slot unit size in bit shift */
+   unsigned char islotbits;
+
+   u32 build_time_nsec;
+   u64 build_time;
+
+   /* what we really care is nid, rather than ino.. */
+   erofs_nid_t root_nid;
+   /* used for statfs, f_files - f_favail */
+   u64 inos;
+
+   u8 uuid[16];/* 128-bit uuid for volume */
+   u8 volume_name[16]; /* volume name */
+   u32 requirements;
+
+   unsigned int mount_opt;
+
+#ifdef CONFIG_EROFS_FAULT_INJECTION
+   struct erofs_fault_info fault_info; /* For fault injection */
+#endif
+};
+
+#ifdef CONFIG_EROFS_FAULT_INJECTION
+#define erofs_show_injection_info(type)
\
+   infoln("inject %s in %s of %pS", erofs_fault_name[type],\
+   __func__, __builtin_return_address(0))
+
+static inline bool time_to_inject(struct erofs_sb_info *sbi, int type)
+{
+   struct erofs_fault_info *ffi = &sbi->fault_info;
+
+   if (!ffi->inject_rate)
+   return false;
+
+   if (!IS_FAULT_SET(ffi, type))
+   return false;
+
+   atomic_inc(&ffi->inject_ops);
+   if (atomic_read(&ffi->inject_ops) >= ffi->inject_rate) {
+   atomic_set(&ffi->inject_ops, 0);
+   return true;
+   }
+   return false;
+}
+#else
+static inline bool time_to_inject(struct erofs_sb_info *sbi, int type)
+{
+   return false;
+}
+
+static inline void erofs_show_injection_info(int type)
+{
+}
+#endif /* !CONFIG_EROFS_FAULT_INJECTION */
+
+static inline void *erofs_kmalloc(struct erofs_sb_info *sbi,
+   size_t size, gfp_t flags)
+{
+   if (time_to_inject(sbi, FAULT_KMALLOC)) {
+   erofs_show_injection_info(FAULT_KMALLOC);
+   return NULL;
+   }
+   return kmalloc(size, flags);
+}
+
+#define EROFS_SB(sb) ((struct erofs_sb_info *)(sb)->s_fs_info)
+#define EROFS_I_SB(inode) ((struct erofs_sb_info *)(inode)->i_sb->s_fs_info)
+
+/* Mount flags set via mount options or defaults */
+#define EROFS_MOUNT_FAULT_INJECTION0x0040
+
+#define clear_opt(sbi, option) ((sbi)->mount_opt &= ~EROFS_MOUNT_##option)
+#define set_opt(sbi, option)   ((sbi)->mount_opt |= EROFS_MOUNT_##option)
+#define test_opt(sbi, option)  ((sbi)->mount_opt & EROFS_MOUNT_##option)
+
+/* we strictly follow PAGE_SIZE and no buffer head yet */
+#define LOG_BLOCK_SIZE PAGE_SHIFT
+
+#undef LOG_SECTORS_PER_BLOCK
+#define LOG_SECTORS_PER_BLOCK  (PAGE_SHIFT - 9)
+
+#undef SECTORS_PER_BLOCK
+#define SECTORS_PER_BLOCK  (1 << SECTORS_PER_BLOCK)
+
+#define EROFS_BLKSIZ   (1 << LOG_BLOCK_SIZE)
+
+#if (EROFS_BLKSIZ % 4096 || !EROFS_BLKSIZ)
+#error erofs cannot be used in this platform
+#endif
+
+#define ERO

[PATCH v8 16/24] erofs: introduce workstation for decompression

2019-08-14 Thread Gao Xiang
This patch introduces another concept used by decompress
subsystem called 'workstation'. It can be seen as
a sparse array that stores pointers pointed to data
structures related to the corresponding physical clusters.

All lookups are protected by RCU read lock. Besides,
reference count and spin_lock are also introduced
to manage its lifetime and serialize all update
operations.

`workstation' is currently implemented on the in-kernel
radix tree approach for backward compatibility. With the
evolution of linux kernel, it will be migrated into
new XArray implementation in the future.

Signed-off-by: Gao Xiang 
---
 fs/erofs/internal.h |  80 +
 fs/erofs/super.c|   4 ++
 fs/erofs/utils.c| 166 +++-
 3 files changed, 248 insertions(+), 2 deletions(-)

diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 4bcdf32a45ad..6a2407fb3013 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -65,6 +65,9 @@ struct erofs_sb_info {
struct list_head list;
struct mutex umount_mutex;
 
+   /* the dedicated workstation for compression */
+   struct radix_tree_root workstn_tree;
+
unsigned int shrinker_run_no;
 #endif /* CONFIG_EROFS_FS_ZIP */
u32 blocks;
@@ -150,6 +153,77 @@ static inline void *erofs_kmalloc(struct erofs_sb_info 
*sbi,
 #define set_opt(sbi, option)   ((sbi)->mount_opt |= EROFS_MOUNT_##option)
 #define test_opt(sbi, option)  ((sbi)->mount_opt & EROFS_MOUNT_##option)
 
+#ifdef CONFIG_EROFS_FS_ZIP
+#define EROFS_LOCKED_MAGIC (INT_MIN | 0xE0F510CCL)
+
+/* basic unit of the workstation of a super_block */
+struct erofs_workgroup {
+   /* the workgroup index in the workstation */
+   pgoff_t index;
+
+   /* overall workgroup reference count */
+   atomic_t refcount;
+};
+
+#if defined(CONFIG_SMP)
+static inline bool erofs_workgroup_try_to_freeze(struct erofs_workgroup *grp,
+int val)
+{
+   preempt_disable();
+   if (val != atomic_cmpxchg(&grp->refcount, val, EROFS_LOCKED_MAGIC)) {
+   preempt_enable();
+   return false;
+   }
+   return true;
+}
+
+static inline void erofs_workgroup_unfreeze(struct erofs_workgroup *grp,
+   int orig_val)
+{
+   /*
+* other observers should notice all modifications
+* in the freezing period.
+*/
+   smp_mb();
+   atomic_set(&grp->refcount, orig_val);
+   preempt_enable();
+}
+
+static inline int erofs_wait_on_workgroup_freezed(struct erofs_workgroup *grp)
+{
+   return atomic_cond_read_relaxed(&grp->refcount,
+   VAL != EROFS_LOCKED_MAGIC);
+}
+#else
+static inline bool erofs_workgroup_try_to_freeze(struct erofs_workgroup *grp,
+int val)
+{
+   preempt_disable();
+   /* no need to spin on UP platforms, let's just disable preemption. */
+   if (val != atomic_read(&grp->refcount)) {
+   preempt_enable();
+   return false;
+   }
+   return true;
+}
+
+static inline void erofs_workgroup_unfreeze(struct erofs_workgroup *grp,
+   int orig_val)
+{
+   preempt_enable();
+}
+
+static inline int erofs_wait_on_workgroup_freezed(struct erofs_workgroup *grp)
+{
+   int v = atomic_read(&grp->refcount);
+
+   /* workgroup is never freezed on uniprocessor systems */
+   DBG_BUGON(v == EROFS_LOCKED_MAGIC);
+   return v;
+}
+#endif /* !CONFIG_SMP */
+#endif /* !CONFIG_EROFS_FS_ZIP */
+
 /* we strictly follow PAGE_SIZE and no buffer head yet */
 #define LOG_BLOCK_SIZE PAGE_SHIFT
 
@@ -409,6 +483,12 @@ extern const struct file_operations erofs_dir_fops;
 
 /* utils.c */
 #ifdef CONFIG_EROFS_FS_ZIP
+int erofs_workgroup_put(struct erofs_workgroup *grp);
+struct erofs_workgroup *erofs_find_workgroup(struct super_block *sb,
+pgoff_t index, bool *tag);
+int erofs_register_workgroup(struct super_block *sb,
+struct erofs_workgroup *grp, bool tag);
+static inline void erofs_workgroup_free_rcu(struct erofs_workgroup *grp) {}
 void erofs_shrinker_register(struct super_block *sb);
 void erofs_shrinker_unregister(struct super_block *sb);
 int __init erofs_init_shrinker(void);
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 09992cc3b2fd..ea8d065068fa 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -338,6 +338,10 @@ static int erofs_fill_super(struct super_block *sb, void 
*data, int silent)
else
sb->s_flags &= ~SB_POSIXACL;
 
+#ifdef CONFIG_EROFS_FS_ZIP
+   INIT_RADIX_TREE(&sbi->workstn_tree, GFP_ATOMIC);
+#endif
+
/* get the root inode */
inode = erofs_iget(sb, ROOT_NID(sbi), true);
if (IS_ERR(inode))
diff --git a/fs/erofs/utils.c b/fs/erofs/utils.c
index cab7d77c4e59..6281782

[PATCH v8 20/24] erofs: introduce generic decompression backend

2019-08-14 Thread Gao Xiang
This patch adds decompression backend to EROFS, which
supports uncompressed and LZ4 compressed data.

For compressed data, it uses the following strategy:
1) If outputsize is very small (totally less than a threshold),
   decompress to the per-CPU buffer and do memcpy directly
   in order to avoid vmap() overhead;
2) Otherwise it will fill bounced pages if needed and vmap
   all output pages into a continuous virtual memory area,
   memcpy compressed data to the per-CPU buffer for inplace
   I/O [1] and decompress.

Since LZ4 is a lz77-based algorithm which has a dynamically
populated ("sliding window") dictionary and the maximum
lookback distance is 65535. Therefore the number of bounced
pages could be limited by erofs based on this property.

[1] `LZ4 decompression inplace' will eliminate the extra memcpy
if iend - oend margin is safe enough, see the following patch.

Signed-off-by: Gao Xiang 
---
 fs/erofs/Makefile   |   2 +-
 fs/erofs/compress.h |  62 
 fs/erofs/decompressor.c | 332 
 3 files changed, 395 insertions(+), 1 deletion(-)
 create mode 100644 fs/erofs/compress.h
 create mode 100644 fs/erofs/decompressor.c

diff --git a/fs/erofs/Makefile b/fs/erofs/Makefile
index 930770be124f..5594abca6f95 100644
--- a/fs/erofs/Makefile
+++ b/fs/erofs/Makefile
@@ -7,5 +7,5 @@ ccflags-y += -DEROFS_VERSION=\"$(EROFS_VERSION)\"
 obj-$(CONFIG_EROFS_FS) += erofs.o
 erofs-objs := super.o inode.o data.o namei.o dir.o utils.o
 erofs-$(CONFIG_EROFS_FS_XATTR) += xattr.o
-erofs-$(CONFIG_EROFS_FS_ZIP) += zmap.o
+erofs-$(CONFIG_EROFS_FS_ZIP) += decompressor.o zmap.o
 
diff --git a/fs/erofs/compress.h b/fs/erofs/compress.h
new file mode 100644
index ..57035b7646ef
--- /dev/null
+++ b/fs/erofs/compress.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * linux/fs/erofs/compress.h
+ *
+ * Copyright (C) 2019 HUAWEI, Inc.
+ * http://www.huawei.com/
+ * Created by Gao Xiang 
+ */
+#ifndef __EROFS_FS_COMPRESS_H
+#define __EROFS_FS_COMPRESS_H
+
+#include "internal.h"
+
+enum {
+   Z_EROFS_COMPRESSION_SHIFTED = Z_EROFS_COMPRESSION_MAX,
+   Z_EROFS_COMPRESSION_RUNTIME_MAX
+};
+
+struct z_erofs_decompress_req {
+   struct super_block *sb;
+   struct page **in, **out;
+
+   unsigned short pageofs_out;
+   unsigned int inputsize, outputsize;
+
+   /* indicate the algorithm will be used for decompression */
+   unsigned int alg;
+   bool inplace_io, partial_decoding;
+};
+
+/*
+ * - 0x5A110C8D ('sallocated', Z_EROFS_MAPPING_STAGING) -
+ * used to mark temporary allocated pages from other
+ * file/cached pages and NULL mapping pages.
+ */
+#define Z_EROFS_MAPPING_STAGING ((void *)0x5A110C8D)
+
+/* check if a page is marked as staging */
+static inline bool z_erofs_page_is_staging(struct page *page)
+{
+   return page->mapping == Z_EROFS_MAPPING_STAGING;
+}
+
+static inline bool z_erofs_put_stagingpage(struct list_head *pagepool,
+  struct page *page)
+{
+   if (!z_erofs_page_is_staging(page))
+   return false;
+
+   /* staging pages should not be used by others at the same time */
+   if (page_ref_count(page) > 1)
+   put_page(page);
+   else
+   list_add(&page->lru, pagepool);
+   return true;
+}
+
+int z_erofs_decompress(struct z_erofs_decompress_req *rq,
+  struct list_head *pagepool);
+
+#endif
+
diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c
new file mode 100644
index ..2374dd3c967c
--- /dev/null
+++ b/fs/erofs/decompressor.c
@@ -0,0 +1,332 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * linux/fs/erofs/decompressor.c
+ *
+ * Copyright (C) 2019 HUAWEI, Inc.
+ * http://www.huawei.com/
+ * Created by Gao Xiang 
+ */
+#include "compress.h"
+#include 
+#include 
+
+#ifndef LZ4_DISTANCE_MAX   /* history window size */
+#define LZ4_DISTANCE_MAX 65535 /* set to maximum value by default */
+#endif
+
+#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1)
+
+struct z_erofs_decompressor {
+   /*
+* if destpages have sparsed pages, fill them with bounce pages.
+* it also check whether destpages indicate continuous physical memory.
+*/
+   int (*prepare_destpages)(struct z_erofs_decompress_req *rq,
+struct list_head *pagepool);
+   int (*decompress)(struct z_erofs_decompress_req *rq, u8 *out);
+   char *name;
+};
+
+static bool use_vmap;
+module_param(use_vmap, bool, 0444);
+MODULE_PARM_DESC(use_vmap, "Use vmap() instead of vm_map_ram() (default 0)");
+
+static int lz4_prepare_destpages(struct z_erofs_decompress_req *rq,
+struct list_head *pagepool)
+{
+   const unsigned int nr =
+   PAGE_ALIGN(rq->pageofs_out + rq->outputsize) >> PAGE_SHIFT;
+   struct page *availables[LZ4_MAX_DISTANCE_PA

[PATCH v8 12/24] erofs: introduce tagged pointer

2019-08-14 Thread Gao Xiang
Currently kernel has scattered tagged pointer usages
hacked by hand in plain code, without a unique and
portable functionset to highlight the tagged pointer
itself and wrap these hacked code in order to clean up
all over meaningless magic masks.

This patch introduces simple generic methods to fold
tags into a pointer integer. Currently it supports
the last n bits of the pointer for tags, which can be
selected by users.

In addition, it will also be used for the upcoming EROFS
filesystem, which heavily uses tagged pointer pproach
 to reduce extra memory allocation.

Link: https://en.wikipedia.org/wiki/Tagged_pointer

Signed-off-by: Gao Xiang 
---
 fs/erofs/tagptr.h | 110 ++
 1 file changed, 110 insertions(+)
 create mode 100644 fs/erofs/tagptr.h

diff --git a/fs/erofs/tagptr.h b/fs/erofs/tagptr.h
new file mode 100644
index ..a72897c86744
--- /dev/null
+++ b/fs/erofs/tagptr.h
@@ -0,0 +1,110 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * A tagged pointer implementation
+ *
+ * Copyright (C) 2018 Gao Xiang 
+ */
+#ifndef __EROFS_FS_TAGPTR_H
+#define __EROFS_FS_TAGPTR_H
+
+#include 
+#include 
+
+/*
+ * the name of tagged pointer types are tagptr{1, 2, 3...}_t
+ * avoid directly using the internal structs __tagptr{1, 2, 3...}
+ */
+#define __MAKE_TAGPTR(n) \
+typedef struct __tagptr##n {   \
+   uintptr_t v;\
+} tagptr##n##_t;
+
+__MAKE_TAGPTR(1)
+__MAKE_TAGPTR(2)
+__MAKE_TAGPTR(3)
+__MAKE_TAGPTR(4)
+
+#undef __MAKE_TAGPTR
+
+extern void __compiletime_error("bad tagptr tags")
+   __bad_tagptr_tags(void);
+
+extern void __compiletime_error("bad tagptr type")
+   __bad_tagptr_type(void);
+
+/* fix the broken usage of "#define tagptr2_t tagptr3_t" by users */
+#define __tagptr_mask_1(ptr, n)\
+   __builtin_types_compatible_p(typeof(ptr), struct __tagptr##n) ? \
+   (1UL << (n)) - 1 :
+
+#define __tagptr_mask(ptr) (\
+   __tagptr_mask_1(ptr, 1) ( \
+   __tagptr_mask_1(ptr, 2) ( \
+   __tagptr_mask_1(ptr, 3) ( \
+   __tagptr_mask_1(ptr, 4) ( \
+   __bad_tagptr_type(), 0)
+
+/* generate a tagged pointer from a raw value */
+#define tagptr_init(type, val) \
+   ((typeof(type)){ .v = (uintptr_t)(val) })
+
+/*
+ * directly cast a tagged pointer to the native pointer type, which
+ * could be used for backward compatibility of existing code.
+ */
+#define tagptr_cast_ptr(tptr) ((void *)(tptr).v)
+
+/* encode tagged pointers */
+#define tagptr_fold(type, ptr, _tags) ({ \
+   const typeof(_tags) tags = (_tags); \
+   if (__builtin_constant_p(tags) && (tags & ~__tagptr_mask(type))) \
+   __bad_tagptr_tags(); \
+tagptr_init(type, (uintptr_t)(ptr) | tags); })
+
+/* decode tagged pointers */
+#define tagptr_unfold_ptr(tptr) \
+   ((void *)((tptr).v & ~__tagptr_mask(tptr)))
+
+#define tagptr_unfold_tags(tptr) \
+   ((tptr).v & __tagptr_mask(tptr))
+
+/* operations for the tagger pointer */
+#define tagptr_eq(_tptr1, _tptr2) ({ \
+   typeof(_tptr1) tptr1 = (_tptr1); \
+   typeof(_tptr2) tptr2 = (_tptr2); \
+   (void)(&tptr1 == &tptr2); \
+(tptr1).v == (tptr2).v; })
+
+/* lock-free CAS operation */
+#define tagptr_cmpxchg(_ptptr, _o, _n) ({ \
+   typeof(_ptptr) ptptr = (_ptptr); \
+   typeof(_o) o = (_o); \
+   typeof(_n) n = (_n); \
+   (void)(&o == &n); \
+   (void)(&o == ptptr); \
+tagptr_init(o, cmpxchg(&ptptr->v, o.v, n.v)); })
+
+/* wrap WRITE_ONCE if atomic update is needed */
+#define tagptr_replace_tags(_ptptr, tags) ({ \
+   typeof(_ptptr) ptptr = (_ptptr); \
+   *ptptr = tagptr_fold(*ptptr, tagptr_unfold_ptr(*ptptr), tags); \
+*ptptr; })
+
+#define tagptr_set_tags(_ptptr, _tags) ({ \
+   typeof(_ptptr) ptptr = (_ptptr); \
+   const typeof(_tags) tags = (_tags); \
+   if (__builtin_constant_p(tags) && (tags & ~__tagptr_mask(*ptptr))) \
+   __bad_tagptr_tags(); \
+   ptptr->v |= tags; \
+*ptptr; })
+
+#define tagptr_clear_tags(_ptptr, _tags) ({ \
+   typeof(_ptptr) ptptr = (_ptptr); \
+   const typeof(_tags) tags = (_tags); \
+   if (__builtin_constant_p(tags) && (tags & ~__tagptr_mask(*ptptr))) \
+   __bad_tagptr_tags(); \
+   ptptr->v &= ~tags; \
+*ptptr; })
+
+#endif /* __EROFS_FS_TAGPTR_H */
+
-- 
2.17.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v8 19/24] erofs: add erofs_allocpage()

2019-08-14 Thread Gao Xiang
This patch introduces an temporary _on-stack_ page
pool to reuse the freed page directly as much as
it can for better performance and release all pages
at a time, it also slightly reduces the possibility of
the potential memory allocation failure.

Signed-off-by: Gao Xiang 
---
 fs/erofs/internal.h |  2 ++
 fs/erofs/utils.c| 14 ++
 2 files changed, 16 insertions(+)

diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 3222947c9bab..9dc3d47347db 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -488,6 +488,8 @@ int erofs_namei(struct inode *dir, struct qstr *name,
 extern const struct file_operations erofs_dir_fops;
 
 /* utils.c */
+struct page *erofs_allocpage(struct list_head *pool, gfp_t gfp, bool nofail);
+
 #if (EROFS_PCPUBUF_NR_PAGES > 0)
 void *erofs_get_pcpubuf(unsigned int pagenr);
 #define erofs_put_pcpubuf(buf) do { \
diff --git a/fs/erofs/utils.c b/fs/erofs/utils.c
index f3eed9af24d6..ae6362abed67 100644
--- a/fs/erofs/utils.c
+++ b/fs/erofs/utils.c
@@ -9,6 +9,20 @@
 #include "internal.h"
 #include 
 
+struct page *erofs_allocpage(struct list_head *pool, gfp_t gfp, bool nofail)
+{
+   struct page *page;
+
+   if (!list_empty(pool)) {
+   page = lru_to_page(pool);
+   DBG_BUGON(page_ref_count(page) != 1);
+   list_del(&page->lru);
+   } else {
+   page = alloc_pages(gfp | (nofail ? __GFP_NOFAIL : 0), 0);
+   }
+   return page;
+}
+
 #if (EROFS_PCPUBUF_NR_PAGES > 0)
 static struct {
u8 data[PAGE_SIZE * EROFS_PCPUBUF_NR_PAGES];
-- 
2.17.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v8 18/24] erofs: introduce pagevec for decompression subsystem

2019-08-14 Thread Gao Xiang
For each physical cluster, there is a straight-forward
way of allocating a fixed or variable-sized array to
record the corresponding file pages for its decompression
if we decide to decompress these pages asynchronously
(eg. read-ahead case), however it will take variable-sized
on-heap memory compared with traditional uncompressed
filesystems.

This patch introduces a pagevec solution to reuse some
allocated file page in the time-sharing approach to store
parts of the array itself in order to minimize the extra
memory overhead, thus only a small-sized constant array
used for booting the whole array itself up will be needed.

Signed-off-by: Gao Xiang 
---
 fs/erofs/zpvec.h | 159 +++
 1 file changed, 159 insertions(+)
 create mode 100644 fs/erofs/zpvec.h

diff --git a/fs/erofs/zpvec.h b/fs/erofs/zpvec.h
new file mode 100644
index ..bb7689e67836
--- /dev/null
+++ b/fs/erofs/zpvec.h
@@ -0,0 +1,159 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * linux/fs/erofs/zpvec.h
+ *
+ * Copyright (C) 2018 HUAWEI, Inc.
+ * http://www.huawei.com/
+ * Created by Gao Xiang 
+ */
+#ifndef __EROFS_FS_ZPVEC_H
+#define __EROFS_FS_ZPVEC_H
+
+#include "tagptr.h"
+
+/* page type in pagevec for decompress subsystem */
+enum z_erofs_page_type {
+   /* including Z_EROFS_VLE_PAGE_TAIL_EXCLUSIVE */
+   Z_EROFS_PAGE_TYPE_EXCLUSIVE,
+
+   Z_EROFS_VLE_PAGE_TYPE_TAIL_SHARED,
+
+   Z_EROFS_VLE_PAGE_TYPE_HEAD,
+   Z_EROFS_VLE_PAGE_TYPE_MAX
+};
+
+extern void __compiletime_error("Z_EROFS_PAGE_TYPE_EXCLUSIVE != 0")
+   __bad_page_type_exclusive(void);
+
+/* pagevec tagged pointer */
+typedef tagptr2_t  erofs_vtptr_t;
+
+/* pagevec collector */
+struct z_erofs_pagevec_ctor {
+   struct page *curr, *next;
+   erofs_vtptr_t *pages;
+
+   unsigned int nr, index;
+};
+
+static inline void z_erofs_pagevec_ctor_exit(struct z_erofs_pagevec_ctor *ctor,
+bool atomic)
+{
+   if (!ctor->curr)
+   return;
+
+   if (atomic)
+   kunmap_atomic(ctor->pages);
+   else
+   kunmap(ctor->curr);
+}
+
+static inline struct page *
+z_erofs_pagevec_ctor_next_page(struct z_erofs_pagevec_ctor *ctor,
+  unsigned int nr)
+{
+   unsigned int index;
+
+   /* keep away from occupied pages */
+   if (ctor->next)
+   return ctor->next;
+
+   for (index = 0; index < nr; ++index) {
+   const erofs_vtptr_t t = ctor->pages[index];
+   const unsigned int tags = tagptr_unfold_tags(t);
+
+   if (tags == Z_EROFS_PAGE_TYPE_EXCLUSIVE)
+   return tagptr_unfold_ptr(t);
+   }
+   DBG_BUGON(nr >= ctor->nr);
+   return NULL;
+}
+
+static inline void
+z_erofs_pagevec_ctor_pagedown(struct z_erofs_pagevec_ctor *ctor,
+ bool atomic)
+{
+   struct page *next = z_erofs_pagevec_ctor_next_page(ctor, ctor->nr);
+
+   z_erofs_pagevec_ctor_exit(ctor, atomic);
+
+   ctor->curr = next;
+   ctor->next = NULL;
+   ctor->pages = atomic ?
+   kmap_atomic(ctor->curr) : kmap(ctor->curr);
+
+   ctor->nr = PAGE_SIZE / sizeof(struct page *);
+   ctor->index = 0;
+}
+
+static inline void z_erofs_pagevec_ctor_init(struct z_erofs_pagevec_ctor *ctor,
+unsigned int nr,
+erofs_vtptr_t *pages,
+unsigned int i)
+{
+   ctor->nr = nr;
+   ctor->curr = ctor->next = NULL;
+   ctor->pages = pages;
+
+   if (i >= nr) {
+   i -= nr;
+   z_erofs_pagevec_ctor_pagedown(ctor, false);
+   while (i > ctor->nr) {
+   i -= ctor->nr;
+   z_erofs_pagevec_ctor_pagedown(ctor, false);
+   }
+   }
+   ctor->next = z_erofs_pagevec_ctor_next_page(ctor, i);
+   ctor->index = i;
+}
+
+static inline bool z_erofs_pagevec_enqueue(struct z_erofs_pagevec_ctor *ctor,
+  struct page *page,
+  enum z_erofs_page_type type,
+  bool *occupied)
+{
+   *occupied = false;
+   if (unlikely(!ctor->next && type))
+   if (ctor->index + 1 == ctor->nr)
+   return false;
+
+   if (unlikely(ctor->index >= ctor->nr))
+   z_erofs_pagevec_ctor_pagedown(ctor, false);
+
+   /* exclusive page type must be 0 */
+   if (Z_EROFS_PAGE_TYPE_EXCLUSIVE != (uintptr_t)NULL)
+   __bad_page_type_exclusive();
+
+   /* should remind that collector->next never equal to 1, 2 */
+   if (type == (uintptr_t)ctor->next) {
+   ctor->next = page;
+   *occupied = true;
+   }
+   ctor->pages[ctor->index++] = tagptr_fold(erofs_vtpt

[PATCH v8 03/24] erofs: add super block operations

2019-08-14 Thread Gao Xiang
This commit adds erofs super block operations, including (u)mount,
remount_fs, show_options, statfs, in addition to some private
icache management functions.

Signed-off-by: Gao Xiang 
---
 fs/erofs/super.c | 437 +++
 1 file changed, 437 insertions(+)
 create mode 100644 fs/erofs/super.c

diff --git a/fs/erofs/super.c b/fs/erofs/super.c
new file mode 100644
index ..cd4bd6f48173
--- /dev/null
+++ b/fs/erofs/super.c
@@ -0,0 +1,437 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * linux/fs/erofs/super.c
+ *
+ * Copyright (C) 2017-2018 HUAWEI, Inc.
+ * http://www.huawei.com/
+ * Created by Gao Xiang 
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "internal.h"
+
+#define CREATE_TRACE_POINTS
+#include 
+
+static struct kmem_cache *erofs_inode_cachep __read_mostly;
+
+static void init_once(void *ptr)
+{
+   struct erofs_vnode *vi = ptr;
+
+   inode_init_once(&vi->vfs_inode);
+}
+
+static int __init erofs_init_inode_cache(void)
+{
+   erofs_inode_cachep = kmem_cache_create("erofs_inode",
+  sizeof(struct erofs_vnode), 0,
+  SLAB_RECLAIM_ACCOUNT,
+  init_once);
+
+   return erofs_inode_cachep ? 0 : -ENOMEM;
+}
+
+static void erofs_exit_inode_cache(void)
+{
+   kmem_cache_destroy(erofs_inode_cachep);
+}
+
+static struct inode *alloc_inode(struct super_block *sb)
+{
+   struct erofs_vnode *vi =
+   kmem_cache_alloc(erofs_inode_cachep, GFP_KERNEL);
+
+   if (!vi)
+   return NULL;
+
+   /* zero out everything except vfs_inode */
+   memset(vi, 0, offsetof(struct erofs_vnode, vfs_inode));
+   return &vi->vfs_inode;
+}
+
+static void free_inode(struct inode *inode)
+{
+   struct erofs_vnode *vi = EROFS_V(inode);
+
+   /* be careful RCU symlink path (see ext4_inode_info->i_data)! */
+   if (is_inode_fast_symlink(inode))
+   kfree(inode->i_link);
+
+   kmem_cache_free(erofs_inode_cachep, vi);
+}
+
+static bool check_layout_compatibility(struct super_block *sb,
+  struct erofs_super_block *layout)
+{
+   const unsigned int requirements = le32_to_cpu(layout->requirements);
+
+   EROFS_SB(sb)->requirements = requirements;
+
+   /* check if current kernel meets all mandatory requirements */
+   if (requirements & (~EROFS_ALL_REQUIREMENTS)) {
+   errln("unidentified requirements %x, please upgrade kernel 
version",
+ requirements & ~EROFS_ALL_REQUIREMENTS);
+   return false;
+   }
+   return true;
+}
+
+static int superblock_read(struct super_block *sb)
+{
+   struct erofs_sb_info *sbi;
+   struct buffer_head *bh;
+   struct erofs_super_block *layout;
+   unsigned int blkszbits;
+   int ret;
+
+   bh = sb_bread(sb, 0);
+
+   if (!bh) {
+   errln("cannot read erofs superblock");
+   return -EIO;
+   }
+
+   sbi = EROFS_SB(sb);
+   layout = (struct erofs_super_block *)((u8 *)bh->b_data
++ EROFS_SUPER_OFFSET);
+
+   ret = -EINVAL;
+   if (le32_to_cpu(layout->magic) != EROFS_SUPER_MAGIC_V1) {
+   errln("cannot find valid erofs superblock");
+   goto out;
+   }
+
+   blkszbits = layout->blkszbits;
+   /* 9(512 bytes) + LOG_SECTORS_PER_BLOCK == LOG_BLOCK_SIZE */
+   if (unlikely(blkszbits != LOG_BLOCK_SIZE)) {
+   errln("blksize %u isn't supported on this platform",
+ 1 << blkszbits);
+   goto out;
+   }
+
+   if (!check_layout_compatibility(sb, layout))
+   goto out;
+
+   sbi->blocks = le32_to_cpu(layout->blocks);
+   sbi->meta_blkaddr = le32_to_cpu(layout->meta_blkaddr);
+   sbi->islotbits = ffs(sizeof(struct erofs_inode_v1)) - 1;
+   sbi->root_nid = le16_to_cpu(layout->root_nid);
+   sbi->inos = le64_to_cpu(layout->inos);
+
+   sbi->build_time = le64_to_cpu(layout->build_time);
+   sbi->build_time_nsec = le32_to_cpu(layout->build_time_nsec);
+
+   memcpy(&sb->s_uuid, layout->uuid, sizeof(layout->uuid));
+   memcpy(sbi->volume_name, layout->volume_name,
+  sizeof(layout->volume_name));
+
+   ret = 0;
+out:
+   brelse(bh);
+   return ret;
+}
+
+#ifdef CONFIG_EROFS_FAULT_INJECTION
+const char *erofs_fault_name[FAULT_MAX] = {
+   [FAULT_KMALLOC] = "kmalloc",
+   [FAULT_READ_IO] = "read IO error",
+};
+
+static void __erofs_build_fault_attr(struct erofs_sb_info *sbi,
+unsigned int rate)
+{
+   struct erofs_fault_info *ffi = &sbi->fault_info;
+
+   if (rate) {
+   atomic_set(&ffi->inject_ops, 0);
+   ffi->inject_rate = rate;
+   ffi->inject_type = (1 << FAULT_MAX) - 1;
+  

[PATCH v8 00/24] erofs: promote erofs from staging v8

2019-08-14 Thread Gao Xiang
[I strip the previous cover letter, the old one can be found in v6:
 https://lore.kernel.org/r/20190802125347.166018-1-gaoxian...@huawei.com/]

We'd like to submit a formal moving patch applied to staging tree
for 5.4, before that we'd like to hear if there are some ACKs,
suggestions or NAKs, objections of EROFS. Therefore, we can improve
it in this round or rethink about the whole thing.

As related materials mentioned [1] [2], the goal of EROFS is to
save extra storage space with guaranteed end-to-end performance
for read-only files, which has better performance over exist Linux
compression filesystems based on fixed-sized output compression
and inplace decompression. It even has better performance in
a large compression ratio range compared with generic uncompressed
filesystems with proper CPU-storage combinations. And we think this
direction is correct and a dedicated kernel team is continuously /
actively working on improving it, enough testers and beta / end
users using it.

EROFS has been applied to almost all in-service HUAWEI smartphones
(Yes, the number is still increasing by time) and it seems like
a success. It can be used in more wider scenarios. We think it's
useful for Linux / Android OS community and it's the time moving
out of staging.

In order to get started, latest stable mkfs.erofs is available at

git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git -b dev

with README in the repository.

We are still tuning sequential read performance for ultra-fast
speed NVME SSDs like Samsung 970PRO, but at least now you can
try on your PC with some data with proper compression ratio,
the latest Linux kernel, USB stick for convenience sake and
a not very old-fashioned CPU. There are also benchmarks available
in the above materials mentioned.

EROFS is a self-contained filesystem driver. Although there are
still some TODOs to be more generic, we will actively keep on
developping / tuning EROFS with the evolution of Linux kernel
as the other in-kernel filesystems.

As I mentioned before in LSF/MM 2019, in the future, we'd like
to generalize the decompression engine into a library for other
fses to use after the whole system is mature like fscrypt.
However, such metadata should be designed respectively for
each fs, and synchronous metadata read cost will be larger
than EROFS because of those ondisk limitation. Therefore EROFS
is still a better choice for read-only scenarios.

EROFS is now ready for reviewing and moving, and the code is
already cleaned up as shiny floors... Please kindly take some
precious time, share your comments about EROFS and let us know
your opinion about this. It's really important for us since
generally speaking, we like to use Linux _in-tree_ stuffs rather
than lack of supported out-of-tree / orphan stuffs as well.

Thank you in advance,
Gao Xiang

[1] 
https://kccncosschn19eng.sched.com/event/Nru2/erofs-an-introduction-and-our-smartphone-practice-xiang-gao-huawei
[2] https://www.usenix.org/conference/atc19/presentation/gao

Changelog from v7:
 o keep up with the latest staging tree in addition to
   the latest staging patch:
   https://lore.kernel.org/r/20190814103705.60698-1-gaoxian...@huawei.com/
   - use EUCLEAN for fs corruption cases suggested by Pavel;
   - turn EIO into EOPNOTSUPP for unsupported on-disk format;
   - fix all misused ENOTSUPP into EOPNOTSUPP pointed out by Chao;
 o update cover letter

It can also be found in git at tag "erofs_2019-08-15" (will be shown later) at:
 https://git.kernel.org/pub/scm/linux/kernel/git/xiang/linux.git/

and the latest fs code is available at:
 
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/linux.git/tree/fs/erofs?h=erofs-outofstaging

Changelog from v6:
 o keep up with the latest staging patchset
   
https://lore.kernel.org/linux-fsdevel/20190813023054.73126-1-gaoxian...@huawei.com/
   in order to fix the following cases:
   - inline erofs_inode_is_data_compressed() in erofs_fs.h;
   - remove incomplete cleancache;
   - remove all BUG_ON in EROFS.
 o Removing the file names from the comments at the top of the files
   suggested by Stephen will be applied to the real moving patch later.

Changelog from v5:
 o keep up with "[PATCH v2] staging: erofs: updates according to 
erofs-outofstaging v4"
https://lore.kernel.org/lkml/20190731155752.210602-1-gaoxian...@huawei.com/
   which mainly addresses review comments from Chao:
  - keep the marco EROFS_IO_MAX_RETRIES_NOFAIL in internal.h;
  - kill a redundant NULL check in "__stagingpage_alloc";
  - add some descriptions in document about "use_vmap";
  - rearrange erofs_vmap of "staging: erofs: kill 
CONFIG_EROFS_FS_USE_VM_MAP_RAM";

 o all changes have been merged into staging tree, which are under 
staging-testing:

https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git/log/?h=staging-testing

Changelog from v4:
 o rebase on Linux 5.3-rc1;

 o keep up with "staging: erofs: updates according to erofs-outofstaging v4"
   in order to get main c

[PATCH v8 09/24] erofs: support tracepoint

2019-08-14 Thread Gao Xiang
Add basic tracepoints for ->readpage{,s}, ->lookup,
->destroy_inode, fill_inode and map_blocks.

Signed-off-by: Gao Xiang 
---
 include/trace/events/erofs.h | 241 +++
 1 file changed, 241 insertions(+)
 create mode 100644 include/trace/events/erofs.h

diff --git a/include/trace/events/erofs.h b/include/trace/events/erofs.h
new file mode 100644
index ..0c5847c54b60
--- /dev/null
+++ b/include/trace/events/erofs.h
@@ -0,0 +1,241 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM erofs
+
+#if !defined(_TRACE_EROFS_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_EROFS_H
+
+#include 
+
+#define show_dev(dev)  MAJOR(dev), MINOR(dev)
+#define show_dev_nid(entry)show_dev(entry->dev), entry->nid
+
+#define show_file_type(type)   \
+   __print_symbolic(type,  \
+   { 0,"FILE" },   \
+   { 1,"DIR" })
+
+#define show_map_flags(flags) __print_flags(flags, "|",\
+   { EROFS_GET_BLOCKS_RAW, "RAW" })
+
+#define show_mflags(flags) __print_flags(flags, "",\
+   { EROFS_MAP_MAPPED, "M" },  \
+   { EROFS_MAP_META,   "I" })
+
+TRACE_EVENT(erofs_lookup,
+
+   TP_PROTO(struct inode *dir, struct dentry *dentry, unsigned int flags),
+
+   TP_ARGS(dir, dentry, flags),
+
+   TP_STRUCT__entry(
+   __field(dev_t,  dev )
+   __field(erofs_nid_t,nid )
+   __field(const char *,   name)
+   __field(unsigned int,   flags   )
+   ),
+
+   TP_fast_assign(
+   __entry->dev= dir->i_sb->s_dev;
+   __entry->nid= EROFS_V(dir)->nid;
+   __entry->name   = dentry->d_name.name;
+   __entry->flags  = flags;
+   ),
+
+   TP_printk("dev = (%d,%d), pnid = %llu, name:%s, flags:%x",
+   show_dev_nid(__entry),
+   __entry->name,
+   __entry->flags)
+);
+
+TRACE_EVENT(erofs_fill_inode,
+   TP_PROTO(struct inode *inode, int isdir),
+   TP_ARGS(inode, isdir),
+
+   TP_STRUCT__entry(
+   __field(dev_t,  dev )
+   __field(erofs_nid_t,nid )
+   __field(erofs_blk_t,blkaddr )
+   __field(unsigned int,   ofs )
+   __field(int,isdir   )
+   ),
+
+   TP_fast_assign(
+   __entry->dev= inode->i_sb->s_dev;
+   __entry->nid= EROFS_V(inode)->nid;
+   __entry->blkaddr= erofs_blknr(iloc(EROFS_I_SB(inode), 
__entry->nid));
+   __entry->ofs= erofs_blkoff(iloc(EROFS_I_SB(inode), 
__entry->nid));
+   __entry->isdir  = isdir;
+   ),
+
+   TP_printk("dev = (%d,%d), nid = %llu, blkaddr %u ofs %u, isdir %d",
+ show_dev_nid(__entry),
+ __entry->blkaddr, __entry->ofs,
+ __entry->isdir)
+);
+
+TRACE_EVENT(erofs_readpage,
+
+   TP_PROTO(struct page *page, bool raw),
+
+   TP_ARGS(page, raw),
+
+   TP_STRUCT__entry(
+   __field(dev_t,  dev )
+   __field(erofs_nid_t,nid )
+   __field(int,dir )
+   __field(pgoff_t,index   )
+   __field(int,uptodate)
+   __field(bool,   raw )
+   ),
+
+   TP_fast_assign(
+   __entry->dev= page->mapping->host->i_sb->s_dev;
+   __entry->nid= EROFS_V(page->mapping->host)->nid;
+   __entry->dir= S_ISDIR(page->mapping->host->i_mode);
+   __entry->index  = page->index;
+   __entry->uptodate = PageUptodate(page);
+   __entry->raw = raw;
+   ),
+
+   TP_printk("dev = (%d,%d), nid = %llu, %s, index = %lu, uptodate = %d "
+   "raw = %d",
+   show_dev_nid(__entry),
+   show_file_type(__entry->dir),
+   (unsigned long)__entry->index,
+   __entry->uptodate,
+   __entry->raw)
+);
+
+TRACE_EVENT(erofs_readpages,
+
+   TP_PROTO(struct inode *inode, struct page *page, unsigned int nrpage,
+   bool raw),
+
+   TP_ARGS(inode, page, nrpage, raw),
+
+   TP_STRUCT__entry(
+   __field(dev_t,  dev )
+   __field(erofs_nid_t,nid )
+   __field(pgoff_t,start   )
+   __field(unsigned int,   nrpage  )
+   __field(bool,   raw )
+   ),
+
+   TP_fast_assign(
+   __entry->dev= inode->i_sb->s_dev;
+   __entry->nid= EROFS_V(inode)->nid;
+   __entry->start  = page->index;
+   __entry->nrpage = nrpage;
+ 

[PATCH v8 21/24] erofs: introduce LZ4 decompression inplace

2019-08-14 Thread Gao Xiang
compressed data will be usually loaded into last pages of
the extent (the last page for 4k) for in-place decompression
(more specifically, in-place IO), as ilustration below,

 start of compressed logical extent
   |  end of this logical extent
   |   |
 __v___v
... |  page 6  |  page 7  |  page 8  |  page 9  | ...
|__|__|__|__|
   . ^ .^
   . |compressed|
   . |   data   |
   .   ..
   |<  dstsize>||
   oend iend
   opip

Therefore, it's possible to do decompression inplace (thus no
memcpy at all) if the margin is sufficient and safe enough [1],
and it can be implemented only for fixed-size output compression
compared with fixed-size input compression.

No memcpy for most of in-place IO (about 99% of enwik9) after
decompression inplace is implemented and sequential read will
be improved of course (see the following patches for test results).

[1] https://github.com/lz4/lz4/commit/b17f578a919b7e6b078cede2d52be29dd48c8e8c
https://github.com/lz4/lz4/commit/5997e139f53169fa3a1c1b4418d2452a90b01602

Signed-off-by: Gao Xiang 
---
 fs/erofs/decompressor.c | 36 
 fs/erofs/erofs_fs.h |  2 +-
 2 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c
index 2374dd3c967c..9a750bf662a5 100644
--- a/fs/erofs/decompressor.c
+++ b/fs/erofs/decompressor.c
@@ -15,6 +15,9 @@
 #endif
 
 #define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1)
+#ifndef LZ4_DECOMPRESS_INPLACE_MARGIN
+#define LZ4_DECOMPRESS_INPLACE_MARGIN(srcsize)  (((srcsize) >> 8) + 32)
+#endif
 
 struct z_erofs_decompressor {
/*
@@ -117,7 +120,7 @@ static int lz4_decompress(struct z_erofs_decompress_req 
*rq, u8 *out)
 {
unsigned int inputmargin, inlen;
u8 *src;
-   bool copied;
+   bool copied, support_0padding;
int ret;
 
if (rq->inputsize > PAGE_SIZE)
@@ -125,13 +128,38 @@ static int lz4_decompress(struct z_erofs_decompress_req 
*rq, u8 *out)
 
src = kmap_atomic(*rq->in);
inputmargin = 0;
+   support_0padding = false;
+
+   /* decompression inplace is only safe when 0padding is enabled */
+   if (EROFS_SB(rq->sb)->requirements & EROFS_REQUIREMENT_LZ4_0PADDING) {
+   support_0padding = true;
+
+   while (!src[inputmargin & ~PAGE_MASK])
+   if (!(++inputmargin & ~PAGE_MASK))
+   break;
+
+   if (inputmargin >= rq->inputsize) {
+   kunmap_atomic(src);
+   return -EIO;
+   }
+   }
 
copied = false;
inlen = rq->inputsize - inputmargin;
if (rq->inplace_io) {
-   src = generic_copy_inplace_data(rq, src, inputmargin);
-   inputmargin = 0;
-   copied = true;
+   const uint oend = (rq->pageofs_out +
+  rq->outputsize) & ~PAGE_MASK;
+   const uint nr = PAGE_ALIGN(rq->pageofs_out +
+  rq->outputsize) >> PAGE_SHIFT;
+
+   if (rq->partial_decoding || !support_0padding ||
+   rq->out[nr - 1] != rq->in[0] ||
+   rq->inputsize - oend <
+ LZ4_DECOMPRESS_INPLACE_MARGIN(inlen)) {
+   src = generic_copy_inplace_data(rq, src, inputmargin);
+   inputmargin = 0;
+   copied = true;
+   }
}
 
ret = LZ4_decompress_safe_partial(src + inputmargin, out,
diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
index 230fcba1099d..c0fb7d6ebfcb 100644
--- a/fs/erofs/erofs_fs.h
+++ b/fs/erofs/erofs_fs.h
@@ -17,7 +17,7 @@
  * incompatible with this kernel version.
  */
 #define EROFS_REQUIREMENT_LZ4_0PADDING 0x0001
-#define EROFS_ALL_REQUIREMENTS 0
+#define EROFS_ALL_REQUIREMENTS EROFS_REQUIREMENT_LZ4_0PADDING
 
 struct erofs_super_block {
 /*  0 */__le32 magic;   /* in the little endian */
-- 
2.17.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v8 15/24] erofs: introduce erofs shrinker

2019-08-14 Thread Gao Xiang
This patch adds a dedicated shrinker targeting to free
unneeded memory consumed by a number of erofs in-memory
data structures.

Like F2FS and UBIFS, it also adds:
  - sbi->umount_mutex to avoid races on shrinker and put_super;
  - sbi->shrinker_run_no to not revisit recently scanned objects.

Signed-off-by: Gao Xiang 
---
 fs/erofs/internal.h |  7 
 fs/erofs/super.c|  6 +++
 fs/erofs/utils.c| 93 -
 3 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 62f1e3ffe0a2..4bcdf32a45ad 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -63,6 +63,9 @@ struct erofs_sb_info {
 #ifdef CONFIG_EROFS_FS_ZIP
/* list for all registered superblocks, mainly for shrinker */
struct list_head list;
+   struct mutex umount_mutex;
+
+   unsigned int shrinker_run_no;
 #endif /* CONFIG_EROFS_FS_ZIP */
u32 blocks;
u32 meta_blkaddr;
@@ -408,9 +411,13 @@ extern const struct file_operations erofs_dir_fops;
 #ifdef CONFIG_EROFS_FS_ZIP
 void erofs_shrinker_register(struct super_block *sb);
 void erofs_shrinker_unregister(struct super_block *sb);
+int __init erofs_init_shrinker(void);
+void erofs_exit_shrinker(void);
 #else
 static inline void erofs_shrinker_register(struct super_block *sb) {}
 static inline void erofs_shrinker_unregister(struct super_block *sb) {}
+static inline int erofs_init_shrinker(void) { return 0; }
+static inline void erofs_exit_shrinker(void) {}
 #endif /* !CONFIG_EROFS_FS_ZIP */
 
 #define EFSCORRUPTEDEUCLEAN /* Filesystem is corrupted */
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 2eca3b25db75..09992cc3b2fd 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -413,6 +413,9 @@ static int __init erofs_module_init(void)
if (err)
goto icache_err;
 
+   err = erofs_init_shrinker();
+   if (err)
+   goto shrinker_err;
err = register_filesystem(&erofs_fs_type);
if (err)
goto fs_err;
@@ -421,6 +424,8 @@ static int __init erofs_module_init(void)
return 0;
 
 fs_err:
+   erofs_exit_shrinker();
+shrinker_err:
erofs_exit_inode_cache();
 icache_err:
return err;
@@ -429,6 +434,7 @@ static int __init erofs_module_init(void)
 static void __exit erofs_module_exit(void)
 {
unregister_filesystem(&erofs_fs_type);
+   erofs_exit_shrinker();
erofs_exit_inode_cache();
infoln("successfully finalize erofs");
 }
diff --git a/fs/erofs/utils.c b/fs/erofs/utils.c
index 791b2df1f761..cab7d77c4e59 100644
--- a/fs/erofs/utils.c
+++ b/fs/erofs/utils.c
@@ -9,6 +9,12 @@
 #include "internal.h"
 
 #ifdef CONFIG_EROFS_FS_ZIP
+/* global shrink count (for all mounted EROFS instances) */
+static atomic_long_t erofs_global_shrink_cnt;
+
+/* protected by 'erofs_sb_list_lock' */
+static unsigned int shrinker_run_no;
+
 /* protects the mounted 'erofs_sb_list' */
 static DEFINE_SPINLOCK(erofs_sb_list_lock);
 static LIST_HEAD(erofs_sb_list);
@@ -17,6 +23,8 @@ void erofs_shrinker_register(struct super_block *sb)
 {
struct erofs_sb_info *sbi = EROFS_SB(sb);
 
+   mutex_init(&sbi->umount_mutex);
+
spin_lock(&erofs_sb_list_lock);
list_add(&sbi->list, &erofs_sb_list);
spin_unlock(&erofs_sb_list_lock);
@@ -24,9 +32,92 @@ void erofs_shrinker_register(struct super_block *sb)
 
 void erofs_shrinker_unregister(struct super_block *sb)
 {
+   struct erofs_sb_info *const sbi = EROFS_SB(sb);
+
+   mutex_lock(&sbi->umount_mutex);
+   /* will add shrink final handler here */
+
+   spin_lock(&erofs_sb_list_lock);
+   list_del(&sbi->list);
+   spin_unlock(&erofs_sb_list_lock);
+   mutex_unlock(&sbi->umount_mutex);
+}
+
+static unsigned long erofs_shrink_count(struct shrinker *shrink,
+   struct shrink_control *sc)
+{
+   return atomic_long_read(&erofs_global_shrink_cnt);
+}
+
+static unsigned long erofs_shrink_scan(struct shrinker *shrink,
+  struct shrink_control *sc)
+{
+   struct erofs_sb_info *sbi;
+   struct list_head *p;
+
+   unsigned long nr = sc->nr_to_scan;
+   unsigned int run_no;
+   unsigned long freed = 0;
+
spin_lock(&erofs_sb_list_lock);
-   list_del(&EROFS_SB(sb)->list);
+   do {
+   run_no = ++shrinker_run_no;
+   } while (run_no == 0);
+
+   /* Iterate over all mounted superblocks and try to shrink them */
+   p = erofs_sb_list.next;
+   while (p != &erofs_sb_list) {
+   sbi = list_entry(p, struct erofs_sb_info, list);
+
+   /*
+* We move the ones we do to the end of the list, so we stop
+* when we see one we have already done.
+*/
+   if (sbi->shrinker_run_no == run_no)
+   break;
+
+   if (!mutex_trylock(&sbi->umount_mutex

[PATCH v8 23/24] erofs: introduce cached decompression

2019-08-14 Thread Gao Xiang
This patch adds strategies which can be selected
by users in order to cache both incomplete ends of
compressed physical clusters as a complement of
in-place I/O in order to boost random read, but
it costs more memory than the in-place I/O only.

Signed-off-by: Gao Xiang 
---
 fs/erofs/internal.h |  16 +
 fs/erofs/super.c| 126 -
 fs/erofs/utils.c|  40 ---
 fs/erofs/zdata.c| 165 ++--
 fs/erofs/zdata.h|   7 +-
 5 files changed, 336 insertions(+), 18 deletions(-)

diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 2be1ae700aca..ad3b6ba75979 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -72,6 +72,12 @@ struct erofs_sb_info {
unsigned int max_sync_decompress_pages;
 
unsigned int shrinker_run_no;
+
+   /* current strategy of how to use managed cache */
+   unsigned char cache_strategy;
+
+   /* pseudo inode to manage cached pages */
+   struct inode *managed_cache;
 #endif /* CONFIG_EROFS_FS_ZIP */
u32 blocks;
u32 meta_blkaddr;
@@ -157,6 +163,12 @@ static inline void *erofs_kmalloc(struct erofs_sb_info 
*sbi,
 #define test_opt(sbi, option)  ((sbi)->mount_opt & EROFS_MOUNT_##option)
 
 #ifdef CONFIG_EROFS_FS_ZIP
+enum {
+   EROFS_ZIP_CACHE_DISABLED,
+   EROFS_ZIP_CACHE_READAHEAD,
+   EROFS_ZIP_CACHE_READAROUND
+};
+
 #define EROFS_LOCKED_MAGIC (INT_MIN | 0xE0F510CCL)
 
 /* basic unit of the workstation of a super_block */
@@ -524,6 +536,10 @@ int __init erofs_init_shrinker(void);
 void erofs_exit_shrinker(void);
 int __init z_erofs_init_zip_subsystem(void);
 void z_erofs_exit_zip_subsystem(void);
+int erofs_try_to_free_all_cached_pages(struct erofs_sb_info *sbi,
+  struct erofs_workgroup *egrp);
+int erofs_try_to_free_cached_page(struct address_space *mapping,
+ struct page *page);
 #else
 static inline void erofs_shrinker_register(struct super_block *sb) {}
 static inline void erofs_shrinker_unregister(struct super_block *sb) {}
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index bdac8abf3aa7..95187619b3e3 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -197,10 +197,45 @@ static unsigned int erofs_get_fault_rate(struct 
erofs_sb_info *sbi)
 }
 #endif
 
+#ifdef CONFIG_EROFS_FS_ZIP
+static int erofs_build_cache_strategy(struct erofs_sb_info *sbi,
+ substring_t *args)
+{
+   const char *cs = match_strdup(args);
+   int err = 0;
+
+   if (!cs) {
+   errln("Not enough memory to store cache strategy");
+   return -ENOMEM;
+   }
+
+   if (!strcmp(cs, "disabled")) {
+   sbi->cache_strategy = EROFS_ZIP_CACHE_DISABLED;
+   } else if (!strcmp(cs, "readahead")) {
+   sbi->cache_strategy = EROFS_ZIP_CACHE_READAHEAD;
+   } else if (!strcmp(cs, "readaround")) {
+   sbi->cache_strategy = EROFS_ZIP_CACHE_READAROUND;
+   } else {
+   errln("Unrecognized cache strategy \"%s\"", cs);
+   err = -EINVAL;
+   }
+   kfree(cs);
+   return err;
+}
+#else
+static int erofs_build_cache_strategy(struct erofs_sb_info *sbi,
+ substring_t *args)
+{
+   infoln("EROFS compression is disabled, so cache strategy is ignored");
+   return 0;
+}
+#endif
+
 /* set up default EROFS parameters */
 static void default_options(struct erofs_sb_info *sbi)
 {
 #ifdef CONFIG_EROFS_FS_ZIP
+   sbi->cache_strategy = EROFS_ZIP_CACHE_READAROUND;
sbi->max_sync_decompress_pages = 3;
 #endif
 #ifdef CONFIG_EROFS_FS_XATTR
@@ -217,6 +252,7 @@ enum {
Opt_acl,
Opt_noacl,
Opt_fault_injection,
+   Opt_cache_strategy,
Opt_err
 };
 
@@ -226,6 +262,7 @@ static match_table_t erofs_tokens = {
{Opt_acl, "acl"},
{Opt_noacl, "noacl"},
{Opt_fault_injection, "fault_injection=%u"},
+   {Opt_cache_strategy, "cache_strategy=%s"},
{Opt_err, NULL}
 };
 
@@ -283,6 +320,11 @@ static int parse_options(struct super_block *sb, char 
*options)
if (err)
return err;
break;
+   case Opt_cache_strategy:
+   err = erofs_build_cache_strategy(EROFS_SB(sb), args);
+   if (err)
+   return err;
+   break;
default:
errln("Unrecognized mount option \"%s\" or missing 
value", p);
return -EINVAL;
@@ -291,6 +333,65 @@ static int parse_options(struct super_block *sb, char 
*options)
return 0;
 }
 
+#ifdef CONFIG_EROFS_FS_ZIP
+static const struct address_space_operations managed_cache_aops;
+
+static int managed_cache_releasepage(struct page *page, gfp_t gfp_mask)
+{
+   int ret = 1;/* 0 - busy */
+  

[PATCH v8 24/24] erofs: add document

2019-08-14 Thread Gao Xiang
This documents key features, usage, and
on-disk design of erofs.

Signed-off-by: Gao Xiang 
---
 Documentation/filesystems/erofs.txt | 225 
 1 file changed, 225 insertions(+)
 create mode 100644 Documentation/filesystems/erofs.txt

diff --git a/Documentation/filesystems/erofs.txt 
b/Documentation/filesystems/erofs.txt
new file mode 100644
index ..457e601e0467
--- /dev/null
+++ b/Documentation/filesystems/erofs.txt
@@ -0,0 +1,225 @@
+Overview
+
+
+EROFS file-system stands for Enhanced Read-Only File System. Different
+from other read-only file systems, it aims to be designed for flexibility,
+scalability, but be kept simple and high performance.
+
+It is designed as a better filesystem solution for the following scenarios:
+ - read-only storage media or
+
+ - part of a fully trusted read-only solution, which means it needs to be
+   immutable and bit-for-bit identical to the official golden image for
+   their releases due to security and other considerations and
+
+ - hope to save some extra storage space with guaranteed end-to-end performance
+   by using reduced metadata and transparent file compression, especially
+   for those embedded devices with limited memory (ex, smartphone);
+
+Here is the main features of EROFS:
+ - Little endian on-disk design;
+
+ - Currently 4KB block size (nobh) and therefore maximum 16TB address space;
+
+ - Metadata & data could be mixed by design;
+
+ - 2 inode versions for different requirements:
+  v1v2
+   Inode metadata size:   32 bytes  64 bytes
+   Max file size: 4 GB  16 EB (also limited by max. vol size)
+   Max uids/gids: 65536 4294967296
+   File creation time:noyes (64 + 32-bit timestamp)
+   Max hardlinks: 65536 4294967296
+   Metadata reserved: 4 bytes   14 bytes
+
+ - Support extended attributes (xattrs) as an option;
+
+ - Support xattr inline and tail-end data inline for all files;
+
+ - Support POSIX.1e ACLs by using xattrs;
+
+ - Support statx();
+
+ - Support transparent file compression as an option:
+   LZ4 algorithm with 4 KB fixed-output compression for high performance;
+
+The following git tree provides the file system user-space tools under
+development (ex, formatting tool mkfs.erofs):
+>> git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git
+
+Bugs and patches are welcome, please kindly help us and send to the following
+linux-erofs mailing list:
+>> linux-erofs mailing list   
+
+Note that EROFS is still working in progress as a Linux staging driver,
+Cc the staging mailing list as well is highly recommended:
+>> Linux Driver Project Developer List 
+
+Mount options
+=
+
+fault_injection=%d Enable fault injection in all supported types with
+   specified injection rate. Supported injection type:
+   Type_NameType_Value
+   FAULT_KMALLOC0x1
+   FAULT_READ_IO0x2
+(no)user_xattr Setup Extended User Attributes. Note: xattr is enabled
+   by default if CONFIG_EROFS_FS_XATTR is selected.
+(no)aclSetup POSIX Access Control List. Note: acl is enabled
+   by default if CONFIG_EROFS_FS_POSIX_ACL is selected.
+cache_strategy=%s  Select a strategy for cached decompression from now on:
+ disabled: In-place I/O decompression only;
+readahead: Cache the last incomplete compressed 
physical
+   cluster for further reading. It still does
+   in-place I/O decompression for the rest
+   compressed physical clusters;
+   readaround: Cache the both ends of incomplete compressed
+   physical clusters for further reading.
+   It still does in-place I/O decompression
+   for the rest compressed physical clusters.
+
+Module parameters
+=
+use_vmap=[0|1] Use vmap() instead of vm_map_ram() (default 0).
+
+On-disk details
+===
+
+Summary
+---
+Different from other read-only file systems, an EROFS volume is designed
+to be as simple as possible:
+
+|-> aligned with the block size
+   
+  | |SB| | ... | Metadata | ... | Data | Metadata | ... | Data |
+  |_|__|_|_|__|_|__|__|_|__|
+  0 +1K
+
+All data areas should be aligned with the block size, but metadata areas
+may not. All metadatas can be now observed in two different spaces (views):
+ 1. Inode metadata space
+Each valid inode should be aligned with an inode slot, which is a fixed
+value (32 bytes) and 

[PATCH v8 22/24] erofs: introduce the decompression frontend

2019-08-14 Thread Gao Xiang
This patch introduces the basic inplace fixed-sized
output decompression implementation for erofs
filesystem.

In constant to fixed-sized input compression, it has
fixed-sized capacity for each compressed cluster to
contain compressed data with the following advantages:
 1) improved storage density;
 2) decompression inplace support;
 3) all data in a compressed physical cluster can be
decompressed and utilized.

The key point of inplace refers to one of all erofs
decompression strategies: Instead of allocating extra
compressed pages and data management structures, it
reuses the allocated file cache pages as much as
possible to store its compressed data (called inplace
I/O) and the corresponding pagevec in a time-sharing
approach, which is particularly useful for low memory
scenario.

In addition, decompression inplace technology is based
on inplace I/O, which eliminates page allocation and
all extra compressed data memcpy.

Signed-off-by: Gao Xiang 
---
 fs/erofs/Kconfig|1 +
 fs/erofs/Makefile   |2 +-
 fs/erofs/internal.h |   14 +-
 fs/erofs/super.c|   11 +
 fs/erofs/zdata.c| 1254 +++
 fs/erofs/zdata.h|  192 +++
 fs/erofs/zmap.c |4 +-
 7 files changed, 1474 insertions(+), 4 deletions(-)
 create mode 100644 fs/erofs/zdata.c
 create mode 100644 fs/erofs/zdata.h

diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig
index 5f8787c0cf89..16316d1adca3 100644
--- a/fs/erofs/Kconfig
+++ b/fs/erofs/Kconfig
@@ -76,6 +76,7 @@ config EROFS_FS_ZIP
bool "EROFS Data Compression Support"
depends on EROFS_FS
select LZ4_DECOMPRESS
+   default y
help
  Enable fixed-sized output compression for EROFS.
 
diff --git a/fs/erofs/Makefile b/fs/erofs/Makefile
index 5594abca6f95..46f2aa4ba46c 100644
--- a/fs/erofs/Makefile
+++ b/fs/erofs/Makefile
@@ -7,5 +7,5 @@ ccflags-y += -DEROFS_VERSION=\"$(EROFS_VERSION)\"
 obj-$(CONFIG_EROFS_FS) += erofs.o
 erofs-objs := super.o inode.o data.o namei.o dir.o utils.o
 erofs-$(CONFIG_EROFS_FS_XATTR) += xattr.o
-erofs-$(CONFIG_EROFS_FS_ZIP) += decompressor.o zmap.o
+erofs-$(CONFIG_EROFS_FS_ZIP) += decompressor.o zmap.o zdata.o
 
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 9dc3d47347db..2be1ae700aca 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -68,6 +68,9 @@ struct erofs_sb_info {
/* the dedicated workstation for compression */
struct radix_tree_root workstn_tree;
 
+   /* threshold for decompression synchronously */
+   unsigned int max_sync_decompress_pages;
+
unsigned int shrinker_run_no;
 #endif /* CONFIG_EROFS_FS_ZIP */
u32 blocks;
@@ -327,6 +330,9 @@ static inline bool is_inode_flat_inline(struct inode *inode)
 extern const struct super_operations erofs_sops;
 
 extern const struct address_space_operations erofs_raw_access_aops;
+#ifdef CONFIG_EROFS_FS_ZIP
+extern const struct address_space_operations z_erofs_vle_normalaccess_aops;
+#endif
 
 /*
  * Logical to physical block mapping, used by erofs_map_blocks()
@@ -487,7 +493,7 @@ int erofs_namei(struct inode *dir, struct qstr *name,
 /* dir.c */
 extern const struct file_operations erofs_dir_fops;
 
-/* utils.c */
+/* utils.c / zdata.c */
 struct page *erofs_allocpage(struct list_head *pool, gfp_t gfp, bool nofail);
 
 #if (EROFS_PCPUBUF_NR_PAGES > 0)
@@ -511,16 +517,20 @@ struct erofs_workgroup *erofs_find_workgroup(struct 
super_block *sb,
 pgoff_t index, bool *tag);
 int erofs_register_workgroup(struct super_block *sb,
 struct erofs_workgroup *grp, bool tag);
-static inline void erofs_workgroup_free_rcu(struct erofs_workgroup *grp) {}
+void erofs_workgroup_free_rcu(struct erofs_workgroup *grp);
 void erofs_shrinker_register(struct super_block *sb);
 void erofs_shrinker_unregister(struct super_block *sb);
 int __init erofs_init_shrinker(void);
 void erofs_exit_shrinker(void);
+int __init z_erofs_init_zip_subsystem(void);
+void z_erofs_exit_zip_subsystem(void);
 #else
 static inline void erofs_shrinker_register(struct super_block *sb) {}
 static inline void erofs_shrinker_unregister(struct super_block *sb) {}
 static inline int erofs_init_shrinker(void) { return 0; }
 static inline void erofs_exit_shrinker(void) {}
+static inline int z_erofs_init_zip_subsystem(void) { return 0; }
+static inline void z_erofs_exit_zip_subsystem(void) {}
 #endif /* !CONFIG_EROFS_FS_ZIP */
 
 #define EFSCORRUPTEDEUCLEAN /* Filesystem is corrupted */
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index ea8d065068fa..bdac8abf3aa7 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -200,6 +200,9 @@ static unsigned int erofs_get_fault_rate(struct 
erofs_sb_info *sbi)
 /* set up default EROFS parameters */
 static void default_options(struct erofs_sb_info *sbi)
 {
+#ifdef CONFIG_EROFS_FS_ZIP
+   sbi->max_sync_decompress_pages = 3;
+#endif
 #ifdef CONFIG_EROFS_FS_XATTR
set_opt(

[PATCH] staging: gasket: apex: Make structure apex_desc constant

2019-08-14 Thread Nishka Dasgupta
Static structure apex_desc, of type gasket_driver_desc, is used only as
an argument to the functions gasket_register_device() and
gasket_unregister_device(). In the definitions of both these functions,
their parameter is declared as const. Hence make apex_desc itself
constant to protect it from modification.
Issue found with Coccinelle.

Signed-off-by: Nishka Dasgupta 
---
 drivers/staging/gasket/apex_driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/gasket/apex_driver.c 
b/drivers/staging/gasket/apex_driver.c
index 464648ee2036..2973bb920a26 100644
--- a/drivers/staging/gasket/apex_driver.c
+++ b/drivers/staging/gasket/apex_driver.c
@@ -659,7 +659,7 @@ static void apex_pci_remove(struct pci_dev *pci_dev)
pci_disable_device(pci_dev);
 }
 
-static struct gasket_driver_desc apex_desc = {
+static const struct gasket_driver_desc apex_desc = {
.name = "apex",
.driver_version = APEX_DRIVER_VERSION,
.major = 120,
-- 
2.19.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel