Re: [PATCH v6 02/12] x86/paravirt: switch time pvops functions to use static_call()

2021-03-09 Thread Jürgen Groß via Virtualization

On 09.03.21 19:57, Borislav Petkov wrote:

On Tue, Mar 09, 2021 at 02:48:03PM +0100, Juergen Gross wrote:

@@ -167,6 +168,17 @@ static u64 native_steal_clock(int cpu)
return 0;
  }
  
+DEFINE_STATIC_CALL(pv_steal_clock, native_steal_clock);

+DEFINE_STATIC_CALL(pv_sched_clock, native_sched_clock);
+
+bool paravirt_using_native_sched_clock = true;
+
+void paravirt_set_sched_clock(u64 (*func)(void))
+{
+   static_call_update(pv_sched_clock, func);
+   paravirt_using_native_sched_clock = (func == native_sched_clock);
+}


What's the point of this function if there's a global
paravirt_using_native_sched_clock variable now?


It is combining the two needed actions: update the static call and
set the paravirt_using_native_sched_clock boolean.


Looking how the bit of information whether native_sched_clock is used,
is needed in tsc.c, it probably would be cleaner if you add a

set_sched_clock_native(void);

or so, to tsc.c instead and call that here and make that long var name a
a shorter and static one in tsc.c instead.


I need to transfer a boolean value, so it would need to be

set_sched_clock_native(bool state);

In the end the difference is only marginal IMO.

Just had another idea: I could add a function to static_call.h for
querying the current function. This would avoid the double book keeping
and could probably be used later when switching other pv_ops calls to
static_call, too (e.g. pv_is_native_spin_unlock()).

What do you think?


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: application/pgp-keys


OpenPGP_signature
Description: OpenPGP digital signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 9/9] zsmalloc: remove the zsmalloc file system

2021-03-09 Thread Minchan Kim
On Tue, Mar 09, 2021 at 04:53:48PM +0100, Christoph Hellwig wrote:
> Just use the generic anon_inode file system.
> 
> Signed-off-by: Christoph Hellwig 
Acked-by: Minchan Kim 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/9] fs: rename alloc_anon_inode to alloc_anon_inode_sb

2021-03-09 Thread Minchan Kim
On Tue, Mar 09, 2021 at 04:53:40PM +0100, Christoph Hellwig wrote:
> Rename alloc_inode to free the name for a new variant that does not
> need boilerplate to create a super_block first.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/powerpc/platforms/pseries/cmm.c | 2 +-
>  drivers/dma-buf/dma-buf.c| 2 +-
>  drivers/gpu/drm/drm_drv.c| 2 +-
>  drivers/misc/cxl/api.c   | 2 +-
>  drivers/misc/vmw_balloon.c   | 2 +-
>  drivers/scsi/cxlflash/ocxl_hw.c  | 2 +-
>  drivers/virtio/virtio_balloon.c  | 2 +-
>  fs/aio.c | 2 +-
>  fs/anon_inodes.c | 4 ++--
>  fs/libfs.c   | 2 +-
>  include/linux/fs.h   | 2 +-
>  kernel/resource.c| 2 +-
>  mm/z3fold.c  | 2 +-
>  mm/zsmalloc.c| 2 +-
>  14 files changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/cmm.c 
> b/arch/powerpc/platforms/pseries/cmm.c
> index 45a3a3022a85c9..6d36b858b14df1 100644
> --- a/arch/powerpc/platforms/pseries/cmm.c
> +++ b/arch/powerpc/platforms/pseries/cmm.c
> @@ -580,7 +580,7 @@ static int cmm_balloon_compaction_init(void)
>   return rc;
>   }
>  
> - b_dev_info.inode = alloc_anon_inode(balloon_mnt->mnt_sb);
> + b_dev_info.inode = alloc_anon_inode_sb(balloon_mnt->mnt_sb);
>   if (IS_ERR(b_dev_info.inode)) {
>   rc = PTR_ERR(b_dev_info.inode);
>   b_dev_info.inode = NULL;
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index f264b70c383eb4..dedcc9483352dc 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -445,7 +445,7 @@ static inline int is_dma_buf_file(struct file *file)
>  static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags)
>  {
>   struct file *file;
> - struct inode *inode = alloc_anon_inode(dma_buf_mnt->mnt_sb);
> + struct inode *inode = alloc_anon_inode_sb(dma_buf_mnt->mnt_sb);
>  
>   if (IS_ERR(inode))
>   return ERR_CAST(inode);
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 20d22e41d7ce74..87e7214a8e3565 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -519,7 +519,7 @@ static struct inode *drm_fs_inode_new(void)
>   return ERR_PTR(r);
>   }
>  
> - inode = alloc_anon_inode(drm_fs_mnt->mnt_sb);
> + inode = alloc_anon_inode_sb(drm_fs_mnt->mnt_sb);
>   if (IS_ERR(inode))
>   simple_release_fs(&drm_fs_mnt, &drm_fs_cnt);
>  
> diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
> index b493de962153ba..2efbf6c98028ef 100644
> --- a/drivers/misc/cxl/api.c
> +++ b/drivers/misc/cxl/api.c
> @@ -73,7 +73,7 @@ static struct file *cxl_getfile(const char *name,
>   goto err_module;
>   }
>  
> - inode = alloc_anon_inode(cxl_vfs_mount->mnt_sb);
> + inode = alloc_anon_inode_sb(cxl_vfs_mount->mnt_sb);
>   if (IS_ERR(inode)) {
>   file = ERR_CAST(inode);
>   goto err_fs;
> diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
> index b837e7eba5f7dc..5d057a05ddbee8 100644
> --- a/drivers/misc/vmw_balloon.c
> +++ b/drivers/misc/vmw_balloon.c
> @@ -1900,7 +1900,7 @@ static __init int vmballoon_compaction_init(struct 
> vmballoon *b)
>   return PTR_ERR(vmballoon_mnt);
>  
>   b->b_dev_info.migratepage = vmballoon_migratepage;
> - b->b_dev_info.inode = alloc_anon_inode(vmballoon_mnt->mnt_sb);
> + b->b_dev_info.inode = alloc_anon_inode_sb(vmballoon_mnt->mnt_sb);
>  
>   if (IS_ERR(b->b_dev_info.inode))
>   return PTR_ERR(b->b_dev_info.inode);
> diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
> index 244fc27215dc79..40184ed926b557 100644
> --- a/drivers/scsi/cxlflash/ocxl_hw.c
> +++ b/drivers/scsi/cxlflash/ocxl_hw.c
> @@ -88,7 +88,7 @@ static struct file *ocxlflash_getfile(struct device *dev, 
> const char *name,
>   goto err2;
>   }
>  
> - inode = alloc_anon_inode(ocxlflash_vfs_mount->mnt_sb);
> + inode = alloc_anon_inode_sb(ocxlflash_vfs_mount->mnt_sb);
>   if (IS_ERR(inode)) {
>   rc = PTR_ERR(inode);
>   dev_err(dev, "%s: alloc_anon_inode failed rc=%d\n",
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 8985fc2cea8615..cae76ee5bdd688 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -916,7 +916,7 @@ static int virtballoon_probe(struct virtio_device *vdev)
>   }
>  
>   vb->vb_dev_info.migratepage = virtballoon_migratepage;
> - vb->vb_dev_info.inode = alloc_anon_inode(balloon_mnt->mnt_sb);
> + vb->vb_dev_info.inode = alloc_anon_inode_sb(balloon_mnt->mnt_sb);
>   if (IS_ERR(vb->vb_dev_info.inode)) {
>   err = PTR_ERR(vb->vb_dev_info.inode);
>   go

Re: make alloc_anon_inode more useful

2021-03-09 Thread Matthew Wilcox
On Tue, Mar 09, 2021 at 04:53:39PM +0100, Christoph Hellwig wrote:
> this series first renames the existing alloc_anon_inode to
> alloc_anon_inode_sb to clearly mark it as requiring a superblock.
> 
> It then adds a new alloc_anon_inode that works on the anon_inode
> file system super block, thus removing tons of boilerplate code.
> 
> The few remainig callers of alloc_anon_inode_sb all use alloc_file_pseudo
> later, but might also be ripe for some cleanup.

On a somewhat related note, could I get you to look at
drivers/video/fbdev/core/fb_defio.c?

As far as I can tell, there's no need for fb_deferred_io_aops to exist.
We could just set file->f_mapping->a_ops to NULL, and set_page_dirty()
would do the exact same thing this code does (except it would get the
return value correct).

But maybe that would make something else go wrong that distinguishes
between page->mapping being NULL and page->mapping->a_ops->foo being NULL?
Completely untested patch ...

diff --git a/drivers/video/fbdev/core/fb_defio.c 
b/drivers/video/fbdev/core/fb_defio.c
index a591d291b231..441ec31d3e4d 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -151,17 +151,6 @@ static const struct vm_operations_struct 
fb_deferred_io_vm_ops = {
.page_mkwrite   = fb_deferred_io_mkwrite,
 };
 
-static int fb_deferred_io_set_page_dirty(struct page *page)
-{
-   if (!PageDirty(page))
-   SetPageDirty(page);
-   return 0;
-}
-
-static const struct address_space_operations fb_deferred_io_aops = {
-   .set_page_dirty = fb_deferred_io_set_page_dirty,
-};
-
 int fb_deferred_io_mmap(struct fb_info *info, struct vm_area_struct *vma)
 {
vma->vm_ops = &fb_deferred_io_vm_ops;
@@ -212,14 +201,6 @@ void fb_deferred_io_init(struct fb_info *info)
 }
 EXPORT_SYMBOL_GPL(fb_deferred_io_init);
 
-void fb_deferred_io_open(struct fb_info *info,
-struct inode *inode,
-struct file *file)
-{
-   file->f_mapping->a_ops = &fb_deferred_io_aops;
-}
-EXPORT_SYMBOL_GPL(fb_deferred_io_open);
-
 void fb_deferred_io_cleanup(struct fb_info *info)
 {
struct fb_deferred_io *fbdefio = info->fbdefio;
diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
index 06f5805de2de..c4ba76359f22 100644
--- a/drivers/video/fbdev/core/fbmem.c
+++ b/drivers/video/fbdev/core/fbmem.c
@@ -1415,10 +1415,7 @@ __releases(&info->lock)
if (res)
module_put(info->fbops->owner);
}
-#ifdef CONFIG_FB_DEFERRED_IO
-   if (info->fbdefio)
-   fb_deferred_io_open(info, inode, file);
-#endif
+   file->f_mapping->a_ops = NULL;
 out:
unlock_fb_info(info);
if (res)
diff --git a/include/linux/fb.h b/include/linux/fb.h
index ecfbcc0553a5..a8dccd23c249 100644
--- a/include/linux/fb.h
+++ b/include/linux/fb.h
@@ -659,9 +659,6 @@ static inline void __fb_pad_aligned_buffer(u8 *dst, u32 
d_pitch,
 /* drivers/video/fb_defio.c */
 int fb_deferred_io_mmap(struct fb_info *info, struct vm_area_struct *vma);
 extern void fb_deferred_io_init(struct fb_info *info);
-extern void fb_deferred_io_open(struct fb_info *info,
-   struct inode *inode,
-   struct file *file);
 extern void fb_deferred_io_cleanup(struct fb_info *info);
 extern int fb_deferred_io_fsync(struct file *file, loff_t start,
loff_t end, int datasync);
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v6] i2c: virtio: add a virtio i2c frontend driver

2021-03-09 Thread Jason Wang


On 2021/3/10 10:22 上午, Jie Deng wrote:


On 2021/3/4 17:15, Jason Wang wrote:




+    }
+
+    if (msgs[i].flags & I2C_M_RD)
+    memcpy(msgs[i].buf, req->buf, msgs[i].len);



Sorry if I had asked this before but any rason not to use msg[i].buf 
directly?



The msg[i].buf is passed by the I2C core. I just noticed that these 
bufs are not
always allocated by kmalloc. They may come from the stack, which may 
cause
the check "sg_init_one -> sg_set_buf -> virt_addr_valid"  to fail. 
Therefore the

msg[i].buf is not suitable for direct use here.

Regards.



Right, stack is virtually mapped.

Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio-mmio: read[wl]()/write[wl] are already little-endian

2021-03-09 Thread kernel test robot
Hi Laurent,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linux/master]
[also build test WARNING on linus/master v5.12-rc2 next-20210309]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Laurent-Vivier/virtio-mmio-read-wl-write-wl-are-already-little-endian/20210310-064527
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
144c79ef33536b4ecb4951e07dbc1f2b7fa99d32
config: x86_64-randconfig-s022-20210310 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.3-262-g5e674421-dirty
# 
https://github.com/0day-ci/linux/commit/1fd3d4da486545f554eb33663c6afe068bbcbcf8
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Laurent-Vivier/virtio-mmio-read-wl-write-wl-are-already-little-endian/20210310-064527
git checkout 1fd3d4da486545f554eb33663c6afe068bbcbcf8
# save the attached .config to linux build tree
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


"sparse warnings: (new ones prefixed by >>)"
   drivers/virtio/virtio_mmio.c:171:19: sparse: sparse: incorrect type in 
assignment (different base types) @@ expected restricted __le16 [usertype] 
w @@ got unsigned short @@
   drivers/virtio/virtio_mmio.c:171:19: sparse: expected restricted __le16 
[usertype] w
   drivers/virtio/virtio_mmio.c:171:19: sparse: got unsigned short
   drivers/virtio/virtio_mmio.c:175:19: sparse: sparse: incorrect type in 
assignment (different base types) @@ expected restricted __le32 [usertype] 
l @@ got unsigned int @@
   drivers/virtio/virtio_mmio.c:175:19: sparse: expected restricted __le32 
[usertype] l
   drivers/virtio/virtio_mmio.c:175:19: sparse: got unsigned int
   drivers/virtio/virtio_mmio.c:179:19: sparse: sparse: incorrect type in 
assignment (different base types) @@ expected restricted __le32 
[addressable] [usertype] l @@ got unsigned int @@
   drivers/virtio/virtio_mmio.c:179:19: sparse: expected restricted __le32 
[addressable] [usertype] l
   drivers/virtio/virtio_mmio.c:179:19: sparse: got unsigned int
   drivers/virtio/virtio_mmio.c:181:19: sparse: sparse: incorrect type in 
assignment (different base types) @@ expected restricted __le32 
[addressable] [usertype] l @@ got unsigned int @@
   drivers/virtio/virtio_mmio.c:181:19: sparse: expected restricted __le32 
[addressable] [usertype] l
   drivers/virtio/virtio_mmio.c:181:19: sparse: got unsigned int
>> drivers/virtio/virtio_mmio.c:215:24: sparse: sparse: incorrect type in 
>> argument 1 (different base types) @@ expected unsigned short val @@ 
>> got restricted __le16 [addressable] [usertype] w @@
   drivers/virtio/virtio_mmio.c:215:24: sparse: expected unsigned short val
   drivers/virtio/virtio_mmio.c:215:24: sparse: got restricted __le16 
[addressable] [usertype] w
>> drivers/virtio/virtio_mmio.c:219:24: sparse: sparse: incorrect type in 
>> argument 1 (different base types) @@ expected unsigned int val @@ 
>> got restricted __le32 [addressable] [usertype] l @@
   drivers/virtio/virtio_mmio.c:219:24: sparse: expected unsigned int val
   drivers/virtio/virtio_mmio.c:219:24: sparse: got restricted __le32 
[addressable] [usertype] l
   drivers/virtio/virtio_mmio.c:223:24: sparse: sparse: incorrect type in 
argument 1 (different base types) @@ expected unsigned int val @@ got 
restricted __le32 [addressable] [usertype] l @@
   drivers/virtio/virtio_mmio.c:223:24: sparse: expected unsigned int val
   drivers/virtio/virtio_mmio.c:223:24: sparse: got restricted __le32 
[addressable] [usertype] l
   drivers/virtio/virtio_mmio.c:225:24: sparse: sparse: incorrect type in 
argument 1 (different base types) @@ expected unsigned int val @@ got 
restricted __le32 [addressable] [usertype] l @@
   drivers/virtio/virtio_mmio.c:225:24: sparse: expected unsigned int val
   drivers/virtio/virtio_mmio.c:225:24: sparse: got restricted __le32 
[addressable] [usertype] l

vim +215 drivers/virtio/virtio_mmio.c

   188  
   189  static void vm_set(struct virtio_device *vdev, unsigned offset,
   190 const void *buf, unsigned len)
   191  {
   192  struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
   193  void __iomem *base = vm_dev->base + VIRTIO_MMIO_CONFIG;
   194  u8 b;
   195  __le16 w;
   196  __le32 l;
   197  
   198  if (vm_dev->version == 1) {
   199   

Re: [PATCH v6] i2c: virtio: add a virtio i2c frontend driver

2021-03-09 Thread Jie Deng


On 2021/3/4 17:15, Jason Wang wrote:




+    }
+
+    if (msgs[i].flags & I2C_M_RD)
+    memcpy(msgs[i].buf, req->buf, msgs[i].len);



Sorry if I had asked this before but any rason not to use msg[i].buf 
directly?



The msg[i].buf is passed by the I2C core. I just noticed that these bufs 
are not

always allocated by kmalloc. They may come from the stack, which may cause
the check "sg_init_one -> sg_set_buf -> virt_addr_valid"  to fail. 
Therefore the

msg[i].buf is not suitable for direct use here.

Regards.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v6 02/12] x86/paravirt: switch time pvops functions to use static_call()

2021-03-09 Thread Borislav Petkov
On Tue, Mar 09, 2021 at 02:48:03PM +0100, Juergen Gross wrote:
> @@ -167,6 +168,17 @@ static u64 native_steal_clock(int cpu)
>   return 0;
>  }
>  
> +DEFINE_STATIC_CALL(pv_steal_clock, native_steal_clock);
> +DEFINE_STATIC_CALL(pv_sched_clock, native_sched_clock);
> +
> +bool paravirt_using_native_sched_clock = true;
> +
> +void paravirt_set_sched_clock(u64 (*func)(void))
> +{
> + static_call_update(pv_sched_clock, func);
> + paravirt_using_native_sched_clock = (func == native_sched_clock);
> +}

What's the point of this function if there's a global
paravirt_using_native_sched_clock variable now?

Looking how the bit of information whether native_sched_clock is used,
is needed in tsc.c, it probably would be cleaner if you add a

set_sched_clock_native(void);

or so, to tsc.c instead and call that here and make that long var name a
a shorter and static one in tsc.c instead.

Hmm?

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v7 net-next] virtio-net: support XDP when not more queues

2021-03-09 Thread Michael S. Tsirkin
On Mon, Mar 08, 2021 at 04:52:16PM +0800, Xuan Zhuo wrote:
> The number of queues implemented by many virtio backends is limited,
> especially some machines have a large number of CPUs. In this case, it
> is often impossible to allocate a separate queue for
> XDP_TX/XDP_REDIRECT, then xdp cannot be loaded to work, even xdp does
> not use the XDP_TX/XDP_REDIRECT.
> 
> This patch allows XDP_TX/XDP_REDIRECT to run by reuse the existing SQ
> with __netif_tx_lock() hold when there are not enough queues.
> 
> Signed-off-by: Xuan Zhuo 
> Reviewed-by: Dust Li 
> ---
> v7: 1. use macros to implement get/put
> 2. remove 'flag'. (suggested by Jason Wang)
> 
> v6: 1. use __netif_tx_acquire()/__netif_tx_release(). (suggested by Jason 
> Wang)
> 2. add note for why not lock. (suggested by Jason Wang)
> 3. Use variable 'flag' to record with or without locked.  It is not safe 
> to
>use curr_queue_pairs in "virtnet_put_xdp_sq", because it may changed 
> after
>"virtnet_get_xdp_sq".
> 
> v5: change subject from 'support XDP_TX when not more queues'
> 
> v4: make sparse happy
> suggested by Jakub Kicinski
> 
> v3: add warning when no more queues
> suggested by Jesper Dangaard Brouer
> 
>  drivers/net/virtio_net.c | 55 
> 
>  1 file changed, 42 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index ba8e637..5ce40ec 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -195,6 +195,9 @@ struct virtnet_info {
>   /* # of XDP queue pairs currently used by the driver */
>   u16 xdp_queue_pairs;
> 
> + /* xdp_queue_pairs may be 0, when xdp is already loaded. So add this. */
> + bool xdp_enabled;
> +
>   /* I like... big packets and I cannot lie! */
>   bool big_packets;
> 
> @@ -481,12 +484,34 @@ static int __virtnet_xdp_xmit_one(struct virtnet_info 
> *vi,
>   return 0;
>  }
> 
> -static struct send_queue *virtnet_xdp_sq(struct virtnet_info *vi)
> -{
> - unsigned int qp;
> -
> - qp = vi->curr_queue_pairs - vi->xdp_queue_pairs + smp_processor_id();
> - return &vi->sq[qp];
> +/* when vi->curr_queue_pairs > nr_cpu_ids, the txq/sq is only used for xdp 
> tx on
> + * the current cpu, so it does not need to be locked.
> + */

pls also explain why these are macros not inline functions in the
comment.



> +#define virtnet_xdp_get_sq(vi) ({ \
> + struct netdev_queue *txq; \
> + typeof(vi) v = (vi);  \
> + unsigned int qp;  \


empty line here after variable definitions.

same elsewhere

> + if (v->curr_queue_pairs > nr_cpu_ids) {   \
> + qp = v->curr_queue_pairs - v->xdp_queue_pairs;\
> + qp += smp_processor_id(); \
> + txq = netdev_get_tx_queue(v->dev, qp);\
> + __netif_tx_acquire(txq);  \
> + } else {  \
> + qp = smp_processor_id() % v->curr_queue_pairs;\
> + txq = netdev_get_tx_queue(v->dev, qp);\
> + __netif_tx_lock(txq, raw_smp_processor_id()); \
> + } \
> + v->sq + qp;   \
> +})
> +
> +#define virtnet_xdp_put_sq(vi, q) {   \
> + struct netdev_queue *txq; \
> + typeof(vi) v = (vi);  \
> + txq = netdev_get_tx_queue(v->dev, (q) - v->sq);   \
> + if (v->curr_queue_pairs > nr_cpu_ids) \
> + __netif_tx_release(txq);  \
> + else  \
> + __netif_tx_unlock(txq);   \
>  }


>  static int virtnet_xdp_xmit(struct net_device *dev,
> @@ -512,7 +537,7 @@ static int virtnet_xdp_xmit(struct net_device *dev,
>   if (!xdp_prog)
>   return -ENXIO;
> 
> - sq = virtnet_xdp_sq(vi);
> + sq = virtnet_xdp_get_sq(vi);
> 
>   if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK)) {
>   ret = -EINVAL;
> @@ -560,12 +585,13 @@ static int virtnet_xdp_xmit(struct net_device *dev,
>   sq->stats.kicks += kicks;
>   u64_stats_update_end(&sq->stats.syncp);
> 
> + virtnet_xdp_put_sq(vi, sq);
>   return ret;
>  }
> 
>  static unsigned int virtnet_get_headroom(struct virtnet_info *vi)
>  {
> - return vi->xdp_queue_pairs ? VIRTIO_XDP_HEADROOM : 0;
> + return vi->xdp_enabled ? VIRTIO_XDP_HE

Re: [PATCH 6/9] virtio_balloon: remove the balloon-kvm file system

2021-03-09 Thread David Hildenbrand

On 09.03.21 16:53, Christoph Hellwig wrote:

Just use the generic anon_inode file system.

Signed-off-by: Christoph Hellwig 
---
  drivers/virtio/virtio_balloon.c | 30 +++---
  1 file changed, 3 insertions(+), 27 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index cae76ee5bdd688..1efb890cd3ff09 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -6,6 +6,7 @@
   *  Copyright 2008 Rusty Russell IBM Corporation
   */
  
+#include 

  #include 
  #include 
  #include 
@@ -42,10 +43,6 @@
(1 << (VIRTIO_BALLOON_HINT_BLOCK_ORDER + PAGE_SHIFT))
  #define VIRTIO_BALLOON_HINT_BLOCK_PAGES (1 << VIRTIO_BALLOON_HINT_BLOCK_ORDER)
  
-#ifdef CONFIG_BALLOON_COMPACTION

-static struct vfsmount *balloon_mnt;
-#endif
-
  enum virtio_balloon_vq {
VIRTIO_BALLOON_VQ_INFLATE,
VIRTIO_BALLOON_VQ_DEFLATE,
@@ -805,18 +802,6 @@ static int virtballoon_migratepage(struct balloon_dev_info 
*vb_dev_info,
  
  	return MIGRATEPAGE_SUCCESS;

  }
-
-static int balloon_init_fs_context(struct fs_context *fc)
-{
-   return init_pseudo(fc, BALLOON_KVM_MAGIC) ? 0 : -ENOMEM;
-}
-
-static struct file_system_type balloon_fs = {
-   .name   = "balloon-kvm",
-   .init_fs_context = balloon_init_fs_context,
-   .kill_sb= kill_anon_super,
-};
-
  #endif /* CONFIG_BALLOON_COMPACTION */
  
  static unsigned long shrink_free_pages(struct virtio_balloon *vb,

@@ -909,17 +894,11 @@ static int virtballoon_probe(struct virtio_device *vdev)
goto out_free_vb;
  
  #ifdef CONFIG_BALLOON_COMPACTION

-   balloon_mnt = kern_mount(&balloon_fs);
-   if (IS_ERR(balloon_mnt)) {
-   err = PTR_ERR(balloon_mnt);
-   goto out_del_vqs;
-   }
-
vb->vb_dev_info.migratepage = virtballoon_migratepage;
-   vb->vb_dev_info.inode = alloc_anon_inode_sb(balloon_mnt->mnt_sb);
+   vb->vb_dev_info.inode = alloc_anon_inode();
if (IS_ERR(vb->vb_dev_info.inode)) {
err = PTR_ERR(vb->vb_dev_info.inode);
-   goto out_kern_unmount;
+   goto out_del_vqs;
}
vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops;
  #endif
@@ -1016,8 +995,6 @@ static int virtballoon_probe(struct virtio_device *vdev)
  out_iput:
  #ifdef CONFIG_BALLOON_COMPACTION
iput(vb->vb_dev_info.inode);
-out_kern_unmount:
-   kern_unmount(balloon_mnt);
  out_del_vqs:
  #endif
vdev->config->del_vqs(vdev);
@@ -1070,7 +1047,6 @@ static void virtballoon_remove(struct virtio_device *vdev)
if (vb->vb_dev_info.inode)
iput(vb->vb_dev_info.inode);
  
-	kern_unmount(balloon_mnt);

  #endif
kfree(vb);
  }



... you might know what I am going to say :)

Apart from that LGTM.

--
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 5/9] vmw_balloon: remove the balloon-vmware file system

2021-03-09 Thread David Hildenbrand

On 09.03.21 16:53, Christoph Hellwig wrote:

Just use the generic anon_inode file system.

Signed-off-by: Christoph Hellwig 
---
  drivers/misc/vmw_balloon.c | 24 ++--
  1 file changed, 2 insertions(+), 22 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 5d057a05ddbee8..be4be32f858253 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -16,6 +16,7 @@
  //#define DEBUG
  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
  
+#include 

  #include 
  #include 
  #include 
@@ -1735,20 +1736,6 @@ static inline void vmballoon_debugfs_exit(struct 
vmballoon *b)
  
  
  #ifdef CONFIG_BALLOON_COMPACTION

-
-static int vmballoon_init_fs_context(struct fs_context *fc)
-{
-   return init_pseudo(fc, BALLOON_VMW_MAGIC) ? 0 : -ENOMEM;
-}
-
-static struct file_system_type vmballoon_fs = {
-   .name   = "balloon-vmware",
-   .init_fs_context= vmballoon_init_fs_context,
-   .kill_sb= kill_anon_super,
-};
-
-static struct vfsmount *vmballoon_mnt;
-
  /**
   * vmballoon_migratepage() - migrates a balloon page.
   * @b_dev_info: balloon device information descriptor.
@@ -1878,8 +1865,6 @@ static void vmballoon_compaction_deinit(struct vmballoon 
*b)
iput(b->b_dev_info.inode);
  
  	b->b_dev_info.inode = NULL;

-   kern_unmount(vmballoon_mnt);
-   vmballoon_mnt = NULL;
  }
  
  /**

@@ -1895,13 +1880,8 @@ static void vmballoon_compaction_deinit(struct vmballoon 
*b)
   */
  static __init int vmballoon_compaction_init(struct vmballoon *b)
  {
-   vmballoon_mnt = kern_mount(&vmballoon_fs);
-   if (IS_ERR(vmballoon_mnt))
-   return PTR_ERR(vmballoon_mnt);
-
b->b_dev_info.migratepage = vmballoon_migratepage;
-   b->b_dev_info.inode = alloc_anon_inode_sb(vmballoon_mnt->mnt_sb);
-
+   b->b_dev_info.inode = alloc_anon_inode();
if (IS_ERR(b->b_dev_info.inode))
return PTR_ERR(b->b_dev_info.inode);
  



Same comment regarding BALLOON_VMW_MAGIC and includes (mount.h, 
pseudo_fs.h).


Apart from that looks good.

--
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 3/9] powerpc/pseries: remove the ppc-cmm file system

2021-03-09 Thread David Hildenbrand

On 09.03.21 16:53, Christoph Hellwig wrote:

Just use the generic anon_inode file system.

Signed-off-by: Christoph Hellwig 
---
  arch/powerpc/platforms/pseries/cmm.c | 27 ++-
  1 file changed, 2 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c 
b/arch/powerpc/platforms/pseries/cmm.c
index 6d36b858b14df1..9d07e6bea7126c 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -6,6 +6,7 @@
   * Author(s): Brian King (brk...@linux.vnet.ibm.com),
   */
  
+#include 

  #include 
  #include 
  #include 
@@ -502,19 +503,6 @@ static struct notifier_block cmm_mem_nb = {
  };
  
  #ifdef CONFIG_BALLOON_COMPACTION

-static struct vfsmount *balloon_mnt;
-
-static int cmm_init_fs_context(struct fs_context *fc)
-{
-   return init_pseudo(fc, PPC_CMM_MAGIC) ? 0 : -ENOMEM;
-}
-
-static struct file_system_type balloon_fs = {
-   .name = "ppc-cmm",
-   .init_fs_context = cmm_init_fs_context,
-   .kill_sb = kill_anon_super,
-};
-
  static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
   struct page *newpage, struct page *page,
   enum migrate_mode mode)
@@ -573,19 +561,10 @@ static int cmm_balloon_compaction_init(void)
balloon_devinfo_init(&b_dev_info);
b_dev_info.migratepage = cmm_migratepage;
  
-	balloon_mnt = kern_mount(&balloon_fs);

-   if (IS_ERR(balloon_mnt)) {
-   rc = PTR_ERR(balloon_mnt);
-   balloon_mnt = NULL;
-   return rc;
-   }
-
-   b_dev_info.inode = alloc_anon_inode_sb(balloon_mnt->mnt_sb);
+   b_dev_info.inode = alloc_anon_inode();
if (IS_ERR(b_dev_info.inode)) {
rc = PTR_ERR(b_dev_info.inode);
b_dev_info.inode = NULL;
-   kern_unmount(balloon_mnt);
-   balloon_mnt = NULL;
return rc;
}
  
@@ -597,8 +576,6 @@ static void cmm_balloon_compaction_deinit(void)

if (b_dev_info.inode)
iput(b_dev_info.inode);
b_dev_info.inode = NULL;
-   kern_unmount(balloon_mnt);
-   balloon_mnt = NULL;
  }
  #else /* CONFIG_BALLOON_COMPACTION */
  static int cmm_balloon_compaction_init(void)



I always wondered why that was necessary after all (with my limited fs 
knowledge :) ).


a) I assume you want to remove PPC_CMM_MAGIC from 
include/uapi/linux/magic.h as well?


b) Do we still need #include , #include  
and #include ?


Apart from that looks much cleaner.

--
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/9] fs: add an argument-less alloc_anon_inode

2021-03-09 Thread David Hildenbrand

On 09.03.21 16:53, Christoph Hellwig wrote:

Add a new alloc_anon_inode helper that allocates an inode on
the anon_inode file system.

Signed-off-by: Christoph Hellwig 
---
  fs/anon_inodes.c| 15 +--
  include/linux/anon_inodes.h |  1 +
  2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c
index 4745fc37014332..b6a8ea71920bc3 100644
--- a/fs/anon_inodes.c
+++ b/fs/anon_inodes.c
@@ -63,7 +63,7 @@ static struct inode *anon_inode_make_secure_inode(
const struct qstr qname = QSTR_INIT(name, strlen(name));
int error;
  
-	inode = alloc_anon_inode_sb(anon_inode_mnt->mnt_sb);

+   inode = alloc_anon_inode();
if (IS_ERR(inode))
return inode;
inode->i_flags &= ~S_PRIVATE;
@@ -225,13 +225,24 @@ int anon_inode_getfd_secure(const char *name, const 
struct file_operations *fops
  }
  EXPORT_SYMBOL_GPL(anon_inode_getfd_secure);
  
+/**

+ * alloc_anon_inode - create a new anonymous inode
+ *
+ * Create an inode on the anon_inode file system and return it.
+ */
+struct inode *alloc_anon_inode(void)
+{
+   return alloc_anon_inode_sb(anon_inode_mnt->mnt_sb);
+}
+EXPORT_SYMBOL_GPL(alloc_anon_inode);
+
  static int __init anon_inode_init(void)
  {
anon_inode_mnt = kern_mount(&anon_inode_fs_type);
if (IS_ERR(anon_inode_mnt))
panic("anon_inode_init() kernel mount failed (%ld)\n", 
PTR_ERR(anon_inode_mnt));
  
-	anon_inode_inode = alloc_anon_inode_sb(anon_inode_mnt->mnt_sb);

+   anon_inode_inode = alloc_anon_inode();
if (IS_ERR(anon_inode_inode))
panic("anon_inode_init() inode allocation failed (%ld)\n", 
PTR_ERR(anon_inode_inode));
  
diff --git a/include/linux/anon_inodes.h b/include/linux/anon_inodes.h

index 71881a2b6f7860..b5ae9a6eda9923 100644
--- a/include/linux/anon_inodes.h
+++ b/include/linux/anon_inodes.h
@@ -21,6 +21,7 @@ int anon_inode_getfd_secure(const char *name,
const struct file_operations *fops,
void *priv, int flags,
const struct inode *context_inode);
+struct inode *alloc_anon_inode(void);
  
  #endif /* _LINUX_ANON_INODES_H */
  



Reviewed-by: David Hildenbrand 

--
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/9] fs: rename alloc_anon_inode to alloc_anon_inode_sb

2021-03-09 Thread David Hildenbrand

On 09.03.21 16:53, Christoph Hellwig wrote:

Rename alloc_inode to free the name for a new variant that does not
need boilerplate to create a super_block first.

Signed-off-by: Christoph Hellwig 
---
  arch/powerpc/platforms/pseries/cmm.c | 2 +-
  drivers/dma-buf/dma-buf.c| 2 +-
  drivers/gpu/drm/drm_drv.c| 2 +-
  drivers/misc/cxl/api.c   | 2 +-
  drivers/misc/vmw_balloon.c   | 2 +-
  drivers/scsi/cxlflash/ocxl_hw.c  | 2 +-
  drivers/virtio/virtio_balloon.c  | 2 +-
  fs/aio.c | 2 +-
  fs/anon_inodes.c | 4 ++--
  fs/libfs.c   | 2 +-
  include/linux/fs.h   | 2 +-
  kernel/resource.c| 2 +-
  mm/z3fold.c  | 2 +-
  mm/zsmalloc.c| 2 +-
  14 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c 
b/arch/powerpc/platforms/pseries/cmm.c
index 45a3a3022a85c9..6d36b858b14df1 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -580,7 +580,7 @@ static int cmm_balloon_compaction_init(void)
return rc;
}
  
-	b_dev_info.inode = alloc_anon_inode(balloon_mnt->mnt_sb);

+   b_dev_info.inode = alloc_anon_inode_sb(balloon_mnt->mnt_sb);
if (IS_ERR(b_dev_info.inode)) {
rc = PTR_ERR(b_dev_info.inode);
b_dev_info.inode = NULL;
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index f264b70c383eb4..dedcc9483352dc 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -445,7 +445,7 @@ static inline int is_dma_buf_file(struct file *file)
  static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags)
  {
struct file *file;
-   struct inode *inode = alloc_anon_inode(dma_buf_mnt->mnt_sb);
+   struct inode *inode = alloc_anon_inode_sb(dma_buf_mnt->mnt_sb);
  
  	if (IS_ERR(inode))

return ERR_CAST(inode);
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 20d22e41d7ce74..87e7214a8e3565 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -519,7 +519,7 @@ static struct inode *drm_fs_inode_new(void)
return ERR_PTR(r);
}
  
-	inode = alloc_anon_inode(drm_fs_mnt->mnt_sb);

+   inode = alloc_anon_inode_sb(drm_fs_mnt->mnt_sb);
if (IS_ERR(inode))
simple_release_fs(&drm_fs_mnt, &drm_fs_cnt);
  
diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c

index b493de962153ba..2efbf6c98028ef 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -73,7 +73,7 @@ static struct file *cxl_getfile(const char *name,
goto err_module;
}
  
-	inode = alloc_anon_inode(cxl_vfs_mount->mnt_sb);

+   inode = alloc_anon_inode_sb(cxl_vfs_mount->mnt_sb);
if (IS_ERR(inode)) {
file = ERR_CAST(inode);
goto err_fs;
diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index b837e7eba5f7dc..5d057a05ddbee8 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1900,7 +1900,7 @@ static __init int vmballoon_compaction_init(struct 
vmballoon *b)
return PTR_ERR(vmballoon_mnt);
  
  	b->b_dev_info.migratepage = vmballoon_migratepage;

-   b->b_dev_info.inode = alloc_anon_inode(vmballoon_mnt->mnt_sb);
+   b->b_dev_info.inode = alloc_anon_inode_sb(vmballoon_mnt->mnt_sb);
  
  	if (IS_ERR(b->b_dev_info.inode))

return PTR_ERR(b->b_dev_info.inode);
diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index 244fc27215dc79..40184ed926b557 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -88,7 +88,7 @@ static struct file *ocxlflash_getfile(struct device *dev, 
const char *name,
goto err2;
}
  
-	inode = alloc_anon_inode(ocxlflash_vfs_mount->mnt_sb);

+   inode = alloc_anon_inode_sb(ocxlflash_vfs_mount->mnt_sb);
if (IS_ERR(inode)) {
rc = PTR_ERR(inode);
dev_err(dev, "%s: alloc_anon_inode failed rc=%d\n",
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 8985fc2cea8615..cae76ee5bdd688 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -916,7 +916,7 @@ static int virtballoon_probe(struct virtio_device *vdev)
}
  
  	vb->vb_dev_info.migratepage = virtballoon_migratepage;

-   vb->vb_dev_info.inode = alloc_anon_inode(balloon_mnt->mnt_sb);
+   vb->vb_dev_info.inode = alloc_anon_inode_sb(balloon_mnt->mnt_sb);
if (IS_ERR(vb->vb_dev_info.inode)) {
err = PTR_ERR(vb->vb_dev_info.inode);
goto out_kern_unmount;
diff --git a/fs/aio.c b/fs/aio.c
index 1f32da13d39ee6..d1c2aa7fd6de7c 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -234,7 +234,7 @@ static const struct ad

[PATCH 8/9] z3fold: remove the z3fold file system

2021-03-09 Thread Christoph Hellwig
Just use the generic anon_inode file system.

Signed-off-by: Christoph Hellwig 
---
 mm/z3fold.c | 38 ++
 1 file changed, 2 insertions(+), 36 deletions(-)

diff --git a/mm/z3fold.c b/mm/z3fold.c
index e7cd9298b221f5..e0749a3d8987de 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -23,6 +23,7 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
+#include 
 #include 
 #include 
 #include 
@@ -345,38 +346,10 @@ static inline void free_handle(unsigned long handle, 
struct z3fold_header *zhdr)
}
 }
 
-static int z3fold_init_fs_context(struct fs_context *fc)
-{
-   return init_pseudo(fc, Z3FOLD_MAGIC) ? 0 : -ENOMEM;
-}
-
-static struct file_system_type z3fold_fs = {
-   .name   = "z3fold",
-   .init_fs_context = z3fold_init_fs_context,
-   .kill_sb= kill_anon_super,
-};
-
-static struct vfsmount *z3fold_mnt;
-static int z3fold_mount(void)
-{
-   int ret = 0;
-
-   z3fold_mnt = kern_mount(&z3fold_fs);
-   if (IS_ERR(z3fold_mnt))
-   ret = PTR_ERR(z3fold_mnt);
-
-   return ret;
-}
-
-static void z3fold_unmount(void)
-{
-   kern_unmount(z3fold_mnt);
-}
-
 static const struct address_space_operations z3fold_aops;
 static int z3fold_register_migration(struct z3fold_pool *pool)
 {
-   pool->inode = alloc_anon_inode_sb(z3fold_mnt->mnt_sb);
+   pool->inode = alloc_anon_inode();
if (IS_ERR(pool->inode)) {
pool->inode = NULL;
return 1;
@@ -1787,22 +1760,15 @@ MODULE_ALIAS("zpool-z3fold");
 
 static int __init init_z3fold(void)
 {
-   int ret;
-
/* Make sure the z3fold header is not larger than the page size */
BUILD_BUG_ON(ZHDR_SIZE_ALIGNED > PAGE_SIZE);
-   ret = z3fold_mount();
-   if (ret)
-   return ret;
 
zpool_register_driver(&z3fold_zpool_driver);
-
return 0;
 }
 
 static void __exit exit_z3fold(void)
 {
-   z3fold_unmount();
zpool_unregister_driver(&z3fold_zpool_driver);
 }
 
-- 
2.30.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 9/9] zsmalloc: remove the zsmalloc file system

2021-03-09 Thread Christoph Hellwig
Just use the generic anon_inode file system.

Signed-off-by: Christoph Hellwig 
---
 mm/zsmalloc.c | 48 +++-
 1 file changed, 3 insertions(+), 45 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index a6449a2ad861de..a7d2f471935447 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -176,10 +177,6 @@ struct zs_size_stat {
 static struct dentry *zs_stat_root;
 #endif
 
-#ifdef CONFIG_COMPACTION
-static struct vfsmount *zsmalloc_mnt;
-#endif
-
 /*
  * We assign a page to ZS_ALMOST_EMPTY fullness group when:
  * n <= N / f, where
@@ -308,8 +305,6 @@ static void kick_deferred_free(struct zs_pool *pool);
 static void init_deferred_free(struct zs_pool *pool);
 static void SetZsPageMovable(struct zs_pool *pool, struct zspage *zspage);
 #else
-static int zsmalloc_mount(void) { return 0; }
-static void zsmalloc_unmount(void) {}
 static int zs_register_migration(struct zs_pool *pool) { return 0; }
 static void zs_unregister_migration(struct zs_pool *pool) {}
 static void migrate_lock_init(struct zspage *zspage) {}
@@ -1751,33 +1746,6 @@ static void lock_zspage(struct zspage *zspage)
} while ((page = get_next_page(page)) != NULL);
 }
 
-static int zs_init_fs_context(struct fs_context *fc)
-{
-   return init_pseudo(fc, ZSMALLOC_MAGIC) ? 0 : -ENOMEM;
-}
-
-static struct file_system_type zsmalloc_fs = {
-   .name   = "zsmalloc",
-   .init_fs_context = zs_init_fs_context,
-   .kill_sb= kill_anon_super,
-};
-
-static int zsmalloc_mount(void)
-{
-   int ret = 0;
-
-   zsmalloc_mnt = kern_mount(&zsmalloc_fs);
-   if (IS_ERR(zsmalloc_mnt))
-   ret = PTR_ERR(zsmalloc_mnt);
-
-   return ret;
-}
-
-static void zsmalloc_unmount(void)
-{
-   kern_unmount(zsmalloc_mnt);
-}
-
 static void migrate_lock_init(struct zspage *zspage)
 {
rwlock_init(&zspage->lock);
@@ -2086,7 +2054,7 @@ static const struct address_space_operations 
zsmalloc_aops = {
 
 static int zs_register_migration(struct zs_pool *pool)
 {
-   pool->inode = alloc_anon_inode_sb(zsmalloc_mnt->mnt_sb);
+   pool->inode = alloc_anon_inode();
if (IS_ERR(pool->inode)) {
pool->inode = NULL;
return 1;
@@ -2506,14 +2474,10 @@ static int __init zs_init(void)
 {
int ret;
 
-   ret = zsmalloc_mount();
-   if (ret)
-   goto out;
-
ret = cpuhp_setup_state(CPUHP_MM_ZS_PREPARE, "mm/zsmalloc:prepare",
zs_cpu_prepare, zs_cpu_dead);
if (ret)
-   goto hp_setup_fail;
+   return ret;
 
 #ifdef CONFIG_ZPOOL
zpool_register_driver(&zs_zpool_driver);
@@ -2522,11 +2486,6 @@ static int __init zs_init(void)
zs_stat_init();
 
return 0;
-
-hp_setup_fail:
-   zsmalloc_unmount();
-out:
-   return ret;
 }
 
 static void __exit zs_exit(void)
@@ -2534,7 +2493,6 @@ static void __exit zs_exit(void)
 #ifdef CONFIG_ZPOOL
zpool_unregister_driver(&zs_zpool_driver);
 #endif
-   zsmalloc_unmount();
cpuhp_remove_state(CPUHP_MM_ZS_PREPARE);
 
zs_stat_exit();
-- 
2.30.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 7/9] iomem: remove the iomem file system

2021-03-09 Thread Christoph Hellwig
Just use the generic anon_inode file system.

Signed-off-by: Christoph Hellwig 
---
 kernel/resource.c | 30 --
 1 file changed, 4 insertions(+), 26 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 0fd091a3f2fc66..12560553c26796 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -1838,37 +1839,14 @@ static int __init strict_iomem(char *str)
return 1;
 }
 
-static int iomem_fs_init_fs_context(struct fs_context *fc)
-{
-   return init_pseudo(fc, DEVMEM_MAGIC) ? 0 : -ENOMEM;
-}
-
-static struct file_system_type iomem_fs_type = {
-   .name   = "iomem",
-   .owner  = THIS_MODULE,
-   .init_fs_context = iomem_fs_init_fs_context,
-   .kill_sb= kill_anon_super,
-};
-
 static int __init iomem_init_inode(void)
 {
-   static struct vfsmount *iomem_vfs_mount;
-   static int iomem_fs_cnt;
struct inode *inode;
-   int rc;
-
-   rc = simple_pin_fs(&iomem_fs_type, &iomem_vfs_mount, &iomem_fs_cnt);
-   if (rc < 0) {
-   pr_err("Cannot mount iomem pseudo filesystem: %d\n", rc);
-   return rc;
-   }
 
-   inode = alloc_anon_inode_sb(iomem_vfs_mount->mnt_sb);
+   inode = alloc_anon_inode();
if (IS_ERR(inode)) {
-   rc = PTR_ERR(inode);
-   pr_err("Cannot allocate inode for iomem: %d\n", rc);
-   simple_release_fs(&iomem_vfs_mount, &iomem_fs_cnt);
-   return rc;
+   pr_err("Cannot allocate inode for iomem: %zd\n", 
PTR_ERR(inode));
+   return PTR_ERR(inode);
}
 
/*
-- 
2.30.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 6/9] virtio_balloon: remove the balloon-kvm file system

2021-03-09 Thread Christoph Hellwig
Just use the generic anon_inode file system.

Signed-off-by: Christoph Hellwig 
---
 drivers/virtio/virtio_balloon.c | 30 +++---
 1 file changed, 3 insertions(+), 27 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index cae76ee5bdd688..1efb890cd3ff09 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -6,6 +6,7 @@
  *  Copyright 2008 Rusty Russell IBM Corporation
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -42,10 +43,6 @@
(1 << (VIRTIO_BALLOON_HINT_BLOCK_ORDER + PAGE_SHIFT))
 #define VIRTIO_BALLOON_HINT_BLOCK_PAGES (1 << VIRTIO_BALLOON_HINT_BLOCK_ORDER)
 
-#ifdef CONFIG_BALLOON_COMPACTION
-static struct vfsmount *balloon_mnt;
-#endif
-
 enum virtio_balloon_vq {
VIRTIO_BALLOON_VQ_INFLATE,
VIRTIO_BALLOON_VQ_DEFLATE,
@@ -805,18 +802,6 @@ static int virtballoon_migratepage(struct balloon_dev_info 
*vb_dev_info,
 
return MIGRATEPAGE_SUCCESS;
 }
-
-static int balloon_init_fs_context(struct fs_context *fc)
-{
-   return init_pseudo(fc, BALLOON_KVM_MAGIC) ? 0 : -ENOMEM;
-}
-
-static struct file_system_type balloon_fs = {
-   .name   = "balloon-kvm",
-   .init_fs_context = balloon_init_fs_context,
-   .kill_sb= kill_anon_super,
-};
-
 #endif /* CONFIG_BALLOON_COMPACTION */
 
 static unsigned long shrink_free_pages(struct virtio_balloon *vb,
@@ -909,17 +894,11 @@ static int virtballoon_probe(struct virtio_device *vdev)
goto out_free_vb;
 
 #ifdef CONFIG_BALLOON_COMPACTION
-   balloon_mnt = kern_mount(&balloon_fs);
-   if (IS_ERR(balloon_mnt)) {
-   err = PTR_ERR(balloon_mnt);
-   goto out_del_vqs;
-   }
-
vb->vb_dev_info.migratepage = virtballoon_migratepage;
-   vb->vb_dev_info.inode = alloc_anon_inode_sb(balloon_mnt->mnt_sb);
+   vb->vb_dev_info.inode = alloc_anon_inode();
if (IS_ERR(vb->vb_dev_info.inode)) {
err = PTR_ERR(vb->vb_dev_info.inode);
-   goto out_kern_unmount;
+   goto out_del_vqs;
}
vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops;
 #endif
@@ -1016,8 +995,6 @@ static int virtballoon_probe(struct virtio_device *vdev)
 out_iput:
 #ifdef CONFIG_BALLOON_COMPACTION
iput(vb->vb_dev_info.inode);
-out_kern_unmount:
-   kern_unmount(balloon_mnt);
 out_del_vqs:
 #endif
vdev->config->del_vqs(vdev);
@@ -1070,7 +1047,6 @@ static void virtballoon_remove(struct virtio_device *vdev)
if (vb->vb_dev_info.inode)
iput(vb->vb_dev_info.inode);
 
-   kern_unmount(balloon_mnt);
 #endif
kfree(vb);
 }
-- 
2.30.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 5/9] vmw_balloon: remove the balloon-vmware file system

2021-03-09 Thread Christoph Hellwig
Just use the generic anon_inode file system.

Signed-off-by: Christoph Hellwig 
---
 drivers/misc/vmw_balloon.c | 24 ++--
 1 file changed, 2 insertions(+), 22 deletions(-)

diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index 5d057a05ddbee8..be4be32f858253 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -16,6 +16,7 @@
 //#define DEBUG
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
+#include 
 #include 
 #include 
 #include 
@@ -1735,20 +1736,6 @@ static inline void vmballoon_debugfs_exit(struct 
vmballoon *b)
 
 
 #ifdef CONFIG_BALLOON_COMPACTION
-
-static int vmballoon_init_fs_context(struct fs_context *fc)
-{
-   return init_pseudo(fc, BALLOON_VMW_MAGIC) ? 0 : -ENOMEM;
-}
-
-static struct file_system_type vmballoon_fs = {
-   .name   = "balloon-vmware",
-   .init_fs_context= vmballoon_init_fs_context,
-   .kill_sb= kill_anon_super,
-};
-
-static struct vfsmount *vmballoon_mnt;
-
 /**
  * vmballoon_migratepage() - migrates a balloon page.
  * @b_dev_info: balloon device information descriptor.
@@ -1878,8 +1865,6 @@ static void vmballoon_compaction_deinit(struct vmballoon 
*b)
iput(b->b_dev_info.inode);
 
b->b_dev_info.inode = NULL;
-   kern_unmount(vmballoon_mnt);
-   vmballoon_mnt = NULL;
 }
 
 /**
@@ -1895,13 +1880,8 @@ static void vmballoon_compaction_deinit(struct vmballoon 
*b)
  */
 static __init int vmballoon_compaction_init(struct vmballoon *b)
 {
-   vmballoon_mnt = kern_mount(&vmballoon_fs);
-   if (IS_ERR(vmballoon_mnt))
-   return PTR_ERR(vmballoon_mnt);
-
b->b_dev_info.migratepage = vmballoon_migratepage;
-   b->b_dev_info.inode = alloc_anon_inode_sb(vmballoon_mnt->mnt_sb);
-
+   b->b_dev_info.inode = alloc_anon_inode();
if (IS_ERR(b->b_dev_info.inode))
return PTR_ERR(b->b_dev_info.inode);
 
-- 
2.30.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 4/9] drm: remove the drm file system

2021-03-09 Thread Christoph Hellwig
Just use the generic anon_inode file system.

Signed-off-by: Christoph Hellwig 
---
 drivers/gpu/drm/drm_drv.c | 64 ++-
 1 file changed, 3 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 87e7214a8e3565..af293d76f979e5 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -26,6 +26,7 @@
  * DEALINGS IN THE SOFTWARE.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -475,65 +476,6 @@ void drm_dev_unplug(struct drm_device *dev)
 }
 EXPORT_SYMBOL(drm_dev_unplug);
 
-/*
- * DRM internal mount
- * We want to be able to allocate our own "struct address_space" to control
- * memory-mappings in VRAM (or stolen RAM, ...). However, core MM does not 
allow
- * stand-alone address_space objects, so we need an underlying inode. As there
- * is no way to allocate an independent inode easily, we need a fake internal
- * VFS mount-point.
- *
- * The drm_fs_inode_new() function allocates a new inode, drm_fs_inode_free()
- * frees it again. You are allowed to use iget() and iput() to get references 
to
- * the inode. But each drm_fs_inode_new() call must be paired with exactly one
- * drm_fs_inode_free() call (which does not have to be the last iput()).
- * We use drm_fs_inode_*() to manage our internal VFS mount-point and share it
- * between multiple inode-users. You could, technically, call
- * iget() + drm_fs_inode_free() directly after alloc and sometime later do an
- * iput(), but this way you'd end up with a new vfsmount for each inode.
- */
-
-static int drm_fs_cnt;
-static struct vfsmount *drm_fs_mnt;
-
-static int drm_fs_init_fs_context(struct fs_context *fc)
-{
-   return init_pseudo(fc, 0x010203ff) ? 0 : -ENOMEM;
-}
-
-static struct file_system_type drm_fs_type = {
-   .name   = "drm",
-   .owner  = THIS_MODULE,
-   .init_fs_context = drm_fs_init_fs_context,
-   .kill_sb= kill_anon_super,
-};
-
-static struct inode *drm_fs_inode_new(void)
-{
-   struct inode *inode;
-   int r;
-
-   r = simple_pin_fs(&drm_fs_type, &drm_fs_mnt, &drm_fs_cnt);
-   if (r < 0) {
-   DRM_ERROR("Cannot mount pseudo fs: %d\n", r);
-   return ERR_PTR(r);
-   }
-
-   inode = alloc_anon_inode_sb(drm_fs_mnt->mnt_sb);
-   if (IS_ERR(inode))
-   simple_release_fs(&drm_fs_mnt, &drm_fs_cnt);
-
-   return inode;
-}
-
-static void drm_fs_inode_free(struct inode *inode)
-{
-   if (inode) {
-   iput(inode);
-   simple_release_fs(&drm_fs_mnt, &drm_fs_cnt);
-   }
-}
-
 /**
  * DOC: component helper usage recommendations
  *
@@ -563,7 +505,7 @@ static void drm_dev_init_release(struct drm_device *dev, 
void *res)
 {
drm_legacy_ctxbitmap_cleanup(dev);
drm_legacy_remove_map_hash(dev);
-   drm_fs_inode_free(dev->anon_inode);
+   iput(dev->anon_inode);
 
put_device(dev->dev);
/* Prevent use-after-free in drm_managed_release when debugging is
@@ -616,7 +558,7 @@ static int drm_dev_init(struct drm_device *dev,
if (ret)
return ret;
 
-   dev->anon_inode = drm_fs_inode_new();
+   dev->anon_inode = alloc_anon_inode();
if (IS_ERR(dev->anon_inode)) {
ret = PTR_ERR(dev->anon_inode);
DRM_ERROR("Cannot allocate anonymous inode: %d\n", ret);
-- 
2.30.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net v3 2/2] net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0

2021-03-09 Thread David Ahern
On 3/9/21 4:31 AM, Balazs Nemeth wrote:
> A packet with skb_inner_network_header(skb) == skb_network_header(skb)
> and ETH_P_MPLS_UC will prevent mpls_gso_segment from pulling any headers
> from the packet. Subsequently, the call to skb_mac_gso_segment will
> again call mpls_gso_segment with the same packet leading to an infinite
> loop. In addition, ensure that the header length is a multiple of four,
> which should hold irrespective of the number of stacked labels.
> 
> Signed-off-by: Balazs Nemeth 
> ---
>  net/mpls/mpls_gso.c | 3 +++
>  1 file changed, 3 insertions(+)
> 


Reviewed-by: David Ahern 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 3/9] powerpc/pseries: remove the ppc-cmm file system

2021-03-09 Thread Christoph Hellwig
Just use the generic anon_inode file system.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/platforms/pseries/cmm.c | 27 ++-
 1 file changed, 2 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c 
b/arch/powerpc/platforms/pseries/cmm.c
index 6d36b858b14df1..9d07e6bea7126c 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -6,6 +6,7 @@
  * Author(s): Brian King (brk...@linux.vnet.ibm.com),
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -502,19 +503,6 @@ static struct notifier_block cmm_mem_nb = {
 };
 
 #ifdef CONFIG_BALLOON_COMPACTION
-static struct vfsmount *balloon_mnt;
-
-static int cmm_init_fs_context(struct fs_context *fc)
-{
-   return init_pseudo(fc, PPC_CMM_MAGIC) ? 0 : -ENOMEM;
-}
-
-static struct file_system_type balloon_fs = {
-   .name = "ppc-cmm",
-   .init_fs_context = cmm_init_fs_context,
-   .kill_sb = kill_anon_super,
-};
-
 static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
   struct page *newpage, struct page *page,
   enum migrate_mode mode)
@@ -573,19 +561,10 @@ static int cmm_balloon_compaction_init(void)
balloon_devinfo_init(&b_dev_info);
b_dev_info.migratepage = cmm_migratepage;
 
-   balloon_mnt = kern_mount(&balloon_fs);
-   if (IS_ERR(balloon_mnt)) {
-   rc = PTR_ERR(balloon_mnt);
-   balloon_mnt = NULL;
-   return rc;
-   }
-
-   b_dev_info.inode = alloc_anon_inode_sb(balloon_mnt->mnt_sb);
+   b_dev_info.inode = alloc_anon_inode();
if (IS_ERR(b_dev_info.inode)) {
rc = PTR_ERR(b_dev_info.inode);
b_dev_info.inode = NULL;
-   kern_unmount(balloon_mnt);
-   balloon_mnt = NULL;
return rc;
}
 
@@ -597,8 +576,6 @@ static void cmm_balloon_compaction_deinit(void)
if (b_dev_info.inode)
iput(b_dev_info.inode);
b_dev_info.inode = NULL;
-   kern_unmount(balloon_mnt);
-   balloon_mnt = NULL;
 }
 #else /* CONFIG_BALLOON_COMPACTION */
 static int cmm_balloon_compaction_init(void)
-- 
2.30.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 1/9] fs: rename alloc_anon_inode to alloc_anon_inode_sb

2021-03-09 Thread Christoph Hellwig
Rename alloc_inode to free the name for a new variant that does not
need boilerplate to create a super_block first.

Signed-off-by: Christoph Hellwig 
---
 arch/powerpc/platforms/pseries/cmm.c | 2 +-
 drivers/dma-buf/dma-buf.c| 2 +-
 drivers/gpu/drm/drm_drv.c| 2 +-
 drivers/misc/cxl/api.c   | 2 +-
 drivers/misc/vmw_balloon.c   | 2 +-
 drivers/scsi/cxlflash/ocxl_hw.c  | 2 +-
 drivers/virtio/virtio_balloon.c  | 2 +-
 fs/aio.c | 2 +-
 fs/anon_inodes.c | 4 ++--
 fs/libfs.c   | 2 +-
 include/linux/fs.h   | 2 +-
 kernel/resource.c| 2 +-
 mm/z3fold.c  | 2 +-
 mm/zsmalloc.c| 2 +-
 14 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c 
b/arch/powerpc/platforms/pseries/cmm.c
index 45a3a3022a85c9..6d36b858b14df1 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -580,7 +580,7 @@ static int cmm_balloon_compaction_init(void)
return rc;
}
 
-   b_dev_info.inode = alloc_anon_inode(balloon_mnt->mnt_sb);
+   b_dev_info.inode = alloc_anon_inode_sb(balloon_mnt->mnt_sb);
if (IS_ERR(b_dev_info.inode)) {
rc = PTR_ERR(b_dev_info.inode);
b_dev_info.inode = NULL;
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index f264b70c383eb4..dedcc9483352dc 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -445,7 +445,7 @@ static inline int is_dma_buf_file(struct file *file)
 static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags)
 {
struct file *file;
-   struct inode *inode = alloc_anon_inode(dma_buf_mnt->mnt_sb);
+   struct inode *inode = alloc_anon_inode_sb(dma_buf_mnt->mnt_sb);
 
if (IS_ERR(inode))
return ERR_CAST(inode);
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 20d22e41d7ce74..87e7214a8e3565 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -519,7 +519,7 @@ static struct inode *drm_fs_inode_new(void)
return ERR_PTR(r);
}
 
-   inode = alloc_anon_inode(drm_fs_mnt->mnt_sb);
+   inode = alloc_anon_inode_sb(drm_fs_mnt->mnt_sb);
if (IS_ERR(inode))
simple_release_fs(&drm_fs_mnt, &drm_fs_cnt);
 
diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
index b493de962153ba..2efbf6c98028ef 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -73,7 +73,7 @@ static struct file *cxl_getfile(const char *name,
goto err_module;
}
 
-   inode = alloc_anon_inode(cxl_vfs_mount->mnt_sb);
+   inode = alloc_anon_inode_sb(cxl_vfs_mount->mnt_sb);
if (IS_ERR(inode)) {
file = ERR_CAST(inode);
goto err_fs;
diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c
index b837e7eba5f7dc..5d057a05ddbee8 100644
--- a/drivers/misc/vmw_balloon.c
+++ b/drivers/misc/vmw_balloon.c
@@ -1900,7 +1900,7 @@ static __init int vmballoon_compaction_init(struct 
vmballoon *b)
return PTR_ERR(vmballoon_mnt);
 
b->b_dev_info.migratepage = vmballoon_migratepage;
-   b->b_dev_info.inode = alloc_anon_inode(vmballoon_mnt->mnt_sb);
+   b->b_dev_info.inode = alloc_anon_inode_sb(vmballoon_mnt->mnt_sb);
 
if (IS_ERR(b->b_dev_info.inode))
return PTR_ERR(b->b_dev_info.inode);
diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index 244fc27215dc79..40184ed926b557 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -88,7 +88,7 @@ static struct file *ocxlflash_getfile(struct device *dev, 
const char *name,
goto err2;
}
 
-   inode = alloc_anon_inode(ocxlflash_vfs_mount->mnt_sb);
+   inode = alloc_anon_inode_sb(ocxlflash_vfs_mount->mnt_sb);
if (IS_ERR(inode)) {
rc = PTR_ERR(inode);
dev_err(dev, "%s: alloc_anon_inode failed rc=%d\n",
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 8985fc2cea8615..cae76ee5bdd688 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -916,7 +916,7 @@ static int virtballoon_probe(struct virtio_device *vdev)
}
 
vb->vb_dev_info.migratepage = virtballoon_migratepage;
-   vb->vb_dev_info.inode = alloc_anon_inode(balloon_mnt->mnt_sb);
+   vb->vb_dev_info.inode = alloc_anon_inode_sb(balloon_mnt->mnt_sb);
if (IS_ERR(vb->vb_dev_info.inode)) {
err = PTR_ERR(vb->vb_dev_info.inode);
goto out_kern_unmount;
diff --git a/fs/aio.c b/fs/aio.c
index 1f32da13d39ee6..d1c2aa7fd6de7c 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -234,7 +234,7 @@ static const struct address_space_operations aio_ctx_aops;

[PATCH 2/9] fs: add an argument-less alloc_anon_inode

2021-03-09 Thread Christoph Hellwig
Add a new alloc_anon_inode helper that allocates an inode on
the anon_inode file system.

Signed-off-by: Christoph Hellwig 
---
 fs/anon_inodes.c| 15 +--
 include/linux/anon_inodes.h |  1 +
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c
index 4745fc37014332..b6a8ea71920bc3 100644
--- a/fs/anon_inodes.c
+++ b/fs/anon_inodes.c
@@ -63,7 +63,7 @@ static struct inode *anon_inode_make_secure_inode(
const struct qstr qname = QSTR_INIT(name, strlen(name));
int error;
 
-   inode = alloc_anon_inode_sb(anon_inode_mnt->mnt_sb);
+   inode = alloc_anon_inode();
if (IS_ERR(inode))
return inode;
inode->i_flags &= ~S_PRIVATE;
@@ -225,13 +225,24 @@ int anon_inode_getfd_secure(const char *name, const 
struct file_operations *fops
 }
 EXPORT_SYMBOL_GPL(anon_inode_getfd_secure);
 
+/**
+ * alloc_anon_inode - create a new anonymous inode
+ *
+ * Create an inode on the anon_inode file system and return it.
+ */
+struct inode *alloc_anon_inode(void)
+{
+   return alloc_anon_inode_sb(anon_inode_mnt->mnt_sb);
+}
+EXPORT_SYMBOL_GPL(alloc_anon_inode);
+
 static int __init anon_inode_init(void)
 {
anon_inode_mnt = kern_mount(&anon_inode_fs_type);
if (IS_ERR(anon_inode_mnt))
panic("anon_inode_init() kernel mount failed (%ld)\n", 
PTR_ERR(anon_inode_mnt));
 
-   anon_inode_inode = alloc_anon_inode_sb(anon_inode_mnt->mnt_sb);
+   anon_inode_inode = alloc_anon_inode();
if (IS_ERR(anon_inode_inode))
panic("anon_inode_init() inode allocation failed (%ld)\n", 
PTR_ERR(anon_inode_inode));
 
diff --git a/include/linux/anon_inodes.h b/include/linux/anon_inodes.h
index 71881a2b6f7860..b5ae9a6eda9923 100644
--- a/include/linux/anon_inodes.h
+++ b/include/linux/anon_inodes.h
@@ -21,6 +21,7 @@ int anon_inode_getfd_secure(const char *name,
const struct file_operations *fops,
void *priv, int flags,
const struct inode *context_inode);
+struct inode *alloc_anon_inode(void);
 
 #endif /* _LINUX_ANON_INODES_H */
 
-- 
2.30.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


make alloc_anon_inode more useful

2021-03-09 Thread Christoph Hellwig
Hi all,

this series first renames the existing alloc_anon_inode to
alloc_anon_inode_sb to clearly mark it as requiring a superblock.

It then adds a new alloc_anon_inode that works on the anon_inode
file system super block, thus removing tons of boilerplate code.

The few remainig callers of alloc_anon_inode_sb all use alloc_file_pseudo
later, but might also be ripe for some cleanup.

Diffstat:
 arch/powerpc/platforms/pseries/cmm.c |   27 +-
 drivers/dma-buf/dma-buf.c|2 -
 drivers/gpu/drm/drm_drv.c|   64 +--
 drivers/misc/cxl/api.c   |2 -
 drivers/misc/vmw_balloon.c   |   24 +
 drivers/scsi/cxlflash/ocxl_hw.c  |2 -
 drivers/virtio/virtio_balloon.c  |   30 +---
 fs/aio.c |2 -
 fs/anon_inodes.c |   15 +++-
 fs/libfs.c   |2 -
 include/linux/anon_inodes.h  |1 
 include/linux/fs.h   |2 -
 kernel/resource.c|   30 ++--
 mm/z3fold.c  |   38 +---
 mm/zsmalloc.c|   48 +-
 15 files changed, 39 insertions(+), 250 deletions(-)
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net v3 1/2] net: check if protocol extracted by virtio_net_hdr_set_proto is correct

2021-03-09 Thread Willem de Bruijn
On Tue, Mar 9, 2021 at 6:32 AM Balazs Nemeth  wrote:
>
> For gso packets, virtio_net_hdr_set_proto sets the protocol (if it isn't
> set) based on the type in the virtio net hdr, but the skb could contain
> anything since it could come from packet_snd through a raw socket. If
> there is a mismatch between what virtio_net_hdr_set_proto sets and
> the actual protocol, then the skb could be handled incorrectly later
> on.
>
> An example where this poses an issue is with the subsequent call to
> skb_flow_dissect_flow_keys_basic which relies on skb->protocol being set
> correctly. A specially crafted packet could fool
> skb_flow_dissect_flow_keys_basic preventing EINVAL to be returned.
>
> Avoid blindly trusting the information provided by the virtio net header
> by checking that the protocol in the packet actually matches the
> protocol set by virtio_net_hdr_set_proto. Note that since the protocol
> is only checked if skb->dev implements header_ops->parse_protocol,
> packets from devices without the implementation are not checked at this
> stage.
>
> Fixes: 9274124f023b ("net: stricter validation of untrusted gso packets")
> Signed-off-by: Balazs Nemeth 

Acked-by: Willem de Bruijn 

This still relies entirely on data from the untrusted process. But it
adds the constraint that the otherwise untrusted data at least has to
be consistent, closing one loophole.

As responded in v2, we may want to look at the (few) callers and make
sure that they initialize skb->protocol before the call to
virtio_net_hdr_to_skb where possible. That will avoid this entire
branch.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net v3 2/2] net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0

2021-03-09 Thread Willem de Bruijn
On Tue, Mar 9, 2021 at 6:32 AM Balazs Nemeth  wrote:
>
> A packet with skb_inner_network_header(skb) == skb_network_header(skb)
> and ETH_P_MPLS_UC will prevent mpls_gso_segment from pulling any headers
> from the packet. Subsequently, the call to skb_mac_gso_segment will
> again call mpls_gso_segment with the same packet leading to an infinite
> loop. In addition, ensure that the header length is a multiple of four,
> which should hold irrespective of the number of stacked labels.
>
> Signed-off-by: Balazs Nemeth 

Acked-by: Willem de Bruijn 

The compiler will convert that modulo into a cheap & (ETH_HLEN - 1)
test for this constant.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 1/2] net: check if protocol extracted by virtio_net_hdr_set_proto is correct

2021-03-09 Thread Willem de Bruijn
On Tue, Mar 9, 2021 at 6:26 AM Michael S. Tsirkin  wrote:
>
> On Mon, Mar 08, 2021 at 11:31:25AM +0100, Balazs Nemeth wrote:
> > For gso packets, virtio_net_hdr_set_proto sets the protocol (if it isn't
> > set) based on the type in the virtio net hdr, but the skb could contain
> > anything since it could come from packet_snd through a raw socket. If
> > there is a mismatch between what virtio_net_hdr_set_proto sets and
> > the actual protocol, then the skb could be handled incorrectly later
> > on.
> >
> > An example where this poses an issue is with the subsequent call to
> > skb_flow_dissect_flow_keys_basic which relies on skb->protocol being set
> > correctly. A specially crafted packet could fool
> > skb_flow_dissect_flow_keys_basic preventing EINVAL to be returned.
> >
> > Avoid blindly trusting the information provided by the virtio net header
> > by checking that the protocol in the packet actually matches the
> > protocol set by virtio_net_hdr_set_proto. Note that since the protocol
> > is only checked if skb->dev implements header_ops->parse_protocol,
> > packets from devices without the implementation are not checked at this
> > stage.
> >
> > Fixes: 9274124f023b ("net: stricter validation of untrusted gso packets")
> > Signed-off-by: Balazs Nemeth 
> > ---
> >  include/linux/virtio_net.h | 8 +++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
> > index e8a924eeea3d..6c478eee0452 100644
> > --- a/include/linux/virtio_net.h
> > +++ b/include/linux/virtio_net.h
> > @@ -79,8 +79,14 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff 
> > *skb,
> >   if (gso_type && skb->network_header) {
> >   struct flow_keys_basic keys;
> >
> > - if (!skb->protocol)
> > + if (!skb->protocol) {
> > + const struct ethhdr *eth = skb_eth_hdr(skb);
> > + __be16 etype = dev_parse_header_protocol(skb);
> > +
> >   virtio_net_hdr_set_proto(skb, hdr);
> > + if (etype && etype != skb->protocol)
> > + return -EINVAL;
> > + }
>
>
> Well the protocol in the header is an attempt at an optimization to
> remove need to parse the packet ... any data on whether this
> affecs performance?

This adds a branch and reading a cacheline that is inevitably read not
much later. It shouldn't be significant.

And this branch is only taken if skb->protocol is not set. So the cost
can easily be avoided by passing the information.

But you raise a good point, because TUNTAP does set it, but only after
the call to virtio_net_hdr_to_skb.

That should perhaps be inverted (in a separate net-next patch).
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v6 10/12] x86/paravirt: add new macros PVOP_ALT* supporting pvops in ALTERNATIVEs

2021-03-09 Thread Juergen Gross via Virtualization
Instead of using paravirt patching for custom code sequences add
support for using ALTERNATIVE handling combined with paravirt call
patching.

Signed-off-by: Juergen Gross 
Acked-by: Peter Zijlstra (Intel) 
---
V3:
- drop PVOP_ALT_VCALL() macro
---
 arch/x86/include/asm/paravirt_types.h | 49 ++-
 1 file changed, 48 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index 0afdac83f926..0ed976286d49 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -477,44 +477,91 @@ int paravirt_disable_iospace(void);
ret;\
})
 
+#define PVOP_ALT_CALL(ret, op, alt, cond, clbr, call_clbr, \
+ extra_clbr, ...)  \
+   ({  \
+   PVOP_CALL_ARGS; \
+   PVOP_TEST_NULL(op); \
+   asm volatile(ALTERNATIVE(paravirt_alt(PARAVIRT_CALL),   \
+alt, cond) \
+: call_clbr, ASM_CALL_CONSTRAINT   \
+: paravirt_type(op),   \
+  paravirt_clobber(clbr),  \
+  ##__VA_ARGS__\
+: "memory", "cc" extra_clbr);  \
+   ret;\
+   })
+
 #define __PVOP_CALL(rettype, op, ...)  \
PVOP_CALL(PVOP_RETVAL(rettype), op, CLBR_ANY,   \
  PVOP_CALL_CLOBBERS, EXTRA_CLOBBERS, ##__VA_ARGS__)
 
+#define __PVOP_ALT_CALL(rettype, op, alt, cond, ...)   \
+   PVOP_ALT_CALL(PVOP_RETVAL(rettype), op, alt, cond, CLBR_ANY,\
+ PVOP_CALL_CLOBBERS, EXTRA_CLOBBERS,   \
+ ##__VA_ARGS__)
+
 #define __PVOP_CALLEESAVE(rettype, op, ...)\
PVOP_CALL(PVOP_RETVAL(rettype), op.func, CLBR_RET_REG,  \
  PVOP_CALLEE_CLOBBERS, , ##__VA_ARGS__)
 
+#define __PVOP_ALT_CALLEESAVE(rettype, op, alt, cond, ...) \
+   PVOP_ALT_CALL(PVOP_RETVAL(rettype), op.func, alt, cond, \
+ CLBR_RET_REG, PVOP_CALLEE_CLOBBERS, , ##__VA_ARGS__)
+
+
 #define __PVOP_VCALL(op, ...)  \
(void)PVOP_CALL(, op, CLBR_ANY, PVOP_VCALL_CLOBBERS,\
   VEXTRA_CLOBBERS, ##__VA_ARGS__)
 
+#define __PVOP_ALT_VCALL(op, alt, cond, ...)   \
+   (void)PVOP_ALT_CALL(, op, alt, cond, CLBR_ANY,  \
+   PVOP_VCALL_CLOBBERS, VEXTRA_CLOBBERS,   \
+   ##__VA_ARGS__)
+
 #define __PVOP_VCALLEESAVE(op, ...)\
(void)PVOP_CALL(, op.func, CLBR_RET_REG,\
- PVOP_VCALLEE_CLOBBERS, , ##__VA_ARGS__)
+   PVOP_VCALLEE_CLOBBERS, , ##__VA_ARGS__)
 
+#define __PVOP_ALT_VCALLEESAVE(op, alt, cond, ...) \
+   (void)PVOP_ALT_CALL(, op.func, alt, cond, CLBR_RET_REG, \
+   PVOP_VCALLEE_CLOBBERS, , ##__VA_ARGS__)
 
 
 #define PVOP_CALL0(rettype, op)
\
__PVOP_CALL(rettype, op)
 #define PVOP_VCALL0(op)
\
__PVOP_VCALL(op)
+#define PVOP_ALT_CALL0(rettype, op, alt, cond) \
+   __PVOP_ALT_CALL(rettype, op, alt, cond)
+#define PVOP_ALT_VCALL0(op, alt, cond) \
+   __PVOP_ALT_VCALL(op, alt, cond)
 
 #define PVOP_CALLEE0(rettype, op)  \
__PVOP_CALLEESAVE(rettype, op)
 #define PVOP_VCALLEE0(op)  \
__PVOP_VCALLEESAVE(op)
+#define PVOP_ALT_CALLEE0(rettype, op, alt, cond)   \
+   __PVOP_ALT_CALLEESAVE(rettype, op, alt, cond)
+#define PVOP_ALT_VCALLEE0(op, alt, cond)   \
+   __PVOP_ALT_VCALLEESAVE(op, alt, cond)
 
 
 #define PVOP_CALL1(rettype, op, arg1)  \
__PVOP_CALL(rettype, op, PVOP_CALL_ARG1(arg1))
 #define PVOP_VCALL1(op, arg1)  \
__PVOP_VCALL(op, PVOP_CALL_ARG1(arg1))
+#define PVOP_ALT_VCALL1(op, arg1, alt, cond)   \
+   __PVOP_ALT_VCALL(op, alt, cond, PVOP_CALL_ARG1(arg1))
 
 #define PVOP_CALLEE1(rettype, op, arg1)  

[PATCH v6 02/12] x86/paravirt: switch time pvops functions to use static_call()

2021-03-09 Thread Juergen Gross via Virtualization
The time pvops functions are the only ones left which might be
used in 32-bit mode and which return a 64-bit value.

Switch them to use the static_call() mechanism instead of pvops, as
this allows quite some simplification of the pvops implementation.

Signed-off-by: Juergen Gross 
Acked-by: Peter Zijlstra (Intel) 
---
V4:
- drop paravirt_time.h again
- don't move Hyper-V code (Michael Kelley)
V5:
- drop no longer needed Hyper-V modification (Michael Kelley)
- switch Arm and Arm64 to static_call(), too (kernel test robot)
V6:
- factor out common parts in Xen pv/pvh initialization (Boris Petkov)
---
 arch/arm/include/asm/paravirt.h   | 14 +-
 arch/arm/kernel/paravirt.c|  9 +++--
 arch/arm64/include/asm/paravirt.h | 14 +-
 arch/arm64/kernel/paravirt.c  | 13 +
 arch/x86/Kconfig  |  1 +
 arch/x86/include/asm/mshyperv.h   |  2 +-
 arch/x86/include/asm/paravirt.h   | 17 ++---
 arch/x86/include/asm/paravirt_types.h |  6 --
 arch/x86/kernel/cpu/vmware.c  |  5 +++--
 arch/x86/kernel/kvm.c |  2 +-
 arch/x86/kernel/kvmclock.c|  2 +-
 arch/x86/kernel/paravirt.c| 16 
 arch/x86/kernel/tsc.c |  2 +-
 arch/x86/xen/time.c   | 26 +-
 drivers/xen/time.c|  3 ++-
 15 files changed, 75 insertions(+), 57 deletions(-)

diff --git a/arch/arm/include/asm/paravirt.h b/arch/arm/include/asm/paravirt.h
index cdbf02d9c1d4..95d5b0d625cd 100644
--- a/arch/arm/include/asm/paravirt.h
+++ b/arch/arm/include/asm/paravirt.h
@@ -3,23 +3,19 @@
 #define _ASM_ARM_PARAVIRT_H
 
 #ifdef CONFIG_PARAVIRT
+#include 
+
 struct static_key;
 extern struct static_key paravirt_steal_enabled;
 extern struct static_key paravirt_steal_rq_enabled;
 
-struct pv_time_ops {
-   unsigned long long (*steal_clock)(int cpu);
-};
-
-struct paravirt_patch_template {
-   struct pv_time_ops time;
-};
+u64 dummy_steal_clock(int cpu);
 
-extern struct paravirt_patch_template pv_ops;
+DECLARE_STATIC_CALL(pv_steal_clock, dummy_steal_clock);
 
 static inline u64 paravirt_steal_clock(int cpu)
 {
-   return pv_ops.time.steal_clock(cpu);
+   return static_call(pv_steal_clock)(cpu);
 }
 #endif
 
diff --git a/arch/arm/kernel/paravirt.c b/arch/arm/kernel/paravirt.c
index 4cfed91fe256..7dd9806369fb 100644
--- a/arch/arm/kernel/paravirt.c
+++ b/arch/arm/kernel/paravirt.c
@@ -9,10 +9,15 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 struct static_key paravirt_steal_enabled;
 struct static_key paravirt_steal_rq_enabled;
 
-struct paravirt_patch_template pv_ops;
-EXPORT_SYMBOL_GPL(pv_ops);
+static u64 native_steal_clock(int cpu)
+{
+   return 0;
+}
+
+DEFINE_STATIC_CALL(pv_steal_clock, native_steal_clock);
diff --git a/arch/arm64/include/asm/paravirt.h 
b/arch/arm64/include/asm/paravirt.h
index cf3a0fd7c1a7..9aa193e0e8f2 100644
--- a/arch/arm64/include/asm/paravirt.h
+++ b/arch/arm64/include/asm/paravirt.h
@@ -3,23 +3,19 @@
 #define _ASM_ARM64_PARAVIRT_H
 
 #ifdef CONFIG_PARAVIRT
+#include 
+
 struct static_key;
 extern struct static_key paravirt_steal_enabled;
 extern struct static_key paravirt_steal_rq_enabled;
 
-struct pv_time_ops {
-   unsigned long long (*steal_clock)(int cpu);
-};
-
-struct paravirt_patch_template {
-   struct pv_time_ops time;
-};
+u64 dummy_steal_clock(int cpu);
 
-extern struct paravirt_patch_template pv_ops;
+DECLARE_STATIC_CALL(pv_steal_clock, dummy_steal_clock);
 
 static inline u64 paravirt_steal_clock(int cpu)
 {
-   return pv_ops.time.steal_clock(cpu);
+   return static_call(pv_steal_clock)(cpu);
 }
 
 int __init pv_time_init(void);
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c
index c07d7a034941..75fed4460407 100644
--- a/arch/arm64/kernel/paravirt.c
+++ b/arch/arm64/kernel/paravirt.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -26,8 +27,12 @@
 struct static_key paravirt_steal_enabled;
 struct static_key paravirt_steal_rq_enabled;
 
-struct paravirt_patch_template pv_ops;
-EXPORT_SYMBOL_GPL(pv_ops);
+static u64 native_steal_clock(int cpu)
+{
+   return 0;
+}
+
+DEFINE_STATIC_CALL(pv_steal_clock, native_steal_clock);
 
 struct pv_time_stolen_time_region {
struct pvclock_vcpu_stolen_time *kaddr;
@@ -45,7 +50,7 @@ static int __init parse_no_stealacc(char *arg)
 early_param("no-steal-acc", parse_no_stealacc);
 
 /* return stolen time in ns by asking the hypervisor */
-static u64 pv_steal_clock(int cpu)
+static u64 para_steal_clock(int cpu)
 {
struct pv_time_stolen_time_region *reg;
 
@@ -150,7 +155,7 @@ int __init pv_time_init(void)
if (ret)
return ret;
 
-   pv_ops.time.steal_clock = pv_steal_clock;
+   static_call_update(pv_steal_clock, para_steal_clock);
 
static_key_slow_inc(¶virt_steal_enabled);
if (steal_acc)
diff --git

[PATCH v6 11/12] x86/paravirt: switch functions with custom code to ALTERNATIVE

2021-03-09 Thread Juergen Gross via Virtualization
Instead of using paravirt patching for custom code sequences use
ALTERNATIVE for the functions with custom code replacements.

Instead of patching an ud2 instruction for unpopulated vector entries
into the caller site, use a simple function just calling BUG() as a
replacement.

Simplify the register defines for assembler paravirt calling, as there
isn't much usage left.

Signed-off-by: Juergen Gross 
Acked-by: Peter Zijlstra (Intel) 
---
V4:
- fixed SAVE_FLAGS() (kernel test robot)
- added assembler paravirt cleanup
---
 arch/x86/entry/entry_64.S |   2 +-
 arch/x86/include/asm/irqflags.h   |   2 +-
 arch/x86/include/asm/paravirt.h   | 101 +-
 arch/x86/include/asm/paravirt_types.h |   6 --
 arch/x86/kernel/paravirt.c|  16 ++--
 arch/x86/kernel/paravirt_patch.c  |  88 --
 6 files changed, 58 insertions(+), 157 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 400908dff42e..12e2e3cd58be 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -305,7 +305,7 @@ SYM_CODE_END(ret_from_fork)
 .macro DEBUG_ENTRY_ASSERT_IRQS_OFF
 #ifdef CONFIG_DEBUG_ENTRY
pushq %rax
-   SAVE_FLAGS(CLBR_RAX)
+   SAVE_FLAGS
testl $X86_EFLAGS_IF, %eax
jz .Lokay_\@
ud2
diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index a0efbcd24b86..c5ce9845c999 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -111,7 +111,7 @@ static __always_inline unsigned long 
arch_local_irq_save(void)
 
 #ifdef CONFIG_X86_64
 #ifdef CONFIG_DEBUG_ENTRY
-#define SAVE_FLAGS(x)  pushfq; popq %rax
+#define SAVE_FLAGS pushfq; popq %rax
 #endif
 
 #define INTERRUPT_RETURN   jmp native_iret
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 36cd71fa097f..b32b408958e8 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -137,7 +137,9 @@ static inline void write_cr0(unsigned long x)
 
 static inline unsigned long read_cr2(void)
 {
-   return PVOP_CALLEE0(unsigned long, mmu.read_cr2);
+   return PVOP_ALT_CALLEE0(unsigned long, mmu.read_cr2,
+   "mov %%cr2, %%rax;",
+   ALT_NOT(X86_FEATURE_XENPV));
 }
 
 static inline void write_cr2(unsigned long x)
@@ -147,12 +149,14 @@ static inline void write_cr2(unsigned long x)
 
 static inline unsigned long __read_cr3(void)
 {
-   return PVOP_CALL0(unsigned long, mmu.read_cr3);
+   return PVOP_ALT_CALL0(unsigned long, mmu.read_cr3,
+ "mov %%cr3, %%rax;", ALT_NOT(X86_FEATURE_XENPV));
 }
 
 static inline void write_cr3(unsigned long x)
 {
-   PVOP_VCALL1(mmu.write_cr3, x);
+   PVOP_ALT_VCALL1(mmu.write_cr3, x,
+   "mov %%rdi, %%cr3", ALT_NOT(X86_FEATURE_XENPV));
 }
 
 static inline void __write_cr4(unsigned long x)
@@ -172,7 +176,7 @@ static inline void halt(void)
 
 static inline void wbinvd(void)
 {
-   PVOP_VCALL0(cpu.wbinvd);
+   PVOP_ALT_VCALL0(cpu.wbinvd, "wbinvd", ALT_NOT(X86_FEATURE_XENPV));
 }
 
 static inline u64 paravirt_read_msr(unsigned msr)
@@ -386,22 +390,28 @@ static inline void paravirt_release_p4d(unsigned long pfn)
 
 static inline pte_t __pte(pteval_t val)
 {
-   return (pte_t) { PVOP_CALLEE1(pteval_t, mmu.make_pte, val) };
+   return (pte_t) { PVOP_ALT_CALLEE1(pteval_t, mmu.make_pte, val,
+ "mov %%rdi, %%rax",
+ ALT_NOT(X86_FEATURE_XENPV)) };
 }
 
 static inline pteval_t pte_val(pte_t pte)
 {
-   return PVOP_CALLEE1(pteval_t, mmu.pte_val, pte.pte);
+   return PVOP_ALT_CALLEE1(pteval_t, mmu.pte_val, pte.pte,
+   "mov %%rdi, %%rax", ALT_NOT(X86_FEATURE_XENPV));
 }
 
 static inline pgd_t __pgd(pgdval_t val)
 {
-   return (pgd_t) { PVOP_CALLEE1(pgdval_t, mmu.make_pgd, val) };
+   return (pgd_t) { PVOP_ALT_CALLEE1(pgdval_t, mmu.make_pgd, val,
+ "mov %%rdi, %%rax",
+ ALT_NOT(X86_FEATURE_XENPV)) };
 }
 
 static inline pgdval_t pgd_val(pgd_t pgd)
 {
-   return PVOP_CALLEE1(pgdval_t, mmu.pgd_val, pgd.pgd);
+   return PVOP_ALT_CALLEE1(pgdval_t, mmu.pgd_val, pgd.pgd,
+   "mov %%rdi, %%rax", ALT_NOT(X86_FEATURE_XENPV));
 }
 
 #define  __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
@@ -434,12 +444,15 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd)
 
 static inline pmd_t __pmd(pmdval_t val)
 {
-   return (pmd_t) { PVOP_CALLEE1(pmdval_t, mmu.make_pmd, val) };
+   return (pmd_t) { PVOP_ALT_CALLEE1(pmdval_t, mmu.make_pmd, val,
+ "mov %%rdi, %%rax",
+ ALT_NOT(X86_FEATURE_XENPV)) };
 }
 
 static inline pmdval_t pmd_val(pmd_t pmd)
 {
-   retu

[PATCH v6 12/12] x86/paravirt: have only one paravirt patch function

2021-03-09 Thread Juergen Gross via Virtualization
There is no need any longer to have different paravirt patch functions
for native and Xen. Eliminate native_patch() and rename
paravirt_patch_default() to paravirt_patch().

Signed-off-by: Juergen Gross 
Acked-by: Peter Zijlstra (Intel) 
---
V3:
- remove paravirt_patch_insns() (kernel test robot)
---
 arch/x86/include/asm/paravirt_types.h | 19 +--
 arch/x86/kernel/Makefile  |  3 +--
 arch/x86/kernel/alternative.c |  2 +-
 arch/x86/kernel/paravirt.c| 20 ++--
 arch/x86/kernel/paravirt_patch.c  | 11 ---
 arch/x86/xen/enlighten_pv.c   |  1 -
 6 files changed, 5 insertions(+), 51 deletions(-)
 delete mode 100644 arch/x86/kernel/paravirt_patch.c

diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index 588ff14ce969..9d1ddb7b4350 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -68,19 +68,6 @@ struct pv_info {
const char *name;
 };
 
-struct pv_init_ops {
-   /*
-* Patch may replace one of the defined code sequences with
-* arbitrary code, subject to the same register constraints.
-* This generally means the code is not free to clobber any
-* registers other than EAX.  The patch function should return
-* the number of bytes of code generated, as we nop pad the
-* rest in generic code.
-*/
-   unsigned (*patch)(u8 type, void *insn_buff,
- unsigned long addr, unsigned len);
-} __no_randomize_layout;
-
 #ifdef CONFIG_PARAVIRT_XXL
 struct pv_lazy_ops {
/* Set deferred update mode, used for batching operations. */
@@ -276,7 +263,6 @@ struct pv_lock_ops {
  * number for each function using the offset which we use to indicate
  * what to patch. */
 struct paravirt_patch_template {
-   struct pv_init_ops  init;
struct pv_cpu_ops   cpu;
struct pv_irq_ops   irq;
struct pv_mmu_ops   mmu;
@@ -317,10 +303,7 @@ extern void (*paravirt_iret)(void);
 /* Simple instruction patching code. */
 #define NATIVE_LABEL(a,x,b) "\n\t.globl " a #x "_" #b "\n" a #x "_" #b ":\n\t"
 
-unsigned paravirt_patch_default(u8 type, void *insn_buff, unsigned long addr, 
unsigned len);
-unsigned paravirt_patch_insns(void *insn_buff, unsigned len, const char 
*start, const char *end);
-
-unsigned native_patch(u8 type, void *insn_buff, unsigned long addr, unsigned 
len);
+unsigned int paravirt_patch(u8 type, void *insn_buff, unsigned long addr, 
unsigned int len);
 
 int paravirt_disable_iospace(void);
 
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 2ddf08351f0b..0704c2a94272 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -35,7 +35,6 @@ KASAN_SANITIZE_sev-es.o   
:= n
 KCSAN_SANITIZE := n
 
 OBJECT_FILES_NON_STANDARD_test_nx.o:= y
-OBJECT_FILES_NON_STANDARD_paravirt_patch.o := y
 
 ifdef CONFIG_FRAME_POINTER
 OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o := y
@@ -121,7 +120,7 @@ obj-$(CONFIG_AMD_NB)+= amd_nb.o
 obj-$(CONFIG_DEBUG_NMI_SELFTEST) += nmi_selftest.o
 
 obj-$(CONFIG_KVM_GUEST)+= kvm.o kvmclock.o
-obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch.o
+obj-$(CONFIG_PARAVIRT) += paravirt.o
 obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= paravirt-spinlocks.o
 obj-$(CONFIG_PARAVIRT_CLOCK)   += pvclock.o
 obj-$(CONFIG_X86_PMEM_LEGACY_DEVICE) += pmem.o
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 1f12901e75f2..cb3eb8c2f50d 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -615,7 +615,7 @@ void __init_or_module apply_paravirt(struct 
paravirt_patch_site *start,
BUG_ON(p->len > MAX_PATCH_LEN);
/* prep the buffer with the original instructions */
memcpy(insn_buff, p->instr, p->len);
-   used = pv_ops.init.patch(p->type, insn_buff, (unsigned 
long)p->instr, p->len);
+   used = paravirt_patch(p->type, insn_buff, (unsigned 
long)p->instr, p->len);
 
BUG_ON(used > p->len);
 
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 082954930809..3d7b989ed6be 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -99,8 +99,8 @@ void __init native_pv_lock_init(void)
static_branch_disable(&virt_spin_lock_key);
 }
 
-unsigned paravirt_patch_default(u8 type, void *insn_buff,
-   unsigned long addr, unsigned len)
+unsigned int paravirt_patch(u8 type, void *insn_buff, unsigned long addr,
+   unsigned int len)
 {
/*
 * Neat trick to map patch type back to the call within the
@@ -121,19 +121,6 @@ unsigned paravirt_patch_default(u8 type, void *insn_buff,
return ret;
 }
 
-unsigned paravirt_patch_insns(void *ins

[PATCH v6 06/12] x86: add new features for paravirt patching

2021-03-09 Thread Juergen Gross via Virtualization
For being able to switch paravirt patching from special cased custom
code sequences to ALTERNATIVE handling some X86_FEATURE_* are needed
as new features. This enables to have the standard indirect pv call
as the default code and to patch that with the non-Xen custom code
sequence via ALTERNATIVE patching later.

Make sure paravirt patching is performed before alternative patching.

Signed-off-by: Juergen Gross 
Acked-by: Peter Zijlstra (Intel) 
---
V3:
- add comment (Boris Petkov)
- no negative features (Boris Petkov)
V4:
- move paravirt_set_cap() to paravirt-spinlocks.c
---
 arch/x86/include/asm/cpufeatures.h   |  2 ++
 arch/x86/include/asm/paravirt.h  | 10 ++
 arch/x86/kernel/alternative.c| 30 ++--
 arch/x86/kernel/paravirt-spinlocks.c |  9 +
 4 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index cc96e26d69f7..b440c950246d 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -236,6 +236,8 @@
 #define X86_FEATURE_EPT_AD ( 8*32+17) /* Intel Extended Page Table 
access-dirty bit */
 #define X86_FEATURE_VMCALL ( 8*32+18) /* "" Hypervisor supports 
the VMCALL instruction */
 #define X86_FEATURE_VMW_VMMCALL( 8*32+19) /* "" VMware prefers 
VMMCALL hypercall instruction */
+#define X86_FEATURE_PVUNLOCK   ( 8*32+20) /* "" PV unlock function */
+#define X86_FEATURE_VCPUPREEMPT( 8*32+21) /* "" PV 
vcpu_is_preempted function */
 
 /* Intel-defined CPU features, CPUID level 0x0007:0 (EBX), word 9 */
 #define X86_FEATURE_FSGSBASE   ( 9*32+ 0) /* RDFSBASE, WRFSBASE, 
RDGSBASE, WRGSBASE instructions*/
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 1e45b46fae84..8c354099d9c3 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -47,6 +47,10 @@ static inline u64 paravirt_steal_clock(int cpu)
return static_call(pv_steal_clock)(cpu);
 }
 
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+void __init paravirt_set_cap(void);
+#endif
+
 /* The paravirtualized I/O functions */
 static inline void slow_down_io(void)
 {
@@ -811,5 +815,11 @@ static inline void paravirt_arch_exit_mmap(struct 
mm_struct *mm)
 {
 }
 #endif
+
+#ifndef CONFIG_PARAVIRT_SPINLOCKS
+static inline void paravirt_set_cap(void)
+{
+}
+#endif
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_X86_PARAVIRT_H */
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index d8e669a1546f..1f12901e75f2 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 int __read_mostly alternatives_patched;
 
@@ -732,6 +733,33 @@ void __init alternative_instructions(void)
 * patching.
 */
 
+   /*
+* Paravirt patching and alternative patching can be combined to
+* replace a function call with a short direct code sequence (e.g.
+* by setting a constant return value instead of doing that in an
+* external function).
+* In order to make this work the following sequence is required:
+* 1. set (artificial) features depending on used paravirt
+*functions which can later influence alternative patching
+* 2. apply paravirt patching (generally replacing an indirect
+*function call with a direct one)
+* 3. apply alternative patching (e.g. replacing a direct function
+*call with a custom code sequence)
+* Doing paravirt patching after alternative patching would clobber
+* the optimization of the custom code with a function call again.
+*/
+   paravirt_set_cap();
+
+   /*
+* First patch paravirt functions, such that we overwrite the indirect
+* call with the direct call.
+*/
+   apply_paravirt(__parainstructions, __parainstructions_end);
+
+   /*
+* Then patch alternatives, such that those paravirt calls that are in
+* alternatives can be overwritten by their immediate fragments.
+*/
apply_alternatives(__alt_instructions, __alt_instructions_end);
 
 #ifdef CONFIG_SMP
@@ -750,8 +778,6 @@ void __init alternative_instructions(void)
}
 #endif
 
-   apply_paravirt(__parainstructions, __parainstructions_end);
-
restart_nmi();
alternatives_patched = 1;
 }
diff --git a/arch/x86/kernel/paravirt-spinlocks.c 
b/arch/x86/kernel/paravirt-spinlocks.c
index 4f75d0cf6305..9e1ea99ad9df 100644
--- a/arch/x86/kernel/paravirt-spinlocks.c
+++ b/arch/x86/kernel/paravirt-spinlocks.c
@@ -32,3 +32,12 @@ bool pv_is_native_vcpu_is_preempted(void)
return pv_ops.lock.vcpu_is_preempted.func ==
__raw_callee_save___native_vcpu_is_preempted;
 }
+
+void __init paravirt_set_cap(void)
+{
+   if (!pv_is_native_spin_unlock())
+   setup_

[PATCH v6 07/12] x86/paravirt: remove no longer needed 32-bit pvops cruft

2021-03-09 Thread Juergen Gross via Virtualization
PVOP_VCALL4() is only used for Xen PV, while PVOP_CALL4() isn't used
at all. Keep PVOP_CALL4() for 64 bits due to symmetry reasons.

This allows to remove the 32-bit definitions of those macros leading
to a substantial simplification of the paravirt macros, as those were
the only ones needing non-empty "pre" and "post" parameters.

PVOP_CALLEE2() and PVOP_VCALLEE2() are used nowhere, so remove them.

Another no longer needed case is special handling of return types
larger than unsigned long. Replace that with a BUILD_BUG_ON().

DISABLE_INTERRUPTS() is used in 32-bit code only, so it can just be
replaced by cli.

INTERRUPT_RETURN in 32-bit code can be replaced by iret.

ENABLE_INTERRUPTS is used nowhere, so it can be removed.

Signed-off-by: Juergen Gross 
Acked-by: Peter Zijlstra (Intel) 
---
 arch/x86/entry/entry_32.S |   4 +-
 arch/x86/include/asm/irqflags.h   |   5 --
 arch/x86/include/asm/paravirt.h   |  35 +---
 arch/x86/include/asm/paravirt_types.h | 112 --
 arch/x86/kernel/asm-offsets.c |   2 -
 5 files changed, 35 insertions(+), 123 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index df8c017e6161..765487e57d6e 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -430,7 +430,7 @@
 * will soon execute iret and the tracer was already set to
 * the irqstate after the IRET:
 */
-   DISABLE_INTERRUPTS(CLBR_ANY)
+   cli
lss (%esp), %esp/* switch to espfix segment */
 .Lend_\@:
 #endif /* CONFIG_X86_ESPFIX32 */
@@ -1077,7 +1077,7 @@ restore_all_switch_stack:
 * when returning from IPI handler and when returning from
 * scheduler to user-space.
 */
-   INTERRUPT_RETURN
+   iret
 
 .section .fixup, "ax"
 SYM_CODE_START(asm_iret_error)
diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index 144d70ea4393..a0efbcd24b86 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -109,9 +109,6 @@ static __always_inline unsigned long 
arch_local_irq_save(void)
 }
 #else
 
-#define ENABLE_INTERRUPTS(x)   sti
-#define DISABLE_INTERRUPTS(x)  cli
-
 #ifdef CONFIG_X86_64
 #ifdef CONFIG_DEBUG_ENTRY
 #define SAVE_FLAGS(x)  pushfq; popq %rax
@@ -119,8 +116,6 @@ static __always_inline unsigned long 
arch_local_irq_save(void)
 
 #define INTERRUPT_RETURN   jmp native_iret
 
-#else
-#define INTERRUPT_RETURN   iret
 #endif
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 8c354099d9c3..c6496a82fad1 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -721,6 +721,7 @@ extern void default_banner(void);
.if ((~(set)) & mask); pop %reg; .endif
 
 #ifdef CONFIG_X86_64
+#ifdef CONFIG_PARAVIRT_XXL
 
 #define PV_SAVE_REGS(set)  \
COND_PUSH(set, CLBR_RAX, rax);  \
@@ -746,46 +747,12 @@ extern void default_banner(void);
 #define PARA_PATCH(off)((off) / 8)
 #define PARA_SITE(ptype, ops)  _PVSITE(ptype, ops, .quad, 8)
 #define PARA_INDIRECT(addr)*addr(%rip)
-#else
-#define PV_SAVE_REGS(set)  \
-   COND_PUSH(set, CLBR_EAX, eax);  \
-   COND_PUSH(set, CLBR_EDI, edi);  \
-   COND_PUSH(set, CLBR_ECX, ecx);  \
-   COND_PUSH(set, CLBR_EDX, edx)
-#define PV_RESTORE_REGS(set)   \
-   COND_POP(set, CLBR_EDX, edx);   \
-   COND_POP(set, CLBR_ECX, ecx);   \
-   COND_POP(set, CLBR_EDI, edi);   \
-   COND_POP(set, CLBR_EAX, eax)
-
-#define PARA_PATCH(off)((off) / 4)
-#define PARA_SITE(ptype, ops)  _PVSITE(ptype, ops, .long, 4)
-#define PARA_INDIRECT(addr)*%cs:addr
-#endif
 
-#ifdef CONFIG_PARAVIRT_XXL
 #define INTERRUPT_RETURN   \
PARA_SITE(PARA_PATCH(PV_CPU_iret),  \
  ANNOTATE_RETPOLINE_SAFE;  \
  jmp PARA_INDIRECT(pv_ops+PV_CPU_iret);)
 
-#define DISABLE_INTERRUPTS(clobbers)   \
-   PARA_SITE(PARA_PATCH(PV_IRQ_irq_disable),   \
- PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE);\
- ANNOTATE_RETPOLINE_SAFE;  \
- call PARA_INDIRECT(pv_ops+PV_IRQ_irq_disable);\
- PV_RESTORE_REGS(clobbers | CLBR_CALLEE_SAVE);)
-
-#define ENABLE_INTERRUPTS(clobbers)\
-   PARA_SITE(PARA_PATCH(PV_IRQ_irq_enable),\
- PV_SAVE_REGS(clobbers | CLBR_CALLEE_SAVE);\
- ANNOTATE_RETPOLINE_SAFE;  \
- call PARA_INDIRECT(pv_ops+PV_IRQ_irq_enable); \
-

[PATCH v6 09/12] x86/paravirt: switch iret pvops to ALTERNATIVE

2021-03-09 Thread Juergen Gross via Virtualization
The iret paravirt op is rather special as it is using a jmp instead
of a call instruction. Switch it to ALTERNATIVE.

Signed-off-by: Juergen Gross 
Acked-by: Peter Zijlstra (Intel) 
---
V3:
- use ALTERNATIVE_TERNARY
---
 arch/x86/include/asm/paravirt.h   |  6 +++---
 arch/x86/include/asm/paravirt_types.h |  5 +
 arch/x86/kernel/asm-offsets.c |  5 -
 arch/x86/kernel/paravirt.c| 26 ++
 arch/x86/xen/enlighten_pv.c   |  3 +--
 5 files changed, 7 insertions(+), 38 deletions(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index c6496a82fad1..36cd71fa097f 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -749,9 +749,9 @@ extern void default_banner(void);
 #define PARA_INDIRECT(addr)*addr(%rip)
 
 #define INTERRUPT_RETURN   \
-   PARA_SITE(PARA_PATCH(PV_CPU_iret),  \
- ANNOTATE_RETPOLINE_SAFE;  \
- jmp PARA_INDIRECT(pv_ops+PV_CPU_iret);)
+   ANNOTATE_RETPOLINE_SAFE;\
+   ALTERNATIVE_TERNARY("jmp *paravirt_iret(%rip);",\
+   X86_FEATURE_XENPV, "jmp xen_iret;", "jmp native_iret;")
 
 #ifdef CONFIG_DEBUG_ENTRY
 #define SAVE_FLAGS(clobbers)\
diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index 45bd21647dd8..0afdac83f926 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -151,10 +151,6 @@ struct pv_cpu_ops {
 
u64 (*read_pmc)(int counter);
 
-   /* Normal iret.  Jump to this with the standard iret stack
-  frame set up. */
-   void (*iret)(void);
-
void (*start_context_switch)(struct task_struct *prev);
void (*end_context_switch)(struct task_struct *next);
 #endif
@@ -294,6 +290,7 @@ struct paravirt_patch_template {
 
 extern struct pv_info pv_info;
 extern struct paravirt_patch_template pv_ops;
+extern void (*paravirt_iret)(void);
 
 #define PARAVIRT_PATCH(x)  \
(offsetof(struct paravirt_patch_template, x) / sizeof(void *))
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 736508004b30..ecd3fd6993d1 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -61,11 +61,6 @@ static void __used common(void)
OFFSET(IA32_RT_SIGFRAME_sigcontext, rt_sigframe_ia32, uc.uc_mcontext);
 #endif
 
-#ifdef CONFIG_PARAVIRT_XXL
-   BLANK();
-   OFFSET(PV_CPU_iret, paravirt_patch_template, cpu.iret);
-#endif
-
 #ifdef CONFIG_XEN
BLANK();
OFFSET(XEN_vcpu_info_mask, vcpu_info, evtchn_upcall_mask);
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 44e5b0fe28cb..0553a339d850 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -86,25 +86,6 @@ u64 notrace _paravirt_ident_64(u64 x)
 {
return x;
 }
-
-static unsigned paravirt_patch_jmp(void *insn_buff, const void *target,
-  unsigned long addr, unsigned len)
-{
-   struct branch *b = insn_buff;
-   unsigned long delta = (unsigned long)target - (addr+5);
-
-   if (len < 5) {
-#ifdef CONFIG_RETPOLINE
-   WARN_ONCE(1, "Failing to patch indirect JMP in %ps\n", (void 
*)addr);
-#endif
-   return len; /* call too long for patch site */
-   }
-
-   b->opcode = 0xe9;   /* jmp */
-   b->delta = delta;
-
-   return 5;
-}
 #endif
 
 DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
@@ -136,9 +117,6 @@ unsigned paravirt_patch_default(u8 type, void *insn_buff,
else if (opfunc == _paravirt_ident_64)
ret = paravirt_patch_ident_64(insn_buff, len);
 
-   else if (type == PARAVIRT_PATCH(cpu.iret))
-   /* If operation requires a jmp, then jmp */
-   ret = paravirt_patch_jmp(insn_buff, opfunc, addr, len);
 #endif
else
/* Otherwise call the function. */
@@ -316,8 +294,6 @@ struct paravirt_patch_template pv_ops = {
 
.cpu.load_sp0   = native_load_sp0,
 
-   .cpu.iret   = native_iret,
-
 #ifdef CONFIG_X86_IOPL_IOPERM
.cpu.invalidate_io_bitmap   = native_tss_invalidate_io_bitmap,
.cpu.update_io_bitmap   = native_tss_update_io_bitmap,
@@ -422,6 +398,8 @@ struct paravirt_patch_template pv_ops = {
 NOKPROBE_SYMBOL(native_get_debugreg);
 NOKPROBE_SYMBOL(native_set_debugreg);
 NOKPROBE_SYMBOL(native_load_idt);
+
+void (*paravirt_iret)(void) = native_iret;
 #endif
 
 EXPORT_SYMBOL(pv_ops);
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index dc0a337f985b..08dca7bebb30 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1070,8 +1070,6 @@ static const struct pv

[PATCH v6 08/12] x86/paravirt: simplify paravirt macros

2021-03-09 Thread Juergen Gross via Virtualization
The central pvops call macros PVOP_CALL() and PVOP_VCALL() are
looking very similar now.

The main differences are using PVOP_VCALL_ARGS or PVOP_CALL_ARGS, which
are identical, and the return value handling.

So drop PVOP_VCALL_ARGS and instead of PVOP_VCALL() just use
(void)PVOP_CALL(long, ...).

Note that it isn't easily possible to just redefine PVOP_VCALL()
to use PVOP_CALL() instead, as this would require further hiding of
commas in macro parameters.

Signed-off-by: Juergen Gross 
Acked-by: Peter Zijlstra (Intel) 
---
V3:
- new patch
V4:
- fix build warnings with clang (kernel test robot)
---
 arch/x86/include/asm/paravirt_types.h | 41 ---
 1 file changed, 12 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index 42f9eef84131..45bd21647dd8 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -408,11 +408,9 @@ int paravirt_disable_iospace(void);
  * makes sure the incoming and outgoing types are always correct.
  */
 #ifdef CONFIG_X86_32
-#define PVOP_VCALL_ARGS
\
+#define PVOP_CALL_ARGS \
unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx;
 
-#define PVOP_CALL_ARGS PVOP_VCALL_ARGS
-
 #define PVOP_CALL_ARG1(x)  "a" ((unsigned long)(x))
 #define PVOP_CALL_ARG2(x)  "d" ((unsigned long)(x))
 #define PVOP_CALL_ARG3(x)  "c" ((unsigned long)(x))
@@ -428,12 +426,10 @@ int paravirt_disable_iospace(void);
 #define VEXTRA_CLOBBERS
 #else  /* CONFIG_X86_64 */
 /* [re]ax isn't an arg, but the return val */
-#define PVOP_VCALL_ARGS\
+#define PVOP_CALL_ARGS \
unsigned long __edi = __edi, __esi = __esi, \
__edx = __edx, __ecx = __ecx, __eax = __eax;
 
-#define PVOP_CALL_ARGS PVOP_VCALL_ARGS
-
 #define PVOP_CALL_ARG1(x)  "D" ((unsigned long)(x))
 #define PVOP_CALL_ARG2(x)  "S" ((unsigned long)(x))
 #define PVOP_CALL_ARG3(x)  "d" ((unsigned long)(x))
@@ -458,59 +454,46 @@ int paravirt_disable_iospace(void);
 #define PVOP_TEST_NULL(op) ((void)pv_ops.op)
 #endif
 
-#define PVOP_RETMASK(rettype)  \
+#define PVOP_RETVAL(rettype)   \
({  unsigned long __mask = ~0UL;\
+   BUILD_BUG_ON(sizeof(rettype) > sizeof(unsigned long));  \
switch (sizeof(rettype)) {  \
case 1: __mask =   0xffUL; break;   \
case 2: __mask = 0xUL; break;   \
case 4: __mask = 0xUL; break;   \
default: break; \
}   \
-   __mask; \
+   __mask & __eax; \
})
 
 
-#define PVOP_CALL(rettype, op, clbr, call_clbr, extra_clbr, ...)   \
+#define PVOP_CALL(ret, op, clbr, call_clbr, extra_clbr, ...)   \
({  \
PVOP_CALL_ARGS; \
PVOP_TEST_NULL(op); \
-   BUILD_BUG_ON(sizeof(rettype) > sizeof(unsigned long));  \
asm volatile(paravirt_alt(PARAVIRT_CALL)\
 : call_clbr, ASM_CALL_CONSTRAINT   \
 : paravirt_type(op),   \
   paravirt_clobber(clbr),  \
   ##__VA_ARGS__\
 : "memory", "cc" extra_clbr);  \
-   (rettype)(__eax & PVOP_RETMASK(rettype));   \
+   ret;\
})
 
 #define __PVOP_CALL(rettype, op, ...)  \
-   PVOP_CALL(rettype, op, CLBR_ANY, PVOP_CALL_CLOBBERS,\
- EXTRA_CLOBBERS, ##__VA_ARGS__)
+   PVOP_CALL(PVOP_RETVAL(rettype), op, CLBR_ANY,   \
+ PVOP_CALL_CLOBBERS, EXTRA_CLOBBERS, ##__VA_ARGS__)
 
 #define __PVOP_CALLEESAVE(rettype, op, ...)\
-   PVOP_CALL(rettype, op.func, CLBR_RET_REG,   \
+   PVOP_CALL(PVOP_RETVAL(rettype), op.func, CLBR_RET_REG,  \
  PVOP_CALLEE_CLOBBERS, , ##__VA_ARGS__)
 
-
-#define PVOP

[PATCH v6 00/12] x86: major paravirt cleanup

2021-03-09 Thread Juergen Gross via Virtualization
This is a major cleanup of the paravirt infrastructure aiming at
eliminating all custom code patching via paravirt patching.

This is achieved by using ALTERNATIVE instead, leading to the ability
to give objtool access to the patched in instructions.

In order to remove most of the 32-bit special handling from pvops the
time related operations are switched to use static_call() instead.

At the end of this series all paravirt patching has to do is to
replace indirect calls with direct ones. In a further step this could
be switched to static_call(), too.

Changes in V6:
- switched back to "not" bit in feature value for "not feature"
- other minor comments addressed

Changes in V5:
- patches 1-5 of V4 dropped, as already applied
- new patches 1+3
- fixed patch 2
- split V4 patch 8 into patches 4+5
- use flag byte instead of negative feature bit for "not feature"

Changes in V4:
- fixed several build failures
- removed objtool patch, as objtool patches are in tip now
- added patch 1 for making usage of static_call easier
- even more cleanup

Changes in V3:
- added patches 7 and 12
- addressed all comments

Changes in V2:
- added patches 5-12

Juergen Gross (12):
  static_call: move struct static_call_key definition to
static_call_types.h
  x86/paravirt: switch time pvops functions to use static_call()
  x86/alternative: drop feature parameter from ALTINSTR_REPLACEMENT()
  x86/alternative: support not-feature
  x86/alternative: support ALTERNATIVE_TERNARY
  x86: add new features for paravirt patching
  x86/paravirt: remove no longer needed 32-bit pvops cruft
  x86/paravirt: simplify paravirt macros
  x86/paravirt: switch iret pvops to ALTERNATIVE
  x86/paravirt: add new macros PVOP_ALT* supporting pvops in
ALTERNATIVEs
  x86/paravirt: switch functions with custom code to ALTERNATIVE
  x86/paravirt: have only one paravirt patch function

 arch/arm/include/asm/paravirt.h |  14 +-
 arch/arm/kernel/paravirt.c  |   9 +-
 arch/arm64/include/asm/paravirt.h   |  14 +-
 arch/arm64/kernel/paravirt.c|  13 +-
 arch/x86/Kconfig|   1 +
 arch/x86/entry/entry_32.S   |   4 +-
 arch/x86/entry/entry_64.S   |   2 +-
 arch/x86/include/asm/alternative-asm.h  |   7 +
 arch/x86/include/asm/alternative.h  |  23 ++-
 arch/x86/include/asm/cpufeatures.h  |   2 +
 arch/x86/include/asm/irqflags.h |   7 +-
 arch/x86/include/asm/mshyperv.h |   2 +-
 arch/x86/include/asm/paravirt.h | 169 +--
 arch/x86/include/asm/paravirt_types.h   | 210 +---
 arch/x86/kernel/Makefile|   3 +-
 arch/x86/kernel/alternative.c   |  51 +-
 arch/x86/kernel/asm-offsets.c   |   7 -
 arch/x86/kernel/cpu/vmware.c|   5 +-
 arch/x86/kernel/kvm.c   |   2 +-
 arch/x86/kernel/kvmclock.c  |   2 +-
 arch/x86/kernel/paravirt-spinlocks.c|   9 +
 arch/x86/kernel/paravirt.c  |  78 +++--
 arch/x86/kernel/paravirt_patch.c|  99 ---
 arch/x86/kernel/tsc.c   |   2 +-
 arch/x86/xen/enlighten_pv.c |   4 +-
 arch/x86/xen/time.c |  26 +--
 drivers/xen/time.c  |   3 +-
 include/linux/static_call.h |  18 --
 include/linux/static_call_types.h   |  18 ++
 tools/include/linux/static_call_types.h |  18 ++
 30 files changed, 348 insertions(+), 474 deletions(-)
 delete mode 100644 arch/x86/kernel/paravirt_patch.c

-- 
2.26.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 1/2] net: check if protocol extracted by virtio_net_hdr_set_proto is correct

2021-03-09 Thread Michael S. Tsirkin
On Mon, Mar 08, 2021 at 11:31:25AM +0100, Balazs Nemeth wrote:
> For gso packets, virtio_net_hdr_set_proto sets the protocol (if it isn't
> set) based on the type in the virtio net hdr, but the skb could contain
> anything since it could come from packet_snd through a raw socket. If
> there is a mismatch between what virtio_net_hdr_set_proto sets and
> the actual protocol, then the skb could be handled incorrectly later
> on.
> 
> An example where this poses an issue is with the subsequent call to
> skb_flow_dissect_flow_keys_basic which relies on skb->protocol being set
> correctly. A specially crafted packet could fool
> skb_flow_dissect_flow_keys_basic preventing EINVAL to be returned.
> 
> Avoid blindly trusting the information provided by the virtio net header
> by checking that the protocol in the packet actually matches the
> protocol set by virtio_net_hdr_set_proto. Note that since the protocol
> is only checked if skb->dev implements header_ops->parse_protocol,
> packets from devices without the implementation are not checked at this
> stage.
> 
> Fixes: 9274124f023b ("net: stricter validation of untrusted gso packets")
> Signed-off-by: Balazs Nemeth 
> ---
>  include/linux/virtio_net.h | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
> index e8a924eeea3d..6c478eee0452 100644
> --- a/include/linux/virtio_net.h
> +++ b/include/linux/virtio_net.h
> @@ -79,8 +79,14 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff 
> *skb,
>   if (gso_type && skb->network_header) {
>   struct flow_keys_basic keys;
>  
> - if (!skb->protocol)
> + if (!skb->protocol) {
> + const struct ethhdr *eth = skb_eth_hdr(skb);
> + __be16 etype = dev_parse_header_protocol(skb);
> +
>   virtio_net_hdr_set_proto(skb, hdr);
> + if (etype && etype != skb->protocol)
> + return -EINVAL;
> + }


Well the protocol in the header is an attempt at an optimization to
remove need to parse the packet ... any data on whether this
affecs performance?

>  retry:
>   if (!skb_flow_dissect_flow_keys_basic(NULL, skb, &keys,
> NULL, 0, 0, 0,
> -- 
> 2.29.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 6/7] x86/boot/compressed/64: Check SEV encryption in 32-bit boot-path

2021-03-09 Thread Joerg Roedel
On Tue, Mar 02, 2021 at 08:43:53PM +0100, Borislav Petkov wrote:
> On Wed, Feb 10, 2021 at 11:21:34AM +0100, Joerg Roedel wrote:
> > +   /*
> > +* Store the sme_me_mask as an indicator that SEV is active. It will be
> > +* set again in startup_64().
> 
> So why bother? Or does something needs it before that?

This was actually a bug. The startup32_check_sev_cbit() needs something
to skip the check when SEV is not active. Therefore the value is set
here in sme_me_mask, but the function later checks sev_status.

I fixed it by setting sev_status to 1 here (indicates SEV is active).

Regards,

Joerg
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization