[RFC PATCH 18/68] vfs: Convert virtio_balloon to use the new mount API

2019-03-27 Thread David Howells
Convert the virtio_balloon filesystem to the new internal mount API as the old
one will be obsoleted and removed.  This allows greater flexibility in
communication of mount parameters between userspace, the VFS and the
filesystem.

See Documentation/filesystems/mount_api.txt for more information.

Signed-off-by: David Howells 
cc: "Michael S. Tsirkin" 
cc: Jason Wang 
cc: virtualization@lists.linux-foundation.org
---

 drivers/virtio/virtio_balloon.c |   19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f19061b585a4..89d67c8aa719 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -757,21 +757,22 @@ static int virtballoon_migratepage(struct 
balloon_dev_info *vb_dev_info,
 
return MIGRATEPAGE_SUCCESS;
 }
+#include 
 
-static struct dentry *balloon_mount(struct file_system_type *fs_type,
-   int flags, const char *dev_name, void *data)
-{
-   static const struct dentry_operations ops = {
-   .d_dname = simple_dname,
-   };
+static const struct dentry_operations balloon_dops = {
+   .d_dname = simple_dname,
+};
 
-   return mount_pseudo(fs_type, "balloon-kvm:", NULL, ,
-   BALLOON_KVM_MAGIC);
+static int balloon_init_fs_context(struct fs_context *fc)
+{
+   return vfs_init_pseudo_fs_context(fc, "balloon-kvm:",
+ NULL, NULL,
+ _dops, BALLOON_KVM_MAGIC);
 }
 
 static struct file_system_type balloon_fs = {
.name   = "balloon-kvm",
-   .mount  = balloon_mount,
+   .init_fs_context = balloon_init_fs_context,
.kill_sb= kill_anon_super,
 };
 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RFC PATCH 00/68] VFS: Convert a bunch of filesystems to the new mount API

2019-03-27 Thread David Howells


Hi Al,

Here's a set of patches that converts a bunch (but not yet all!) to the new
mount API.  To this end, it makes the following changes:

 (1) Provides a convenience member in struct fs_context that is OR'd into
 sb->s_iflags by sget_fc().

 (2) Provides a convenience helper function, vfs_init_pseudo_fs_context(),
 for doing most of the work in mounting a pseudo filesystem.

 (3) Provides a convenience helper function, vfs_get_block_super(), for
 doing the work in setting up a block-based superblock.

 (4) Improves the handling of fd-type parameters.

 (5) Moves some of the subtype handling int fuse.

 (6) Provides a convenience helper function, vfs_get_mtd_super(), for
 doing the work in setting up an MTD device-based superblock.

 (7) Kills off mount_pseudo(), mount_pseudo_xattr(), mount_ns(),
 sget_userns(), mount_mtd(), mount_single().

 (8) Converts a slew of filesystems to use the mount API.

 (9) Fixes a bug in hypfs.

The patches can be found here also:

https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git

on branch:

mount-api-viro

David
---
Andrew Price (1):
  gfs2: Convert gfs2 to fs_context

David Howells (66):
  vfs: Update mount API docs
  vfs: Fix refcounting of filenames in fs_parser
  vfs: Provide sb->s_iflags settings in fs_context struct
  vfs: Provide a mount_pseudo-replacement for the new mount API
  vfs: Convert aio to use the new mount API
  vfs: Convert anon_inodes to use the new mount API
  vfs: Convert bdev to use the new mount API
  vfs: Convert nsfs to use the new mount API
  vfs: Convert pipe to use the new mount API
  vfs: Convert zsmalloc to use the new mount API
  vfs: Convert sockfs to use the new mount API
  vfs: Convert dax to use the new mount API
  vfs: Convert drm to use the new mount API
  vfs: Convert ia64 perfmon to use the new mount API
  vfs: Convert cxl to use the new mount API
  vfs: Convert ocxlflash to use the new mount API
  vfs: Convert virtio_balloon to use the new mount API
  vfs: Convert btrfs_test to use the new mount API
  vfs: Kill off mount_pseudo() and mount_pseudo_xattr()
  vfs: Use sget_fc() for pseudo-filesystems
  vfs: Convert binderfs to use the new mount API
  vfs: Convert nfsctl to use the new mount API
  vfs: Convert rpc_pipefs to use the new mount API
  vfs: Kill mount_ns()
  vfs: Kill sget_userns()
  vfs: Convert binfmt_misc to use the new mount API
  vfs: Convert configfs to use the new mount API
  vfs: Convert efivarfs to use the new mount API
  vfs: Convert fusectl to use the new mount API
  vfs: Convert qib_fs/ipathfs to use the new mount API
  vfs: Convert ibmasmfs to use the new mount API
  vfs: Convert oprofilefs to use the new mount API
  vfs: Convert gadgetfs to use the new mount API
  vfs: Convert xenfs to use the new mount API
  vfs: Convert openpromfs to use the new mount API
  vfs: Convert apparmorfs to use the new mount API
  vfs: Convert securityfs to use the new mount API
  vfs: Convert selinuxfs to use the new mount API
  vfs: Convert smackfs to use the new mount API
  vfs: Convert ramfs, shmem, tmpfs, devtmpfs, rootfs to use the new mount 
API
  vfs: Create fs_context-aware mount_bdev() replacement
  vfs: Make fs_parse() handle fs_param_is_fd-type params better
  vfs: Convert fuse to use the new mount API
  vfs: Move the subtype parameter into fuse
  mtd: Provide fs_context-aware mount_mtd() replacement
  vfs: Convert romfs to use the new mount API
  vfs: Convert cramfs to use the new mount API
  vfs: Convert jffs2 to use the new mount API
  mtd: Kill mount_mtd()
  vfs: Convert squashfs to use the new mount API
  vfs: Convert ceph to use the new mount API
  vfs: Convert functionfs to use the new mount API
  vfs: Add a single-or-reconfig keying to vfs_get_super()
  vfs: Convert debugfs to use the new mount API
  vfs: Convert tracefs to use the new mount API
  vfs: Convert pstore to use the new mount API
  hypfs: Fix error number left in struct pointer member
  vfs: Convert hypfs to use the new mount API
  vfs: Convert spufs to use the new mount API
  vfs: Kill mount_single()
  vfs: Convert coda to use the new mount API
  vfs: Convert autofs to use the new mount API
  vfs: Convert devpts to use the new mount API
  vfs: Convert bpf to use the new mount API
  vfs: Convert ubifs to use the new mount API
  vfs: Convert orangefs to use the new mount API

Masahiro Yamada (1):
  kbuild: skip sub-make for in-tree build with GNU Make 4.x


 Documentation/filesystems/mount_api.txt   |  367 ---
 Documentation/filesystems/vfs.txt |4 
 Makefile  |   31 +
 arch/ia64/kernel/perfmon.c|   14 -
 arch/powerpc/platforms/cell/spufs/inode.c |  207 

Re: [PATCH net v3] failover: allow name change on IFF_UP slave interfaces

2019-03-27 Thread Michael S. Tsirkin
On Wed, Mar 27, 2019 at 01:10:10PM -0700, si-wei liu wrote:
> Another less safer option is that we just notify userspace anyway without
> sending down/up event around, as I don't see *any real application* cares
> about the link state or whatsoever when it attempts to detect rename.

How do you write a race ree handler then? ATM just detecting link up is
sufficient and covers 100% of cases. Seems like a good idea to keep it
that way.

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi

2019-03-27 Thread Paolo Bonzini
On 27/03/19 17:57, Stefan Hajnoczi wrote:
> On Wed, Mar 27, 2019 at 10:33:57AM -0400, Michael S. Tsirkin wrote:
>> Jason doesn't really have the time to review blk/scsi
>> patches. Paolo and Setfan agreed to help out.
>>
>> Thanks guys!
>>
>> Signed-off-by: Michael S. Tsirkin 
>>
>> ---
> 
> There is relatively little activity in this area so I'd like to reply
> with Reviewed-by/Acked-by on the mailing list and have patches merged
> via your virtio tree.  That way I do not maintain a sub-tree and send
> you pull requests.  Does this sound good?

FWIW me too, that's why I suggested that Michael add us as reviewers
rather than maintainers. So,

Acked-by: Paolo Bonzini 

too.

Paolo

> Acked-by: Stefan Hajnoczi 
> 




signature.asc
Description: OpenPGP digital signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi

2019-03-27 Thread Michael S. Tsirkin
On Wed, Mar 27, 2019 at 04:57:54PM +, Stefan Hajnoczi wrote:
> On Wed, Mar 27, 2019 at 10:33:57AM -0400, Michael S. Tsirkin wrote:
> > Jason doesn't really have the time to review blk/scsi
> > patches. Paolo and Setfan agreed to help out.
> > 
> > Thanks guys!
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > 
> > ---
> 
> There is relatively little activity in this area so I'd like to reply
> with Reviewed-by/Acked-by on the mailing list and have patches merged
> via your virtio tree.  That way I do not maintain a sub-tree and send
> you pull requests.  Does this sound good?
> 
> Acked-by: Stefan Hajnoczi 


v2 does that. pls ack that one then.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi

2019-03-27 Thread Stefan Hajnoczi
On Wed, Mar 27, 2019 at 10:33:57AM -0400, Michael S. Tsirkin wrote:
> Jason doesn't really have the time to review blk/scsi
> patches. Paolo and Setfan agreed to help out.
> 
> Thanks guys!
> 
> Signed-off-by: Michael S. Tsirkin 
> 
> ---

There is relatively little activity in this area so I'd like to reply
with Reviewed-by/Acked-by on the mailing list and have patches merged
via your virtio tree.  That way I do not maintain a sub-tree and send
you pull requests.  Does this sound good?

Acked-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi

2019-03-27 Thread Michael S. Tsirkin
On Wed, Mar 27, 2019 at 04:08:05PM +0100, Paolo Bonzini wrote:
> On 27/03/19 15:33, Michael S. Tsirkin wrote:
> > Jason doesn't really have the time to review blk/scsi
> > patches. Paolo and Setfan agreed to help out.
> > 
> > Thanks guys!
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > 
> > ---
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 95a5ebecd04f..8326d19c1681 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -16247,7 +16247,7 @@ F:  drivers/char/virtio_console.c
> >  F: include/linux/virtio_console.h
> >  F: include/uapi/linux/virtio_console.h
> >  
> > -VIRTIO CORE, NET AND BLOCK DRIVERS
> > +VIRTIO CORE AND NET DRIVERS
> >  M: "Michael S. Tsirkin" 
> >  M: Jason Wang 
> >  L: virtualization@lists.linux-foundation.org
> > @@ -16262,6 +16262,18 @@ F: include/uapi/linux/virtio_*.h
> >  F: drivers/crypto/virtio/
> >  F: mm/balloon_compaction.c
> >  
> > +VIRTIO BLOCK AND SCSI DRIVERS
> > +M: "Michael S. Tsirkin" 
> > +M: Paolo Bonzini 
> > +M: Stefan Hajnoczi 
> 
> Please make this R at least for me, so that it's clear that patches
> still flow through your and Jason's tree.  Not sure if you want to keep
> Jason as M.
> 
> Paolo

Not for block I think. He never reviews these.

> > +L: virtualization@lists.linux-foundation.org
> > +S: Maintained
> > +F: drivers/block/virtio_blk.c
> > +F: drivers/scsi/virtio_scsi.c
> > +F: include/uapi/linux/virtio_blk.h
> > +F: include/uapi/linux/virtio_scsi.h
> > +F: drivers/vhost/scsi.c
> > +
> >  VIRTIO CRYPTO DRIVER
> >  M: Gonglei 
> >  L: virtualization@lists.linux-foundation.org
> > 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi

2019-03-27 Thread Paolo Bonzini
On 27/03/19 15:33, Michael S. Tsirkin wrote:
> Jason doesn't really have the time to review blk/scsi
> patches. Paolo and Setfan agreed to help out.
> 
> Thanks guys!
> 
> Signed-off-by: Michael S. Tsirkin 
> 
> ---
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 95a5ebecd04f..8326d19c1681 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -16247,7 +16247,7 @@ F:drivers/char/virtio_console.c
>  F:   include/linux/virtio_console.h
>  F:   include/uapi/linux/virtio_console.h
>  
> -VIRTIO CORE, NET AND BLOCK DRIVERS
> +VIRTIO CORE AND NET DRIVERS
>  M:   "Michael S. Tsirkin" 
>  M:   Jason Wang 
>  L:   virtualization@lists.linux-foundation.org
> @@ -16262,6 +16262,18 @@ F:   include/uapi/linux/virtio_*.h
>  F:   drivers/crypto/virtio/
>  F:   mm/balloon_compaction.c
>  
> +VIRTIO BLOCK AND SCSI DRIVERS
> +M:   "Michael S. Tsirkin" 
> +M:   Paolo Bonzini 
> +M:   Stefan Hajnoczi 

Please make this R at least for me, so that it's clear that patches
still flow through your and Jason's tree.  Not sure if you want to keep
Jason as M.

Paolo

> +L:   virtualization@lists.linux-foundation.org
> +S:   Maintained
> +F:   drivers/block/virtio_blk.c
> +F:   drivers/scsi/virtio_scsi.c
> +F:   include/uapi/linux/virtio_blk.h
> +F:   include/uapi/linux/virtio_scsi.h
> +F:   drivers/vhost/scsi.c
> +
>  VIRTIO CRYPTO DRIVER
>  M:   Gonglei 
>  L:   virtualization@lists.linux-foundation.org
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v3 5/5] drm/virtio: rework resource creation workflow.

2019-03-27 Thread Noralf Trønnes


Den 18.03.2019 12.33, skrev Gerd Hoffmann:
> This patch moves the virtio_gpu_cmd_create_resource() call (which
> notifies the host about the new resource created) into the
> virtio_gpu_object_create() function.  That way we can call
> virtio_gpu_cmd_create_resource() before ttm_bo_init(), so the host
> already knows about the object when ttm initializes the object and calls
> our driver callbacks.
> 
> Specifically the object is already created when the
> virtio_gpu_ttm_tt_bind() callback invokes virtio_gpu_object_attach(),
> so the extra virtio_gpu_object_attach() calls done after
> virtio_gpu_object_create() are not needed any more.
> 
> The fence support for the create ioctl becomes a bit more tricky though.
> The code moved into virtio_gpu_object_create() too.  We first submit the
> (fenced) virtio_gpu_cmd_create_resource() command, then initialize the
> ttm object, and finally attach just created object to the fence for the
> command in case it didn't finish yet.
> 
> Signed-off-by: Gerd Hoffmann 
> ---

Acked-by: Noralf Trønnes 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 2/5] drm/virtio: use struct to pass params to virtio_gpu_object_create()

2019-03-27 Thread Noralf Trønnes


Den 18.03.2019 12.33, skrev Gerd Hoffmann:
> Create virtio_gpu_object_params, use that to pass object parameters to
> virtio_gpu_object_create.  This is just the first step, followup patches
> will add more parameters to the struct.  The plan is to use the struct
> for all object parameters.
> 
> Drop unused "kernel" parameter for virtio_gpu_alloc_object(), it is
> unused and always false.
> 
> Also drop "pinned" parameter.  virtio-gpu doesn't shuffle around
> objects, so effecively they all are pinned anyway.  Hardcode
> TTM_PL_FLAG_NO_EVICT so ttm knows.  Doesn't change much for the moment
> as virtio-gpu supports TTM_PL_FLAG_TT only so there is no opportunity to
> move around objects.  That'll probably change in the future though.
> 
> Signed-off-by: Gerd Hoffmann 
> ---

Acked-by: Noralf Trønnes 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH] MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi

2019-03-27 Thread Michael S. Tsirkin
Jason doesn't really have the time to review blk/scsi
patches. Paolo and Setfan agreed to help out.

Thanks guys!

Signed-off-by: Michael S. Tsirkin 

---

diff --git a/MAINTAINERS b/MAINTAINERS
index 95a5ebecd04f..8326d19c1681 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16247,7 +16247,7 @@ F:  drivers/char/virtio_console.c
 F: include/linux/virtio_console.h
 F: include/uapi/linux/virtio_console.h
 
-VIRTIO CORE, NET AND BLOCK DRIVERS
+VIRTIO CORE AND NET DRIVERS
 M: "Michael S. Tsirkin" 
 M: Jason Wang 
 L: virtualization@lists.linux-foundation.org
@@ -16262,6 +16262,18 @@ F: include/uapi/linux/virtio_*.h
 F: drivers/crypto/virtio/
 F: mm/balloon_compaction.c
 
+VIRTIO BLOCK AND SCSI DRIVERS
+M: "Michael S. Tsirkin" 
+M: Paolo Bonzini 
+M: Stefan Hajnoczi 
+L: virtualization@lists.linux-foundation.org
+S: Maintained
+F: drivers/block/virtio_blk.c
+F: drivers/scsi/virtio_scsi.c
+F: include/uapi/linux/virtio_blk.h
+F: include/uapi/linux/virtio_scsi.h
+F: drivers/vhost/scsi.c
+
 VIRTIO CRYPTO DRIVER
 M: Gonglei 
 L: virtualization@lists.linux-foundation.org
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] drm/virtio: add virtio-gpu-features debugfs file.

2019-03-27 Thread Noralf Trønnes


Den 20.03.2019 09.36, skrev Gerd Hoffmann:
> This file prints which features the virtio-gpu device has.
> 
> Also add "virtio-gpu-" prefix to the existing fence file,
> to make clear this is a driver-specific debugfs file.
> 
> Signed-off-by: Gerd Hoffmann 
> ---

Acked-by: Noralf Trønnes 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net v3] failover: allow name change on IFF_UP slave interfaces

2019-03-27 Thread Michael S. Tsirkin
On Tue, Mar 26, 2019 at 07:13:42PM -0700, Stephen Hemminger wrote:
> On Tue, 26 Mar 2019 19:48:13 -0400
> Si-Wei Liu  wrote:
> 
> > When a netdev appears through hot plug then gets enslaved by a failover
> > master that is already up and running, the slave will be opened
> > right away after getting enslaved. Today there's a race that userspace
> > (udev) may fail to rename the slave if the kernel (net_failover)
> > opens the slave earlier than when the userspace rename happens.
> > Unlike bond or team, the primary slave of failover can't be renamed by
> > userspace ahead of time, since the kernel initiated auto-enslavement is
> > unable to, or rather, is never meant to be synchronized with the rename
> > request from userspace.
> > 
> > As the failover slave interfaces are not designed to be operated
> > directly by userspace apps: IP configuration, filter rules with
> > regard to network traffic passing and etc., should all be done on master
> > interface. In general, userspace apps only care about the
> > name of master interface, while slave names are less important as long
> > as admin users can see reliable names that may carry
> > other information describing the netdev. For e.g., they can infer that
> > "ens3nsby" is a standby slave of "ens3", while for a
> > name like "eth0" they can't tell which master it belongs to.
> > 
> > Historically the name of IFF_UP interface can't be changed because
> > there might be admin script or management software that is already
> > relying on such behavior and assumes that the slave name can't be
> > changed once UP. But failover is special: with the in-kernel
> > auto-enslavement mechanism, the userspace expectation for device
> > enumeration and bring-up order is already broken. Previously initramfs
> > and various userspace config tools were modified to bypass failover
> > slaves because of auto-enslavement and duplicate MAC address. Similarly,
> > in case that users care about seeing reliable slave name, the new type
> > of failover slaves needs to be taken care of specifically in userspace
> > anyway.
> > 
> > It's less risky to lift up the rename restriction on failover slave
> > which is already UP. Although it's possible this change may potentially
> > break userspace component (most likely configuration scripts or
> > management software) that assumes slave name can't be changed while
> > UP, it's relatively a limited and controllable set among all userspace
> > components, which can be fixed specifically to listen for the rename
> > and/or link down/up events on failover slaves. Userspace component
> > interacting with slaves is expected to be changed to operate on failover
> > master interface instead, as the failover slave is dynamic in nature
> > which may come and go at any point.  The goal is to make the role of
> > failover slaves less relevant, and userspace components should only
> > deal with failover master in the long run.
> > 
> > Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module")
> > Signed-off-by: Si-Wei Liu 
> > Reviewed-by: Liran Alon 
> 
> 
> Why do you need to do dev_close/dev_open which will bounce
> the link?

What we need is notify userspace that link went up/down.
close/open will do that but just sending notifications
would do that as well without playing with link states.

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net v3] failover: allow name change on IFF_UP slave interfaces

2019-03-27 Thread Jiri Pirko
Wed, Mar 27, 2019 at 12:48:13AM CET, si-wei@oracle.com wrote:
>When a netdev appears through hot plug then gets enslaved by a failover
>master that is already up and running, the slave will be opened
>right away after getting enslaved. Today there's a race that userspace
>(udev) may fail to rename the slave if the kernel (net_failover)
>opens the slave earlier than when the userspace rename happens.
>Unlike bond or team, the primary slave of failover can't be renamed by
>userspace ahead of time, since the kernel initiated auto-enslavement is
>unable to, or rather, is never meant to be synchronized with the rename
>request from userspace.
>
>As the failover slave interfaces are not designed to be operated
>directly by userspace apps: IP configuration, filter rules with
>regard to network traffic passing and etc., should all be done on master
>interface. In general, userspace apps only care about the
>name of master interface, while slave names are less important as long
>as admin users can see reliable names that may carry
>other information describing the netdev. For e.g., they can infer that
>"ens3nsby" is a standby slave of "ens3", while for a
>name like "eth0" they can't tell which master it belongs to.
>
>Historically the name of IFF_UP interface can't be changed because
>there might be admin script or management software that is already
>relying on such behavior and assumes that the slave name can't be
>changed once UP. But failover is special: with the in-kernel
>auto-enslavement mechanism, the userspace expectation for device
>enumeration and bring-up order is already broken. Previously initramfs
>and various userspace config tools were modified to bypass failover
>slaves because of auto-enslavement and duplicate MAC address. Similarly,
>in case that users care about seeing reliable slave name, the new type
>of failover slaves needs to be taken care of specifically in userspace
>anyway.
>
>It's less risky to lift up the rename restriction on failover slave
>which is already UP. Although it's possible this change may potentially
>break userspace component (most likely configuration scripts or
>management software) that assumes slave name can't be changed while
>UP, it's relatively a limited and controllable set among all userspace
>components, which can be fixed specifically to listen for the rename
>and/or link down/up events on failover slaves. Userspace component
>interacting with slaves is expected to be changed to operate on failover
>master interface instead, as the failover slave is dynamic in nature
>which may come and go at any point.  The goal is to make the role of
>failover slaves less relevant, and userspace components should only
>deal with failover master in the long run.
>
>Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module")
>Signed-off-by: Si-Wei Liu 
>Reviewed-by: Liran Alon 
>
>--
>v1 -> v2:
>- Drop configurable module parameter (Sridhar)
>
>v2 -> v3:
>- Drop additional IFF_SLAVE_RENAME_OK flag (Sridhar)
>- Send down and up events around rename (Michael S. Tsirkin)
>---
> net/core/dev.c | 37 ++---
> 1 file changed, 34 insertions(+), 3 deletions(-)
>
>diff --git a/net/core/dev.c b/net/core/dev.c
>index 722d50d..3e0cd80 100644
>--- a/net/core/dev.c
>+++ b/net/core/dev.c
>@@ -1171,6 +1171,7 @@ int dev_get_valid_name(struct net *net, struct 
>net_device *dev,
> int dev_change_name(struct net_device *dev, const char *newname)
> {
>   unsigned char old_assign_type;
>+  bool reopen_needed = false;
>   char oldname[IFNAMSIZ];
>   int err = 0;
>   int ret;
>@@ -1180,8 +1181,24 @@ int dev_change_name(struct net_device *dev, const char 
>*newname)
>   BUG_ON(!dev_net(dev));
> 
>   net = dev_net(dev);
>-  if (dev->flags & IFF_UP)
>-  return -EBUSY;
>+
>+  /* Allow failover slave to rename even when
>+   * it is up and running.
>+   *
>+   * Failover slaves are special, since userspace
>+   * might rename the slave after the interface
>+   * has been brought up and running due to
>+   * auto-enslavement.
>+   *
>+   * Failover users don't actually care about slave
>+   * name change, as they are only expected to operate
>+   * on master interface directly.
>+   */
>+  if (dev->flags & IFF_UP) {
>+  if (likely(!(dev->priv_flags & IFF_FAILOVER_SLAVE)))
>+  return -EBUSY;
>+  reopen_needed = true;
>+  }
> 
>   write_seqcount_begin(_rename_seq);
> 
>@@ -1198,6 +1215,9 @@ int dev_change_name(struct net_device *dev, const char 
>*newname)
>   return err;
>   }
> 
>+  if (reopen_needed)
>+  dev_close(dev);

Ugh. Don't dev_close/dev_open on name change.


>+
>   if (oldname[0] && !strchr(oldname, '%'))
>   netdev_info(dev, "renamed from %s\n", oldname);
> 
>@@ -1210,7 +1230,9 @@ int dev_change_name(struct net_device *dev, const char 
>*newname)
>   

[PATCH 2/2] scsi: virtio_scsi: limit number of hw queues by nr_cpu_ids

2019-03-27 Thread Dongli Zhang
When tag_set->nr_maps is 1, the block layer limits the number of hw queues
by nr_cpu_ids. No matter how many hw queues are used by virtio-scsi, as it
has (tag_set->nr_maps == 1), it can use at most nr_cpu_ids hw queues.

In addition, specifically for pci scenario, when the 'num_queues' specified
by qemu is more than maxcpus, virtio-scsi would not be able to allocate
more than maxcpus vectors in order to have a vector for each queue. As a
result, it falls back into MSI-X with one vector for config and one shared
for queues.

Considering above reasons, this patch limits the number of hw queues used
by virtio-scsi by nr_cpu_ids.

Signed-off-by: Dongli Zhang 
---
 drivers/scsi/virtio_scsi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 8af0177..9c4a3e1 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -793,6 +793,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
 
/* We need to know how many queues before we allocate. */
num_queues = virtscsi_config_get(vdev, num_queues) ? : 1;
+   num_queues = min_t(unsigned int, nr_cpu_ids, num_queues);
 
num_targets = virtscsi_config_get(vdev, max_target) + 1;
 
-- 
2.7.4

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 1/2] virtio-blk: limit number of hw queues by nr_cpu_ids

2019-03-27 Thread Dongli Zhang
When tag_set->nr_maps is 1, the block layer limits the number of hw queues
by nr_cpu_ids. No matter how many hw queues are used by virtio-blk, as it
has (tag_set->nr_maps == 1), it can use at most nr_cpu_ids hw queues.

In addition, specifically for pci scenario, when the 'num-queues' specified
by qemu is more than maxcpus, virtio-blk would not be able to allocate more
than maxcpus vectors in order to have a vector for each queue. As a result,
it falls back into MSI-X with one vector for config and one shared for
queues.

Considering above reasons, this patch limits the number of hw queues used
by virtio-blk by nr_cpu_ids.

Signed-off-by: Dongli Zhang 
---
 drivers/block/virtio_blk.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 4bc083b..b83cb45 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -513,6 +513,8 @@ static int init_vq(struct virtio_blk *vblk)
if (err)
num_vqs = 1;
 
+   num_vqs = min_t(unsigned int, nr_cpu_ids, num_vqs);
+
vblk->vqs = kmalloc_array(num_vqs, sizeof(*vblk->vqs), GFP_KERNEL);
if (!vblk->vqs)
return -ENOMEM;
-- 
2.7.4

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 0/2] Limit number of hw queues by nr_cpu_ids for virtio-blk and virtio-scsi

2019-03-27 Thread Dongli Zhang
When tag_set->nr_maps is 1, the block layer limits the number of hw queues
by nr_cpu_ids. No matter how many hw queues are use by
virtio-blk/virtio-scsi, as they both have (tag_set->nr_maps == 1), they
can use at most nr_cpu_ids hw queues.

In addition, specifically for pci scenario, when the 'num-queues' specified
by qemu is more than maxcpus, virtio-blk/virtio-scsi would not be able to
allocate more than maxcpus vectors in order to have a vector for each
queue. As a result, they fall back into MSI-X with one vector for config
and one shared for queues.

Considering above reasons, this patch set limits the number of hw queues
used by nr_cpu_ids for both virtio-blk and virtio-scsi.

-

Here is test result of virtio-scsi:

qemu cmdline:

-smp 2,maxcpus=4, \
-device virtio-scsi-pci,id=scsi0,num_queues=8, \
-device scsi-hd,drive=drive0,bus=scsi0.0,channel=0,scsi-id=0,lun=0, \
-drive file=test.img,if=none,id=drive0

Although maxcpus=4 and num_queues=8, 4 queues are used while 2 interrupts
are allocated.

# cat /proc/interrupts
... ...
 24:  0  0   PCI-MSI 65536-edge  virtio0-config
 25:  0369   PCI-MSI 65537-edge  virtio0-virtqueues
... ...

# /sys/block/sda/mq/
0  1  2  3   --> 4 queues although qemu sets num_queues=8


With the patch set, there is per-queue interrupt.

# cat /proc/interrupts
 24:  0  0   PCI-MSI 65536-edge  virtio0-config
 25:  0  0   PCI-MSI 65537-edge  virtio0-control
 26:  0  0   PCI-MSI 65538-edge  virtio0-event
 27:296  0   PCI-MSI 65539-edge  virtio0-request
 28:  0139   PCI-MSI 65540-edge  virtio0-request
 29:  0  0   PCI-MSI 65541-edge  virtio0-request
 30:  0  0   PCI-MSI 65542-edge  virtio0-request

# ls /sys/block/sda/mq
0  1  2  3

-

Here is test result of virtio-blk:

qemu cmdline:

-smp 2,maxcpus=4,
-device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0,num-queues=8
-drive test.img,format=raw,if=none,id=drive-virtio-disk0

Although maxcpus=4 and num-queues=8, 4 queues are used while 2 interrupts
are allocated.

# cat /proc/interrupts
... ...
 24:  0  0   PCI-MSI 65536-edge  virtio0-config
 25:  0 65   PCI-MSI 65537-edge  virtio0-virtqueues
... ...

# ls /sys/block/vda/mq
0  1  2  3---> 4 queues although qemu sets num_queues=8


With the patch set, there is per-queue interrupt.

# cat /proc/interrupts
 24:  0  0   PCI-MSI 65536-edge  virtio0-config
 25: 64  0   PCI-MSI 65537-edge  virtio0-req.0
 26:  0  10290   PCI-MSI 65538-edge  virtio0-req.1
 27:  0  0   PCI-MSI 65539-edge  virtio0-req.2
 28:  0  0   PCI-MSI 65540-edge  virtio0-req.3

# ls /sys/block/vda/mq/
0  1  2  3


Reference: 
https://lore.kernel.org/lkml/e4afe4c5-0262-4500-aeec-60f30734b4fc@default/

Thank you very much!

Dongli Zhang

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization