Re: [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU

2022-03-25 Thread Hillf Danton
On Fri, 25 Mar 2022 15:53:09 +0800 Jason Wang wrote:
> 
> Ok, Hillf, does this make sense for you? We want the issue to be fixed
> soon, it's near to our product release.

Feel free to go ahead - product release is important.

BR
Hillf
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU

2022-03-24 Thread Hillf Danton
On Thu, 24 Mar 2022 16:20:34 +0800 Jason Wang wrote:
> On Thu, Mar 24, 2022 at 2:17 PM Michael S. Tsirkin  wrote:
> > On Thu, Mar 24, 2022 at 02:04:19PM +0800, Hillf Danton wrote:
> > > On Thu, 24 Mar 2022 10:34:09 +0800 Jason Wang wrote:
> > > > On Thu, Mar 24, 2022 at 8:54 AM Hillf Danton  wrote:
> > > > >
> > > > > On Tue, 22 Mar 2022 09:59:14 +0800 Jason Wang wrote:
> > > > > >
> > > > > > Yes, there will be no "infinite" loop, but since the loop is 
> > > > > > triggered
> > > > > > by userspace. It looks to me it will delay the flush/drain of the
> > > > > > workqueue forever which is still suboptimal.
> > > > >
> > > > > Usually it is barely possible to shoot two birds using a stone.
> > > > >
> > > > > Given the "forever", I am inclined to not running faster, hehe, though
> > > > > another cobble is to add another line in the loop checking if mvdev is
> > > > > unregistered, and for example make mvdev->cvq unready before 
> > > > > destroying
> > > > > workqueue.
> > > > >
> > > > > static void mlx5_vdpa_dev_del(struct vdpa_mgmt_dev *v_mdev, struct 
> > > > > vdpa_device *dev)
> > > > > {
> > > > > struct mlx5_vdpa_mgmtdev *mgtdev = container_of(v_mdev, 
> > > > > struct mlx5_vdpa_mgmtdev, mgtdev);
> > > > > struct mlx5_vdpa_dev *mvdev = to_mvdev(dev);
> > > > > struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> > > > >
> > > > > mlx5_notifier_unregister(mvdev->mdev, >nb);
> > > > > destroy_workqueue(mvdev->wq);
> > > > > _vdpa_unregister_device(dev);
> > > > > mgtdev->ndev = NULL;
> > > > > }
> > > > >
> > > >
> > > > Yes, so we had
> > > >
> > > > 1) using a quota for re-requeue
> > > > 2) using something like
> > > >
> > > > while (READ_ONCE(cvq->ready)) {
> > > > ...
> > > > cond_resched();
> > > > }
> > > >
> > > > There should not be too much difference except we need to use
> > > > cancel_work_sync() instead of flush_work for 1).
> > > >
> > > > I would keep the code as is but if you stick I can change.
> > >
> > > No Sir I would not - I am simply not a fan of work requeue.
> > >
> > > Hillf
> >
> > I think I agree - requeue adds latency spikes under heavy load -
> > unfortunately, not measured by netperf but still important
> > for latency sensitive workloads. Checking a flag is cheaper.
> 
> Just spot another possible issue.
> 
> The workqueue will be used by another work to update the carrier
> (event_handler()). Using cond_resched() may still have unfair issue
> which blocks the carrier update for infinite time,

Then would you please specify the reason why mvdev->wq is single
threaded? Given requeue, the serialization of the two works is not
strong. Otherwise unbound WQ that can process works in parallel is
a cure to the unfairness above.

Thanks
Hillf
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU

2022-03-24 Thread Hillf Danton
On Thu, 24 Mar 2022 10:34:09 +0800 Jason Wang wrote:
> On Thu, Mar 24, 2022 at 8:54 AM Hillf Danton  wrote:
> >
> > On Tue, 22 Mar 2022 09:59:14 +0800 Jason Wang wrote:
> > >
> > > Yes, there will be no "infinite" loop, but since the loop is triggered
> > > by userspace. It looks to me it will delay the flush/drain of the
> > > workqueue forever which is still suboptimal.
> >
> > Usually it is barely possible to shoot two birds using a stone.
> >
> > Given the "forever", I am inclined to not running faster, hehe, though
> > another cobble is to add another line in the loop checking if mvdev is
> > unregistered, and for example make mvdev->cvq unready before destroying
> > workqueue.
> >
> > static void mlx5_vdpa_dev_del(struct vdpa_mgmt_dev *v_mdev, struct 
> > vdpa_device *dev)
> > {
> > struct mlx5_vdpa_mgmtdev *mgtdev = container_of(v_mdev, struct 
> > mlx5_vdpa_mgmtdev, mgtdev);
> > struct mlx5_vdpa_dev *mvdev = to_mvdev(dev);
> > struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> >
> > mlx5_notifier_unregister(mvdev->mdev, >nb);
> > destroy_workqueue(mvdev->wq);
> > _vdpa_unregister_device(dev);
> > mgtdev->ndev = NULL;
> > }
> >
> 
> Yes, so we had
> 
> 1) using a quota for re-requeue
> 2) using something like
> 
> while (READ_ONCE(cvq->ready)) {
> ...
> cond_resched();
> }
> 
> There should not be too much difference except we need to use
> cancel_work_sync() instead of flush_work for 1).
> 
> I would keep the code as is but if you stick I can change.

No Sir I would not - I am simply not a fan of work requeue.

Hillf
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU

2022-03-23 Thread Hillf Danton
On Tue, 22 Mar 2022 09:59:14 +0800 Jason Wang wrote:
> 
> Yes, there will be no "infinite" loop, but since the loop is triggered
> by userspace. It looks to me it will delay the flush/drain of the
> workqueue forever which is still suboptimal.

Usually it is barely possible to shoot two birds using a stone.

Given the "forever", I am inclined to not running faster, hehe, though
another cobble is to add another line in the loop checking if mvdev is
unregistered, and for example make mvdev->cvq unready before destroying
workqueue.

static void mlx5_vdpa_dev_del(struct vdpa_mgmt_dev *v_mdev, struct vdpa_device 
*dev)
{
struct mlx5_vdpa_mgmtdev *mgtdev = container_of(v_mdev, struct 
mlx5_vdpa_mgmtdev, mgtdev);
struct mlx5_vdpa_dev *mvdev = to_mvdev(dev);
struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);

mlx5_notifier_unregister(mvdev->mdev, >nb);
destroy_workqueue(mvdev->wq);
_vdpa_unregister_device(dev);
mgtdev->ndev = NULL;
}
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU

2022-03-21 Thread Hillf Danton
On Mon, 21 Mar 2022 17:00:09 +0800 Jason Wang wrote:
> 
> Ok, speak too fast.

Frankly I have fun running faster five days a week.

> So you meant to add a cond_resched() in the loop?

Yes, it is one liner.

Hillf
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] vdpa: mlx5: prevent cvq work from hogging CPU

2022-03-21 Thread Hillf Danton
On Mon, 21 Mar 2022 14:04:28 +0800 Jason Wang wrote:
> A userspace triggerable infinite loop could happen in
> mlx5_cvq_kick_handler() if userspace keeps sending a huge amount of
> cvq requests.
> 
> Fixing this by introducing a quota and re-queue the work if we're out
> of the budget. While at it, using a per device workqueue to avoid on
> demand memory allocation for cvq.
> 
> Fixes: 5262912ef3cfc ("vdpa/mlx5: Add support for control VQ and MAC setting")
> Signed-off-by: Jason Wang 
> ---
>  drivers/vdpa/mlx5/net/mlx5_vnet.c | 28 +++-
>  1 file changed, 15 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
> b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index d0f91078600e..d5a6fb3f9c41 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -163,6 +163,7 @@ struct mlx5_vdpa_net {
>   u32 cur_num_vqs;
>   struct notifier_block nb;
>   struct vdpa_callback config_cb;
> + struct mlx5_vdpa_wq_ent cvq_ent;
>  };
>  
>  static void free_resources(struct mlx5_vdpa_net *ndev);
> @@ -1600,6 +1601,8 @@ static virtio_net_ctrl_ack handle_ctrl_mq(struct 
> mlx5_vdpa_dev *mvdev, u8 cmd)
>   return status;
>  }
>  
> +#define MLX5_CVQ_BUDGET 16
> +

This is not needed as given a single thread workqueue, a cond_resched()
can do the job in the worker context instead of requeue of work.

Hillf

>  static void mlx5_cvq_kick_handler(struct work_struct *work)
>  {
>   virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
> @@ -1609,17 +1612,17 @@ static void mlx5_cvq_kick_handler(struct work_struct 
> *work)
>   struct mlx5_control_vq *cvq;
>   struct mlx5_vdpa_net *ndev;
>   size_t read, write;
> - int err;
> + int err, n = 0;
>  
>   wqent = container_of(work, struct mlx5_vdpa_wq_ent, work);
>   mvdev = wqent->mvdev;
>   ndev = to_mlx5_vdpa_ndev(mvdev);
>   cvq = >cvq;
>   if (!(ndev->mvdev.actual_features & BIT_ULL(VIRTIO_NET_F_CTRL_VQ)))
> - goto out;
> + return;
>  
>   if (!cvq->ready)
> - goto out;
> + return;
>  
>   while (true) {
>   err = vringh_getdesc_iotlb(>vring, >riov, >wiov, 
> >head,
> @@ -1653,9 +1656,13 @@ static void mlx5_cvq_kick_handler(struct work_struct 
> *work)
>  
>   if (vringh_need_notify_iotlb(>vring))
>   vringh_notify(>vring);
> +
> + n++;
> + if (n > MLX5_CVQ_BUDGET) {
> + queue_work(mvdev->wq, >work);
> + break;
> + }
>   }
> -out:
> - kfree(wqent);
>  }
>  
>  static void mlx5_vdpa_kick_vq(struct vdpa_device *vdev, u16 idx)
> @@ -1663,7 +1670,6 @@ static void mlx5_vdpa_kick_vq(struct vdpa_device *vdev, 
> u16 idx)
>   struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
>   struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
>   struct mlx5_vdpa_virtqueue *mvq;
> - struct mlx5_vdpa_wq_ent *wqent;
>  
>   if (!is_index_valid(mvdev, idx))
>   return;
> @@ -1672,13 +1678,7 @@ static void mlx5_vdpa_kick_vq(struct vdpa_device 
> *vdev, u16 idx)
>   if (!mvdev->cvq.ready)
>   return;
>  
> - wqent = kzalloc(sizeof(*wqent), GFP_ATOMIC);
> - if (!wqent)
> - return;
> -
> - wqent->mvdev = mvdev;
> - INIT_WORK(>work, mlx5_cvq_kick_handler);
> - queue_work(mvdev->wq, >work);
> + queue_work(mvdev->wq, >cvq_ent.work);
>   return;
>   }
>  
> @@ -2668,6 +2668,8 @@ static int mlx5_vdpa_dev_add(struct vdpa_mgmt_dev 
> *v_mdev, const char *name,
>   if (err)
>   goto err_mr;
>  
> + ndev->cvq_ent.mvdev = mvdev;
> + INIT_WORK(>cvq_ent.work, mlx5_cvq_kick_handler);
>   mvdev->wq = create_singlethread_workqueue("mlx5_vdpa_wq");
>   if (!mvdev->wq) {
>   err = -ENOMEM;
> -- 
> 2.18.1
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 7/8] vhost: use kernel_copy_process to check RLIMITs and inherit cgroups

2021-09-19 Thread Hillf Danton
On Thu, 16 Sep 2021 16:20:50 -0500 Mike Christie wrote:
>  
>  static int vhost_worker_create(struct vhost_dev *dev)
>  {
> + DECLARE_COMPLETION_ONSTACK(start_done);

Nit, cut it.

>   struct vhost_worker *worker;
>   struct task_struct *task;
> + char buf[TASK_COMM_LEN];
>   int ret;
>  
>   worker = kzalloc(sizeof(*worker), GFP_KERNEL_ACCOUNT);
> @@ -603,27 +613,30 @@ static int vhost_worker_create(struct vhost_dev *dev)
>   return -ENOMEM;
>  
>   dev->worker = worker;
> - worker->dev = dev;
>   worker->kcov_handle = kcov_common_handle();
>   init_llist_head(>work_list);
>  
> - task = kthread_create(vhost_worker, worker, "vhost-%d", current->pid);
> - if (IS_ERR(task)) {
> - ret = PTR_ERR(task);
> + /*
> +  * vhost used to use the kthread API which ignores all signals by
> +  * default and the drivers expect this behavior. So we do not want to
> +  * ineherit the parent's signal handlers and set our worker to ignore
> +  * everything below.
> +  */
> + task = kernel_copy_process(vhost_worker, worker, NUMA_NO_NODE,
> +CLONE_FS|CLONE_CLEAR_SIGHAND, 0, 1);
> + if (IS_ERR(task))
>   goto free_worker;
> - }
>  
>   worker->task = task;
> - wake_up_process(task); /* avoid contributing to loadavg */
>  
> - ret = vhost_attach_cgroups(dev);
> - if (ret)
> - goto stop_worker;
> + snprintf(buf, sizeof(buf), "vhost-%d", current->pid);
> + set_task_comm(task, buf);
> +
> + ignore_signals(task);
>  
> + wake_up_new_task(task);
>   return 0;
>  
> -stop_worker:
> - kthread_stop(worker->task);
>  free_worker:
>   kfree(worker);
>   dev->worker = NULL;
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index 102ce25e4e13..09748694cb66 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -25,11 +25,16 @@ struct vhost_work {
>   unsigned long   flags;
>  };
>  
> +enum {
> + VHOST_WORKER_FLAG_STOP,
> +};
> +
>  struct vhost_worker {
>   struct task_struct  *task;
> + struct completion   *exit_done;
>   struct llist_head   work_list;
> - struct vhost_dev*dev;
>   u64 kcov_handle;
> + unsigned long   flags;
>  };
>  
>  /* Poll a file (eventfd or socket) */
> -- 
> 2.25.1
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: INFO: task hung in lock_sock_nested (2)

2020-02-24 Thread Hillf Danton


On Mon, 24 Feb 2020 11:08:53 +0100 Stefano Garzarella wrote:
> On Sun, Feb 23, 2020 at 03:50:25PM +0800, Hillf Danton wrote:
> > 
> > Seems like vsock needs a word to track lock owner in an attempt to
> > avoid trying to lock sock while the current is the lock owner.
> 
> Thanks for this possible solution.
> What about using sock_owned_by_user()?
> 
No chance for vsock_locked() if it works.

> We should fix also hyperv_transport, because it could suffer from the same
> problem.
> 
You're right. My diff is at most for introducing vsk's lock owner.

> At this point, it might be better to call vsk->transport->release(vsk)
> always with the lock taken and remove it in the transports as in the
> following patch.
> 
> What do you think?
> 
Yes and ... please take a look at the output of grep

grep -n lock_sock linux/net/vmw_vsock/af_vsock.c

as it drove me mad.

> 
> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> index 9c5b2a91baad..a073d8efca33 100644
> --- a/net/vmw_vsock/af_vsock.c
> +++ b/net/vmw_vsock/af_vsock.c
> @@ -753,20 +753,18 @@ static void __vsock_release(struct sock *sk, int level)
>   vsk = vsock_sk(sk);
>   pending = NULL; /* Compiler warning. */
>  
> - /* The release call is supposed to use lock_sock_nested()
> -  * rather than lock_sock(), if a sock lock should be acquired.
> -  */
> - if (vsk->transport)
> - vsk->transport->release(vsk);
> - else if (sk->sk_type == SOCK_STREAM)
> - vsock_remove_sock(vsk);
> -
>   /* When "level" is SINGLE_DEPTH_NESTING, use the nested
>* version to avoid the warning "possible recursive locking
>* detected". When "level" is 0, lock_sock_nested(sk, level)
>* is the same as lock_sock(sk).
>*/
>   lock_sock_nested(sk, level);
> +
> + if (vsk->transport)
> + vsk->transport->release(vsk);
> + else if (sk->sk_type == SOCK_STREAM)
> + vsock_remove_sock(vsk);
> +
>   sock_orphan(sk);
>   sk->sk_shutdown = SHUTDOWN_MASK;
>  
> diff --git a/net/vmw_vsock/hyperv_transport.c 
> b/net/vmw_vsock/hyperv_transport.c
> index 3492c021925f..510f25f4a856 100644
> --- a/net/vmw_vsock/hyperv_transport.c
> +++ b/net/vmw_vsock/hyperv_transport.c
> @@ -529,9 +529,7 @@ static void hvs_release(struct vsock_sock *vsk)
>   struct sock *sk = sk_vsock(vsk);
>   bool remove_sock;
>  
> - lock_sock_nested(sk, SINGLE_DEPTH_NESTING);
>   remove_sock = hvs_close_lock_held(vsk);
> - release_sock(sk);
>   if (remove_sock)
>   vsock_remove_sock(vsk);
>  }
> diff --git a/net/vmw_vsock/virtio_transport_common.c 
> b/net/vmw_vsock/virtio_transport_common.c
> index d9f0c9c5425a..f3c4bab2f737 100644
> --- a/net/vmw_vsock/virtio_transport_common.c
> +++ b/net/vmw_vsock/virtio_transport_common.c
> @@ -829,7 +829,6 @@ void virtio_transport_release(struct vsock_sock *vsk)
>   struct sock *sk = >sk;
>   bool remove_sock = true;
>  
> - lock_sock_nested(sk, SINGLE_DEPTH_NESTING);
>   if (sk->sk_type == SOCK_STREAM)
>   remove_sock = virtio_transport_close(vsk);
>  
> @@ -837,7 +836,6 @@ void virtio_transport_release(struct vsock_sock *vsk)
>   list_del(>list);
>   virtio_transport_free_pkt(pkt);
>   }
> - release_sock(sk);
>  
>   if (remove_sock)
>   vsock_remove_sock(vsk);
> 
> Thanks,
> Stefano

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: INFO: task hung in lock_sock_nested (2)

2020-02-23 Thread Hillf Danton


On Sat, 22 Feb 2020 10:58:12 -0800
> syzbot found the following crash on:
> 
> HEAD commit:2bb07f4e tc-testing: updated tdc tests for basic filter
> git tree:   net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=122efdede0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=768cc3d3e277cc16
> dashboard link: https://syzkaller.appspot.com/bug?extid=731710996d79d0d58fbc
> compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=14887de9e0
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=149eec81e0
> 
> The bug was bisected to:
> 
> commit 408624af4c89989117bb2c6517bd50b7708a2fcd
> Author: Stefano Garzarella 
> Date:   Tue Dec 10 10:43:06 2019 +
> 
> vsock: use local transport when it is loaded
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1011e27ee0
> final crash:https://syzkaller.appspot.com/x/report.txt?x=1211e27ee0
> console output: https://syzkaller.appspot.com/x/log.txt?x=1411e27ee0
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+731710996d79d0d58...@syzkaller.appspotmail.com
> Fixes: 408624af4c89 ("vsock: use local transport when it is loaded")
> 
> INFO: task syz-executor280:9768 blocked for more than 143 seconds.
>   Not tainted 5.6.0-rc1-syzkaller #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executor280 D27912  9768   9766 0x
> Call Trace:
>  context_switch kernel/sched/core.c:3386 [inline]
>  __schedule+0x934/0x1f90 kernel/sched/core.c:4082
>  schedule+0xdc/0x2b0 kernel/sched/core.c:4156
>  __lock_sock+0x165/0x290 net/core/sock.c:2413
>  lock_sock_nested+0xfe/0x120 net/core/sock.c:2938
>  virtio_transport_release+0xc4/0xd60 
> net/vmw_vsock/virtio_transport_common.c:832
>  vsock_assign_transport+0xf3/0x3b0 net/vmw_vsock/af_vsock.c:454
>  vsock_stream_connect+0x2b3/0xc70 net/vmw_vsock/af_vsock.c:1288
>  __sys_connect_file+0x161/0x1c0 net/socket.c:1857
>  __sys_connect+0x174/0x1b0 net/socket.c:1874
>  __do_sys_connect net/socket.c:1885 [inline]
>  __se_sys_connect net/socket.c:1882 [inline]
>  __x64_sys_connect+0x73/0xb0 net/socket.c:1882
>  do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x440209
> Code: Bad RIP value.
> RSP: 002b:7ffdb9f67718 EFLAGS: 0246 ORIG_RAX: 002a
> RAX: ffda RBX: 004002c8 RCX: 00440209
> RDX: 0010 RSI: 2440 RDI: 0003
> RBP: 006ca018 R08: 004002c8 R09: 004002c8
> R10: 004002c8 R11: 0246 R12: 00401a90
> R13: 00401b20 R14:  R15: 
> 
> Showing all locks held in the system:
> 1 lock held by khungtaskd/951:
>  #0: 89bac240 (rcu_read_lock){}, at: 
> debug_show_all_locks+0x5f/0x279 kernel/locking/lockdep.c:5333
> 1 lock held by rsyslogd/9652:
>  #0: 8880a6533120 (>f_pos_lock){+.+.}, at: __fdget_pos+0xee/0x110 
> fs/file.c:821
> 2 locks held by getty/9742:
>  #0: 8880a693f090 (>ldisc_sem){}, at: ldsem_down_read+0x33/0x40 
> drivers/tty/tty_ldsem.c:340
>  #1: c900061bb2e0 (>atomic_read_lock){+.+.}, at: 
> n_tty_read+0x220/0x1bf0 drivers/tty/n_tty.c:2156
> 2 locks held by getty/9743:
>  #0: 88809f7a1090 (>ldisc_sem){}, at: ldsem_down_read+0x33/0x40 
> drivers/tty/tty_ldsem.c:340
>  #1: c900061b72e0 (>atomic_read_lock){+.+.}, at: 
> n_tty_read+0x220/0x1bf0 drivers/tty/n_tty.c:2156
> 2 locks held by getty/9744:
>  #0: 88809be3e090 (>ldisc_sem){}, at: ldsem_down_read+0x33/0x40 
> drivers/tty/tty_ldsem.c:340
>  #1: c900061632e0 (>atomic_read_lock){+.+.}, at: 
> n_tty_read+0x220/0x1bf0 drivers/tty/n_tty.c:2156
> 2 locks held by getty/9745:
>  #0: 88808eb1e090 (>ldisc_sem){}, at: ldsem_down_read+0x33/0x40 
> drivers/tty/tty_ldsem.c:340
>  #1: c900061bf2e0 (>atomic_read_lock){+.+.}, at: 
> n_tty_read+0x220/0x1bf0 drivers/tty/n_tty.c:2156
> 2 locks held by getty/9746:
>  #0: 88808d33a090 (>ldisc_sem){}, at: ldsem_down_read+0x33/0x40 
> drivers/tty/tty_ldsem.c:340
>  #1: c900061732e0 (>atomic_read_lock){+.+.}, at: 
> n_tty_read+0x220/0x1bf0 drivers/tty/n_tty.c:2156
> 2 locks held by getty/9747:
>  #0: 8880a6a0c090 (>ldisc_sem){}, at: ldsem_down_read+0x33/0x40 
> drivers/tty/tty_ldsem.c:340
>  #1: c900061c32e0 (>atomic_read_lock){+.+.}, at: 
> n_tty_read+0x220/0x1bf0 drivers/tty/n_tty.c:2156
> 2 locks held by getty/9748:
>  #0: 8880a6e4d090 (>ldisc_sem){}, at: ldsem_down_read+0x33/0x40 
> drivers/tty/tty_ldsem.c:340
>  #1: c900061332e0 (>atomic_read_lock){+.+.}, at: 
> n_tty_read+0x220/0x1bf0 drivers/tty/n_tty.c:2156
> 1 lock held by syz-executor280/9768:
>  #0: 8880987cb8d0 (sk_lock-AF_VSOCK){+.+.}, at: lock_sock 
> include/net/sock.h:1516 [inline]
>  #0: 8880987cb8d0 

Re: [Spice-devel] Xorg indefinitely hangs in kernelspace

2019-10-03 Thread Hillf Danton


On Thu, 3 Oct 2019 09:45:55 +0300 Jaak Ristioja wrote:
> On 30.09.19 16:29, Frediano Ziglio wrote:
> >   Why didn't you update bug at 
> > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1813620?
> > I know it can seem tedious but would help tracking it.
> 
> I suppose the lack on centralized tracking and handling of Linux kernel
> bugs is a delicate topic, so I don't want to rant much more on that.
> Updating that bug would tedious and time-consuming indeed, which is why
> I haven't done that. To be honest, I don't have enough time and motivation.

Give the diff below a go only when it is convenient and only if it makes
a bit of sense to you.

--- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
+++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
@@ -110,6 +110,7 @@ int ttm_eu_reserve_buffers(struct ww_acq
ww_acquire_init(ticket, _ww_class);
 
list_for_each_entry(entry, list, head) {
+   bool lockon = false;
struct ttm_buffer_object *bo = entry->bo;
 
ret = __ttm_bo_reserve(bo, intr, (ticket == NULL), ticket);
@@ -150,6 +151,7 @@ int ttm_eu_reserve_buffers(struct ww_acq
dma_resv_lock_slow(bo->base.resv, ticket);
ret = 0;
}
+   lockon = !ret;
}
 
if (!ret && entry->num_shared)
@@ -157,6 +159,8 @@ int ttm_eu_reserve_buffers(struct ww_acq

entry->num_shared);
 
if (unlikely(ret != 0)) {
+   if (lockon)
+   dma_resv_unlock(bo->base.resv);
if (ret == -EINTR)
ret = -ERESTARTSYS;
if (ticket) {

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: Xorg indefinitely hangs in kernelspace

2019-09-09 Thread Hillf Danton


On Mon, 9 Sep 2019 from Gerd Hoffmann 
>
> Hmm, I think the patch is wrong.

Hmm...it should have added change only in the error path, leaving locks
for drivers to release if job is done with no error returned.

> As far I know it is the qxl drivers's
> job to call ttm_eu_backoff_reservation().

Like other drivers, qxl is currently doing the right.

> Doing that automatically in
> ttm will most likely break other ttm users.
>
You are right. They are responsible for doing backoff if error happens
while validating buffers afterwards.


--- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
+++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
@@ -111,8 +111,10 @@ int ttm_eu_reserve_buffers(struct ww_acq
 
list_for_each_entry(entry, list, head) {
struct ttm_buffer_object *bo = entry->bo;
+   bool lockon;
 
ret = __ttm_bo_reserve(bo, intr, (ticket == NULL), ticket);
+   lockon = !ret;
if (!ret && unlikely(atomic_read(>cpu_writers) > 0)) {
reservation_object_unlock(bo->resv);
 
@@ -151,6 +153,7 @@ int ttm_eu_reserve_buffers(struct ww_acq
ret = 0;
}
}
+   lockon = !ret;
 
if (!ret && entry->num_shared)
ret = reservation_object_reserve_shared(bo->resv,
@@ -163,6 +166,8 @@ int ttm_eu_reserve_buffers(struct ww_acq
ww_acquire_done(ticket);
ww_acquire_fini(ticket);
}
+   if (lockon)
+   ttm_eu_backoff_reservation_reverse(list, entry);
return ret;
}
 
--

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: Xorg indefinitely hangs in kernelspace

2019-09-09 Thread Hillf Danton
Hi,

On Mon, 9 Sep 2019 from Gerd Hoffmann 
>
> Hmm, I think the patch is wrong.  As far I know it is the qxl drivers's
> job to call ttm_eu_backoff_reservation().  Doing that automatically in
> ttm will most likely break other ttm users.
>
Perhaps.

>So I guess the call is missing in the qxl driver somewhere, most likely
>in some error handling code path given that this bug is a relatively
>rare event.
>
>There is only a single ttm_eu_reserve_buffers() call in qxl.
>So how about this?
>
No preference in either way if it is a right cure.

BTW a quick peep at the mainline tree shows not every
ttm_eu_reserve_buffers() pairs with ttm_eu_backoff_reservation()
without qxl being taken in account.

Hillf
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [Spice-devel] Xorg indefinitely hangs in kernelspace

2019-09-06 Thread Hillf Danton
>From Frediano Ziglio 
>
> Where does it came this patch?

My fingers tapping the keyboard.

> Is it already somewhere?

No idea yet.

> Is it supposed to fix this issue?

It may do nothing else as far as I can tell.

> Does it affect some other card beside QXL?

Perhaps.


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: Xorg indefinitely hangs in kernelspace

2019-09-05 Thread Hillf Danton


On Tue, 6 Aug 2019 21:00:10 +0300 From:   Jaak Ristioja 
> Hello!
> 
> I'm writing to report a crash in the QXL / DRM code in the Linux kernel.
> I originally filed the issue on LaunchPad and more details can be found
> there, although I doubt whether these details are useful.
> 
>   https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1813620
> 
> I first experienced these issues with:
> 
> * Ubuntu 18.04 (probably kernel 4.15.something)
> * Ubuntu 18.10 (kernel 4.18.0-13)
> * Ubuntu 19.04 (kernel 5.0.0-13-generic)
> * Ubuntu 19.04 (mainline kernel 5.1-rc7)
> * Ubuntu 19.04 (mainline kernel 5.2.0-050200rc1-generic)
> 
> Here is the crash output from dmesg:
> 
> [354073.713350] INFO: task Xorg:920 blocked for more than 120 seconds.
> [354073.717755]   Not tainted 5.2.0-050200rc1-generic #201905191930
> [354073.722277] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [354073.738332] XorgD0   920854 0x00404004
> [354073.738334] Call Trace:
> [354073.738340]  __schedule+0x2ba/0x650
> [354073.738342]  schedule+0x2d/0x90
> [354073.738343]  schedule_preempt_disabled+0xe/0x10
> [354073.738345]  __ww_mutex_lock.isra.11+0x3e0/0x750
> [354073.738346]  __ww_mutex_lock_slowpath+0x16/0x20
> [354073.738347]  ww_mutex_lock+0x34/0x50
> [354073.738352]  ttm_eu_reserve_buffers+0x1f9/0x2e0 [ttm]
> [354073.738356]  qxl_release_reserve_list+0x67/0x150 [qxl]
> [354073.738358]  ? qxl_bo_pin+0xaa/0x190 [qxl]
> [354073.738359]  qxl_cursor_atomic_update+0x1b0/0x2e0 [qxl]
> [354073.738367]  drm_atomic_helper_commit_planes+0xb9/0x220 [drm_kms_helper]
> [354073.738371]  drm_atomic_helper_commit_tail+0x2b/0x70 [drm_kms_helper]
> [354073.738374]  commit_tail+0x67/0x70 [drm_kms_helper]
> [354073.738378]  drm_atomic_helper_commit+0x113/0x120 [drm_kms_helper]
> [354073.738390]  drm_atomic_commit+0x4a/0x50 [drm]
> [354073.738394]  drm_atomic_helper_update_plane+0xe9/0x100 [drm_kms_helper]
> [354073.738402]  __setplane_atomic+0xd3/0x120 [drm]
> [354073.738410]  drm_mode_cursor_universal+0x142/0x270 [drm]
> [354073.738418]  drm_mode_cursor_common+0xcb/0x220 [drm]
> [354073.738425]  ? drm_mode_cursor_ioctl+0x60/0x60 [drm]
> [354073.738432]  drm_mode_cursor2_ioctl+0xe/0x10 [drm]
> [354073.738438]  drm_ioctl_kernel+0xb0/0x100 [drm]
> [354073.738440]  ? ___sys_recvmsg+0x16c/0x200
> [354073.738445]  drm_ioctl+0x233/0x410 [drm]
> [354073.738452]  ? drm_mode_cursor_ioctl+0x60/0x60 [drm]
> [354073.738454]  ? timerqueue_add+0x57/0x90
> [354073.738456]  ? enqueue_hrtimer+0x3c/0x90
> [354073.738458]  do_vfs_ioctl+0xa9/0x640
> [354073.738459]  ? fput+0x13/0x20
> [354073.738461]  ? __sys_recvmsg+0x88/0xa0
> [354073.738462]  ksys_ioctl+0x67/0x90
> [354073.738463]  __x64_sys_ioctl+0x1a/0x20
> [354073.738465]  do_syscall_64+0x5a/0x140
> [354073.738467]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [354073.738468] RIP: 0033:0x7ffad14d3417
> [354073.738472] Code: Bad RIP value.
> [354073.738472] RSP: 002b:7ffdd5679978 EFLAGS: 3246 ORIG_RAX:
> 0010
> [354073.738473] RAX: ffda RBX: 56428a474610 RCX:
> 7ffad14d3417
> [354073.738474] RDX: 7ffdd56799b0 RSI: c02464bb RDI:
> 000e
> [354073.738474] RBP: 7ffdd56799b0 R08: 0040 R09:
> 0010
> [354073.738475] R10: 003f R11: 3246 R12:
> c02464bb
> [354073.738475] R13: 000e R14:  R15:
> 56428a4721d0
> [354073.738511] INFO: task kworker/1:0:27625 blocked for more than 120 
> seconds.
> [354073.745154]   Not tainted 5.2.0-050200rc1-generic #201905191930
> [354073.751900] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [354073.762197] kworker/1:0 D0 27625  2 0x80004000
> [354073.762205] Workqueue: events qxl_client_monitors_config_work_func [qxl]
> [354073.762206] Call Trace:
> [354073.762211]  __schedule+0x2ba/0x650
> [354073.762214]  schedule+0x2d/0x90
> [354073.762215]  schedule_preempt_disabled+0xe/0x10
> [354073.762216]  __ww_mutex_lock.isra.11+0x3e0/0x750
> [354073.762217]  ? __switch_to_asm+0x34/0x70
> [354073.762218]  ? __switch_to_asm+0x40/0x70
> [354073.762219]  ? __switch_to_asm+0x40/0x70
> [354073.762220]  __ww_mutex_lock_slowpath+0x16/0x20
> [354073.762221]  ww_mutex_lock+0x34/0x50
> [354073.762235]  drm_modeset_lock+0x35/0xb0 [drm]
> [354073.762243]  drm_modeset_lock_all_ctx+0x5d/0xe0 [drm]
> [354073.762251]  drm_modeset_lock_all+0x5e/0xb0 [drm]
> [354073.762252]  qxl_display_read_client_monitors_config+0x1e1/0x370 [qxl]
> [354073.762254]  qxl_client_monitors_config_work_func+0x15/0x20 [qxl]
> [354073.762256]  process_one_work+0x20f/0x410
> [354073.762257]  worker_thread+0x34/0x400
> [354073.762259]  kthread+0x120/0x140
> [354073.762260]  ? process_one_work+0x410/0x410
> [354073.762261]  ? __kthread_parkme+0x70/0x70
> [354073.762262]  ret_from_fork+0x35/0x40
> 

--- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c
+++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c
@@ -97,8 

Re: [PATCH V5 3/5] iommu/dma-iommu: Handle deferred devices

2019-08-16 Thread Hillf Danton


On Thu, 15 Aug 2019 12:09:41 +0100 Tom Murphy wrote:
> 
> Handle devices which defer their attach to the iommu in the dma-iommu api
> 
> Signed-off-by: Tom Murphy 
> ---
>  drivers/iommu/dma-iommu.c | 27 ++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 2712fbc68b28..906b7fa14d3c 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  struct iommu_dma_msi_page {
>   struct list_headlist;
> @@ -351,6 +352,21 @@ static int iommu_dma_init_domain(struct iommu_domain 
> *domain, dma_addr_t base,
>   return iova_reserve_iommu_regions(dev, domain);
>  }
>  
> +static int handle_deferred_device(struct device *dev,
> + struct iommu_domain *domain)
> +{
> + const struct iommu_ops *ops = domain->ops;
> +
> + if (!is_kdump_kernel())
> + return 0;
> +
> + if (unlikely(ops->is_attach_deferred &&
> + ops->is_attach_deferred(domain, dev)))
> + return iommu_attach_device(domain, dev);
> +
> + return 0;
> +}
> +
>  /**
>   * dma_info_to_prot - Translate DMA API directions and attributes to IOMMU 
> API
>   *page flags.
> @@ -463,6 +479,9 @@ static dma_addr_t __iommu_dma_map(struct device *dev, 
> phys_addr_t phys,
>   size_t iova_off = iova_offset(iovad, phys);
>   dma_addr_t iova;
>  
> + if (unlikely(handle_deferred_device(dev, domain)))
> + return DMA_MAPPING_ERROR;
> +
>   size = iova_align(iovad, size + iova_off);
>  
>   iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev);

iommu_map_atomic() is applied to __iommu_dma_map() in 2/5.
Is it an atomic context currently given the mutex_lock() in
iommu_attach_device()?

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: INFO: rcu detected stall in vhost_worker

2019-07-27 Thread Hillf Danton



Fri, 26 Jul 2019 08:26:01 -0700 (PDT)

syzbot has bisected this bug to:

commit 0ecfebd2b52404ae0c54a878c872bb93363ada36
Author: Linus Torvalds 
Date:   Sun Jul 7 22:41:56 2019 +

 Linux 5.2

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=118810bfa0
start commit:   13bf6d6a Add linux-next specific files for 20190725
git tree:   linux-next
kernel config:  https://syzkaller.appspot.com/x/.config?x=8ae987d803395886
dashboard link: https://syzkaller.appspot.com/bug?extid=36e93b425cd6eb54fcc1
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=15112f3fa0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=131ab57860

Reported-by: syzbot+36e93b425cd6eb54f...@syzkaller.appspotmail.com
Fixes: 0ecfebd2b524 ("Linux 5.2")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection


--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -787,7 +787,6 @@ static void vhost_setup_uaddr(struct vho
  size_t size, bool write)
{
struct vhost_uaddr *addr = >uaddrs[index];
-   spin_lock(>mmu_lock);

addr->uaddr = uaddr;
addr->size = size;
@@ -797,7 +796,10 @@ static void vhost_setup_uaddr(struct vho
static void vhost_setup_vq_uaddr(struct vhost_virtqueue *vq)
{
spin_lock(>mmu_lock);
-
+   /*
+* deadlock if managing to take mmu_lock again while
+* setting up uaddr
+*/
vhost_setup_uaddr(vq, VHOST_ADDR_DESC,
  (unsigned long)vq->desc,
  vhost_get_desc_size(vq, vq->num),
--

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: Reminder: 2 open syzbot bugs in vhost subsystem

2019-07-24 Thread Hillf Danton


On Tue, 2 Jul 2019 13:30:07 +0800 Jason Wang wrote:
> On 2019/7/2 Eric Biggers wrote:
> > [This email was generated by a script.  Let me know if you have any 
> > suggestions
> > to make it better, or if you want it re-generated with the latest status.]
> >
> > Of the currently open syzbot reports against the upstream kernel, I've 
> > manually
> > marked 2 of them as possibly being bugs in the vhost subsystem.  I've listed
> > these reports below, sorted by an algorithm that tries to list first the 
> > reports
> > most likely to be still valid, important, and actionable.
> >
> > Of these 2 bugs, 1 was seen in mainline in the last week.
> >
> > If you believe a bug is no longer valid, please close the syzbot report by
> > sending a '#syz fix', '#syz dup', or '#syz invalid' command in reply to the
> > original thread, as explained at https://goo.gl/tpsmEJ#status
> >
> > If you believe I misattributed a bug to the vhost subsystem, please let me 
> > know,
> > and if possible forward the report to the correct people or mailing list.
> >
> > Here are the bugs:
> >
> > 
> > Title:  memory leak in vhost_net_ioctl
> > Last occurred:  0 days ago
> > Reported:   26 days ago
> > Branches:   Mainline
> > Dashboard link: 
> > https://syzkaller.appspot.com/bug?id=3D12ba349d7e26ccfe95317bc376e812ebbae2ee0f
> > Original thread:
> > https://lkml.kernel.org/lkml/188da1058a9c2...@google.com/T/#u
> >
> > This bug has a C reproducer.
> >
> > The original thread for this bug has received 4 replies; the last was 17 
> > days
> > ago.
> >
> > If you fix this bug, please add the following tag to the commit:
> >  Reported-by: syzbot+0789f0c7e45efd7bb...@syzkaller.appspotmail.com
> >
> > If you send any email or patch for this bug, please consider replying to the
> > original thread.  For the git send-email command to use, or tips on how to 
> > reply
> > if the thread isn't in your mailbox, see the "Reply instructions" at
> > https://lkml.kernel.org/r/188da1058a9c2...@google.com
> > 
> Cc Hillf who should had a fix for this.
> 
It could not be a fix in any form without the great idea you shared, Jason:)
while reviewing the first version.

> Hillf, would you please post a formal patch for this? (for -net)
> 
And feel free to do this thing appropriate or that thing for fixing the
reported memory leak before I can earn a Tested-by.

--
Hillf

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: memory leak in vhost_net_ioctl

2019-07-24 Thread Hillf Danton


Hello Syzbot

On Fri, 14 Jun 2019 11:04:03 +0800 syzbot wrote:
>
>Hello,
>
>syzbot has tested the proposed patch but the reproducer still triggered crash:
>memory leak in batadv_tvlv_handler_register
>
>   484.626788][  T156] bond0 (unregistering): Releasing backup interface 
> bond_slave_1
>Warning: Permanently added '10.128.0.87' (ECDSA) to the list of known hosts.
>BUG: memory leak
>unreferenced object 0x88811d25c4c0 (size 64):
>   comm "softirq", pid 0, jiffies 4294943668 (age 434.830s)
>   hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 e0 fc 5b 20 81 88 ff ff  ..[ 
> 00 00 00 00 00 00 00 00 20 91 15 83 ff ff ff ff   ...
>   backtrace:
> [<0045bc9d>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 
> [inline]
> [<0045bc9d>] slab_post_alloc_hook mm/slab.h:439 [inline]
> [<0045bc9d>] slab_alloc mm/slab.c:3326 [inline]
> [<0045bc9d>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
> [<197d773e>] kmalloc include/linux/slab.h:547 [inline]
> [<197d773e>] kzalloc include/linux/slab.h:742 [inline]
> [<197d773e>] batadv_tvlv_handler_register+0xae/0x140 
> net/batman-adv/tvlv.c:529
> [] batadv_tt_init+0x78/0x180 
> net/batman-adv/translation-table.c:4411
> [<8c50839d>] batadv_mesh_init+0x196/0x230 
> net/batman-adv/main.c:208
> [<1c5a74a3>] batadv_softif_init_late+0x1ca/0x220 
> net/batman-adv/soft-interface.c:861
> [<4e676cd1>] register_netdevice+0xbf/0x600 net/core/dev.c:8635
> [<5601497b>] __rtnl_newlink+0xaca/0xb30 net/core/rtnetlink.c:3199
> [] rtnl_newlink+0x4e/0x80 net/core/rtnetlink.c:3245
> [] rtnetlink_rcv_msg+0x178/0x4b0 
> net/core/rtnetlink.c:5214
> [<140451f6>] netlink_rcv_skb+0x61/0x170 
> net/netlink/af_netlink.c:2482
> [<237e38f7>] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5232
> [<0d47c000>] netlink_unicast_kernel net/netlink/af_netlink.c:1307 
> [inline]
> [<0d47c000>] netlink_unicast+0x1ec/0x2d0 
> net/netlink/af_netlink.c:1333
> [<98503d79>] netlink_sendmsg+0x26a/0x480 
> net/netlink/af_netlink.c:1922
> [<9263e868>] sock_sendmsg_nosec net/socket.c:646 [inline]
> [<9263e868>] sock_sendmsg+0x54/0x70 net/socket.c:665
> [<7791ad47>] __sys_sendto+0x148/0x1f0 net/socket.c:1958
> [] __do_sys_sendto net/socket.c:1970 [inline]
> [] __se_sys_sendto net/socket.c:1966 [inline]
> [] __x64_sys_sendto+0x2a/0x30 net/socket.c:1966
>
>BUG: memory leak
>unreferenced object 0x8881024a3340 (size 64):
>   comm "softirq", pid 0, jiffies 4294943678 (age 434.730s)
>   hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 e0 2c 66 04 81 88 ff ff  .,f.
> 00 00 00 00 00 00 00 00 20 91 15 83 ff ff ff ff   ...
>   backtrace:
> [<0045bc9d>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 
> [inline]
> [<0045bc9d>] slab_post_alloc_hook mm/slab.h:439 [inline]
> [<0045bc9d>] slab_alloc mm/slab.c:3326 [inline]
> [<0045bc9d>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
> [<197d773e>] kmalloc include/linux/slab.h:547 [inline]
> [<197d773e>] kzalloc include/linux/slab.h:742 [inline]
> [<197d773e>] batadv_tvlv_handler_register+0xae/0x140 
> net/batman-adv/tvlv.c:529
> [] batadv_tt_init+0x78/0x180 
> net/batman-adv/translation-table.c:4411
> [<8c50839d>] batadv_mesh_init+0x196/0x230 
> net/batman-adv/main.c:208
> [<1c5a74a3>] batadv_softif_init_late+0x1ca/0x220 
> net/batman-adv/soft-interface.c:861
> [<4e676cd1>] register_netdevice+0xbf/0x600 net/core/dev.c:8635
> [<5601497b>] __rtnl_newlink+0xaca/0xb30 net/core/rtnetlink.c:3199
> [] rtnl_newlink+0x4e/0x80 net/core/rtnetlink.c:3245
> [] rtnetlink_rcv_msg+0x178/0x4b0 
> net/core/rtnetlink.c:5214
> [<140451f6>] netlink_rcv_skb+0x61/0x170 
> net/netlink/af_netlink.c:2482
> [<237e38f7>] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5232
> [<0d47c000>] netlink_unicast_kernel net/netlink/af_netlink.c:1307 
> [inline]
> [<0d47c000>] netlink_unicast+0x1ec/0x2d0 
> net/netlink/af_netlink.c:1333
> [<98503d79>] netlink_sendmsg+0x26a/0x480 
> net/netlink/af_netlink.c:1922
> [<9263e868>] sock_sendmsg_nosec net/socket.c:646 [inline]
> [<9263e868>] sock_sendmsg+0x54/0x70 net/socket.c:665
> [<7791ad47>] __sys_sendto+0x148/0x1f0 net/socket.c:1958
> [] __do_sys_sendto net/socket.c:1970 [inline]
> [] __se_sys_sendto net/socket.c:1966 [inline]
> [] __x64_sys_sendto+0x2a/0x30 net/socket.c:1966
>

Re: memory leak in vhost_net_ioctl

2019-07-24 Thread Hillf Danton


Hello Syzbot

On Fri, 14 Jun 2019 11:04:03 +0800 syzbot wrote:
>
>Hello,
>
>syzbot has tested the proposed patch but the reproducer still triggered crash:
>memory leak in batadv_tvlv_handler_register
>
It is not ubuf leak which is addressed in this thread. Good news.
I will see this new leak soon.

>   484.626788][  T156] bond0 (unregistering): Releasing backup interface 
> bond_slave_1
>Warning: Permanently added '10.128.0.87' (ECDSA) to the list of known hosts.
>BUG: memory leak
>unreferenced object 0x88811d25c4c0 (size 64):
>   comm "softirq", pid 0, jiffies 4294943668 (age 434.830s)
>   hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 e0 fc 5b 20 81 88 ff ff  ..[ 
> 00 00 00 00 00 00 00 00 20 91 15 83 ff ff ff ff   ...
>   backtrace:
> [<0045bc9d>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 
> [inline]
> [<0045bc9d>] slab_post_alloc_hook mm/slab.h:439 [inline]
> [<0045bc9d>] slab_alloc mm/slab.c:3326 [inline]
> [<0045bc9d>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
> [<197d773e>] kmalloc include/linux/slab.h:547 [inline]
> [<197d773e>] kzalloc include/linux/slab.h:742 [inline]
> [<197d773e>] batadv_tvlv_handler_register+0xae/0x140 
> net/batman-adv/tvlv.c:529
> [] batadv_tt_init+0x78/0x180 
> net/batman-adv/translation-table.c:4411
> [<8c50839d>] batadv_mesh_init+0x196/0x230 
> net/batman-adv/main.c:208
> [<1c5a74a3>] batadv_softif_init_late+0x1ca/0x220 
> net/batman-adv/soft-interface.c:861
> [<4e676cd1>] register_netdevice+0xbf/0x600 net/core/dev.c:8635
> [<5601497b>] __rtnl_newlink+0xaca/0xb30 net/core/rtnetlink.c:3199
> [] rtnl_newlink+0x4e/0x80 net/core/rtnetlink.c:3245
> [] rtnetlink_rcv_msg+0x178/0x4b0 
> net/core/rtnetlink.c:5214
> [<140451f6>] netlink_rcv_skb+0x61/0x170 
> net/netlink/af_netlink.c:2482
> [<237e38f7>] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5232
> [<0d47c000>] netlink_unicast_kernel net/netlink/af_netlink.c:1307 
> [inline]
> [<0d47c000>] netlink_unicast+0x1ec/0x2d0 
> net/netlink/af_netlink.c:1333
> [<98503d79>] netlink_sendmsg+0x26a/0x480 
> net/netlink/af_netlink.c:1922
> [<9263e868>] sock_sendmsg_nosec net/socket.c:646 [inline]
> [<9263e868>] sock_sendmsg+0x54/0x70 net/socket.c:665
> [<7791ad47>] __sys_sendto+0x148/0x1f0 net/socket.c:1958
> [] __do_sys_sendto net/socket.c:1970 [inline]
> [] __se_sys_sendto net/socket.c:1966 [inline]
> [] __x64_sys_sendto+0x2a/0x30 net/socket.c:1966
>
>BUG: memory leak
>unreferenced object 0x8881024a3340 (size 64):
>   comm "softirq", pid 0, jiffies 4294943678 (age 434.730s)
>   hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 e0 2c 66 04 81 88 ff ff  .,f.
> 00 00 00 00 00 00 00 00 20 91 15 83 ff ff ff ff   ...
>   backtrace:
> [<0045bc9d>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 
> [inline]
> [<0045bc9d>] slab_post_alloc_hook mm/slab.h:439 [inline]
> [<0045bc9d>] slab_alloc mm/slab.c:3326 [inline]
> [<0045bc9d>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
> [<197d773e>] kmalloc include/linux/slab.h:547 [inline]
> [<197d773e>] kzalloc include/linux/slab.h:742 [inline]
> [<197d773e>] batadv_tvlv_handler_register+0xae/0x140 
> net/batman-adv/tvlv.c:529
> [] batadv_tt_init+0x78/0x180 
> net/batman-adv/translation-table.c:4411
> [<8c50839d>] batadv_mesh_init+0x196/0x230 
> net/batman-adv/main.c:208
> [<1c5a74a3>] batadv_softif_init_late+0x1ca/0x220 
> net/batman-adv/soft-interface.c:861
> [<4e676cd1>] register_netdevice+0xbf/0x600 net/core/dev.c:8635
> [<5601497b>] __rtnl_newlink+0xaca/0xb30 net/core/rtnetlink.c:3199
> [] rtnl_newlink+0x4e/0x80 net/core/rtnetlink.c:3245
> [] rtnetlink_rcv_msg+0x178/0x4b0 
> net/core/rtnetlink.c:5214
> [<140451f6>] netlink_rcv_skb+0x61/0x170 
> net/netlink/af_netlink.c:2482
> [<237e38f7>] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5232
> [<0d47c000>] netlink_unicast_kernel net/netlink/af_netlink.c:1307 
> [inline]
> [<0d47c000>] netlink_unicast+0x1ec/0x2d0 
> net/netlink/af_netlink.c:1333
> [<98503d79>] netlink_sendmsg+0x26a/0x480 
> net/netlink/af_netlink.c:1922
> [<9263e868>] sock_sendmsg_nosec net/socket.c:646 [inline]
> [<9263e868>] sock_sendmsg+0x54/0x70 net/socket.c:665
> [<7791ad47>] __sys_sendto+0x148/0x1f0 net/socket.c:1958
> [] __do_sys_sendto net/socket.c:1970 [inline]
> [] __se_sys_sendto 

Re: memory leak in vhost_net_ioctl

2019-07-24 Thread Hillf Danton


Hello Syzbot

On Fri, 14 Jun 2019 02:26:02 +0800 syzbot wrote:
>
> Hello,
>
> syzbot has tested the proposed patch but the reproducer still triggered crash:
> memory leak in vhost_net_ioctl
>
Oh sorry for my poor patch.

> ANGE): hsr_slave_1: link becomes ready
> 2019/06/13 18:24:57 executed programs: 18
> BUG: memory leak
> unreferenced object 0x88811cbc6ac0 (size 64):
>comm "syz-executor.0", pid 7196, jiffies 4294943804 (age 14.770s)
>hex dump (first 32 bytes):
>  01 00 00 00 81 88 ff ff 00 00 00 00 82 88 ff ff  
>  d0 6a bc 1c 81 88 ff ff d0 6a bc 1c 81 88 ff ff  .j...j..
>backtrace:
>  [<6c752978>] kmemleak_alloc_recursive 
> include/linux/kmemleak.h:43 [inline]
>  [<6c752978>] slab_post_alloc_hook mm/slab.h:439 [inline]
>  [<6c752978>] slab_alloc mm/slab.c:3326 [inline]
>  [<6c752978>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
>  [] kmalloc include/linux/slab.h:547 [inline]
>  [] vhost_net_ubuf_alloc drivers/vhost/net.c:241 
> [inline]
>  [] vhost_net_set_backend drivers/vhost/net.c:1535 
> [inline]
>  [] vhost_net_ioctl+0xb43/0xc10 drivers/vhost/net.c:1717
>  [<700f02d7>] vfs_ioctl fs/ioctl.c:46 [inline]
>  [<700f02d7>] file_ioctl fs/ioctl.c:509 [inline]
>  [<700f02d7>] do_vfs_ioctl+0x62a/0x810 fs/ioctl.c:696
>  [<9a0ec0a7>] ksys_ioctl+0x86/0xb0 fs/ioctl.c:713
>  [] __do_sys_ioctl fs/ioctl.c:720 [inline]
>  [] __se_sys_ioctl fs/ioctl.c:718 [inline]
>  [] __x64_sys_ioctl+0x1e/0x30 fs/ioctl.c:718
>  [] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
>  [<8715c149>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> BUG: memory leak
> unreferenced object 0x88810b1365c0 (size 64):
>comm "syz-executor.2", pid 7193, jiffies 4294943823 (age 14.580s)
>hex dump (first 32 bytes):
>  01 00 00 00 81 88 ff ff 00 00 00 00 81 88 ff ff  
>  d0 65 13 0b 81 88 ff ff d0 65 13 0b 81 88 ff ff  .e...e..
>backtrace:
>  [<6c752978>] kmemleak_alloc_recursive 
> include/linux/kmemleak.h:43 [inline]
>  [<6c752978>] slab_post_alloc_hook mm/slab.h:439 [inline]
>  [<6c752978>] slab_alloc mm/slab.c:3326 [inline]
>
>  [] kmalloc include/linux/slab.h:547 [inline]
>  [] vhost_net_ubuf_alloc drivers/vhost/net.c:241 
> [inline]
>  [] vhost_net_set_backend drivers/vhost/net.c:1535 
> [inline]
>  [] vhost_net_ioctl+0xb43/0xc10 drivers/vhost/net.c:1717
>  [<700f02d7>] vfs_ioctl fs/ioctl.c:46 [inline]
>  [<700f02d7>] file_ioctl fs/ioctl.c:509 [inline]
>  [<700f02d7>] do_vfs_ioctl+0x62a/0x810 fs/ioctl.c:696
>  [<9a0ec0a7>] ksys_ioctl+0x86/0xb0 fs/ioctl.c:713
>  [] __do_sys_ioctl fs/ioctl.c:720 [inline]
>  [] __se_sys_ioctl fs/ioctl.c:718 [inline]
>  [] __x64_sys_ioctl+0x1e/0x30 fs/ioctl.c:718
>  [] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
>  [<8715c149>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> BUG: memory leak
> unreferenced object 0x88810be23700 (size 64):
>comm "syz-executor.3", pid 7194, jiffies 4294943823 (age 14.580s)
>hex dump (first 32 bytes):
>  01 00 00 00 00 00 00 00 00 00 00 00 00 c9 ff ff  
>  10 37 e2 0b 81 88 ff ff 10 37 e2 0b 81 88 ff ff  .7...7..
>backtrace:
>  [<6c752978>] kmemleak_alloc_recursive 
> include/linux/kmemleak.h:43 [inline]
>  [<6c752978>] slab_post_alloc_hook mm/slab.h:439 [inline]
>  [<6c752978>] slab_alloc mm/slab.c:3326 [inline]
>  [<6c752978>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
>  [] kmalloc include/linux/slab.h:547 [inline]
>  [] vhost_net_ubuf_alloc drivers/vhost/net.c:241 
> [inline]
>  [] vhost_net_set_backend drivers/vhost/net.c:1535 
> [inline]
>  [] vhost_net_ioctl+0xb43/0xc10 drivers/vhost/net.c:1717
>  [<700f02d7>] vfs_ioctl fs/ioctl.c:46 [inline]
>  [<700f02d7>] file_ioctl fs/ioctl.c:509 [inline]
>  [<700f02d7>] do_vfs_ioctl+0x62a/0x810 fs/ioctl.c:696
>  [<9a0ec0a7>] ksys_ioctl+0x86/0xb0 fs/ioctl.c:713
>  [] __do_sys_ioctl fs/ioctl.c:720 [inline]
>  [] __se_sys_ioctl fs/ioctl.c:718 [inline]
>  [] __x64_sys_ioctl+0x1e/0x30 fs/ioctl.c:718
>  [] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
>  [<8715c149>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> BUG: memory leak
> unreferenced 

Re: memory leak in vhost_net_ioctl

2019-07-24 Thread Hillf Danton

Hello Dmitry

On Thu, 13 Jun 2019 20:12:06 +0800 Dmitry Vyukov wrote:
> On Thu, Jun 13, 2019 at 2:07 PM Hillf Danton  wrote:
> >
> > Hello Jason
> >
> > On Thu, 13 Jun 2019 17:10:39 +0800 Jason Wang wrote:
> > >
> > > This is basically a kfree(ubuf) after the second vhost_net_flush() in
> > > vhost_net_release().
> > >
> > Fairly good catch.
> >
> > > Could you please post a formal patch?
> > >
> > I'd like very much to do that; but I wont, I am afraid, until I collect a
> > Tested-by because of reproducer without a cutting edge.
>
> You can easily collect Tested-by from syzbot for any bug with a reproducer;)
> https://github.com/google/syzkaller/blob/master/docs/syzbot.md#testing-patches
>
Thank you for the light you are casting.

Here it goes.
--->8
From: Hillf Danton 
Subject: [PATCH] vhost: fix memory leak in vhost_net_release

syzbot found the following crash on:

HEAD commit:788a0249 Merge tag 'arc-5.2-rc4' of git://git.kernel.org/p..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?xdc9ea6a0
kernel config:  https://syzkaller.appspot.com/x/.config?x�c73825cbdc7326
dashboard link: https://syzkaller.appspot.com/bug?extid89f0c7e45efd7bb643
compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:  https://syzkaller.appspot.com/x/repro.syz?xb31761a0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x4892c1a0


udit: type00 audit(1559768703.229:36): avc:  denied  { map } for
pidq16 comm="syz-executor330" path="/root/syz-executor330334897"
dev="sda1" ino461 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1
executing program
executing program

BUG: memory leak
unreferenced object 0x88812421fe40 (size 64):
   comm "syz-executor330", pid 7117, jiffies 4294949245 (age 13.030s)
   hex dump (first 32 bytes):
 01 00 00 00 20 69 6f 63 00 00 00 00 64 65 76 2f   iocdev/
 50 fe 21 24 81 88 ff ff 50 fe 21 24 81 88 ff ff  P.!$P.!$
   backtrace:
 [<ae0c4ae0>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 
[inline]
 [<ae0c4ae0>] slab_post_alloc_hook mm/slab.h:439 [inline]
 [<ae0c4ae0>] slab_alloc mm/slab.c:3326 [inline]
 [<ae0c4ae0>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
 [<79ebab38>] kmalloc include/linux/slab.h:547 [inline]
 [<79ebab38>] vhost_net_ubuf_alloc drivers/vhost/net.c:241 [inline]
 [<79ebab38>] vhost_net_set_backend drivers/vhost/net.c:1534 
[inline]
 [<79ebab38>] vhost_net_ioctl+0xb43/0xc10 drivers/vhost/net.c:1716
 [<9f6204a2>] vfs_ioctl fs/ioctl.c:46 [inline]
 [<9f6204a2>] file_ioctl fs/ioctl.c:509 [inline]
 [<9f6204a2>] do_vfs_ioctl+0x62a/0x810 fs/ioctl.c:696
 [<b45866de>] ksys_ioctl+0x86/0xb0 fs/ioctl.c:713
 [<dfb41eb8>] __do_sys_ioctl fs/ioctl.c:720 [inline]
 [<dfb41eb8>] __se_sys_ioctl fs/ioctl.c:718 [inline]
 [<dfb41eb8>] __x64_sys_ioctl+0x1e/0x30 fs/ioctl.c:718
 [<49c1f547>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
 [<29cc8ca7>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

BUG: memory leak
unreferenced object 0x88812421fa80 (size 64):
   comm "syz-executor330", pid 7130, jiffies 4294949755 (age 7.930s)
   hex dump (first 32 bytes):
 01 00 00 00 01 00 00 00 00 00 00 00 2f 76 69 72  /vir
 90 fa 21 24 81 88 ff ff 90 fa 21 24 81 88 ff ff  ..!$..!$
   backtrace:
 [<ae0c4ae0>] kmemleak_alloc_recursive  include/linux/kmemleak.h:55 
[inline]
 [<ae0c4ae0>] slab_post_alloc_hook mm/slab.h:439 [inline]
 [<ae0c4ae0>] slab_alloc mm/slab.c:3326 [inline]
 [<ae0c4ae0>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
 [<79ebab38>] kmalloc include/linux/slab.h:547 [inline]
 [<79ebab38>] vhost_net_ubuf_alloc drivers/vhost/net.c:241  [inline]
 [<79ebab38>] vhost_net_set_backend drivers/vhost/net.c:1534  
[inline]
 [<79ebab38>] vhost_net_ioctl+0xb43/0xc10  drivers/vhost/net.c:1716
 [<9f6204a2>] vfs_ioctl fs/ioctl.c:46 [inline]
 [<9f6204a2>] file_ioctl fs/ioctl.c:509 [inline]
 [<9f6204a2>] do_vfs_ioctl+0x62a/0x810 fs/ioctl.c:696
 [<b45866de>] ksys_ioctl+0x86/0xb0 fs/ioctl.c:713
 [<dfb41eb8>] __do_sys_ioctl fs/ioctl.c:720 [inline]
 [<dfb41eb8>] __se_sys_ioctl fs/ioctl.c:718 [inline]
 [<dfb41eb8>] __x64_sys_ioctl+0x1e/0x30 fs/ioctl.c:71

Re: memory leak in vhost_net_ioctl

2019-07-24 Thread Hillf Danton


Hello Jason

On Thu, 13 Jun 2019 17:10:39 +0800 Jason Wang wrote:
> 
> This is basically a kfree(ubuf) after the second vhost_net_flush() in
> vhost_net_release().
> 
Fairly good catch.

> Could you please post a formal patch?
> 
I'd like very much to do that; but I wont, I am afraid, until I collect a
Tested-by because of reproducer without a cutting edge.

Thanks
Hillf

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: memory leak in vhost_net_ioctl

2019-07-24 Thread Hillf Danton



On Wed, 05 Jun 2019 16:42:05 -0700 (PDT) syzbot wrote:

Hello,

syzbot found the following crash on:

HEAD commit:788a0249 Merge tag 'arc-5.2-rc4' of git://git.kernel.org/p..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15dc9ea6a0
kernel config:  https://syzkaller.appspot.com/x/.config?x=d5c73825cbdc7326
dashboard link: https://syzkaller.appspot.com/bug?extid=0789f0c7e45efd7bb643
compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=10b31761a0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=124892c1a0

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+0789f0c7e45efd7bb...@syzkaller.appspotmail.com

udit: type=1400 audit(1559768703.229:36): avc:  denied  { map } for   
pid=7116 comm="syz-executor330" path="/root/syz-executor330334897"  
dev="sda1" ino=16461 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023  
tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1

executing program
executing program
BUG: memory leak
unreferenced object 0x88812421fe40 (size 64):
   comm "syz-executor330", pid 7117, jiffies 4294949245 (age 13.030s)
   hex dump (first 32 bytes):
 01 00 00 00 20 69 6f 63 00 00 00 00 64 65 76 2f   iocdev/
 50 fe 21 24 81 88 ff ff 50 fe 21 24 81 88 ff ff  P.!$P.!$
   backtrace:
 [] kmemleak_alloc_recursive include/linux/kmemleak.h:55 
[inline]
 [] slab_post_alloc_hook mm/slab.h:439 [inline]
 [] slab_alloc mm/slab.c:3326 [inline]
 [] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
 [<79ebab38>] kmalloc include/linux/slab.h:547 [inline]
 [<79ebab38>] vhost_net_ubuf_alloc drivers/vhost/net.c:241 [inline]
 [<79ebab38>] vhost_net_set_backend drivers/vhost/net.c:1534 
[inline]
 [<79ebab38>] vhost_net_ioctl+0xb43/0xc10 drivers/vhost/net.c:1716
 [<9f6204a2>] vfs_ioctl fs/ioctl.c:46 [inline]
 [<9f6204a2>] file_ioctl fs/ioctl.c:509 [inline]
 [<9f6204a2>] do_vfs_ioctl+0x62a/0x810 fs/ioctl.c:696
 [] ksys_ioctl+0x86/0xb0 fs/ioctl.c:713
 [] __do_sys_ioctl fs/ioctl.c:720 [inline]
 [] __se_sys_ioctl fs/ioctl.c:718 [inline]
 [] __x64_sys_ioctl+0x1e/0x30 fs/ioctl.c:718
 [<49c1f547>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
 [<29cc8ca7>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

BUG: memory leak
unreferenced object 0x88812421fa80 (size 64):
   comm "syz-executor330", pid 7130, jiffies 4294949755 (age 7.930s)
   hex dump (first 32 bytes):
 01 00 00 00 01 00 00 00 00 00 00 00 2f 76 69 72  /vir
 90 fa 21 24 81 88 ff ff 90 fa 21 24 81 88 ff ff  ..!$..!$
   backtrace:
 [] kmemleak_alloc_recursive include/linux/kmemleak.h:55 
[inline]
 [] slab_post_alloc_hook mm/slab.h:439 [inline]
 [] slab_alloc mm/slab.c:3326 [inline]
 [] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
 [<79ebab38>] kmalloc include/linux/slab.h:547 [inline]
 [<79ebab38>] vhost_net_ubuf_alloc drivers/vhost/net.c:241 [inline]
 [<79ebab38>] vhost_net_set_backend drivers/vhost/net.c:1534 
[inline]
 [<79ebab38>] vhost_net_ioctl+0xb43/0xc10 drivers/vhost/net.c:1716
 [<9f6204a2>] vfs_ioctl fs/ioctl.c:46 [inline]
 [<9f6204a2>] file_ioctl fs/ioctl.c:509 [inline]
 [<9f6204a2>] do_vfs_ioctl+0x62a/0x810 fs/ioctl.c:696
 [] ksys_ioctl+0x86/0xb0 fs/ioctl.c:713
 [] __do_sys_ioctl fs/ioctl.c:720 [inline]
 [] __se_sys_ioctl fs/ioctl.c:718 [inline]
 [] __x64_sys_ioctl+0x1e/0x30 fs/ioctl.c:718
 [<49c1f547>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
 [<29cc8ca7>] entry_SYSCALL_64_after_hwframe+0x44/0xa9



---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches


Ignore my noise if you have no interest seeing the syzbot report.

After commit c38e39c378f46f ("vhost-net: fix use-after-free in
vhost_net_flush") flush would no longer free ubuf, just wait until ubuf users
disappear instead.

The following diff, in hope that may perhaps help you handle the memory leak,
makes flush able to free ubuf in the path of file release.

Thanks
Hillf
---
drivers/vhost/net.c | 8 +++-
1 file changed, 7 insertions(+), 1