Re: [Qemu-devel] [PATCH v9 2/7] virtio-pmem: Add virtio pmem driver

2019-05-17 Thread Jakub Staroń via Qemu-devel
On 5/16/19 10:35 PM, Pankaj Gupta wrote:
> Can I take it your reviewed/acked-by? or tested-by tag? for the virtio patch 
> :)I don't feel that I have enough expertise to give the reviewed-by tag, but 
> you can
take my acked-by + tested-by.

Acked-by: Jakub Staron 
Tested-by: Jakub Staron 

No kernel panics/stalls encountered during testing this patches (v9) with QEMU 
+ xfstests.
Some CPU stalls encountered while testing with crosvm instead of QEMU with 
xfstests
(test generic/464) but no repro for QEMU, so the fault may be on the side of 
crosvm.


The dump for the crosvm/xfstests stall:
[ 2504.175276] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 2504.176681] rcu: 0-...!: (1 GPs behind) idle=9b2/1/0x4000 
softirq=1089198/1089202 fqs=0 
[ 2504.178270] rcu: 2-...!: (1 ticks this GP) idle=cfe/1/0x4002 
softirq=1055108/1055110 fqs=0 
[ 2504.179802] rcu: 3-...!: (1 GPs behind) idle=1d6/1/0x4002 
softirq=1046798/1046802 fqs=0 
[ 2504.181215] rcu: 4-...!: (2 ticks this GP) idle=522/1/0x4002 
softirq=1249063/1249064 fqs=0 
[ 2504.182625] rcu: 5-...!: (1 GPs behind) idle=6da/1/0x4000 
softirq=1131036/1131047 fqs=0 
[ 2504.183955]  (detected by 3, t=0 jiffies, g=1232529, q=1370)
[ 2504.184762] Sending NMI from CPU 3 to CPUs 0:
[ 2504.186400] NMI backtrace for cpu 0
[ 2504.186401] CPU: 0 PID: 6670 Comm: 464 Not tainted 5.1.0+ #1
[ 2504.186401] Hardware name: ChromiumOS crosvm, BIOS 0 
[ 2504.186402] RIP: 0010:queued_spin_lock_slowpath+0x1c/0x1e0
[ 2504.186402] Code: e7 89 c8 f0 44 0f b1 07 39 c1 75 dc f3 c3 0f 1f 44 00 00 
ba 01 00 00 00 8b 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 f3 c3 f3 90  ec 81 
fe 00 01 00 00 0f 84 ab 00 00 00 81 e6 00 ff ff ff 75 44
[ 2504.186403] RSP: 0018:c9003ee8 EFLAGS: 0002
[ 2504.186404] RAX: 0001 RBX: 0246 RCX: 00404044
[ 2504.186404] RDX: 0001 RSI: 0001 RDI: 8244a280
[ 2504.186405] RBP: 8244a280 R08: 000f4200 R09: 024709ed6c32
[ 2504.186405] R10:  R11: 0001 R12: 8244a280
[ 2504.186405] R13: 0009 R14: 0009 R15: 
[ 2504.186406] FS:  () GS:8880cc60() 
knlGS:
[ 2504.186406] CS:  0010 DS:  ES:  CR0: 80050033
[ 2504.186406] CR2: 7efd6b0f15d8 CR3: 0260a006 CR4: 00360ef0
[ 2504.186407] DR0:  DR1:  DR2: 
[ 2504.186407] DR3:  DR6: fffe0ff0 DR7: 0400
[ 2504.186407] Call Trace:
[ 2504.186408]  
[ 2504.186408]  _raw_spin_lock_irqsave+0x1d/0x30
[ 2504.186408]  rcu_core+0x3b6/0x740
[ 2504.186408]  ? __hrtimer_run_queues+0x133/0x280
[ 2504.186409]  ? recalibrate_cpu_khz+0x10/0x10
[ 2504.186409]  __do_softirq+0xd8/0x2e4
[ 2504.186409]  irq_exit+0xa3/0xb0
[ 2504.186410]  smp_apic_timer_interrupt+0x67/0x120
[ 2504.186410]  apic_timer_interrupt+0xf/0x20
[ 2504.186410]  
[ 2504.186410] RIP: 0010:unmap_page_range+0x47a/0x9b0
[ 2504.186411] Code: 0f 46 46 10 49 39 6e 18 49 89 46 10 48 89 e8 49 0f 43 46 
18 41 80 4e 20 08 4d 85 c9 49 89 46 18 0f 84 68 ff ff ff 49 8b 51 08 <48> 8d 42 
ff 83 e2 01 49 0f 44 c1 f6 40 18 01 75 38 48 ba ff 0f 00
[ 2504.186411] RSP: 0018:c900036cbcc8 EFLAGS: 0282 ORIG_RAX: 
ff13
[ 2504.186412] RAX:  RBX: 80003751d045 RCX: 0001
[ 2504.186413] RDX: ea0002e09288 RSI: 0269b000 RDI: 8880b6525e40
[ 2504.186413] RBP: 0269c000 R08:  R09: eadd4740
[ 2504.186413] R10: ea0001755700 R11: 8880cc62d120 R12: 02794000
[ 2504.186414] R13: 0269b000 R14: c900036cbdf0 R15: 8880572434d8
[ 2504.186414]  ? unmap_page_range+0x420/0x9b0
[ 2504.186414]  ? release_pages+0x175/0x390
[ 2504.186414]  unmap_vmas+0x7c/0xe0
[ 2504.186415]  exit_mmap+0xa4/0x190
[ 2504.186415]  mmput+0x3b/0x100
[ 2504.186415]  do_exit+0x276/0xc10
[ 2504.186415]  do_group_exit+0x35/0xa0
[ 2504.186415]  __x64_sys_exit_group+0xf/0x10
[ 2504.186416]  do_syscall_64+0x43/0x120
[ 2504.186416]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2504.186416] RIP: 0033:0x7efd6ae10618
[ 2504.186416] Code: Bad RIP value.
[ 2504.186417] RSP: 002b:7ffcac9bde38 EFLAGS: 0246 ORIG_RAX: 
00e7
[ 2504.186417] RAX: ffda RBX:  RCX: 7efd6ae10618
[ 2504.186418] RDX:  RSI: 003c RDI: 
[ 2504.186418] RBP: 7efd6b0ed8e0 R08: 00e7 R09: ff98
[ 2504.186418] R10: 7ffcac9bddb8 R11: 0246 R12: 7efd6b0ed8e0
[ 2504.186419] R13: 7efd6b0f2c20 R14: 0060 R15: 0070e705
[ 2504.186421] NMI backtrace for cpu 3
[ 2504.226980] CPU: 3 PID: 6596 Comm: xfs_io Not tainted 5.1.0+ #1
[ 2504.227661] Hardware name: ChromiumOS crosvm, BIOS 0 
[ 2504.228261] Call Trace:
[ 

Re: [Qemu-devel] [PATCH v9 2/7] virtio-pmem: Add virtio pmem driver

2019-05-16 Thread Jakub Staroń via Qemu-devel
On 5/14/19 7:54 AM, Pankaj Gupta wrote:
> + if (!list_empty(>req_list)) {
> + req_buf = list_first_entry(>req_list,
> + struct virtio_pmem_request, list);
> + req_buf->wq_buf_avail = true;
> + wake_up(_buf->wq_buf);
> + list_del(_buf->list);
Yes, this change is the right one, thank you!

> +  /*
> +   * If virtqueue_add_sgs returns -ENOSPC then req_vq virtual
> +   * queue does not have free descriptor. We add the request
> +   * to req_list and wait for host_ack to wake us up when free
> +   * slots are available.
> +   */
> + while ((err = virtqueue_add_sgs(vpmem->req_vq, sgs, 1, 1, req,
> + GFP_ATOMIC)) == -ENOSPC) {
> +
> + dev_err(>dev, "failed to send command to virtio pmem" \
> + "device, no free slots in the virtqueue\n");
> + req->wq_buf_avail = false;
> + list_add_tail(>list, >req_list);
> + spin_unlock_irqrestore(>pmem_lock, flags);
> +
> + /* A host response results in "host_ack" getting called */
> + wait_event(req->wq_buf, req->wq_buf_avail);
> + spin_lock_irqsave(>pmem_lock, flags);
> + }
> + err1 = virtqueue_kick(vpmem->req_vq);
> + spin_unlock_irqrestore(>pmem_lock, flags);
> +
> + /*
> +  * virtqueue_add_sgs failed with error different than -ENOSPC, we can't
> +  * do anything about that.
> +  */
> + if (err || !err1) {
> + dev_info(>dev, "failed to send command to virtio pmem 
> device\n");
> + err = -EIO;
> + } else {
> + /* A host repsonse results in "host_ack" getting called */
> + wait_event(req->host_acked, req->done);
> + err = req->ret;
> +I confirm that the failures I was facing with the `-ENOSPC` error path are 
> not present in v9.

Best,
Jakub Staron



Re: [Qemu-devel] [PATCH v7 2/6] virtio-pmem: Add virtio pmem driver

2019-05-08 Thread Jakub Staroń via Qemu-devel
On 5/8/19 4:12 AM, Pankaj Gupta wrote:
> 
>>
>> On 4/25/19 10:00 PM, Pankaj Gupta wrote:
>>
>>> +void host_ack(struct virtqueue *vq)
>>> +{
>>> +   unsigned int len;
>>> +   unsigned long flags;
>>> +   struct virtio_pmem_request *req, *req_buf;
>>> +   struct virtio_pmem *vpmem = vq->vdev->priv;
>>> +
>>> +   spin_lock_irqsave(>pmem_lock, flags);
>>> +   while ((req = virtqueue_get_buf(vq, )) != NULL) {
>>> +   req->done = true;
>>> +   wake_up(>host_acked);
>>> +
>>> +   if (!list_empty(>req_list)) {
>>> +   req_buf = list_first_entry(>req_list,
>>> +   struct virtio_pmem_request, list);
>>> +   list_del(>req_list);
>>
>> Shouldn't it be rather `list_del(vpmem->req_list.next)`? We are trying to
>> unlink
>> first element of the list and `vpmem->req_list` is just the list head.
> 
> This looks correct. We are not deleting head but first entry in 'req_list'
> which is device corresponding list of pending requests.
> 
> Please see below:
> 
> /**
>  * Retrieve the first list entry for the given list pointer.
>  *
>  * Example:
>  * struct foo *first;
>  * first = list_first_entry(>list_of_foos, struct foo, list_of_foos);
>  *
>  * @param ptr The list head
>  * @param type Data type of the list element to retrieve
>  * @param member Member name of the struct list_head field in the list 
> element.
>  * @return A pointer to the first list element.
>  */
> #define list_first_entry(ptr, type, member) \
> list_entry((ptr)->next, type, member)

Please look at this StackOverflow question:
https://stackoverflow.com/questions/19675419/deleting-first-element-of-a-list-h-list

Author asks about deleting first element of the queue. In our case
(and also in the question's author case), `vpmem->req_list` is not element
of any request struct and not an element of the list. It's just a list head 
storing 
`next` and `prev` pointers which are then pointing to respectively first and
last element of the list. We want to unlink the first element of the list,
so we need to pass pointer to the first element of the list to
the `list_del` function - that is, the `vpmem->req_list.next`.

Thank you,
Jakub Staron



Re: [Qemu-devel] [PATCH v7 2/6] virtio-pmem: Add virtio pmem driver

2019-05-07 Thread Jakub Staroń via Qemu-devel
On 4/25/19 10:00 PM, Pankaj Gupta wrote:

> +void host_ack(struct virtqueue *vq)
> +{
> + unsigned int len;
> + unsigned long flags;
> + struct virtio_pmem_request *req, *req_buf;
> + struct virtio_pmem *vpmem = vq->vdev->priv;
> +
> + spin_lock_irqsave(>pmem_lock, flags);
> + while ((req = virtqueue_get_buf(vq, )) != NULL) {
> + req->done = true;
> + wake_up(>host_acked);
> +
> + if (!list_empty(>req_list)) {
> + req_buf = list_first_entry(>req_list,
> + struct virtio_pmem_request, list);
> + list_del(>req_list);

Shouldn't it be rather `list_del(vpmem->req_list.next)`? We are trying to unlink
first element of the list and `vpmem->req_list` is just the list head.

> +int virtio_pmem_flush(struct nd_region *nd_region)
> +{
> + int err;
> + unsigned long flags;
> + struct scatterlist *sgs[2], sg, ret;
> + struct virtio_device *vdev = nd_region->provider_data;
> + struct virtio_pmem *vpmem = vdev->priv;
> + struct virtio_pmem_request *req;
> +
> + might_sleep();
> + req = kmalloc(sizeof(*req), GFP_KERNEL);
> + if (!req)
> + return -ENOMEM;
> +
> + req->done = req->wq_buf_avail = false;
> + strcpy(req->name, "FLUSH");
> + init_waitqueue_head(>host_acked);
> + init_waitqueue_head(>wq_buf);
> + sg_init_one(, req->name, strlen(req->name));
> + sgs[0] = 
> + sg_init_one(, >ret, sizeof(req->ret));
> + sgs[1] = 
> +
> + spin_lock_irqsave(>pmem_lock, flags);
> + err = virtqueue_add_sgs(vpmem->req_vq, sgs, 1, 1, req, GFP_ATOMIC);
> + if (err) {
> + dev_err(>dev, "failed to send command to virtio pmem 
> device\n");
> +
> + list_add_tail(>req_list, >list);
> + spin_unlock_irqrestore(>pmem_lock, flags);
> +
> + /* When host has read buffer, this completes via host_ack */
> + wait_event(req->wq_buf, req->wq_buf_avail);
> + spin_lock_irqsave(>pmem_lock, flags);
> + }

Aren't the arguments in `list_add_tail` swapped? The element we are adding 
should
be first, the list should be second. Also, shouldn't we resubmit the request 
after
waking up from `wait_event(req->wq_buf, req->wq_buf_avail)`?

I propose rewriting it like that:

diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
index 66b582f751a3..ff0556b04e86 100644
--- a/drivers/nvdimm/virtio_pmem.c
+++ b/drivers/nvdimm/virtio_pmem.c
@@ -25,7 +25,7 @@ void host_ack(struct virtqueue *vq)
if (!list_empty(>req_list)) {
req_buf = list_first_entry(>req_list,
struct virtio_pmem_request, list);
-   list_del(>req_list);
+   list_del(vpmem->req_list.next);
req_buf->wq_buf_avail = true;
wake_up(_buf->wq_buf);
}
@@ -59,17 +59,33 @@ int virtio_pmem_flush(struct nd_region *nd_region)
sgs[1] = 
 
spin_lock_irqsave(>pmem_lock, flags);
-   err = virtqueue_add_sgs(vpmem->req_vq, sgs, 1, 1, req, GFP_ATOMIC);
-   if (err) {
-   dev_err(>dev, "failed to send command to virtio pmem 
device\n");
+   /*
+* If virtqueue_add_sgs returns -ENOSPC then req_vq virtual queue does 
not
+* have free descriptor slots. We add the request to req_list and wait
+* for host_ack to wake us up when free slots are available.
+*/
+   while ((err = virtqueue_add_sgs(vpmem->req_vq, sgs, 1, 1, req, 
GFP_ATOMIC)) == -ENOSPC) {
+   dev_err(>dev, "failed to send command to virtio pmem 
device, no free slots in the virtqueue, postponing request\n");
+   req->wq_buf_avail = false;
 
-   list_add_tail(>req_list, >list);
+   list_add_tail(>list, >req_list);
spin_unlock_irqrestore(>pmem_lock, flags);
 
/* When host has read buffer, this completes via host_ack */
wait_event(req->wq_buf, req->wq_buf_avail);
spin_lock_irqsave(>pmem_lock, flags);
}
+
+   /*
+* virtqueue_add_sgs failed with error different than -ENOSPC, we can't
+* do anything about that.
+*/
+   if (err) {
+   dev_info(>dev, "failed to send command to virtio pmem 
device, error code %d\n", err);
+   spin_unlock_irqrestore(>pmem_lock, flags);
+   err = -EIO;
+   goto ret;
+   }
err = virtqueue_kick(vpmem->req_vq);
spin_unlock_irqrestore(>pmem_lock, flags);


Let me know if it looks reasonable to you.

Thank you,
Jakub Staron




Re: [Qemu-devel] [PATCH v7 4/6] dax: check synchronous mapping is supported

2019-05-07 Thread Jakub Staroń via Qemu-devel
From: Pankaj Gupta 
Date: Thu, Apr 25, 2019 at 10:00 PM

> +static inline bool daxdev_mapping_supported(struct vm_area_struct *vma,
> +   struct dax_device *dax_dev)
> +{
> +   return !(vma->flags & VM_SYNC);
> +}

Shouldn't it be rather `return !(vma->vm_flags & VM_SYNC);`? There is
no field named `flags` in `struct vm_area_struct`.

Thank you,
Jakub