Re: [PATCH v3 4/4] scsi: provide UAPI version of scsi/sg.h and scsi/scsi_ioctl.h

2015-09-25 Thread Bart Van Assche

On 09/25/2015 02:28 AM, Paolo Bonzini wrote:

Provide a UAPI version of the header in the kernel, making it easier
for interested projects to use an up-to-date version of the header.

The new headers are placed under uapi/linux/ so as not to conflict
with the glibc-provided headers in /usr/include/scsi.

/dev/sgN default values are implementation aspects, and are moved to
drivers/scsi/sg.c instead (together with e.g. SG_ALLOW_DIO_DEF).
However, SG_SCATTER_SZ is used by Wine so it is kept in linux/sg.h
SG_MAX_QUEUE could also be useful.

struct scsi_ioctl_command and struct scsi_idlun used to be under
"#ifdef __KERNEL__", but they are actually useful for userspace as
well.  Add them to the new header.

Cc: James Bottomley <jbottom...@parallels.com>
Cc: Christoph Hellwig <h...@lst.de>
Cc: linux-s...@vger.kernel.org
Cc: Bart Van Assche <bart.vanass...@sandisk.com>
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>


Reviewed-by: Bart Van Assche <bart.vanass...@sandisk.com>

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] vhost: Reduce vhost_work_flush() wakeup latency

2013-08-15 Thread Bart Van Assche

On 08/15/13 03:30, Asias He wrote:

On Wed, Aug 14, 2013 at 08:58:25PM +0300, Michael S. Tsirkin wrote:

On Wed, Aug 14, 2013 at 05:22:36PM +0200, Bart Van Assche wrote:

On 08/14/13 13:37, Michael S. Tsirkin wrote:

On Wed, Aug 14, 2013 at 09:02:32AM +0200, Bart Van Assche wrote:

If the TIF_NEED_RESCHED task flag is set, wake up any vhost_work_flush()
waiters before rescheduling instead of after rescheduling.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Michael S. Tsirkin m...@redhat.com
Cc: Asias He as...@redhat.com


Why exactly? It's not like flush needs to be extra fast ...


I'm not worried about how fast a flush is processed either. But I
think this patch is a nice cleanup of the vhost_worker() function.
It eliminates the uninitialized_var() construct and moves the
assignment of done_seq below the read of queue_seq.

Bart.


I'm not worried about uninitialized_var - it's just a compiler bug.
done_seq below read is nice, but this is at the cost of sprinkling
lock/unlock, lock/unlock all over the code.  Maybe we can come up with
something that doesn't have this property.


The extra locking introduced here does not look good to me neither.


Please note that although there are additional locking statements in the 
code, there are no additional locking calls at runtime.


Bart.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] vhost_worker fixes

2013-08-14 Thread Bart Van Assche

There are two patches in this series:
* A patch that reduces vhost_work_flush() latency on busy systems.
* A patch that avoids that vhost_work_flush() returns early.

I found these issues via code inspection.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] vhost: Reduce vhost_work_flush() wakeup latency

2013-08-14 Thread Bart Van Assche
If the TIF_NEED_RESCHED task flag is set, wake up any vhost_work_flush()
waiters before rescheduling instead of after rescheduling.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Michael S. Tsirkin m...@redhat.com
Cc: Asias He as...@redhat.com
---
 drivers/vhost/vhost.c |   42 +++---
 1 file changed, 19 insertions(+), 23 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index e58cf00..e7ffc10 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -201,47 +201,43 @@ static void vhost_vq_reset(struct vhost_dev *dev,
 static int vhost_worker(void *data)
 {
struct vhost_dev *dev = data;
-   struct vhost_work *work = NULL;
-   unsigned uninitialized_var(seq);
+   struct vhost_work *work;
+   unsigned seq;
mm_segment_t oldfs = get_fs();
 
set_fs(USER_DS);
use_mm(dev-mm);
 
-   for (;;) {
+   spin_lock_irq(dev-work_lock);
+   while (!kthread_should_stop()) {
/* mb paired w/ kthread_stop */
set_current_state(TASK_INTERRUPTIBLE);
-
-   spin_lock_irq(dev-work_lock);
-   if (work) {
-   work-done_seq = seq;
-   if (work-flushing)
-   wake_up_all(work-done);
-   }
-
-   if (kthread_should_stop()) {
-   spin_unlock_irq(dev-work_lock);
-   __set_current_state(TASK_RUNNING);
-   break;
-   }
if (!list_empty(dev-work_list)) {
work = list_first_entry(dev-work_list,
struct vhost_work, node);
list_del_init(work-node);
seq = work-queue_seq;
-   } else
-   work = NULL;
-   spin_unlock_irq(dev-work_lock);
+   spin_unlock_irq(dev-work_lock);
 
-   if (work) {
__set_current_state(TASK_RUNNING);
work-fn(work);
-   if (need_resched())
-   schedule();
-   } else
+
+   spin_lock_irq(dev-work_lock);
+   work-done_seq = seq;
+   if (work-flushing)
+   wake_up_all(work-done);
+   }
+   if (list_empty(dev-work_list) || need_resched()) {
+   spin_unlock_irq(dev-work_lock);
+
schedule();
 
+   spin_lock_irq(dev-work_lock);
+   }
}
+   spin_unlock_irq(dev-work_lock);
+
+   __set_current_state(TASK_RUNNING);
unuse_mm(dev-mm);
set_fs(oldfs);
return 0;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] vhost: Avoid that vhost_work_flush() returns early

2013-08-14 Thread Bart Van Assche
If two or more items are queued on dev-work_list before
vhost_worker() starts processing these then the value of
work-done_seq will be set to the sequence number of a work item
that has not yet been processed. Avoid this by letting
vhost_worker() count the number of items that have already been
processed.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Michael S. Tsirkin m...@redhat.com
Cc: Asias He as...@redhat.com
---
 drivers/vhost/vhost.c |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index e7ffc10..11d668a 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -202,7 +202,7 @@ static int vhost_worker(void *data)
 {
struct vhost_dev *dev = data;
struct vhost_work *work;
-   unsigned seq;
+   unsigned seq = 0;
mm_segment_t oldfs = get_fs();
 
set_fs(USER_DS);
@@ -216,14 +216,13 @@ static int vhost_worker(void *data)
work = list_first_entry(dev-work_list,
struct vhost_work, node);
list_del_init(work-node);
-   seq = work-queue_seq;
spin_unlock_irq(dev-work_lock);
 
__set_current_state(TASK_RUNNING);
work-fn(work);
 
spin_lock_irq(dev-work_lock);
-   work-done_seq = seq;
+   work-done_seq = ++seq;
if (work-flushing)
wake_up_all(work-done);
}
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] vhost: Avoid that vhost_work_flush() returns early

2013-08-14 Thread Bart Van Assche

On 08/14/13 13:46, Michael S. Tsirkin wrote:

I'm confused by this explanation.
done_seq is set here:
 if (work) {
 work-done_seq = seq;
 if (work-flushing)
 wake_up_all(work-done);
 }

and work is set here:

 if (!list_empty(dev-work_list)) {
 work = list_first_entry(dev-work_list,
 struct vhost_work, node);
 list_del_init(work-node);
 seq = work-queue_seq;
 }

this work is processed on the next line:

 if (work) {
 __set_current_state(TASK_RUNNING);
 work-fn(work);
 if (need_resched())
 schedule();
 }

so how do we end up with a sequence of a work item
that isn't processed?


I was wondering what would happen if a work item got requeued before 
done_seq is set. I think you are right and this should work fine.


Bart.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] vhost: Reduce vhost_work_flush() wakeup latency

2013-08-14 Thread Bart Van Assche

On 08/14/13 13:37, Michael S. Tsirkin wrote:

On Wed, Aug 14, 2013 at 09:02:32AM +0200, Bart Van Assche wrote:

If the TIF_NEED_RESCHED task flag is set, wake up any vhost_work_flush()
waiters before rescheduling instead of after rescheduling.

Signed-off-by: Bart Van Assche bvanass...@acm.org
Cc: Michael S. Tsirkin m...@redhat.com
Cc: Asias He as...@redhat.com


Why exactly? It's not like flush needs to be extra fast ...


I'm not worried about how fast a flush is processed either. But I think 
this patch is a nice cleanup of the vhost_worker() function. It 
eliminates the uninitialized_var() construct and moves the assignment of 
done_seq below the read of queue_seq.


Bart.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] vhost: Avoid that vhost_work_flush() locks up

2013-07-05 Thread Bart Van Assche
Wake up work-done waiters even if the TIF_NEED_RESCHED task flag
has been set. This patch fixes a regression introduced in commit
d550dda (kernel v3.4).

Signed-off-by: Bart Van Assche bvanass...@acm.org
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=60505
Cc: Michael S. Tsirkin m...@redhat.com
Cc: Asias He as...@redhat.com
Cc: Nadav Har'El n...@math.technion.ac.il
Cc: Abel Gordon ab...@il.ibm.com
Cc: sta...@vger.kernel.org # v3.4+
---
 drivers/vhost/vhost.c |   10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 60aa5ad..cd544ae 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -227,8 +227,16 @@ static int vhost_worker(void *data)
if (work) {
__set_current_state(TASK_RUNNING);
work-fn(work);
-   if (need_resched())
+   if (need_resched()) {
+   spin_lock_irq(dev-work_lock);
+   work-done_seq = seq;
+   if (work-flushing)
+   wake_up_all(work-done);
+   spin_unlock_irq(dev-work_lock);
+
+   work = NULL;
schedule();
+   }
} else
schedule();
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Avoid that vhost_work_flush() locks up

2013-07-05 Thread Bart Van Assche

On 07/05/13 10:36, Bart Van Assche wrote:

Wake up work-done waiters even if the TIF_NEED_RESCHED task flag
has been set. This patch fixes a regression introduced in commit
d550dda (kernel v3.4).

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=60505


(replying to my own e-mail)

Please ignore this patch. Although it might help to wake up 
vhost_work_flush() earlier, the patch description does not match the 
patch itself and the patch does not fix the vhost_work_flush() lockup.


Bart.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V7 4/5] virtio-scsi: introduce multiqueue support

2013-03-25 Thread Bart Van Assche

On 03/23/13 12:28, Wanlong Gao wrote:

+static struct virtio_scsi_vq *virtscsi_pick_vq(struct virtio_scsi *vscsi,
+  struct virtio_scsi_target_state 
*tgt)
+{
+   struct virtio_scsi_vq *vq;
+   unsigned long flags;
+   u32 queue_num;
+
+   spin_lock_irqsave(tgt-tgt_lock, flags);
+
+   /*
+* The memory barrier after atomic_inc_return matches
+* the smp_read_barrier_depends() in virtscsi_req_done.
+*/
+   if (atomic_inc_return(tgt-reqs)  1)
+   vq = ACCESS_ONCE(tgt-req_vq);
+   else {
+   queue_num = smp_processor_id();
+   while (unlikely(queue_num = vscsi-num_queues))
+   queue_num -= vscsi-num_queues;
+
+   tgt-req_vq = vq = vscsi-req_vqs[queue_num];
+   }
+
+   spin_unlock_irqrestore(tgt-tgt_lock, flags);
+   return vq;
+}


Is there any reason why the smp_processor_id() % vscsi-num_queues 
computation in virtscsi_pick_vq() has been implemented as a loop instead 
of as an arithmetic operation ? If so, I would appreciate it if that 
could be explained in a comment.


Thanks,

Bart.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/6] tcm_vhost: Initial merge for vhost level target fabric driver

2012-07-05 Thread Bart Van Assche
On 07/04/12 04:24, Nicholas A. Bellinger wrote:

 +/* Fill in status and signal that we are done processing this command
 + *
 + * This is scheduled in the vhost work queue so we are called with the owner
 + * process mm and can access the vring.
 + */
 +static void vhost_scsi_complete_cmd_work(struct vhost_work *work)
 +{


As far as I can see vhost_scsi_complete_cmd_work() runs on the context
of a work queue kernel thread and hence doesn't have an mm context. Did
I misunderstand something ?

Bart.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] tcm_vhost/virtio-scsi WIP code for-3.6

2012-07-05 Thread Bart Van Assche
On 07/05/12 01:52, Nicholas A. Bellinger wrote:

 fio randrw workload | virtio-scsi-raw | virtio-scsi+tcm_vhost | bare-metal 
 raw block
 
 25 Write / 75 Read  |  ~15K   | ~45K  | ~70K
 75 Write / 25 Read  |  ~20K   | ~55K  | ~60K


These numbers are interesting. To me these numbers mean that there is a
huge performance bottleneck in the virtio-scsi-raw storage path. Why is
the virtio-scsi-raw bandwidth only one third of the bare-metal raw block
bandwidth ?

Bart.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/6] tcm_vhost: Initial merge for vhost level target fabric driver

2012-07-05 Thread Bart Van Assche
On 07/05/12 17:47, Bart Van Assche wrote:

 On 07/04/12 04:24, Nicholas A. Bellinger wrote:
 +/* Fill in status and signal that we are done processing this command
 + *
 + * This is scheduled in the vhost work queue so we are called with the owner
 + * process mm and can access the vring.
 + */
 +static void vhost_scsi_complete_cmd_work(struct vhost_work *work)
 +{
 
 As far as I can see vhost_scsi_complete_cmd_work() runs on the context
 of a work queue kernel thread and hence doesn't have an mm context. Did
 I misunderstand something ?


Please ignore the above - I've found the answer in vhost_dev_ioctl() and
vhost_dev_set_owner().

Bart.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] tcm_vhost/virtio-scsi WIP code for-3.6

2012-07-05 Thread Bart Van Assche
On 07/05/12 17:53, Bart Van Assche wrote:

 On 07/05/12 01:52, Nicholas A. Bellinger wrote:
 fio randrw workload | virtio-scsi-raw | virtio-scsi+tcm_vhost | bare-metal 
 raw block
 
 25 Write / 75 Read  |  ~15K   | ~45K  | ~70K
 75 Write / 25 Read  |  ~20K   | ~55K  | ~60K
 
 These numbers are interesting. To me these numbers mean that there is a
 huge performance bottleneck in the virtio-scsi-raw storage path. Why is
 the virtio-scsi-raw bandwidth only one third of the bare-metal raw block
 bandwidth ?


(replying to my own e-mail)

Or maybe the above numbers mean that in the virtio-scsi-raw test I/O was
serialized (I/O depth 1) while the other two tests use a large I/O depth
(64) ? It can't be a coincidence that the virtio-scsi-raw results are
close to the bare-metal results for I/O depth 1.

Another question: which functionality does tcm_vhost provide that is not
yet provided by the SCSI emulation code in qemu + tcm_loop ?

Bart.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Pe: [PATCH v5 1/3] virtio-scsi: first version

2012-02-13 Thread Bart Van Assche
On Mon, Feb 13, 2012 at 8:05 AM, Christian Borntraeger
borntrae...@de.ibm.com wrote:
 On 12/02/12 21:16, James Bottomley wrote:
  Could someone please explain to me why you can't simply fix virtio-blk?

 I dont think that virtio-scsi will replace virtio-blk everywhere. For non-scsi
 block devices, image files or logical volumes virtio-blk seems to be the right
 approach, I think.

  Or would virtio-blk maintainers give a reason why they're unwilling to
  have it fixed?

 I dont consider virtio-blk broken. It just doesnt cover everything.

Although I'm not sure whether that helps here: since about a year
there is software present in the upstream kernel that allows to use
any block device or even a file as a SCSI device.

Bart.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Scst-devel] BUG: Process hang when kvm crashed, stacktrace cfs+kvm

2012-01-22 Thread Bart Van Assche
On Sat, Jan 21, 2012 at 9:19 PM, kordex kor...@gmail.com wrote:
 I hit sysrq+t sysrq+p in order to print the tasks with stacktraces and
 registers. Using ps -e -o  pid,state,command will hang ps output,
 while ps -e -o pid,state will not.

Sorry, haven't seen such a hang ever before. Does it also occur with
older kernels than 3.1 ? Is it reproducible with 3.3-rc1 ?

Bart.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Scst-devel] BUG: Process hang when kvm crashed, stacktrace cfs+kvm

2012-01-21 Thread Bart Van Assche
On Sat, Jan 21, 2012 at 2:58 PM, kordex kor...@gmail.com wrote:
 On Thu, Jan 19, 2012 at 21:12, kordex kor...@gmail.com wrote:
  On Thu, Jan 19, 2012 at 20:34, kordex kor...@gmail.com wrote:
  Full dmesg of the case attached.
  On Thu, Jan 19, 2012 at 13:12, Bart Van Assche bvanass...@acm.org wrote:
  On Thu, Jan 19, 2012 at 11:54 AM, kordex kor...@gmail.com wrote:
   root@teletex:/usr/src/linux# dmesg
   +0x12e7/0x1c50
    [810dc322] ? page_add_new_anon_rmap+0x52/0xa0
 
  The above output is truncated - the most interesting part is missing.

 Do I need to give you anything else? Any leads on what the problem
 might be? The machine is still running and in this state.

In the /proc/meminfo output I see that the system has 16 GB RAM and 33
GB swap memory. Also, MemFree + Buffers = 400 KB, which makes me
assume the system was under heavy memory pressure ?

Some essential information is still missing - the reason why the
kernel started spewing out all these call traces. Did someone hit the
SysRq key ? Is it because a task was hanging ? Was a BUG or WARNING
hit ?

Bart.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Scst-devel] BUG: Process hang when kvm crashed, stacktrace cfs+kvm

2012-01-19 Thread Bart Van Assche
On Thu, Jan 19, 2012 at 11:54 AM, kordex kor...@gmail.com wrote:
  root@teletex:/usr/src/linux# dmesg
  +0x12e7/0x1c50
   [810dc322] ? page_add_new_anon_rmap+0x52/0xa0

The above output is truncated - the most interesting part is missing.

Bart.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html