Re: [PATCH RFC 0/2] percpu_tags: Prototype implementation

2014-08-11 Thread Alexander Gordeev
On Fri, Jul 18, 2014 at 12:20:56PM +0200, Alexander Gordeev wrote:
 The performance test is not decent, though. I used fio random
 read against a null_blk device sitting on top of percpu_tags,
 which is not exactly how percpu_ida is used. This is another
 reason I am posting - an advice on how to properly test is very
 appreciated.

Hi Nicholas et al,

I expect the best possible performance test for percpu_ida/percpu_tags
would be to stress drivers/vhost/scsi.c vhost_scsi_get_tag() function.

I tried to make such test by attaching ramdisk to a virtual machine
(similar to https://lkml.org/lkml/2012/8/10/347) but ultimately failed
to configure the necessary environment - the stock qemu does not have
-vhost-scsi parameter.

Could you please advice how to make this configuration exposed to guests?

o- / . [...]
  o- backstores .. [...]
  | o- block .. [Storage Objects: 0]
  | o- fileio . [Storage Objects: 0]
  | o- pscsi .. [Storage Objects: 0]
  | o- ramdisk  [Storage Objects: 1]
  |   o- rda .. [(1.0GiB) activated]
  o- iscsi  [Targets: 0]
  o- loopback . [Targets: 0]
  o- vhost  [Targets: 1]
o- naa.5001405b171ee405 .. [TPGs: 1]
  o- tpg1 .. [naa.5001405983a5b1a4, no-gen-acls]
o- acls .. [ACLs: 0]
o- luns .. [LUNs: 1]
  o- lun0  [ramdisk/rda]

Thanks!
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 0/2] percpu_tags: Prototype implementation

2014-08-11 Thread Nicholas A. Bellinger
(Responding again without gmail, as the last email hit a failure when
responding to the lists..)

On Mon, 2014-08-11 at 16:17 -0400, Alexander Gordeev wrote:
 On Fri, Jul 18, 2014 at 12:20:56PM +0200, Alexander Gordeev wrote:
  The performance test is not decent, though. I used fio random
  read against a null_blk device sitting on top of percpu_tags,
  which is not exactly how percpu_ida is used. This is another
  reason I am posting - an advice on how to properly test is very
  appreciated.
 
 Hi Nicholas et al,
 
 I expect the best possible performance test for percpu_ida/percpu_tags
 would be to stress drivers/vhost/scsi.c vhost_scsi_get_tag() function.
 
 I tried to make such test by attaching ramdisk to a virtual machine
 (similar to https://lkml.org/lkml/2012/8/10/347) but ultimately failed
 to configure the necessary environment - the stock qemu does not have
 -vhost-scsi parameter.
 
 Could you please advice how to make this configuration exposed to guests?
 
 o- / . 
 [...]
   o- backstores .. 
 [...]
   | o- block .. [Storage Objects: 
 0]
   | o- fileio . [Storage Objects: 
 0]
   | o- pscsi .. [Storage Objects: 
 0]
   | o- ramdisk  [Storage Objects: 
 1]
   |   o- rda .. [(1.0GiB) 
 activated]
   o- iscsi  [Targets: 
 0]
   o- loopback . [Targets: 
 0]
   o- vhost  [Targets: 
 1]
 o- naa.5001405b171ee405 .. [TPGs: 
 1]
   o- tpg1 .. [naa.5001405983a5b1a4, 
 no-gen-acls]
 o- acls .. [ACLs: 
 0]
 o- luns .. [LUNs: 
 1]
   o- lun0  
 [ramdisk/rda]
 

So qemu expects '-device vhost-scsi-pci' with the following syntax:

   -device vhost-scsi-pci,wwpn=naa.5001405b171ee405,num_queues=1,cmd_per_lun=64

For best results I'd recommend setting the IRQ affinity for each of the
virtio*_request MSI-X vectors to a dedicated vCPU in KVM guest.

Also, I've been using the scsi-mq prototype for small block I/O
performance testing in order to push vhost-scsi and avoid the legacy
scsi_request_fn() bottleneck(s) with virtio-scsi, and now that hch's
scsi-mq work (CC'ed) has been merged upstream in v3.17-rc0, it would be
a good time for a scsi-mq + virtio-scsi + vhost-scsi performance
checkpoint.  ;)

--nab

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC 0/2] percpu_tags: Prototype implementation

2014-07-18 Thread Alexander Gordeev
Hello Gentleman,

This is a development of percpu_ida library. I named it
percpu_tags to simplify review, since most of percpu_ida
code has gone and a diff would not be informative.

While the concept of per-cpu arrays is is preserved, the
implementation is heavily reworked. The main objective is to
improve the percpu_ida locking scheme.

Here is the list of major differrences between percpu_ida and
percpu_tags:

* The global freelist has gone. As result, CPUs do not compete
  for the global lock.

* Long-running operatons (scanning thru a cpumask) are executed
  with local interrupts enabled;

* percpu_ida::percpu_max_size limit is eliminated. Instead, the
  limit is determined dynamically. Depending from how many CPUs
  are requesting tags each CPU gets a fair share of the tag space;

* A tag is attempted to return to the CPU it was allocated on. As
  result, it is expected the locality of data associated with the
  tag benefits;

The code is largely raw and untested. The reason I am posting
is the prototype implementation performs 2-3 times faster than
percpu_ida, so I would like to ensure if it worth going further
with this approach or is there a no-go.

The performance test is not decent, though. I used fio random
read against a null_blk device sitting on top of percpu_tags,
which is not exactly how percpu_ida is used. This is another
reason I am posting - an advice on how to properly test is very
appreciated.

The source code could be found at
https://github.com/a-gordeev/linux.git  percpu_tags-v0

Thanks!

Cc: linux-scsi@vger.kernel.org
Cc: qla2xxx-upstr...@qlogic.com
Cc: Nicholas Bellinger n...@daterainc.com
Cc: Kent Overstreet k...@daterainc.com
Cc: Michael S. Tsirkin m...@redhat.com

Alexander Gordeev (2):
  percpu_tags: Prototype implementation
  percpu_tags: Use percpu_tags instead of percpu_ida

 drivers/scsi/qla2xxx/qla_target.c|6 +-
 drivers/target/iscsi/iscsi_target_util.c |6 +-
 drivers/target/target_core_transport.c   |4 +-
 drivers/target/tcm_fc/tfc_cmd.c  |8 +-
 drivers/vhost/scsi.c |6 +-
 include/linux/percpu_tags.h  |   37 ++
 include/target/target_core_base.h|4 +-
 lib/Makefile |2 +-
 lib/percpu_tags.c|  556 ++
 9 files changed, 611 insertions(+), 18 deletions(-)
 create mode 100644 include/linux/percpu_tags.h
 create mode 100644 lib/percpu_tags.c

-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html