Re: [ewg] initial LIO iSER performance numbers [was: GIT PULL] target updates for v3.10-rc1)

2013-05-03 Thread Or Gerlitz
On Thu, May 2, 2013 at 10:31 PM, Nicholas A. Bellinger
n...@linux-iscsi.org wrote:

 We used RAMDISK_MCP backend which was patched to act as NULL device, so
 we can test the raw iSER wire performance.

 Btw, I'll be including a similar patch to allow for RAMDISK_NULL to be
 configured as a NULL device mode.

yep, that would be very helpful, so people can do that sort of testing
without hacks...

Or.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] initial LIO iSER performance numbers [was: GIT PULL] target updates for v3.10-rc1)

2013-05-02 Thread Or Gerlitz

On 30/04/2013 05:59, Nicholas A. Bellinger wrote:

Hello Linus!

Here are the target pending changes for the v3.10-rc1 merge window.

Please go ahead and pull from:

   git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git 
for-next-merge

The highlights this round include:

  - Add fileio support for WRITE_SAME w/ UNMAP=1 discard (asias)
  - Add fileio support for UNMAP discard (asias)
  - Add tcm_vhost hotplug support to work with upstream QEMU
vhost-scsi-pci code (asias + mst)
  - Check for aborted sequence in tcm_fc response path (mdr)
  - Add initial iscsit_transport support into iscsi-target code (nab)
  - Refactor iscsi-target RX PDU logic + export request PDU handling  (nab)
  - Refactor iscsi-target TX queue logic + export response PDU creation (nab)
  - Add new iSCSI Extentions for RDMA (ISER) target driver (Or + nab)

The biggest changes revolve around iscsi-target refactoring in order to
support the iser-target driver.  This includes the conversion of the
iscsi-target data-path to use modern se_cmd-cmd_kref counting, and
allowing transport independent aspects of RX/TX PDU request/response
handling be shared across existing traditional iscsi-target code, and
the new iser-target code.


Hi Nic, everyone,

So LIO iser target code is now merged into Linus tree, and will be in 
kernel 3.10, exciting!


Here's some data on raw performance numbers we were able to get with the 
LIO iser code.


For single initiator and single lun, block sizes varying over the range 
1KB,2KB... 128KB

doing random read:

1KB 227,870K
2KB 458,099K
4KB 909,761K
8KB 1,679,922K
16KB 3,233,753K
32KB 4,905,139K
64KB 5,294,873K
128KB 5,565,235K

When enlarging the number of luns and still with single initiator, for 
1KB randomreads we get:


1 LUN  = 230k IOPS
2 LUNs = 420k IOPS
4 LUNs = 740k IOPS

When enlarging the number of initiators, and each having four lunswe get 
for 1KB random reads:


1 initiator  x 4 LUNs = 740k  IOPS
2 initiators x 4 LUNs = 1480k IOPS
3 initiators x 4 LUNs = 1570k IOPS

So all in all, things scale pretty nicely, and we observe a some bottleneck
in the IOPS rate around 1.6 Million IOPS, so there's where to improve...

Here's the fio command line used by the initiators

$ fio --cpumask=0xfc --rw=randread --bs=1k --numjobs=2 --iodepth=128 
--runtime=62 --time_based --size=1073741824k --loops=1 --ioengine=libaio 
--direct=1 --invalidate=1 --fsync_on_close=1 --randrepeat=1 
--norandommap --group_reporting --exitall --name 
dev-sdb-randread-1k-2thr-libaio-128iodepth-62sec --filename=/dev/sdb


And some details on the setup:

The nodes are HP ProLiant DL380p Gen8 with the following CPU: Intel(R) 
Xeon(R) CPU E5-2650 0 @ 2.00GHz
two NUMA nodes with eight cores each, 32GB RAM, PCI express gen3 8x, the 
HCA being Mellanox ConnectX3 with firmware 2.11.500


The target node was running upstream kernel and the initiators RHEL 6.3 
kernel, all X86_64


We used RAMDISK_MCP backend which was patched to act as NULL device, so 
we can test the raw iSER wire performance.


Or.



Thanks to Or Gerlitz + Mellanox for supporting the iser-target development 
effort!

Thank you,

--nab

Andy Grover (2):
   target/iscsi: Remove chap_set_random()
   target/iscsi: Use ISCSI_LOGIN_CURRENT/NEXT_STAGE macros

Asias He (10):
   target/file: Add WRITE_SAME w/ UNMAP=1 emulation support
   target/file: Add UNMAP emulation support
   target/file: Add fd_do_unmap() helper
   target/iblock: Add iblock_do_unmap() helper
   target: Add sbc_execute_unmap() helper
   target/file: Set is_nonrot attribute
   tcm_vhost: Refactor the lock nesting rule
   tcm_vhost: Add hotplug/hotunplug support
   tcm_vhost: Add ioctl to get and set events missed flag
   tcm_vhost: Enable VIRTIO_SCSI_F_HOTPLUG

Jörn Engel (2):
   qla2xxx: Remove unused function
   target: Change default sense key of NOT_READY

Mark Rustad (1):
   tcm_fc: Check for aborted sequence

Nicholas Bellinger (9):
   target: Add export of target_get_sess_cmd symbol
   iscsi-target: Add iscsit_transport API template
   iscsi-target: Initial traditional TCP conversion to iscsit_transport
   iscsi-target: Add iser-target parameter keys + setup during login
   iscsi-target: Add per transport iscsi_cmd alloc/free
   iscsi-target: Refactor RX PDU logic + export request PDU handling
   iscsi-target: Refactor TX queue logic + export response PDU creation
   iscsi-target: Add iser network portal attribute
   iser-target: Add iSCSI Extensions for RDMA (iSER) target driver

Wei Yongjun (1):
   tcm_fc: using kfree_rcu() to simplify the code

  drivers/infiniband/Kconfig |1 +
  drivers/infiniband/Makefile|1 +
  drivers/infiniband/ulp/isert/Kconfig   |5 +
  drivers/infiniband/ulp/isert/Makefile  |2 +
  drivers/infiniband/ulp/isert/ib_isert.c| 2281 
  drivers/infiniband/ulp/isert/ib_isert.h|  138 ++
  drivers/infiniband/ulp/isert/isert_proto.h |   47 +