Re: initial LIO iSER performance numbers [was: GIT PULL] target updates for v3.10-rc1)
On Fri, May 03, 2013 at 09:57:20AM +0300, Or Gerlitz wrote: > On Thu, May 2, 2013 at 10:31 PM, Nicholas A. Bellinger > wrote: > > >> We used RAMDISK_MCP backend which was patched to act as NULL device, so > >> we can test the raw iSER wire performance. > > > Btw, I'll be including a similar patch to allow for RAMDISK_NULL to be > > configured as a NULL device mode. > > yep, that would be very helpful, so people can do that sort of testing > without hacks... +1, I used to hack drivers/block/brd.c to get a NULL ramdisk. > Or. -- Asias -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: initial LIO iSER performance numbers [was: GIT PULL] target updates for v3.10-rc1)
On Thu, May 2, 2013 at 10:31 PM, Nicholas A. Bellinger wrote: >> We used RAMDISK_MCP backend which was patched to act as NULL device, so >> we can test the raw iSER wire performance. > Btw, I'll be including a similar patch to allow for RAMDISK_NULL to be > configured as a NULL device mode. yep, that would be very helpful, so people can do that sort of testing without hacks... Or. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: initial LIO iSER performance numbers [was: GIT PULL] target updates for v3.10-rc1)
On Thu, 2013-05-02 at 16:58 +0300, Or Gerlitz wrote: > On 30/04/2013 05:59, Nicholas A. Bellinger wrote: > > Hello Linus! > > > > Here are the target pending changes for the v3.10-rc1 merge window. > > > > Please go ahead and pull from: > > > >git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git > > for-next-merge > > > Hi Nic, everyone, > > So LIO iser target code is now merged into Linus tree, and will be in > kernel 3.10, exciting! > > Here's some data on raw performance numbers we were able to get with the > LIO iser code. > > For single initiator and single lun, block sizes varying over the range > 1KB,2KB... 128KB > doing random read: > > 1KB 227,870K > 2KB 458,099K > 4KB 909,761K > 8KB 1,679,922K > 16KB 3,233,753K > 32KB 4,905,139K > 64KB 5,294,873K > 128KB 5,565,235K > > When enlarging the number of luns and still with single initiator, for > 1KB randomreads we get: > > 1 LUN = 230k IOPS > 2 LUNs = 420k IOPS > 4 LUNs = 740k IOPS > > When enlarging the number of initiators, and each having four lunswe get > for 1KB random reads: > > 1 initiator x 4 LUNs = 740k IOPS > 2 initiators x 4 LUNs = 1480k IOPS > 3 initiators x 4 LUNs = 1570k IOPS > > So all in all, things scale pretty nicely, and we observe a some bottleneck > in the IOPS rate around 1.6 Million IOPS, so there's where to improve... > Excellent. Thanks for the posting these initial performance results. > Here's the fio command line used by the initiators > > $ fio --cpumask=0xfc --rw=randread --bs=1k --numjobs=2 --iodepth=128 > --runtime=62 --time_based --size=1073741824k --loops=1 --ioengine=libaio > --direct=1 --invalidate=1 --fsync_on_close=1 --randrepeat=1 > --norandommap --group_reporting --exitall --name > dev-sdb-randread-1k-2thr-libaio-128iodepth-62sec --filename=/dev/sdb > > And some details on the setup: > > The nodes are HP ProLiant DL380p Gen8 with the following CPU: Intel(R) > Xeon(R) CPU E5-2650 0 @ 2.00GHz > two NUMA nodes with eight cores each, 32GB RAM, PCI express gen3 8x, the > HCA being Mellanox ConnectX3 with firmware 2.11.500 > > The target node was running upstream kernel and the initiators RHEL 6.3 > kernel, all X86_64 > > We used RAMDISK_MCP backend which was patched to act as NULL device, so > we can test the raw iSER wire performance. > Btw, I'll be including a similar patch to allow for RAMDISK_NULL to be configured as a NULL device mode. Thanks Or & Co! --nab -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: initial LIO iSER performance numbers [was: GIT PULL] target updates for v3.10-rc1)
On 30/04/2013 05:59, Nicholas A. Bellinger wrote: Hello Linus! Here are the target pending changes for the v3.10-rc1 merge window. Please go ahead and pull from: git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git for-next-merge The highlights this round include: - Add fileio support for WRITE_SAME w/ UNMAP=1 discard (asias) - Add fileio support for UNMAP discard (asias) - Add tcm_vhost hotplug support to work with upstream QEMU vhost-scsi-pci code (asias + mst) - Check for aborted sequence in tcm_fc response path (mdr) - Add initial iscsit_transport support into iscsi-target code (nab) - Refactor iscsi-target RX PDU logic + export request PDU handling (nab) - Refactor iscsi-target TX queue logic + export response PDU creation (nab) - Add new iSCSI Extentions for RDMA (ISER) target driver (Or + nab) The biggest changes revolve around iscsi-target refactoring in order to support the iser-target driver. This includes the conversion of the iscsi-target data-path to use modern se_cmd->cmd_kref counting, and allowing transport independent aspects of RX/TX PDU request/response handling be shared across existing traditional iscsi-target code, and the new iser-target code. Hi Nic, everyone, So LIO iser target code is now merged into Linus tree, and will be in kernel 3.10, exciting! Here's some data on raw performance numbers we were able to get with the LIO iser code. For single initiator and single lun, block sizes varying over the range 1KB,2KB... 128KB doing random read: 1KB 227,870K 2KB 458,099K 4KB 909,761K 8KB 1,679,922K 16KB 3,233,753K 32KB 4,905,139K 64KB 5,294,873K 128KB 5,565,235K When enlarging the number of luns and still with single initiator, for 1KB randomreads we get: 1 LUN = 230k IOPS 2 LUNs = 420k IOPS 4 LUNs = 740k IOPS When enlarging the number of initiators, and each having four lunswe get for 1KB random reads: 1 initiator x 4 LUNs = 740k IOPS 2 initiators x 4 LUNs = 1480k IOPS 3 initiators x 4 LUNs = 1570k IOPS So all in all, things scale pretty nicely, and we observe a some bottleneck in the IOPS rate around 1.6 Million IOPS, so there's where to improve... Here's the fio command line used by the initiators $ fio --cpumask=0xfc --rw=randread --bs=1k --numjobs=2 --iodepth=128 --runtime=62 --time_based --size=1073741824k --loops=1 --ioengine=libaio --direct=1 --invalidate=1 --fsync_on_close=1 --randrepeat=1 --norandommap --group_reporting --exitall --name dev-sdb-randread-1k-2thr-libaio-128iodepth-62sec --filename=/dev/sdb And some details on the setup: The nodes are HP ProLiant DL380p Gen8 with the following CPU: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz two NUMA nodes with eight cores each, 32GB RAM, PCI express gen3 8x, the HCA being Mellanox ConnectX3 with firmware 2.11.500 The target node was running upstream kernel and the initiators RHEL 6.3 kernel, all X86_64 We used RAMDISK_MCP backend which was patched to act as NULL device, so we can test the raw iSER wire performance. Or. Thanks to Or Gerlitz + Mellanox for supporting the iser-target development effort! Thank you, --nab Andy Grover (2): target/iscsi: Remove chap_set_random() target/iscsi: Use ISCSI_LOGIN_CURRENT/NEXT_STAGE macros Asias He (10): target/file: Add WRITE_SAME w/ UNMAP=1 emulation support target/file: Add UNMAP emulation support target/file: Add fd_do_unmap() helper target/iblock: Add iblock_do_unmap() helper target: Add sbc_execute_unmap() helper target/file: Set is_nonrot attribute tcm_vhost: Refactor the lock nesting rule tcm_vhost: Add hotplug/hotunplug support tcm_vhost: Add ioctl to get and set events missed flag tcm_vhost: Enable VIRTIO_SCSI_F_HOTPLUG Jörn Engel (2): qla2xxx: Remove unused function target: Change default sense key of NOT_READY Mark Rustad (1): tcm_fc: Check for aborted sequence Nicholas Bellinger (9): target: Add export of target_get_sess_cmd symbol iscsi-target: Add iscsit_transport API template iscsi-target: Initial traditional TCP conversion to iscsit_transport iscsi-target: Add iser-target parameter keys + setup during login iscsi-target: Add per transport iscsi_cmd alloc/free iscsi-target: Refactor RX PDU logic + export request PDU handling iscsi-target: Refactor TX queue logic + export response PDU creation iscsi-target: Add iser network portal attribute iser-target: Add iSCSI Extensions for RDMA (iSER) target driver Wei Yongjun (1): tcm_fc: using kfree_rcu() to simplify the code drivers/infiniband/Kconfig |1 + drivers/infiniband/Makefile|1 + drivers/infiniband/ulp/isert/Kconfig |5 + drivers/infiniband/ulp/isert/Makefile |2 + drivers/infiniband/ulp/isert/ib_isert.c| 2281 drivers/infiniband/ulp/isert/ib_isert.h| 138 ++ drivers/infiniband/ulp/isert/isert_proto.h | 47 + drivers/sc
[GIT PULL] target updates for v3.10-rc1
Hello Linus! Here are the target pending changes for the v3.10-rc1 merge window. Please go ahead and pull from: git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git for-next-merge The highlights this round include: - Add fileio support for WRITE_SAME w/ UNMAP=1 discard (asias) - Add fileio support for UNMAP discard (asias) - Add tcm_vhost hotplug support to work with upstream QEMU vhost-scsi-pci code (asias + mst) - Check for aborted sequence in tcm_fc response path (mdr) - Add initial iscsit_transport support into iscsi-target code (nab) - Refactor iscsi-target RX PDU logic + export request PDU handling (nab) - Refactor iscsi-target TX queue logic + export response PDU creation (nab) - Add new iSCSI Extentions for RDMA (ISER) target driver (Or + nab) The biggest changes revolve around iscsi-target refactoring in order to support the iser-target driver. This includes the conversion of the iscsi-target data-path to use modern se_cmd->cmd_kref counting, and allowing transport independent aspects of RX/TX PDU request/response handling be shared across existing traditional iscsi-target code, and the new iser-target code. Thanks to Or Gerlitz + Mellanox for supporting the iser-target development effort! Thank you, --nab Andy Grover (2): target/iscsi: Remove chap_set_random() target/iscsi: Use ISCSI_LOGIN_CURRENT/NEXT_STAGE macros Asias He (10): target/file: Add WRITE_SAME w/ UNMAP=1 emulation support target/file: Add UNMAP emulation support target/file: Add fd_do_unmap() helper target/iblock: Add iblock_do_unmap() helper target: Add sbc_execute_unmap() helper target/file: Set is_nonrot attribute tcm_vhost: Refactor the lock nesting rule tcm_vhost: Add hotplug/hotunplug support tcm_vhost: Add ioctl to get and set events missed flag tcm_vhost: Enable VIRTIO_SCSI_F_HOTPLUG Jörn Engel (2): qla2xxx: Remove unused function target: Change default sense key of NOT_READY Mark Rustad (1): tcm_fc: Check for aborted sequence Nicholas Bellinger (9): target: Add export of target_get_sess_cmd symbol iscsi-target: Add iscsit_transport API template iscsi-target: Initial traditional TCP conversion to iscsit_transport iscsi-target: Add iser-target parameter keys + setup during login iscsi-target: Add per transport iscsi_cmd alloc/free iscsi-target: Refactor RX PDU logic + export request PDU handling iscsi-target: Refactor TX queue logic + export response PDU creation iscsi-target: Add iser network portal attribute iser-target: Add iSCSI Extensions for RDMA (iSER) target driver Wei Yongjun (1): tcm_fc: using kfree_rcu() to simplify the code drivers/infiniband/Kconfig |1 + drivers/infiniband/Makefile|1 + drivers/infiniband/ulp/isert/Kconfig |5 + drivers/infiniband/ulp/isert/Makefile |2 + drivers/infiniband/ulp/isert/ib_isert.c| 2281 drivers/infiniband/ulp/isert/ib_isert.h| 138 ++ drivers/infiniband/ulp/isert/isert_proto.h | 47 + drivers/scsi/qla2xxx/qla_target.c | 19 - drivers/scsi/qla2xxx/qla_target.h |1 - drivers/target/iscsi/Makefile |3 +- drivers/target/iscsi/iscsi_target.c| 1184 - drivers/target/iscsi/iscsi_target.h|3 +- drivers/target/iscsi/iscsi_target_auth.c | 28 +- drivers/target/iscsi/iscsi_target_configfs.c | 98 +- drivers/target/iscsi/iscsi_target_core.h | 26 +- drivers/target/iscsi/iscsi_target_device.c |7 +- drivers/target/iscsi/iscsi_target_erl1.c | 13 +- drivers/target/iscsi/iscsi_target_login.c | 472 -- drivers/target/iscsi/iscsi_target_login.h |6 + drivers/target/iscsi/iscsi_target_nego.c | 194 +-- drivers/target/iscsi/iscsi_target_nego.h | 11 +- drivers/target/iscsi/iscsi_target_parameters.c | 87 +- drivers/target/iscsi/iscsi_target_parameters.h | 16 +- drivers/target/iscsi/iscsi_target_tmr.c|4 +- drivers/target/iscsi/iscsi_target_tpg.c|6 +- drivers/target/iscsi/iscsi_target_transport.c | 55 + drivers/target/iscsi/iscsi_target_util.c | 53 +- drivers/target/iscsi/iscsi_target_util.h |1 + drivers/target/target_core_file.c | 122 ++- drivers/target/target_core_iblock.c| 108 +- drivers/target/target_core_sbc.c | 85 + drivers/target/target_core_transport.c | 13 +- drivers/target/tcm_fc/tfc_io.c |9 +- drivers/target/tcm_fc/tfc_sess.c |9 +- drivers/vhost/tcm_vhost.c | 262 +++- drivers/vhost/tcm_vhost.h | 13 + include/target/iscsi/iscsi_transport.h | 83 + include/target/target_core_backend.h |4 + include/target/target_core_fabric.h|2 +- 39 files changed, 4470 insertions(+), 1002 de