Re: [PATCH 1/2] libibverbs: Use autoreconf in autogen.sh
On May 1, 2013, at 11:30 AM, Doug Ledford dledf...@redhat.com wrote: This is fine with me, however, I think you also need to bump the autotools version to the latest upstream. The automated checkers in our build environment is spitting out errors about a number of upstream packages where the autotools used to configure the package does not include proper arm support. The latest autotools bring in all of the forthcoming arm variants. So I would like to see both of these things done. Are you referring to the version of Autotools that Roland uses to create his tarballs? Because I have no control over that. :-) On Apr 25, 2013, at 11:38 AM, Jeff Squyres (jsquyres) jsquy...@cisco.com wrote: Bump. On Apr 22, 2013, at 1:41 PM, Jeff Squyres jsquy...@cisco.com wrote: The old sequence of Autotools commands listed in autogen.sh is no longer correct. Instead, just use the single autoreconf command, which will invoke all the Right Autotools commands in the correct order. Signed-off-by: Jeff Squyres jsquy...@cisco.com --- autogen.sh | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/autogen.sh b/autogen.sh index fd47839..6c9233e 100755 --- a/autogen.sh +++ b/autogen.sh @@ -1,8 +1,4 @@ #! /bin/sh set -x -aclocal -I config -libtoolize --force --copy -autoheader -automake --foreign --add-missing --copy -autoconf +autoreconf -ifv -I config -- 1.8.1.1 -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Ad IB_MTU_1500|9000 enums.
On Apr 22, 2013, at 4:00 PM, Doug Ledford dledf...@redhat.com wrote: 2. Change all instances of ib_mtu/ibv_mtu to an int. Code such as switch(mtu) case IBV_MTU_1024: ... will need to be updated to switch(mtu) case 1024: I was actually thinking that an ibverbs API version 2.0 might be an interesting way to go. The proliferation of non-IB link layers providing the verbs API make some of the original assumptions of IB link layer in the original API obsolete. But, if we were to do that, I'd take some time to really think the issue over and try to catch all of the needed updates in one go. In addition to the MTU, another obvious issue is the active_speed attribute on the ibv_port_attr. On the kernel side, it's an enum (IB_SPEED_SDR through IB_SPEED_EDR), but there's no corresponding enum names in libibverbs. It would be good to make this value a non-enum-int, too. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/1] OpenSM: dfsssp - add support for multicast
Recent tests on a large system revealed a problem with loops in the multicast routing. Using DFSSSP together with the default mcast routing algorithm of OpenSM can produce loops in the fabric. This patch adds the mcast_build_stree function to the DFSSSP routing algorithm, so that DFSSSP is able to calculate the correct mcast forwarding tables for the subnet. It almost does the same steps as the default mcast routing, except that it uses the Dijkstra algorithm to generate the spanning tree instead of using the hop count information given by the unicast routing. General overview of the algorithm in pseudo-code: 1) identify the ports, which are part of the multicast group 2) find the 'best' switch (depending on the hop count) for the mcast group, which can be used as a root of the spanning tree 3) perform a dijkstra step with the root switch as starting point to generate a spanning tree to all other switches in the subnet 4) build the mcast forwarding tables for relevant switches: 4.1) select a switch which has mcast member ports connected to it 4.2) set the downstream ports for the mcast member ports in the mcft 4.3) traverse towards the root of the spanning tree and set up-/downstream ports on this path for all involved switches 4.4) goto 4.1 until all switches have been processed The same mcast algorithm will be used for SSSP, because SSSP has the potential to produce loops in the mcast forwarding table as well. Signed-off-by: Jens Domke domke.j...@m.titech.ac.jp --- include/opensm/osm_mcast_mgr.h | 72 +++ opensm/Makefile.am |1 + opensm/osm_mcast_mgr.c | 35 opensm/osm_ucast_dfsssp.c | 194 4 files changed, 283 insertions(+), 19 deletions(-) create mode 100644 include/opensm/osm_mcast_mgr.h diff --git a/include/opensm/osm_mcast_mgr.h b/include/opensm/osm_mcast_mgr.h new file mode 100644 index 000..291a478 --- /dev/null +++ b/include/opensm/osm_mcast_mgr.h @@ -0,0 +1,72 @@ +/* + * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. + * Copyright (c) 2002-2009 Mellanox Technologies LTD. All rights reserved. + * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. + * Copyright (c) 2009-2011 ZIH, TU Dresden, Federal Republic of Germany. All rights reserved. + * Copyright (C) 2012-2013 Tokyo Institute of Technology. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + */ + +/* + * Abstract: + * Declaration of osm_mcast_work_obj_t. + * Provide access to a mcast function which searches the root swicth for + * a spanning tree. + */ + +#ifndef _OSM_MCAST_MGR_H_ +#define _OSM_MCAST_MGR_H_ + +#ifdef __cplusplus +# define BEGIN_C_DECLS extern C { +# define END_C_DECLS } +#else /* !__cplusplus */ +# define BEGIN_C_DECLS +# define END_C_DECLS +#endif /* __cplusplus */ + +BEGIN_C_DECLS + +typedef struct osm_mcast_work_obj { + cl_list_item_t list_item; + osm_port_t *p_port; + cl_map_item_t map_item; +} osm_mcast_work_obj_t; + +int osm_mcast_make_port_list_and_map(cl_qlist_t * list, cl_qmap_t * map, +osm_mgrp_box_t * mbox); + +void osm_mcast_drop_port_list(cl_qlist_t * list); + +osm_switch_t * osm_mcast_mgr_find_root_switch(osm_sm_t * sm, cl_qlist_t * list); + +END_C_DECLS +#endif /* _OSM_MCAST_MGR_H_ */ diff --git a/opensm/Makefile.am b/opensm/Makefile.am index 7fd6bc6..20318cc 100644 --- a/opensm/Makefile.am +++ b/opensm/Makefile.am @@ -116,6 +116,7 @@ opensminclude_HEADERS = \
RE: decent performance drop for SCSI LLD / SAN initiator when iommu is turned on
-Original Message- From: Michael S. Tsirkin [mailto:m...@redhat.com] Sent: Thursday, May 02, 2013 04:56 To: Or Gerlitz Cc: Roland Dreier; io...@lists.linux-foundation.org; Yan Burman; linux- r...@vger.kernel.org Subject: Re: decent performance drop for SCSI LLD / SAN initiator when iommu is turned on On Thu, May 02, 2013 at 02:11:15AM +0300, Or Gerlitz wrote: Hi Roland, IOMMU folks, So we've noted that when configuring the kernel booting with intel iommu set to on on a physical node (non VM, and without enabling SRIOV by the HW device driver) raw performance of the iSER (iSCSI RDMA) SAN initiator is reduced notably, e.g in the testbed we looked today we had ~260K 1KB random IOPS and 5.5GBs BW for 128KB IOs with iommu turned off for single LUN, and ~150K IOPS and 4GBs BW with iommu turned on. No change on the target node between runs. That's why we have iommu=pt. See definition of iommu_pass_through in arch/x86/kernel/pci-dma.c. I tried passing intel_iommu=on iommu=pt to 3.8.11 kernel and I still get performance degradation. I get the same numbers with iommu=pt as without it. I wanted to send perf output, but currently I seem to have some problem with its output. Will try to get perf differences next week. Yan -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: initial LIO iSER performance numbers [was: GIT PULL] target updates for v3.10-rc1)
On Thu, 2013-05-02 at 16:58 +0300, Or Gerlitz wrote: On 30/04/2013 05:59, Nicholas A. Bellinger wrote: Hello Linus! Here are the target pending changes for the v3.10-rc1 merge window. Please go ahead and pull from: git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git for-next-merge SNIP Hi Nic, everyone, So LIO iser target code is now merged into Linus tree, and will be in kernel 3.10, exciting! Here's some data on raw performance numbers we were able to get with the LIO iser code. For single initiator and single lun, block sizes varying over the range 1KB,2KB... 128KB doing random read: 1KB 227,870K 2KB 458,099K 4KB 909,761K 8KB 1,679,922K 16KB 3,233,753K 32KB 4,905,139K 64KB 5,294,873K 128KB 5,565,235K When enlarging the number of luns and still with single initiator, for 1KB randomreads we get: 1 LUN = 230k IOPS 2 LUNs = 420k IOPS 4 LUNs = 740k IOPS When enlarging the number of initiators, and each having four lunswe get for 1KB random reads: 1 initiator x 4 LUNs = 740k IOPS 2 initiators x 4 LUNs = 1480k IOPS 3 initiators x 4 LUNs = 1570k IOPS So all in all, things scale pretty nicely, and we observe a some bottleneck in the IOPS rate around 1.6 Million IOPS, so there's where to improve... Excellent. Thanks for the posting these initial performance results. Here's the fio command line used by the initiators $ fio --cpumask=0xfc --rw=randread --bs=1k --numjobs=2 --iodepth=128 --runtime=62 --time_based --size=1073741824k --loops=1 --ioengine=libaio --direct=1 --invalidate=1 --fsync_on_close=1 --randrepeat=1 --norandommap --group_reporting --exitall --name dev-sdb-randread-1k-2thr-libaio-128iodepth-62sec --filename=/dev/sdb And some details on the setup: The nodes are HP ProLiant DL380p Gen8 with the following CPU: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz two NUMA nodes with eight cores each, 32GB RAM, PCI express gen3 8x, the HCA being Mellanox ConnectX3 with firmware 2.11.500 The target node was running upstream kernel and the initiators RHEL 6.3 kernel, all X86_64 We used RAMDISK_MCP backend which was patched to act as NULL device, so we can test the raw iSER wire performance. Btw, I'll be including a similar patch to allow for RAMDISK_NULL to be configured as a NULL device mode. Thanks Or Co! --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html