[PATCH] opensm: Clean up event subscriptions if a port goes away
Event subscriptions needs to be cleaned up if a port goes away. If the port comes online again later it may no longer want to receive the events on the same QPN. If the old QPN is used for something else the SM forwarding events may cause QKey violations. Signed-off-by: Line Holen line.ho...@oracle.com --- diff --git a/include/opensm/osm_inform.h b/include/opensm/osm_inform.h index f737441..8cefc20 100644 --- a/include/opensm/osm_inform.h +++ b/include/opensm/osm_inform.h @@ -2,6 +2,7 @@ * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. + * Copyright (c) 2013 Oracle and/or its affiliates. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -200,6 +201,35 @@ void osm_infr_insert_to_db(IN osm_subn_t * p_subn, IN osm_log_t * p_log, void osm_infr_remove_from_db(IN osm_subn_t * p_subn, IN osm_log_t * p_log, IN osm_infr_t * p_infr); +/f* OpenSM: Inform Record/osm_infr_remove_subscriptions +* NAME +* osm_infr_remove_subscriptions +* +* DESCRIPTION +* Remove all event subscriptions of a port +* +* SYNOPSIS +*/ +ib_api_status_t +osm_infr_remove_subscriptions(IN osm_subn_t * p_subn, IN osm_log_t * p_log, + IN ib_net64_t port_guid); +/* +* PARAMETERS +* p_subn +* [in] Pointer to the subnet object +* +* p_log +* [in] Pointer to the log object +* +* port_guid +* [in] PortGUID of the subscriber that should be removed +* +* RETURN +* CL_SUCCESS if port_guid had any subscriptions being removed +* CL_NOT_FOUND if port_guid did not have any active subscriptions +* SEE ALSO +*/ + /f* OpenSM: Inform Record/osm_report_notice * NAME * osm_report_notice diff --git a/opensm/osm_drop_mgr.c b/opensm/osm_drop_mgr.c index b309273..5f9181a 100644 --- a/opensm/osm_drop_mgr.c +++ b/opensm/osm_drop_mgr.c @@ -3,6 +3,7 @@ * Copyright (c) 2002-2012 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2008 Xsigo Systems Inc. All rights reserved. + * Copyright (c) 2013 Oracle and/or its affiliates. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -276,6 +277,13 @@ static void drop_mgr_remove_port(osm_sm_t * sm, IN osm_port_t * p_port) drop_mgr_clean_physp(sm, p_port-p_physp); + /* Delete event forwarding subscriptions */ + if (osm_infr_remove_subscriptions(sm-p_subn, sm-p_log, port_guid) + == CL_SUCCESS) + OSM_LOG(sm-p_log, OSM_LOG_DEBUG, + Removed event subscriptions for port 0x%016 PRIx64 \n, + cl_ntoh64(port_guid)); + /* initialize the p_node - may need to get node_desc later */ p_node = p_port-p_node; diff --git a/opensm/osm_inform.c b/opensm/osm_inform.c index 804c414..6318700 100644 --- a/opensm/osm_inform.c +++ b/opensm/osm_inform.c @@ -282,6 +282,41 @@ void osm_infr_remove_from_db(IN osm_subn_t * p_subn, IN osm_log_t * p_log, OSM_LOG_EXIT(p_log); } +ib_api_status_t osm_infr_remove_subscriptions(IN osm_subn_t * p_subn, + IN osm_log_t * p_log, + IN ib_net64_t port_guid) +{ + ib_gid_t gid; + cl_list_item_t *p_list_item; + osm_infr_t *p_infr; + ib_api_status_t status = CL_NOT_FOUND; + + OSM_LOG_ENTER(p_log); + + gid.unicast.interface_id = port_guid; + gid.unicast.prefix = p_subn-opt.subnet_prefix; + + /* go over all inform info available at the subnet */ + /* match to the given GID and delete subscriptions if match */ + p_list_item = cl_qlist_head(p_subn-sa_infr_list); + while (p_list_item != cl_qlist_end(p_subn-sa_infr_list)) { + + p_infr = (osm_infr_t *)p_list_item; + p_list_item = cl_qlist_next(p_list_item); + + if (memcmp((p_infr-inform_record.subscriber_gid), gid, sizeof(ib_gid_t))) + continue; + + /* Remove this event subscription */ + osm_infr_remove_from_db(p_subn, p_log, p_infr); + + status = CL_SUCCESS; + } + + OSM_LOG_EXIT(p_log); + return (status); +} + /** * Send a report: * Given a target address to send to and the notice. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][MINOR] osm_sa_informinfo.c Add attribute info to log messages
On 9/3/2013 7:31 AM, Line Holen wrote: Signed-off-by: Line Holen line.ho...@oracle.com Thanks. Applied with minor change noted below. --- diff --git a/opensm/osm_sa_informinfo.c b/opensm/osm_sa_informinfo.c index f32b88b..bf1fe97 100644 --- a/opensm/osm_sa_informinfo.c +++ b/opensm/osm_sa_informinfo.c @@ -598,7 +598,8 @@ void osm_infr_rcv_process(IN void *context, IN void *data) CL_ASSERT(p_sa_mad-attr_id == IB_MAD_ATTR_INFORM_INFO); if (p_sa_mad-method != IB_MAD_METHOD_SET) { - OSM_LOG(sa-p_log, OSM_LOG_DEBUG, Unsupported Method (%s)\n, + OSM_LOG(sa-p_log, OSM_LOG_DEBUG, + Unsupported Method (%s) for InformInfo\n, ib_get_sa_method_str(p_sa_mad-method)); osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; @@ -626,7 +627,8 @@ void osm_infir_rcv_process(IN void *context, IN void *data) if (p_sa_mad-method != IB_MAD_METHOD_GET p_sa_mad-method != IB_MAD_METHOD_GETTABLE) { - OSM_LOG(sa-p_log, OSM_LOG_DEBUG, Unsupported Method (%s)\n, + OSM_LOG(sa-p_log, OSM_LOG_DEBUG, + Unsupported Method (%s) for InformInfo\n, InformInfoRecord r.t. InformInfo -- Hal ib_get_sa_method_str(p_sa_mad-method)); osm_sa_send_error(sa, p_madw, IB_MAD_STATUS_UNSUP_METHOD_ATTR); goto Exit; -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] osm_sa_guidinfo_record.c False duplicate GUID error messages
On 9/3/2013 7:30 AM, Line Holen wrote: If the same request is received twice then the second one will cause error messages indicating duplicate alias GUIDs. In this case this is a false warning. The second request should be treated as a void and return success to the requester. Signed-off-by: Line Holen line.ho...@oracle.com --- diff --git a/opensm/osm_sa_guidinfo_record.c b/opensm/osm_sa_guidinfo_record.c index 8323b38..cfaf6f3 100644 --- a/opensm/osm_sa_guidinfo_record.c +++ b/opensm/osm_sa_guidinfo_record.c @@ -638,6 +638,14 @@ static void set_guidinfo(IN osm_sa_t *sa, IN osm_madw_t *p_madw, } add_alias_guid: + /* Check whether the port already contain this alias guid +at this index. If so we're done, continue to next */ + if (set_alias_guid == (*p_port-p_physp-p_guids)[i]) { + OSM_LOG(sa-p_log, OSM_LOG_DEBUG, + The alias GUID is already correctly assigned, continue\n); + continue; + } + I don't think this handles all the cases properly. Please see modified patch to follow this. -- Hal /* allocate alias guid and add to alias guid table */ p_alias_guid = osm_alias_guid_new(set_alias_guid, p_port); if (!p_alias_guid) { @@ -656,9 +664,12 @@ add_alias_guid: /* alias GUID is a duplicate */ OSM_LOG(sa-p_log, OSM_LOG_ERROR, ERR 5108: Duplicate alias port GUID 0x% PRIx64 - index %d base port GUID 0x% PRIx64 \n, + index %d base port GUID 0x% PRIx64 + , alias GUID already assigned to + base port GUID 0x% PRIx64 \n, cl_ntoh64(p_alias_guid-alias_guid), i, - cl_ntoh64(p_alias_guid-p_base_port-guid)); + cl_ntoh64(p_alias_guid-p_base_port-guid), + cl_ntoh64(p_alias_guid_check-p_base_port-guid)); osm_alias_guid_delete(p_alias_guid); /* clear response guid at index to indicate duplicate */ p_rcvd_rec-guid_info.guid[i % 8] = 0; -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH opensm] osm_sa_guidinfo_record.c: False duplicate GUID error messages
If the same request is received twice then the second one will cause error messages indicating duplicate alias GUIDs. In this case this is a false warning. The second request should be treated as a void and return success to the requester. Signed-off-by: Line Holen line.ho...@oracle.com Signed-off-by: Hal Rosenstock h...@mellanox.com --- diff --git a/opensm/osm_sa_guidinfo_record.c b/opensm/osm_sa_guidinfo_record.c index 8323b38..28f6c0f 100644 --- a/opensm/osm_sa_guidinfo_record.c +++ b/opensm/osm_sa_guidinfo_record.c @@ -653,15 +653,21 @@ add_alias_guid: p_alias_guid-alias_guid, p_alias_guid-map_item); if (p_alias_guid_check != p_alias_guid) { - /* alias GUID is a duplicate */ - OSM_LOG(sa-p_log, OSM_LOG_ERROR, ERR 5108: - Duplicate alias port GUID 0x% PRIx64 -index %d base port GUID 0x% PRIx64 \n, - cl_ntoh64(p_alias_guid-alias_guid), i, - cl_ntoh64(p_alias_guid-p_base_port-guid)); - osm_alias_guid_delete(p_alias_guid); - /* clear response guid at index to indicate duplicate */ - p_rcvd_rec-guid_info.guid[i % 8] = 0; + /* alias GUID is a duplicate if it exists on another port or on the same port but at another index */ + if (p_alias_guid_check-p_base_port != p_port || + (*p_port-p_physp-p_guids)[i] != set_alias_guid) { + OSM_LOG(sa-p_log, OSM_LOG_ERROR, ERR 5108: + Duplicate alias port GUID 0x% PRIx64 +index %d base port GUID 0x% PRIx64 + , alias GUID already assigned to + base port GUID 0x% PRIx64 \n, + cl_ntoh64(p_alias_guid-alias_guid), i, + cl_ntoh64(p_alias_guid-p_base_port-guid), + cl_ntoh64(p_alias_guid_check-p_base_port-guid)); + osm_alias_guid_delete(p_alias_guid); + /* clear response guid at index to indicate duplicate */ + p_rcvd_rec-guid_info.guid[i % 8] = 0; + } } else { del_alias_guid = (*p_port-p_physp-p_guids)[i]; if (del_alias_guid) { -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: rtnl_lock deadlock on 3.10
On 9/5/2013 5:02 AM, Bart Van Assche wrote: On 07/30/13 14:54, Steve Wise wrote: On 7/29/2013 6:02 PM, Shawn Bohrer wrote: On Mon, Jul 15, 2013 at 09:38:19AM -0500, Shawn Bohrer wrote: On Wed, Jul 03, 2013 at 08:26:11PM +0300, Or Gerlitz wrote: On 03/07/2013 20:22, Shawn Bohrer wrote: On Wed, Jul 03, 2013 at 07:33:07AM +0200, Hannes Frederic Sowa wrote: On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes Frederic Sowa wrote: On Tue, Jul 02, 2013 at 01:38:26PM +, Cong Wang wrote: On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa han...@stressinduktion.org wrote: On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote: I've managed to hit a deadlock at boot a couple times while testing the 3.10 rc kernels. It seems to always happen when my network devices are initializing. This morning I updated to v3.10 and made a few config tweaks and so far I've hit it 4 out of 5 reboots. It looks like most processes are getting stuck on rtnl_lock. Below is a boot log with the soft lockup prints. Please let know if there is any other information I can provide: Could you try a build with CONFIG_LOCKDEP enabled? The problem is clear: ib_register_device() is called with rtnl_lock, but itself needs device_mutex, however, ib_register_client() first acquires device_mutex, then indirectly calls register_netdev() which takes rtnl_lock. Deadlock! One possible fix is always taking rtnl_lock before taking device_mutex, something like below: diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index 18c1ece..890870b 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -381,6 +381,7 @@ int ib_register_client(struct ib_client *client) { struct ib_device *device; +rtnl_lock(); mutex_lock(device_mutex); list_add_tail(client-list, client_list); @@ -389,6 +390,7 @@ int ib_register_client(struct ib_client *client) client-add(device); mutex_unlock(device_mutex); +rtnl_unlock(); return 0; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index b6e049a..5a7a048 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1609,7 +1609,7 @@ static struct net_device *ipoib_add_port(const char *format, goto event_failed; } -result = register_netdev(priv-dev); +result = register_netdevice(priv-dev); if (result) { printk(KERN_WARNING %s: couldn't register ipoib port %d; error %d\n, hca-name, port, result); Looks good to me. Shawn, could you test this patch? ib_unregister_device/ib_unregister_client would need the same change, too. I have not checked the other -add() and -remove() functions. Also cc'ed linux-rdma@vger.kernel.org, Roland Dreier. Cong's patch is missing the #include linux/rtnetlink.h but otherwise I've had 34 successful reboots with no deadlocks which is a good sign. It sounds like there are more paths that need to be audited and a proper patch submitted. I can do more testing later if needed. Thanks, Shawn Guys, I was a bit busy today looking into that, but I don't think we want the IB core layer (core/device.c) to use rtnl locking which is something that belongs to the network stack. Has anymore thought been put into a proper fix for this issue? I'm no expert in this area but I'm having a hard time seeing a different solution than the one Cong suggested. Just to be clear the deadlock I hit was between cxgb3 and the ipoib module, so I've Cc'd Steve Wise in case he has a better solution from the Chelsio side. I don't know of another way to resolve this. The rtnl lock is used in ipoib and mlx4 already. I think we should go forward with the proposed patch. (replying to an e-mail of one month ago) Hello, It would be appreciated if anyone could report what the current status of this issue is. I think a deadlock I ran into with kernels 3.10 and 3.11 and PCI pass-through is related to this issue. See also http://bugzilla.kernel.org/show_bug.cgi?id=60856 for the lockdep report. Thanks, Bart. Roland, what do you think? As I've said, I think we should go ahead with using the rtnl lock in the core. Is there a complete patch available for review? looks like the original was a partial fix. Steve. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: rtnl_lock deadlock on 3.10
On Thu, Sep 05, 2013 at 10:14:51AM -0500, Steve Wise wrote: On 9/5/2013 5:02 AM, Bart Van Assche wrote: On 07/30/13 14:54, Steve Wise wrote: On 7/29/2013 6:02 PM, Shawn Bohrer wrote: On Mon, Jul 15, 2013 at 09:38:19AM -0500, Shawn Bohrer wrote: On Wed, Jul 03, 2013 at 08:26:11PM +0300, Or Gerlitz wrote: On 03/07/2013 20:22, Shawn Bohrer wrote: On Wed, Jul 03, 2013 at 07:33:07AM +0200, Hannes Frederic Sowa wrote: On Wed, Jul 03, 2013 at 07:11:52AM +0200, Hannes Frederic Sowa wrote: On Tue, Jul 02, 2013 at 01:38:26PM +, Cong Wang wrote: On Tue, 02 Jul 2013 at 08:28 GMT, Hannes Frederic Sowa han...@stressinduktion.org wrote: On Mon, Jul 01, 2013 at 09:54:56AM -0500, Shawn Bohrer wrote: I've managed to hit a deadlock at boot a couple times while testing the 3.10 rc kernels. It seems to always happen when my network devices are initializing. This morning I updated to v3.10 and made a few config tweaks and so far I've hit it 4 out of 5 reboots. It looks like most processes are getting stuck on rtnl_lock. Below is a boot log with the soft lockup prints. Please let know if there is any other information I can provide: Could you try a build with CONFIG_LOCKDEP enabled? The problem is clear: ib_register_device() is called with rtnl_lock, but itself needs device_mutex, however, ib_register_client() first acquires device_mutex, then indirectly calls register_netdev() which takes rtnl_lock. Deadlock! One possible fix is always taking rtnl_lock before taking device_mutex, something like below: diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index 18c1ece..890870b 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -381,6 +381,7 @@ int ib_register_client(struct ib_client *client) { struct ib_device *device; +rtnl_lock(); mutex_lock(device_mutex); list_add_tail(client-list, client_list); @@ -389,6 +390,7 @@ int ib_register_client(struct ib_client *client) client-add(device); mutex_unlock(device_mutex); +rtnl_unlock(); return 0; } diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index b6e049a..5a7a048 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1609,7 +1609,7 @@ static struct net_device *ipoib_add_port(const char *format, goto event_failed; } -result = register_netdev(priv-dev); +result = register_netdevice(priv-dev); if (result) { printk(KERN_WARNING %s: couldn't register ipoib port %d; error %d\n, hca-name, port, result); Looks good to me. Shawn, could you test this patch? ib_unregister_device/ib_unregister_client would need the same change, too. I have not checked the other -add() and -remove() functions. Also cc'ed linux-rdma@vger.kernel.org, Roland Dreier. Cong's patch is missing the #include linux/rtnetlink.h but otherwise I've had 34 successful reboots with no deadlocks which is a good sign. It sounds like there are more paths that need to be audited and a proper patch submitted. I can do more testing later if needed. Thanks, Shawn Guys, I was a bit busy today looking into that, but I don't think we want the IB core layer (core/device.c) to use rtnl locking which is something that belongs to the network stack. Has anymore thought been put into a proper fix for this issue? I'm no expert in this area but I'm having a hard time seeing a different solution than the one Cong suggested. Just to be clear the deadlock I hit was between cxgb3 and the ipoib module, so I've Cc'd Steve Wise in case he has a better solution from the Chelsio side. I don't know of another way to resolve this. The rtnl lock is used in ipoib and mlx4 already. I think we should go forward with the proposed patch. (replying to an e-mail of one month ago) Hello, It would be appreciated if anyone could report what the current status of this issue is. I think a deadlock I ran into with kernels 3.10 and 3.11 and PCI pass-through is related to this issue. See also http://bugzilla.kernel.org/show_bug.cgi?id=60856 for the lockdep report. Thanks, Bart. Roland, what do you think? As I've said, I think we should go ahead with using the rtnl lock in the core. Is there a complete patch available for review? looks like the original was a partial fix. I've been running with Cong's partial fix for the past couple of months, and I'm pretty sure no complete patch has been posted. I may be able to look at the missing pieces tomorrow and see if I can put together a patch but if someone else wants to run with this feel free. -- Shawn -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 0/4] RDMA/nes and RDMA/cxgb4: iWARP Port Mapper Overview
Hello Steve, Thank you for the feedback. I think there are lots of #defines and enums replicated in both nes_netlink.h and c4iw_netlink.h. For example, the IWPM_* defines and enums. It seems like we could put all this in a common header file to be included by all iwarp providers? Say include/rdma/iw_portmap.h or something. Having a common rdma/iw_portmap.h is a good idea. (and maybe even a common core module if there is enough common functions) At this early stage of the project, I think that having a separate core module for the iwarp port mapper could be overhead. That is the reason we initially decided to keep the changes locally in the drivers using the service. Tatyana -Original Message- From: Steve Wise [mailto:sw...@opengridcomputing.com] Sent: Tuesday, September 03, 2013 10:34 AM To: Nikolova, Tatyana E Cc: Roland Dreier; Sharp, Robert O; Lacombe, John S; vi...@chelsio.com; linux-rdma@vger.kernel.org Subject: Re: [PATCH 0/4] RDMA/nes and RDMA/cxgb4: iWARP Port Mapper Overview On 8/31/2013 2:41 PM, Tatyana Nikolova wrote: Hello All, This patch series adds iWARP Port Mapper (IWPM) service support in RDMA/core, RDMA/nes driver and RDMA/cxgb4 driver. The iWARP Port Mapper implementation is based on the port mapper specification section in the Sockets Direct Protocol paper - http://www.rdmaconsortium.org/home/draft-pinkerton-iwarp-sdp-v1.0.pdf Hey Tatyana, I'm replying to 0/4 because these comments are for both iwarp providers. I think there are lots of #defines and enums replicated in both nes_netlink.h and c4iw_netlink.h. For example, the IWPM_* defines and enums. It seems like we could put all this in a common header file to be included by all iwarp providers? Say include/rdma/iw_portmap.h or something. Basically the interface to the user mode daemon is the same for all providers, no? Those bits should be in a common header. (and maybe even a common core module if there is enough common functions). Thoughts? Steve. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Strange NFS client ACK behaviour
CC linux-nfs .. maybe this is obvious to someone there ... Two comments inlined below. On Tue, Sep 3, 2013 at 11:28 AM, Markus Stockhausen stockhau...@collogia.de wrote: Hello, we observed a performance drop in our IPoIB NFS backup infrastructure since we switched to machines with newer kernels. As I do not know where to start I hope someone on this list can give me hint where to dig for more details. In case of no other reply, I would start w/ a socket program (or a network performance measuring tool) on the interface that does similar logic as dd you described below; that is, send a 256K message in a fixed number of loops (so total transfer size somewhere close to your file size) between client and server, followed by comparing the interrupt counters (cat /proc/interrtups) on both kernels. If the interrupt count differs as you described, the problem is most likely with the IB driver, not NFS layer. To make a long story short. We use ConnectX cards with the standard kernel drivers on version 2.6.32 (Ubuntu 10.04), 3.5 (Ubuntu 12.04) and 3.10 (Fedora 19). The very simple and not scientific test consists of mounting a NFS share using IPoIB UD network interfaces at MTU of 2044. Afterwards read a large file on the client side with dd if=file of=/dev/null bs=256K. During the transfer we run a tcpdump on the ibX interface on the NFS server side. No special settings for kernel parameters until now. I don't know much about ConnectX. Not sure what IPoIB UD means ? Datagram vs. CM or TCP vs. UDP ? When doing the test with a 2.6.32 kernel based client we see the following packet sequence. More or less a lot of transferd blocks from the NFS server to the client with sometimes an ACK package from the client to the server: 16:16:45.050930 IP server.nfs cli_2_6_32.896: Flags [.], seq 8909853:8913837, ack 1154149, win 604, options [nop,nop,TS val 1640401415 ecr 3881919089], length 3984 16:16:45.050936 IP server.nfs cli_2_6_32.896: Flags [.], seq 8913837:8917821, ack 1154149, win 604, options [nop,nop,TS val 1640401415 ecr 3881919089], length 3984 ... 8 more ... 16:16:45.050976 IP cli_2_6_32.896 server.nfs: Flags [.], ack 8909853, win 24574, options [nop,nop,TS val 3881919089 ecr 1640401415], length 0 ... After switchng to a client with a newer kernel (3.5 or 3.10) the sequence all of a sudden gives just the opposite behaviour. One should note that this is the same server as in the test above. The server sends bigger packets (I guess TSO is doing the rest of the work). After each packet the client sends several ACK packages back. 16:15:21.038782 IP server.nfs cli_3_5_0.928: Flags [.], seq 9612429:9652269, ack 372776, win 5815, options [nop,nop,TS val 1640380412 ecr 560111379], length 39840 16:15:21.038806 IP cli_3_5_0.928 server.nfs: Flags [.], ack 9542205, win 16384, options [nop,nop,TS val 560111379 ecr 1640380412], length 0 16:15:21.038812 IP cli_3_5_0.928 server.nfs: Flags [.], ack 9546077, win 16384, options [nop,nop,TS val 560111379 ecr 1640380412], length 0 ... 6-8 more ... The visible side effects of this changed processing include: - NIC interrupts on the NFS servers raise by a factor of 8. - Transfer speed lowers by 50% (400-200 MB/sec) Best regards. Markus -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 0/4] RDMA/nes and RDMA/cxgb4: iWARP Port Mapper Overview
Hello Or, Does this series follows the netconf 2011session e.g from this link http://vger.kernel.org/netconf2011_slides/pj_netconf2011.ppt? Yes Thanks, Tatyana -Original Message- From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Or Gerlitz Sent: Wednesday, September 04, 2013 5:19 AM To: Nikolova, Tatyana E Cc: Roland Dreier; Sharp, Robert O; Lacombe, John S; Vipul Pandya; Steve Wise; linux-rdma Subject: Re: [PATCH 0/4] RDMA/nes and RDMA/cxgb4: iWARP Port Mapper Overview On Sat, Aug 31 2013, Tatyana Nikolova tatyana.e.nikol...@intel.com wrote: [...] The patches are built against Roland's infiniband tree for-next branch. We would like this series to get in the linux-3.12 merge window. So why submitting it only 48 hours before the merge window opens up? Does this series follows the netconf 2011session e.g from this link http://vger.kernel.org/netconf2011_slides/pj_netconf2011.ppt ? Personally, I will not be able to review the concept and core patches before next week, but if others here can and are OK for this to 3.12, let it be. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH opensm] osm_sa_guidinfo_record.c: False duplicate GUID error messages
Looks good to me. Thanks, Line On 09/05/13 16:18, Hal Rosenstock wrote: If the same request is received twice then the second one will cause error messages indicating duplicate alias GUIDs. In this case this is a false warning. The second request should be treated as a void and return success to the requester. Signed-off-by: Line Holenline.ho...@oracle.com Signed-off-by: Hal Rosenstockh...@mellanox.com --- diff --git a/opensm/osm_sa_guidinfo_record.c b/opensm/osm_sa_guidinfo_record.c index 8323b38..28f6c0f 100644 --- a/opensm/osm_sa_guidinfo_record.c +++ b/opensm/osm_sa_guidinfo_record.c @@ -653,15 +653,21 @@ add_alias_guid: p_alias_guid-alias_guid, p_alias_guid-map_item); if (p_alias_guid_check != p_alias_guid) { - /* alias GUID is a duplicate */ - OSM_LOG(sa-p_log, OSM_LOG_ERROR, ERR 5108: - Duplicate alias port GUID 0x% PRIx64 -index %d base port GUID 0x% PRIx64 \n, - cl_ntoh64(p_alias_guid-alias_guid), i, - cl_ntoh64(p_alias_guid-p_base_port-guid)); - osm_alias_guid_delete(p_alias_guid); - /* clear response guid at index to indicate duplicate */ - p_rcvd_rec-guid_info.guid[i % 8] = 0; + /* alias GUID is a duplicate if it exists on another port or on the same port but at another index */ + if (p_alias_guid_check-p_base_port != p_port || + (*p_port-p_physp-p_guids)[i] != set_alias_guid) { + OSM_LOG(sa-p_log, OSM_LOG_ERROR, ERR 5108: + Duplicate alias port GUID 0x% PRIx64 +index %d base port GUID 0x% PRIx64 + , alias GUID already assigned to + base port GUID 0x% PRIx64 \n, + cl_ntoh64(p_alias_guid-alias_guid), i, + cl_ntoh64(p_alias_guid-p_base_port-guid), + cl_ntoh64(p_alias_guid_check-p_base_port-guid)); + osm_alias_guid_delete(p_alias_guid); + /* clear response guid at index to indicate duplicate */ + p_rcvd_rec-guid_info.guid[i % 8] = 0; + } } else { del_alias_guid = (*p_port-p_physp-p_guids)[i]; if (del_alias_guid) { -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 0/4] RDMA/nes and RDMA/cxgb4: iWARP Port Mapper Overview
Hello Sean, + RDMA_NL_NES, + RDMA_NL_C4IW Why is the communication with the daemon device specific? These enums are necessary. The communication with the IWPM daemon is device specific, because each device is a different netlink client and needs to register with netlink a set of callback functions. The drivers nes and cxgb4 are consuming two netlink clients, using the following ibnl_add_client(RDMA_NL_NES, RDMA_NL_IWPM_NUM_OPS, nes_nl_cb_table) ibnl_add_client(RDMA_NL_C4IW, RDMA_NL_IWPM_NUM_OPS, c4iw_nl_cb_table) The current implementation will work with two iwarp providers (nes and cxgb4) on the same system and the user space iwarp port mapper will be servicing two clients at the same time. Can the iwarp cm fully encapsulate these changes? This needs more consideration, because it may not be a trivial change. Thank you, Tatyana -Original Message- From: Hefty, Sean Sent: Tuesday, September 03, 2013 12:22 PM To: Steve Wise; Nikolova, Tatyana E Cc: Roland Dreier; Sharp, Robert O; Lacombe, John S; vi...@chelsio.com; linux-rdma@vger.kernel.org Subject: RE: [PATCH 0/4] RDMA/nes and RDMA/cxgb4: iWARP Port Mapper Overview I think there are lots of #defines and enums replicated in both nes_netlink.h and c4iw_netlink.h. For example, the IWPM_* defines and enums. It seems like we could put all this in a common header file to be included by all iwarp providers? Say include/rdma/iw_portmap.h or something. Basically the interface to the user mode daemon is the same for all providers, no? Those bits should be in a common header. (and maybe even a common core module if there is enough common functions). Thoughts? I agree. This enum change looks off: + RDMA_NL_NES, + RDMA_NL_C4IW Why is the communication with the daemon device specific? Can the iwarp cm fully encapsulate these changes? - Sean -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html