Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout
Quoting r. Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] IB_CM: Limit the MRA timeout Ishai Rabinovitz wrote: There is a bug in SRP Engenio target that send a large value as service timeout. (It gets 30 which mean timeout of (2^(30-8))=4195 sec.) Such a long timeout is not reasonable and it may leave the kernel module waiting on wait_for_completion and may stuck a lot of processes. The following patch allows the load of ib_cm module with a limit on the timeout. There's several timeout values transfered and used by the cm, most notably the remote cm response timeout and packet life time. Does it make more sense to have a single, generic timeout maximum instead? Hmm. I'm not sure - we are working around an actual broken implementation here - what do you think? Would it make more sense to enable the maximum(s) by default, since we're dependent upon values received over the network? I think it would. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] IB/ipath - initialize diagpkt file on device init only
Hi folks, Here's a new spin of the patch to fix the problem of the ipath module causing errors if no ipath hardware is present. This new version of the patch should fix the potential problem Roland spotted if the kernel is doing multithreaded probes. Roland: please review and queue for 2.6.19 if you're satisfied with this approach. I still don't have an answer about why modprobe hangs when this patch isn't applied - I'll get to that in the next day or so when I have a moment. Michael: please consider replacing the last patch we sent to OFED for this with this new version. I suspect that, once again, you will be required to modify the patch to get it to apply cleanly. I'd like to avoid having you do this, but I don't have a clear idea how to get hold of the OFED-next-release-in-progress stuff. Bryan handled this previously, but he's on vacation for the next several weeks. Do you have some instructions written down somewhere you could point me at on how to submit patches that would make your life a little easier in this regard? Regards, Robert. IB/ipath - initialize diagpkt file on device init only Don't attempt to set up the diagpkt device in the module init code. Instead, wait until a piece of hardware is initted. Fixes a problem when loading the ib_ipath module when no InfiniPath hardware is present: modprobe would go into the D state and stay there. Signed-off-by: Robert Walsh [EMAIL PROTECTED] diff -r d168d78758ca drivers/infiniband/hw/ipath/ipath_diag.c --- a/drivers/infiniband/hw/ipath/ipath_diag.c Tue Oct 03 15:01:29 2006 -0700 +++ b/drivers/infiniband/hw/ipath/ipath_diag.c Tue Oct 03 15:04:44 2006 -0700 @@ -286,17 +286,20 @@ static struct file_operations diagpkt_fi static struct cdev *diagpkt_cdev; static struct class_device *diagpkt_class_dev; - -int __init ipath_diagpkt_add(void) -{ - return ipath_cdev_init(IPATH_DIAGPKT_MINOR, - ipath_diagpkt, diagpkt_file_ops, - diagpkt_cdev, diagpkt_class_dev); -} - -void __exit ipath_diagpkt_remove(void) -{ - ipath_cdev_cleanup(diagpkt_cdev, diagpkt_class_dev); +static atomic_t diagpkt_count = ATOMIC_INIT(0); + +void ipath_diagpkt_add(void) +{ + if (atomic_inc_return(diagpkt_count) == 1) + ipath_cdev_init(IPATH_DIAGPKT_MINOR, + ipath_diagpkt, diagpkt_file_ops, + diagpkt_cdev, diagpkt_class_dev); +} + +void ipath_diagpkt_remove(void) +{ + if (atomic_dec_and_test(diagpkt_count)) + ipath_cdev_cleanup(diagpkt_cdev, diagpkt_class_dev); } /** diff -r d168d78758ca drivers/infiniband/hw/ipath/ipath_driver.c --- a/drivers/infiniband/hw/ipath/ipath_driver.cTue Oct 03 15:01:29 2006 -0700 +++ b/drivers/infiniband/hw/ipath/ipath_driver.cTue Oct 03 15:01:29 2006 -0700 @@ -559,6 +559,7 @@ static int __devinit ipath_init_one(stru ipathfs_add_device(dd); ipath_user_add(dd); ipath_diag_add(dd); + ipath_diagpkt_add(); ipath_register_ib_device(dd); /* Check that we have a LID in LID_TIMEOUT seconds. */ @@ -700,6 +701,7 @@ static void __devexit ipath_remove_one(s if (dd-verbs_dev) ipath_unregister_ib_device(dd-verbs_dev); + ipath_diagpkt_remove(); ipath_diag_remove(dd); ipath_user_remove(dd); ipathfs_remove_device(dd); @@ -2183,17 +2185,7 @@ static int __init infinipath_init(void) goto bail_group; } - ret = ipath_diagpkt_add(); - if (ret 0) { - printk(KERN_ERR IPATH_DRV_NAME : Unable to create - diag data device: error %d\n, -ret); - goto bail_ipathfs; - } - goto bail; - -bail_ipathfs: - ipath_exit_ipathfs(); bail_group: ipath_driver_remove_group(ipath_driver.driver); diff -r d168d78758ca drivers/infiniband/hw/ipath/ipath_kernel.h --- a/drivers/infiniband/hw/ipath/ipath_kernel.hTue Oct 03 15:01:29 2006 -0700 +++ b/drivers/infiniband/hw/ipath/ipath_kernel.hTue Oct 03 15:01:29 2006 -0700 @@ -889,7 +889,7 @@ void ipath_device_remove_group(struct de void ipath_device_remove_group(struct device *, struct ipath_devdata *); int ipath_expose_reset(struct device *); -int ipath_diagpkt_add(void); +void ipath_diagpkt_add(void); void ipath_diagpkt_remove(void); int ipath_init_ipathfs(void); ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Fwd: Re: Problems with OFED IPoIB HA on SLES10
BTW, any idea? The ipoib_ha is just a script that ups/downs and configures interfaces, so this crash it seems coul also happen on systems without it. -- MST ---BeginMessage--- If I fail back and forth between ib0 and ib1 every 30 seconds or so for several hours, while IPoIB traffic is running, IPoIB host gets an Oops: and IPoIB stops working. ib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib1: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetib0: dev_queue_xmit failed to requeue packetgeneral protection fault: [1] SMPlast sysfs file: /devices/pci:00/:00:00.0/irqCPU 7Modules linked in: af_packet ib_sdp rdma_ucm rdma_cm ib_addr ib_cm ib_ipoib ib_sa ib_uverbs ib_umad ib_mthca ib_mad ib_core nls_utf8 st ipv6 nfs lockd nfs_acl sunrpc button battery ac apparmor aamatch_pcre loop usbhid dm_mod hw_random ide_cd ehci_hcd uhci_hcd cdrom i8xx_tco ide_floppy usbcore shpchp e1000 pci_hotplug floppy reiserfs edd fan thermal processor siimage sg mptspi mptscsih mptbase scsi_transport_spi piix sd_mod scsi_mod ide_disk ide_corePid: 23541, comm: ib_mad1 Tainted: G U 2.6.16.21-0.8-smp #1RIP: 0010:[802cffea] 802cffea{_spin_lock_irqsave+3}RSP: 0018:810132a4fc20 EFLAGS: 00010086RAX: 0286 RBX: RCX: 883324eeRDX: 810128d5e380 RSI: RDI: 1b6017ffRBP: fffc R08: 803d3260 R09: 810140333800R10: 81000107d400 R11: 0292 R12: 810128d5e380R13: 810132a4fc78 R14: 1b6017ff R15: 0003FS: () GS:810142d19740() knlGS:CS: 0010 DS: 0018 ES: 0018 CR0: 8005003bCR2: 2b0b5e6ae180 CR3: 000128cbc000 CR4: 06e0Process ib_mad1 (pid: 23541, threadinfo 810132a4e000, task 810142b56100)Stack: 8833c5f5 8101302b3000 1b6012ff 0002 0296 8101302b3500 8027753e 810128d5e3a0 81012bce1680 810128d5e380Call Trace:
Re: [openib-general] [openfabrics-ewg] Problems with OFED IPoIB HA on SLES10
Hi Scott, You have an old version of ipoibtools package (ipoib_ha.pl). All issues you are talking about were fixed in the new version which will be available in OFED-1.1-rc7. You can also download it from SVN: https://openib.org/svn/gen2/branches/1.1/src/userspace/ipoibtools Thanks, Regards, Vladimir On Tue, 2006-10-03 at 14:53 -0700, Scott Weitzenkamp (sweitzen) wrote: Vlad, thaks for the fast response. I have some followup questions about configuring IPoIB HA, see below. 3) I got IPoIB HA working on SLES 10, but the documentation is a little lacking. Looks like I have to put the same IP address in ifcfg-ib0 and ifcfg-ib1, is this correct? Yes, IP address should be the same. Actually the configuration of the secondary interface does not matter. The High Availability daemon reads the configuration of the primary interface and migrates it between the interfaces in case of failure. If I don't have an ifcfg-ib1 file, then ipoib_ha.pl won't start. If I don't have an ifcfg-ib1, then ipoib_ha.pl won't start. I would prefer to not configure ifcfg-ib1 since I don't plan to use it. # ipoib_ha.pl --with-arping --with-multicast -v Can't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directory Can't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directory Can't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directory Can't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directory Can't open conf /etc/sysconfig/network/ifcfg-ib1: No such file or directory ... If I put different IP addresses in ifcfg-ib0 and ifcfg-ib1, then the ifcfg-ib1 IP address is used for both ib0 and ib1! # pwd /etc/sysconfig/network # cat ifcfg-ib0 DEVICE=ib0 BOOTPROTO=static IPADDR=192.168.2.46 NETMASK=255.255.255.0 > # cat ifcfg-ib1 DEVICE=ib1 BOOTPROTO=static IPADDR=192.168.6.46 NETMASK=255.255.255.0 > # /etc/init.d/openibd start Loading HCA driver and Access Layer: [ OK ] Setting up InfiniBand network interfaces: ib0 device: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor com patibility mode) (rev 20) ib0 configuration: ib1 Bringing up interface ib0: [ OK ] ib1 device: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor com patibility mode) (rev 20) Bringing up interface ib1: [ OK ] Setting up service network . . . [ done ] # ifconfig ib0 ib0 Link encap:UNSPEC HWaddr 00-00-04-04-FE-80-00-00-00-00-00-00-00-00-00 -00 inet addr:192.168.6.46 Bcast:192.168.6.255 Mask:255.255.255.0 inet6 addr: fe80::202:c902:21:700d/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:3 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:128 RX bytes:0 (0.0 b) TX bytes:224 (224.0 b) # ifconfig ib1 ib1 Link encap:UNSPEC HWaddr 00-00-04-05-FE-80-00-00-00-00-00-00-00-00-00 -00 inet addr:192.168.6.46 Bcast:192.168.6.255 Mask:255.255.255.0 inet6 addr: fe80::202:c902:21:700e/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:4 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:128 RX bytes:0 (0.0 b) TX bytes:304 (304.0 b) Notice how both ib0 and ib1 have the IP address from ifcfg-ib1. This contradicts this info from ipoib_release_notes.txt: b. The ib1 interface uses the configuration script of ib0. Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_cm: fix module unload race with timewait
On Saturday 30 September 2006 02:52, Sean Hefty wrote: If the ib_cm module is unloaded while id's are still in timewait, the CM will destroy the work queue used to process timewait. Once the id's exit timewait, their timers will fire, leading to a crash trying to access the destroyed work queue. We need to track id's that are in timewait, and cancel their deferred work on module unload. Signed-off-by: Sean Hefty [EMAIL PROTECTED] Erez, have you tried out the patch (with or without Roland's suggested modifications)? If so, did it solve the problem? (we think it most likely did, but we would like to know). - Jack ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm branch
Sean Hefty wrote: Or Gerlitz wrote: Can you clarify what do you mean (ABI) conflict with OFED releases? Is an issue with someone wishing to work with OFED user space and IB code from upstream kernel? Yes - there could be issues there. As long as OFED provides kernel IB code you need not support the above config. The approach i suggest is: it makes sense to take some care not to create too much non working scenarios... however the upstream push process must **not** be restricted by the existence of OFED. I agree with this. cool. Specifically, can you push rhe rdma_establish() ***kernel*** API support which was integrated into OFED 1.1 as a bug fix for 2.6.19 ? Yes, but I'd like a user of it to go in at the same time. I don't think this is possible nor its required. The thing is that the only in-tree consumer of the cma code is the iser initiator which implements the active side of an rdma connection. As such it does not call rdma_accept() nor it can be modified to call rdma_establish(), so the _establish() call can be merged during bug fixes window similarly as the _accept() call has been merged during feature window. If you find it problematic to merge it for 2.6.19 i think you should demand ***removing*** the rdma_establish() call from OFED 1.1 as this puts the kernel code in second place relative to OFED and violates another guideline: OFED uses ***kernel IB code***, where kernel IB code stands for code that has been merged into Linus tree, or is at some branch of Roland's tree (or your tree when you have such...), or at the -mm tree etc. Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RHEL 4 U3 - lost completions
Roland Dreier wrote: Or Roland - If indeed, does it make sense that the problem does Or not reproduce with single threaded runs? Sorry, I can't parse the question. However, the problem here seems to be that the CQ buffer pages end up being marked for copy-on-write, and I don't know of any reason why that would happen other than a fork() happening somewhere (possibly behind the scenes in a system() call or something like that). My question was: assuming there is some fork() (eg behind the scenes of daemonize()) in the app, does it makes sense that everything works as long as the app is single threaded but when there are multiple threads things breaks (eg COW is applied on the page used to hold the CQ etc). ? Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/10] [RFC] Support for SilverStorm Virtual Ethernet I/O controller (VEx)
rick wrote: For what it's worth: As a customer who is using the SS stack - we were more than pleased that we could achieve IPOIB (and RDS) failover without using the bonding driver. I believe this is direct result of the Virtual NIC approach SS is using. Were you pleased as of having a solution for Oracle/IPoIB/RDS/HA in the presence of the no support for IPoIB by the bonding driver? or the VNIC has provided you some feature which differentiates it from the active-backup mode of the bonding driver? Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm branch
Sean Hefty wrote: Can you clarify what do you mean (ABI) conflict with OFED releases? Is an issue with someone wishing to work with OFED user space and IB code from upstream kernel? Yes - there could be issues there. To clarify the major issue: currently when a connection request is received, the connection data specified by the active side through the rdma_conn_param is NOT given to the user. This includes the responder_resources and initiator_depth. There's no easy way to obtain this information. And when getting established event, the connection data specified by the passive side through the rdma_conn_param provided to rdma_accept is also not given to the user, is that an issue? Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] 2.6.18 kernel support in the main trunk.
On Tue, 2006-10-03 at 09:25 -0500, Steve Wise wrote: Someday soon I hear, OFA will be able to host git repositories, so my preference is to delay any svn to git transition until then. (I cannot host git from inside Intel's firewall, nor can I access a git repository which isn't hosted at kernel.org.) How would you handle merging in changes from the main branch to side branches? Can OFA give us a date on when this will happen? We just got approval to spend OFA money on a new hosted server. The arrangements are being made but we don't have a date for when we will get access to this new machine or when it will be set up. If I had to guess I'd say we will start setting up the server in the next couple weeks. Thanks, - Matt ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Infiniband Fedora Core5
No, Fedora Core 5 is not part of the OFED OS matrix. Marsh, Scott wrote: Good day, My name is Scott Marsh. I am an Engineer for Analogic Corporation and I have a few questions regarding OFED. Is there any current development towards OFED for use with Fedora Core 5? If so, is there a timeline for working towards Fedora Core 5? Thank you. Regards, Scott Marsh The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [EMAIL PROTECTED] - and destroy all copies of this information, including any attachments, without reading or disclosing them. Thank you. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm branch
Quoting r. Sean Hefty [EMAIL PROTECTED]: To clarify the major issue: currently when a connection request is received, the connection data specified by the active side through the rdma_conn_param is NOT given to the user. This includes the responder_resources and initiator_depth. There's no easy way to obtain this information. The ideal fix for this is to include rdma_conn_param as part of the rdma_cm_event. BTW, wouldn't it be cleaner to just pass it up in the request event? However, this breaks every userspace app that's been coded to OFED / SVN. An alternative is to add another call to retrieve the data, but that's not a very clean alternative for new kernel submission. Another alternative is to version the create ID call. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 4/13] osm: port to WinIB stack : osmtest/osmtest.c
Hi Eitan, On Sun, 2006-09-17 at 11:59, Eitan Zahavi wrote: Hi Hal Explicit cast required for the win compiler to handle this... Applied to trunk only. Accepted to be consistent with other patches applied for Windows which currently accepted casts. -- Hal Thanks Eitan ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] osm: port to WinIB stack
Hi Eitan, On Tue, 2006-10-03 at 17:49, Hal Rosenstock wrote: Hi Eitan, Aside from the varargs handling (relative to 2 patches) and the osmtest.c question, osmtest.c patch has been applied. The question is more general as to why the casts were needed for Windows. -- Hal also pending is a patch to remove the WIN defines just added in multiple places and move them to config.h for the Windows build ? Can you/when can you prepare a patch for this ? Thanks! -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH repost] IB/mthca: query port fix
Fill in max_vl_num (encoded according to VLCap field in the PortInfo MAD), and init_type_reply values in the ib_query_port verb. Signed-off-by: Jack Morgenstein [EMAIL PROTECTED] --- This was posted a while ago - could the fix go into 2.6.19? Index: ofed_1_1/drivers/infiniband/hw/mthca/mthca_provider.c === --- ofed_1_1.orig/drivers/infiniband/hw/mthca/mthca_provider.c 2006-08-03 14:30:21.0 +0300 +++ ofed_1_1/drivers/infiniband/hw/mthca/mthca_provider.c 2006-08-20 09:37:10.647839000 +0300 @@ -179,6 +179,8 @@ static int mthca_query_port(struct ib_de props-max_mtu = out_mad-data[41] 0xf; props-active_mtu= out_mad-data[36] 4; props-subnet_timeout= out_mad-data[51] 0x1f; + props-max_vl_num= out_mad-data[37] 4; + props-init_type_reply = out_mad-data[41] 4; out: kfree(in_mad); -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH repost] IB/srp: re-create QP and CQ on reconnect
From: Ishai Rabinovitz [EMAIL PROTECTED] Make srp destroy/re-create QP and CQ on each reconnect. This makes SRP more robust in presence of hardware errors and is closer to behaviour suggested by IB spec, reducing chance of stale packets. Signed-off-by: Ishai Rabinovitz [EMAIL PROTECTED] Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED] --- Roland, this has been posted a while ago, and still applies to for-2.6.19 with a small offset. Looks like a good idea - could this go into 2.6.19? A description from the original mail below: For some reason (could be a firmware problem) I got a CQ overrun in SRP. Because of that there was a QP FATAL. Since in srp_reconnect_target we are not destroying the QP, the QP FATAL persists after the reconnect. In order to be able to recover from such situation I suggest we destroy the CQ and the QP in every reconnect. This also corrects a minor spec in-compliance - when srp_reconnect_target is called, srp destroys the CM ID and resets the QP, the new connection will be retried with the same QPN which could theoretically lead to stale packets (for strict spec compliance I think QPN should not be reused till all stale packets are flushed out of the network). Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c === --- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-08-31 12:23:52.0 +0300 +++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-08-31 12:30:48.0 +0300 @@ -495,10 +495,10 @@ static int srp_reconnect_target(struct srp_target_port *target) { struct ib_cm_id *new_cm_id; - struct ib_qp_attr qp_attr; struct srp_request *req, *tmp; - struct ib_wc wc; int ret; + struct ib_cq *old_cq; + struct ib_qp *old_qp; spin_lock_irq(target-scsi_host-host_lock); if (target-state != SRP_TARGET_LIVE) { @@ -522,17 +522,17 @@ ib_destroy_cm_id(target-cm_id); target-cm_id = new_cm_id; - qp_attr.qp_state = IB_QPS_RESET; - ret = ib_modify_qp(target-qp, qp_attr, IB_QP_STATE); - if (ret) - goto err; - - ret = srp_init_qp(target, target-qp); - if (ret) + old_qp = target-qp; + old_cq = target-cq; + ret = srp_create_target_ib(target); + if (ret) { + target-qp = old_qp; + target-cq = old_cq; goto err; + } - while (ib_poll_cq(target-cq, 1, wc) 0) - ; /* nothing */ + ib_destroy_qp(old_qp); + ib_destroy_cq(old_cq); spin_lock_irq(target-scsi_host-host_lock); list_for_each_entry_safe(req, tmp, target-req_queue, list) -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] IB/SRP set initiator_extention from user space
There is a need for an initiator to connect to the same target several times, e.g., once from each IB port of the target. Some targets do not support multichannel. In order to work with them as well: 1) Use port_guid instead of node_guid. 2) Allow the user to set the identifier_extension when providing the target attributes. Signed-off-by: Ishai Rabinovitz [EMAIL PROTECTED] --- Roland, Madhu and MST, I think this summarizes our discussion. Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.c === --- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.c2006-10-03 15:38:16.0 +0200 +++ last_stable/drivers/infiniband/ulp/srp/ib_srp.c 2006-10-03 18:10:34.0 +0200 @@ -329,25 +329,29 @@ static int srp_send_req(struct srp_targe req-priv.req_it_iu_len = cpu_to_be32(srp_max_iu_len); req-priv.req_buf_fmt = cpu_to_be16(SRP_BUF_FORMAT_DIRECT | SRP_BUF_FORMAT_INDIRECT); + /* * In the published SRP specification (draft rev. 16a), the * port identifier format is 8 bytes of ID extension followed -* by 8 bytes of GUID. Older drafts put the two halves in the -* opposite order, so that the GUID comes first. +* by 8 bytes of port GUID. Older drafts put the two halves in the +* opposite order, so that the port GUID comes first. * * Targets conforming to these obsolete drafts can be * recognized by the I/O Class they report. */ + if (target-io_class == SRP_REV10_IB_IO_CLASS) { memcpy(req-priv.initiator_port_id, - target-srp_host-initiator_port_id + 8, 8); + target-path.sgid.global.interface_id, 8); memcpy(req-priv.initiator_port_id + 8, - target-srp_host-initiator_port_id, 8); + target-initiator_ext, 8); memcpy(req-priv.target_port_id, target-ioc_guid, 8); memcpy(req-priv.target_port_id + 8, target-id_ext, 8); } else { memcpy(req-priv.initiator_port_id, - target-srp_host-initiator_port_id, 16); + target-initiator_ext, 8); + memcpy(req-priv.initiator_port_id + 8, + target-path.sgid.global.interface_id, 8); memcpy(req-priv.target_port_id, target-id_ext, 8); memcpy(req-priv.target_port_id + 8, target-ioc_guid, 8); } @@ -1557,6 +1561,7 @@ enum { SRP_OPT_MAX_SECT= 1 5, SRP_OPT_MAX_CMD_PER_LUN = 1 6, SRP_OPT_IO_CLASS= 1 7, + SRP_OPT_INITIATOR_EXT = 1 8, SRP_OPT_ALL = (SRP_OPT_ID_EXT | SRP_OPT_IOC_GUID | SRP_OPT_DGID | @@ -1573,6 +1578,7 @@ static match_table_t srp_opt_tokens = { { SRP_OPT_MAX_SECT, max_sect=%d }, { SRP_OPT_MAX_CMD_PER_LUN, max_cmd_per_lun=%d}, { SRP_OPT_IO_CLASS, io_class=%x }, + { SRP_OPT_INITIATOR_EXT,initiator_ext=%s }, { SRP_OPT_ERR, NULL} }; @@ -1672,6 +1678,12 @@ static int srp_parse_options(const char target-io_class = token; break; + case SRP_OPT_INITIATOR_EXT: + p = match_strdup(args); + target-initiator_ext = cpu_to_be64(simple_strtoull(p, NULL, 16)); + kfree(p); + break; + default: printk(KERN_WARNING PFX unknown parameter or missing value '%s' in target creation request\n, p); @@ -1820,9 +1832,6 @@ static struct srp_host *srp_add_port(str host-dev = device; host-port = port; - host-initiator_port_id[7] = port; - memcpy(host-initiator_port_id + 8, device-dev-node_guid, 8); - host-class_dev.class = srp_class; host-class_dev.dev = device-dev-dma_device; snprintf(host-class_dev.class_id, BUS_ID_SIZE, srp-%s-%d, Index: last_stable/drivers/infiniband/ulp/srp/ib_srp.h === --- last_stable.orig/drivers/infiniband/ulp/srp/ib_srp.h2006-10-03 15:38:16.0 +0200 +++ last_stable/drivers/infiniband/ulp/srp/ib_srp.h 2006-10-03 18:05:50.0 +0200 @@ -91,7 +91,6 @@ struct srp_device { }; struct srp_host { - u8 initiator_port_id[16]; struct srp_device *dev; u8 port; struct class_device class_dev; @@ -122,6 +121,7 @@ struct srp_target_port { __be64 id_ext; __be64
Re: [openib-general] Fwd: Re: Problems with OFED IPoIB HA on SLES10
Another point: this seems to be crashing while we are requeueing the packet through dev_start_xmit upon path record completion. It looks like this could try to requeue even though the interface is going down - could this trigger some problems? Quoting r. Michael S. Tsirkin [EMAIL PROTECTED]: Subject: Fwd: Re: Problems with OFED IPoIB HA on SLES10 BTW, any idea? The ipoib_ha is just a script that ups/downs and configures interfaces, so this crash it seems coul also happen on systems without it. -- MST Date: Tue, 3 Oct 2006 22:39:54 -0700 From: Scott Weitzenkamp (sweitzen) [EMAIL PROTECTED] Subject: Re: [openib-general] Problems with OFED IPoIB HA on SLES10 If I fail back and forth between ib0 and ib1 every 30 seconds or so for several hours, while IPoIB traffic is running, IPoIB host gets an Oops: and IPoIB stops working. ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib1: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet general protection fault: [1] SMP last sysfs file: /devices/pci:00/:00:00.0/irq CPU 7 Modules linked in: af_packet ib_sdp rdma_ucm rdma_cm ib_addr ib_cm ib_ipoib ib_s a ib_uverbs ib_umad ib_mthca ib_mad ib_core nls_utf8 st ipv6 nfs lockd nfs_acl s unrpc button battery ac apparmor aamatch_pcre loop usbhid dm_mod hw_random ide_c d ehci_hcd uhci_hcd cdrom i8xx_tco ide_floppy usbcore shpchp e1000 pci_hotplug f loppy reiserfs edd fan thermal processor siimage sg mptspi mptscsih mptbase scsi _transport_spi piix sd_mod scsi_mod ide_disk ide_core Pid: 23541, comm: ib_mad1 Tainted: G U 2.6.16.21-0.8-smp #1 RIP: 0010:[802cffea] 802cffea{_spin_lock_irqsave+3} RSP: 0018:810132a4fc20 EFLAGS: 00010086 RAX: 0286 RBX: RCX: 883324ee RDX: 810128d5e380 RSI: RDI: 1b6017ff RBP: fffc R08: 803d3260 R09: 810140333800 R10: 81000107d400 R11: 0292 R12: 810128d5e380 R13: 810132a4fc78 R14: 1b6017ff R15:
Re: [openib-general] 2.6.18 kernel support in the main trunk.
Quoting r. Matt Leininger [EMAIL PROTECTED]: We just got approval to spend OFA money on a new hosted server. The arrangements are being made but we don't have a date for when we will get access to this new machine or when it will be set up. If I had to guess I'd say we will start setting up the server in the next couple weeks. Thanks, - Matt Thanks. A couple of more requests as far as you are working on the infrastructure - updated svn server enables fast mirroring better web access and other goodies - add bugzilla email gateway (as seen e.g. at kernel.org) that supports accepting Cc mail where you put [Bug ] in the subject (where is the bug number) and cc [EMAIL PROTECTED] Could these be addressed? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] RFC: potential race in ipoib
Not related to the recently discussed oops, but I think I see an oopsable race in path_rec_completion: we do: if (dev_queue_xmit(skb)) ipoib_warn(priv, dev_queue_xmit failed to requeue packet\n); if the device is going away (e.g. hotplug remove) and the skb is the last one, priv pointer might not exist anymore after dev_queue_xmit - the attempt to read the name in ipoib_warn will then lead to a crash. Do we even need the ipoib_warn? Its not too hard to trigger it by downing the device while path record query is in progress. Maybe just remove the message? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH fixed] IB/srp: enable multiple connections to the same target
Enable multiple concurrent connections to the same SRP target 1) Use port guid instead of node guid in the initiator port identifier. 2) Let the user specify the identifier extention when adding the device. Signed-off-by: Ishai Rabinovitz [EMAIL PROTECTED] Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED] --- Looks like the last patch Ishai posted didn't apply to the upstream srp. Here's the version that does. Comments? drivers/infiniband/ulp/srp/ib_srp.c | 19 +-- drivers/infiniband/ulp/srp/ib_srp.h |2 +- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 44b9e5b..273a688 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -343,14 +343,16 @@ static int srp_send_req(struct srp_targe */ if (target-io_class == SRP_REV10_IB_IO_CLASS) { memcpy(req-priv.initiator_port_id, - target-srp_host-initiator_port_id + 8, 8); + target-path.sgid.global.interface_id, 8); memcpy(req-priv.initiator_port_id + 8, - target-srp_host-initiator_port_id, 8); + target-initiator_ext, 8); memcpy(req-priv.target_port_id, target-ioc_guid, 8); memcpy(req-priv.target_port_id + 8, target-id_ext, 8); } else { memcpy(req-priv.initiator_port_id, - target-srp_host-initiator_port_id, 16); + target-initiator_ext, 8); + memcpy(req-priv.initiator_port_id + 8, + target-path.sgid.global.interface_id, 8); memcpy(req-priv.target_port_id, target-id_ext, 8); memcpy(req-priv.target_port_id + 8, target-ioc_guid, 8); } @@ -1553,6 +1555,7 @@ enum { SRP_OPT_MAX_SECT= 1 5, SRP_OPT_MAX_CMD_PER_LUN = 1 6, SRP_OPT_IO_CLASS= 1 7, + SRP_OPT_INITIATOR_EXT = 1 8, SRP_OPT_ALL = (SRP_OPT_ID_EXT | SRP_OPT_IOC_GUID | SRP_OPT_DGID | @@ -1569,6 +1572,7 @@ static match_table_t srp_opt_tokens = { { SRP_OPT_MAX_SECT, max_sect=%d }, { SRP_OPT_MAX_CMD_PER_LUN, max_cmd_per_lun=%d}, { SRP_OPT_IO_CLASS, io_class=%x }, + { SRP_OPT_INITIATOR_EXT,initiator_ext=%s }, { SRP_OPT_ERR, NULL} }; @@ -1668,6 +1672,12 @@ static int srp_parse_options(const char target-io_class = token; break; + case SRP_OPT_INITIATOR_EXT: + p = match_strdup(args); + target-initiator_ext = cpu_to_be64(simple_strtoull(p, NULL, 16)); + kfree(p); + break; + default: printk(KERN_WARNING PFX unknown parameter or missing value '%s' in target creation request\n, p); @@ -1815,9 +1825,6 @@ static struct srp_host *srp_add_port(str host-dev = device; host-port = port; - host-initiator_port_id[7] = port; - memcpy(host-initiator_port_id + 8, device-dev-node_guid, 8); - host-class_dev.class = srp_class; host-class_dev.dev = device-dev-dma_device; snprintf(host-class_dev.class_id, BUS_ID_SIZE, srp-%s-%d, diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h index 5b581fb..d4e35ef 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.h +++ b/drivers/infiniband/ulp/srp/ib_srp.h @@ -91,7 +91,6 @@ struct srp_device { }; struct srp_host { - u8 initiator_port_id[16]; struct srp_device *dev; u8 port; struct class_device class_dev; @@ -122,6 +121,7 @@ struct srp_target_port { __be64 id_ext; __be64 ioc_guid; __be64 service_id; + __be64 initiator_ext; u16 io_class; struct srp_host*srp_host; struct Scsi_Host *scsi_host; -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH fixed] IB/srp: enable multiple connections to the same target
Enable multiple concurrent connections to the same SRP target 1) Use port guid instead of node guid in the initiator port identifier. 2) Let the user specify the identifier extention when adding the device. Signed-off-by: Ishai Rabinovitz [EMAIL PROTECTED] Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED] --- Looks like the last patch Ishai posted didn't apply to the upstream srp. Here's the version that does. Comments? I had some trouble applying the patch as well. I'll try again and let you know soon. From reviewing the code, it appears to fulfill the requirements we agreed upon. Madhu ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] FW: FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD
Quoting r. Moshe Kazir [EMAIL PROTECTED]: Subject: FW: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD Michael, I received the attached files from Frank. they look small , easy to understand, and change almost nothing in the code. The patch solves the ppc64 problems. Please approve the patch and integrate it into OFED-1.1-rc7. I tested it . it's working o.k. on on JS21 ppc64 sles 10, JS21 ppc64 sles9, redhat as4 u3 x86_64, redhat as4 u3 i386. Frank also tested it on AMD and JS21 PPC and MAC PPC64 . Best regards, Moshe OK, not sure what's in a tarball, but the patch looks small and safe enough to go in. But, we need the Signed-off-by like from the patch author, certifying to the Developer's Certificate of Origin 1.1: The rules are pretty simple: if you can certify the below: Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved. then you just add a line saying Signed-off-by: Random J Developer [EMAIL PROTECTED] -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Infiniband Fedora Core5
Hi Scott, While OFED might not target FC5, you can use the OFA stack on FC5. In fact, FC5 comes with the OFA drivers (the code in driver/infiniband) pre-compiled. You can also download the latest development tree and compile the current drivers/libraries/apps. Of course neither of these options have received as much testing as the OFED distribution. james On Wed, 4 Oct 2006, Aviram Gutman wrote: No, Fedora Core 5 is not part of the OFED OS matrix. Marsh, Scott wrote: Good day, My name is Scott Marsh. I am an Engineer for Analogic Corporation and I have a few questions regarding OFED. Is there any current development towards OFED for use with Fedora Core 5? If so, is there a timeline for working towards Fedora Core 5? Thank you. Regards, Scott Marsh ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH v2] ib_cm: fix module unload race with timewait
Sean Hefty wrote: Updated patch based on Roland's feedback - converted a couple uses of spinlock_irqsave to spinlock_irq, and used list manipulation routine for cleanup. Sean, Your patch seems to work fine. I ran the same test several times (after applying the patch) and didn't see any oops. Thanks -- Erez Zilber | 972-9-971-7689 Software Engineer, Storage Team Voltaire – _The Grid Backbone_ __ www.voltaire.com http://www.voltaire.com/ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RHEL 4 U3 - lost completions
Michael S. Tsirkin wrote: Quoting r. [EMAIL PROTECTED] [EMAIL PROTECTED]: AFAIR there is a bug in kernel 2.6.9 that makes it possible for page to be changed in process's VM even though it is locked by get_user_pages(). That is why Mellanox driver used mlock() in addition to get_user_pages(). I think this bug was fixed somewhere around 2.6.11. I think it got fixed around 2.6.7. RHEL4 U3 has this fix, and AFAIK last SLES9 update has backported that to 2.6.7 too. Another data point here. On gen1 stacks + RHEL 4 U3, the app I'm working on mlock()s a region from user space and also does get_user_pages() on the same region from a kernel piece of the app. When the adapter was closed or the registration was freed, the region was munlock()ed by the IB stack and the page structs changed from under us, even though the app still had get_user_pages() on the region. Is this an indication that get_user_pages() not guaranteeing a page does not move on RHEL 4 U3? I created a test case using pthreads and simulated what the real app does and can not recreate. I will continue to debug the app. I will also verify no forks take place. -Bill ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] mstflint not working on ppc64 and when driver is not loaded on AMD
mmap() does not work on ppc64. The 64-bit machines with 32-bit I/O need ioremap in device driver to allow mmap access to the I/O memory. This patch checks the above situations and try to use PCI config to do the firmware update when mmap() failed. Signed-off-by: Tseng-Hui (Frank) Lin [EMAIL PROTECTED] === diff -uPr mstflint.ofed-1.1r6/mtcr.h mstflint/mtcr.h --- mstflint.ofed-1.1r6/mtcr.h 2006-09-17 10:46:21.0 -0500 +++ mstflint/mtcr.h 2006-10-03 10:29:38.0 -0500 @@ -294,6 +294,9 @@ int err; char buf[]=:00:00.0; char path[]=/sys/bus/pci/devices/:00:00.0/resource0; + unsigned domain, bus, dev, func; + struct stat dummybuf; + char file_name[]=/proc/bus/pci/:00/00.0; mf=(mfile*)malloc(sizeof(mfile)); if (!mf) return 0; @@ -338,13 +341,14 @@ mf-ptr = mmap(NULL, 0x10, PROT_READ | PROT_WRITE, MAP_SHARED, mf-fd, 0); -if ( (! mf-ptr) || (mf-ptr == MAP_FAILED) ) goto map_failed; +if ( (! mf-ptr) || (mf-ptr == MAP_FAILED) || +(__be32_to_cpu(*((u_int32_t *) ((char *) mf-ptr + 0xF0014))) == 0x) ) +goto map_failed_try_pciconf; } #endif else { #if CONFIG_ENABLE_MMAP -unsigned bus, dev, func; if (mfind(name,offset,bus,dev,func)) goto find_failed; #if CONFIG_USE_DEV_MEM @@ -352,8 +356,6 @@ if (mf-fd0) goto open_failed; #else { - struct stat dummybuf; - char file_name[]=/proc/bus/pci/:00/00.0; sprintf(file_name,/proc/bus/pci/%2.2x/%2.2x.%1.1x, bus,dev,func); if (stat(file_name,dummybuf)) @@ -369,7 +371,9 @@ mf-ptr = mmap(NULL, 0x10, PROT_READ | PROT_WRITE, MAP_SHARED, mf-fd, offset); -if ( (! mf-ptr) || (mf-ptr == MAP_FAILED) ) goto map_failed; +if ( (! mf-ptr) || (mf-ptr == MAP_FAILED) || +(__be32_to_cpu(*((u_int32_t *) ((char *) mf-ptr + 0xF0014))) == 0x) ) +goto map_failed_try_pciconf; #else goto open_failed; @@ -379,6 +383,20 @@ #if CONFIG_ENABLE_MMAP +map_failed_try_pciconf: +#if CONFIG_ENABLE_PCICONF + mf-ptr = NULL; + close(mf-fd); + if (sscanf(name, %x:%x:%x.%x, domain, bus, dev, func) != 4) { + domain = 0; + if (sscanf(name, %x:%x.%x, bus, dev, func) != 3) goto map_failed; + } + snprintf(file_name, sizeof file_name, /proc/bus/pci/%2.2x/%2.2x.%1.1x, bus, dev, func); + if (stat(file_name,dummybuf)) + snprintf(file_name, sizeof file_name, /proc/bus/pci/%4.4x:%2.2x/%2.2x.%1.1x, domain, bus,dev,func); + if ((mf-fd = open(file_name, O_RDWR | O_SYNC)) = 0) return mf; +#endif + map_failed: #if !CONFIG_USE_DEV_MEM ioctl_failed: ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH v2] ib_cm: fix module unload race with timewait
Your patch seems to work fine. I ran the same test several times (after applying the patch) and didn't see any oops. Thanks - I will commit to svn and resubmit for 2.6.19 inclusion. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm branch
And when getting established event, the connection data specified by the passive side through the rdma_conn_param provided to rdma_accept is also not given to the user, is that an issue? Correct - the connection parameter data disappears into the rdma_cm and is not directly given to the remote side. The data can be obtained by querying the QP after it's been connected, or by calling rdma_init_qp_attr(), but neither of these methods are very clean. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] 2.6.18 kernel support in the main trunk.
On Wed, 2006-10-04 at 14:47 +0200, Michael S. Tsirkin wrote: Quoting r. Matt Leininger [EMAIL PROTECTED]: We just got approval to spend OFA money on a new hosted server. The arrangements are being made but we don't have a date for when we will get access to this new machine or when it will be set up. If I had to guess I'd say we will start setting up the server in the next couple weeks. Thanks, - Matt Thanks. A couple of more requests as far as you are working on the infrastructure - updated svn server enables fast mirroring better web access and other goodies Are you referring to svn 1.4? our plan is to upgrade to 1.4. - add bugzilla email gateway (as seen e.g. at kernel.org) that supports accepting Cc mail where you put [Bug ] in the subject (where is the bug number) and cc [EMAIL PROTECTED] I'll add that to the list. - Matt Could these be addressed? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm branch
The ideal fix for this is to include rdma_conn_param as part of the rdma_cm_event. BTW, wouldn't it be cleaner to just pass it up in the request event? Yes - this is what I meant by including it in the rdma_cm_event structure. However, this breaks every userspace app that's been coded to OFED / SVN. An alternative is to add another call to retrieve the data, but that's not a very clean alternative for new kernel submission. Another alternative is to version the create ID call. Hmm... I need to think about the implementation of this more, but this sounds like a possibility. Can you provide any details on how you're envisioning this working? - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm branch
On Wed, 2006-10-04 at 10:00 -0700, Sean Hefty wrote: The ideal fix for this is to include rdma_conn_param as part of the rdma_cm_event. BTW, wouldn't it be cleaner to just pass it up in the request event? Yes - this is what I meant by including it in the rdma_cm_event structure. However, this breaks every userspace app that's been coded to OFED / SVN. An alternative is to add another call to retrieve the data, but that's not a very clean alternative for new kernel submission. Another alternative is to version the create ID call. Hmm... I need to think about the implementation of this more, but this sounds like a possibility. Can you provide any details on how you're envisioning this working? - Sean Guys, I must be confused. I thought the private data _was_ passed up in the ESTABLISHED event on the active side. We have tools in the perftools directory that utilize this. What am I missing here? Thanks, Steve. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm branch
Guys, I must be confused. I thought the private data _was_ passed up in the ESTABLISHED event on the active side. We have tools in the perftools directory that utilize this. What am I missing here? When a user calls rdma_connect(), they specific connection parameters (like responder_resources and initiator_depth) through a struct rdma_conn_param. These parameters are NOT given to the user when the connect request event is reported. The issue is: are these values needed by the user during connection establishment? If yes, then how do we export them to the user. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm branch
On Wed, 2006-10-04 at 10:17 -0700, Sean Hefty wrote: Guys, I must be confused. I thought the private data _was_ passed up in the ESTABLISHED event on the active side. We have tools in the perftools directory that utilize this. What am I missing here? When a user calls rdma_connect(), they specific connection parameters (like responder_resources and initiator_depth) through a struct rdma_conn_param. These parameters are NOT given to the user when the connect request event is reported. The issue is: are these values needed by the user during connection establishment? If yes, then how do we export them to the user. I understand now. For iWARP, the key parameter is setting your local QP's ORD (initiator resources) to = your peer's IRD (responder resources) to avoid overflowing the peers incoming rdma read queue. I think the iWARP devices must support setting ORD even after the connection is setup and the QP is in RTS, so the connection _could_ be setup (qp moved to RTS) and then the QP modified to the appropriate settings after querying to get the peer's params. But I think it seems more natural to deal with this at connection setup time. It would be nice, IMO, for the RDMA CM to handle this under the covers and setup the QP appropriately. Thus the parameters need not be passed to the consumer... My 2 cents. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout
Michael S. Tsirkin wrote: There's several timeout values transfered and used by the cm, most notably the remote cm response timeout and packet life time. Does it make more sense to have a single, generic timeout maximum instead? Hmm. I'm not sure - we are working around an actual broken implementation here - what do you think? I wasn't sure either. The MRA timeout is a combination of the packet life time + service timeout, which made me bring this up. The patch only handles the service timeout portion, so we end up in the same situation if a large packet life time is ever used. Would it make more sense to enable the maximum(s) by default, since we're dependent upon values received over the network? I think it would. So do I. The CM has checks to bring out of range values into range, but at the maximum, we get a timeout of about 2.5 hours. Multiple that by 15 retries, and the cm can literally spend all day retrying a request. I was considering dropping the default maximum down to around 4-8 seconds, which with retries still gives us about a minute to timeout a request. The default maximum would apply to local and remote cm timeouts, packet life time, and service timeout, but could be overridden by the user. (Basically, with Ishai's patch: rename mra_timeout_limit to timeout_limit, set to a default of 20, and replace occurrences of '31' in the code with timeout_limit.) - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm branch
On 10/4/06 12:40 PM, Steve Wise [EMAIL PROTECTED] wrote: On Wed, 2006-10-04 at 10:17 -0700, Sean Hefty wrote: Guys, I must be confused. I thought the private data _was_ passed up in the ESTABLISHED event on the active side. We have tools in the perftools directory that utilize this. What am I missing here? When a user calls rdma_connect(), they specific connection parameters (like responder_resources and initiator_depth) through a struct rdma_conn_param. These parameters are NOT given to the user when the connect request event is reported. The issue is: are these values needed by the user during connection establishment? If yes, then how do we export them to the user. I understand now. For iWARP, the key parameter is setting your local QP's ORD (initiator resources) to = your peer's IRD (responder resources) to avoid overflowing the peers incoming rdma read queue. I think the iWARP devices must support setting ORD even after the connection is setup and the QP is in RTS, so the connection _could_ be setup (qp moved to RTS) and then the QP modified to the appropriate settings after querying to get the peer's params. But I think it seems more natural to deal with this at connection setup time. It would be nice, IMO, for the RDMA CM to handle this under the covers and setup the QP appropriately. Thus the parameters need not be passed to the consumer... Actually, I think how the IRD/ORD parameters are exchanged and negotiated by the CM in private data is a separate issue from whether or not the end result of the negotiation is provided to the app in the established event. I think some apps would like to know the end result of the negotiation so that it can throttle RDMA_READ submissions on the SQ and avoid stalling outbound RDMA_WRITE/RDMA_SEND behind the last RMDA_READ. So I guess that's a long way of saying that I advocate adding the negotiated value/values to the event. My 2 cents. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm connection parameters (was: rdma_cm branch)
Steve Wise wrote: It would be nice, IMO, for the RDMA CM to handle this under the covers and setup the QP appropriately. Thus the parameters need not be passed to the consumer... The same parameters are also specified when calling rdma_accept(). I think these are the values that are used for the connection. (I need to trace through the code to be sure.) There's no easy way for the passive side to know what was requested without exporting the values. We could drop to the lower of the two values, and let users that really care what the values are call ib_query_qp() after the connection has been established. This has the disadvantage that you couldn't just reject the connection if the values weren't what you needed. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 1/2] 2.6.19 ib_cm: fix timewait crash after module unload
From: Sean Hefty [EMAIL PROTECTED] If the ib_cm module is unloaded while id's are still in timewait, the CM will destroy the work queue used to process timewait. Once the id's exit timewait, their timers will fire, leading to a crash trying to access the destroyed work queue. We need to track id's that are in timewait, and cancel their deferred work on module unload. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c index f35fcc4..470c482 100644 --- a/drivers/infiniband/core/cm.c +++ b/drivers/infiniband/core/cm.c @@ -75,6 +75,7 @@ static struct ib_cm { struct rb_root remote_sidr_table; struct idr local_id_table; __be32 random_id_operand; + struct list_head timewait_list; struct workqueue_struct *wq; } cm; @@ -112,6 +113,7 @@ struct cm_work { struct cm_timewait_info { struct cm_work work;/* Must be first. */ + struct list_head list; struct rb_node remote_qp_node; struct rb_node remote_id_node; __be64 remote_ca_guid; @@ -647,13 +649,6 @@ static inline int cm_convert_to_ms(int i static void cm_cleanup_timewait(struct cm_timewait_info *timewait_info) { - unsigned long flags; - - if (!timewait_info-inserted_remote_id - !timewait_info-inserted_remote_qp) - return; - - spin_lock_irqsave(cm.lock, flags); if (timewait_info-inserted_remote_id) { rb_erase(timewait_info-remote_id_node, cm.remote_id_table); timewait_info-inserted_remote_id = 0; @@ -663,7 +658,6 @@ static void cm_cleanup_timewait(struct c rb_erase(timewait_info-remote_qp_node, cm.remote_qp_table); timewait_info-inserted_remote_qp = 0; } - spin_unlock_irqrestore(cm.lock, flags); } static struct cm_timewait_info * cm_create_timewait_info(__be32 local_id) @@ -684,8 +678,12 @@ static struct cm_timewait_info * cm_crea static void cm_enter_timewait(struct cm_id_private *cm_id_priv) { int wait_time; + unsigned long flags; + spin_lock_irqsave(cm.lock, flags); cm_cleanup_timewait(cm_id_priv-timewait_info); + list_add_tail(cm_id_priv-timewait_info-list, cm.timewait_list); + spin_unlock_irqrestore(cm.lock, flags); /* * The cm_id could be destroyed by the user before we exit timewait. @@ -701,9 +699,13 @@ static void cm_enter_timewait(struct cm_ static void cm_reset_to_idle(struct cm_id_private *cm_id_priv) { + unsigned long flags; + cm_id_priv-id.state = IB_CM_IDLE; if (cm_id_priv-timewait_info) { + spin_lock_irqsave(cm.lock, flags); cm_cleanup_timewait(cm_id_priv-timewait_info); + spin_unlock_irqrestore(cm.lock, flags); kfree(cm_id_priv-timewait_info); cm_id_priv-timewait_info = NULL; } @@ -1307,6 +1309,7 @@ static struct cm_id_private * cm_match_r if (timewait_info) { cur_cm_id_priv = cm_get_id(timewait_info-work.local_id, timewait_info-work.remote_id); + cm_cleanup_timewait(cm_id_priv-timewait_info); spin_unlock_irqrestore(cm.lock, flags); if (cur_cm_id_priv) { cm_dup_req_handler(work, cur_cm_id_priv); @@ -1315,7 +1318,8 @@ static struct cm_id_private * cm_match_r cm_issue_rej(work-port, work-mad_recv_wc, IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REQ, NULL, 0); - goto error; + listen_cm_id_priv = NULL; + goto out; } /* Find matching listen request. */ @@ -1323,21 +1327,20 @@ static struct cm_id_private * cm_match_r req_msg-service_id, req_msg-private_data); if (!listen_cm_id_priv) { + cm_cleanup_timewait(cm_id_priv-timewait_info); spin_unlock_irqrestore(cm.lock, flags); cm_issue_rej(work-port, work-mad_recv_wc, IB_CM_REJ_INVALID_SERVICE_ID, CM_MSG_RESPONSE_REQ, NULL, 0); - goto error; + goto out; } atomic_inc(listen_cm_id_priv-refcount); atomic_inc(cm_id_priv-refcount); cm_id_priv-id.state = IB_CM_REQ_RCVD; atomic_inc(cm_id_priv-work_count); spin_unlock_irqrestore(cm.lock, flags); +out: return listen_cm_id_priv; - -error: cm_cleanup_timewait(cm_id_priv-timewait_info); - return NULL; } static int cm_req_handler(struct cm_work *work) @@ -2601,28 +2604,29 @@ static int cm_timewait_handler(struct cm { struct cm_timewait_info *timewait_info; struct cm_id_private *cm_id_priv; - unsigned long
[openib-general] [PATCH 2/2] 2.6.19 ib_cm: send DREP in response to unmatched DREQ
From: Sean Hefty [EMAIL PROTECTED] Currently a DREP is only sent in response to a DREQ if a connection has been found matching the DREQ, and it is in the proper state. Once a DREP is sent, the local connection moves into timewait. Duplicate DREQs received while in this state result in re-sending the DREP. However, it's likely that the local connection will enter and exit timewait before the remote side times out a lost DREP and resends a DREQ. To handle this, we send a DREP in response to a DREQ, even if a local connection is not found. This avoids maintaining disconnected id's in timewait states for excessively long times, just to handle a lost DREP. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- This addresses a problem experienced by MPI during scale-up testing. diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c index 470c482..25b1018 100644 --- a/drivers/infiniband/core/cm.c +++ b/drivers/infiniband/core/cm.c @@ -1902,6 +1902,32 @@ out: spin_unlock_irqrestore(cm_id_priv- } EXPORT_SYMBOL(ib_send_cm_drep); +static int cm_issue_drep(struct cm_port *port, +struct ib_mad_recv_wc *mad_recv_wc) +{ + struct ib_mad_send_buf *msg = NULL; + struct cm_dreq_msg *dreq_msg; + struct cm_drep_msg *drep_msg; + int ret; + + ret = cm_alloc_response_msg(port, mad_recv_wc, msg); + if (ret) + return ret; + + dreq_msg = (struct cm_dreq_msg *) mad_recv_wc-recv_buf.mad; + drep_msg = (struct cm_drep_msg *) msg-mad; + + cm_format_mad_hdr(drep_msg-hdr, CM_DREP_ATTR_ID, dreq_msg-hdr.tid); + drep_msg-remote_comm_id = dreq_msg-local_comm_id; + drep_msg-local_comm_id = dreq_msg-remote_comm_id; + + ret = ib_post_send_mad(msg, NULL); + if (ret) + cm_free_msg(msg); + + return ret; +} + static int cm_dreq_handler(struct cm_work *work) { struct cm_id_private *cm_id_priv; @@ -1913,8 +1939,10 @@ static int cm_dreq_handler(struct cm_wor dreq_msg = (struct cm_dreq_msg *)work-mad_recv_wc-recv_buf.mad; cm_id_priv = cm_acquire_id(dreq_msg-remote_comm_id, dreq_msg-local_comm_id); - if (!cm_id_priv) + if (!cm_id_priv) { + cm_issue_drep(work-port, work-mad_recv_wc); return -EINVAL; + } work-cm_event.private_data = dreq_msg-private_data; ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm connection parameters (was: rdma_cm branch)
On 10/4/06 1:03 PM, Sean Hefty [EMAIL PROTECTED] wrote: Steve Wise wrote: It would be nice, IMO, for the RDMA CM to handle this under the covers and setup the QP appropriately. Thus the parameters need not be passed to the consumer... The same parameters are also specified when calling rdma_accept(). I think these are the values that are used for the connection. (I need to trace through the code to be sure.) There's no easy way for the passive side to know what was requested without exporting the values. We could drop to the lower of the two values, and let users that really care what the values are call ib_query_qp() after the connection has been established. This has the disadvantage that you couldn't just reject the connection if the values weren't what you needed. Can't the passive side receive the active side's ORD/IRD in the rdma_cm_event. Is providing the values in the rdma_cm_event what you mean by 'exporting' the values? The passive side could then call either rdma_accept or rdma_reject based on these values. My assumption is that the typical behavior, however, would be to limit itself to whatever the active side requested, or what it was capable of and then return these values in it's own call to rdma_accept. The service provided by the CM is to marshal and unmarshal these values from reserved private data into the rdma_cm_event structure. I would think the limitation is that the active side could effectively overprovision it's QP if the passive side couldn't honor it's request. Am I confused? - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] opensm: setup function for 'null' routing engine
This defines setup function for 'null' routing engine. Currently there is only log message and fallback to default behavior, so the function is needed to prevent opensm crash when '-R null' is used. Signed-off-by: Sasha Khapyorsky [EMAIL PROTECTED] --- osm/opensm/osm_opensm.c | 12 +++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/osm/opensm/osm_opensm.c b/osm/opensm/osm_opensm.c index ac9d462..0c5450d 100644 --- a/osm/opensm/osm_opensm.c +++ b/osm/opensm/osm_opensm.c @@ -76,8 +76,10 @@ struct routing_engine_module { extern int osm_ucast_updn_setup(osm_opensm_t *p_osm); extern int osm_ucast_file_setup(osm_opensm_t *p_osm); +static int osm_ucast_null_setup(osm_opensm_t *p_osm); + const static struct routing_engine_module routing_modules[] = { - { null, NULL }, + { null, osm_ucast_null_setup }, { updn, osm_ucast_updn_setup }, { file, osm_ucast_file_setup }, { NULL, NULL } @@ -102,6 +104,14 @@ static int setup_routing_engine(osm_open return -1; } +static int osm_ucast_null_setup(osm_opensm_t *p_osm) +{ + osm_log(p_osm-log, OSM_LOG_VERBOSE, + osm_ucast_null_setup: nothing yet - + will use default routing engine\n); + return 0; +} + /** **/ void ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm connection parameters (was: rdma_cm branch)
Tom Tucker wrote: Can't the passive side receive the active side's ORD/IRD in the rdma_cm_event. Is providing the values in the rdma_cm_event what you mean by 'exporting' the values? That along with copying the values up to userspace. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] opensm: setup function for 'null' routing engine
On Wed, 2006-10-04 at 15:18, Sasha Khapyorsky wrote: This defines setup function for 'null' routing engine. Currently there is only log message and fallback to default behavior, so the function is needed to prevent opensm crash when '-R null' is used. Signed-off-by: Sasha Khapyorsky [EMAIL PROTECTED] Thanks. Applied. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] opensm: verbose message about fallback to default routing engine.
This provides verbose message for cases then specified routing engine (with -R) was not found or this setup was failed. Signed-off-by: Sasha Khapyorsky [EMAIL PROTECTED] --- osm/opensm/osm_opensm.c | 15 --- 1 files changed, 12 insertions(+), 3 deletions(-) diff --git a/osm/opensm/osm_opensm.c b/osm/opensm/osm_opensm.c index 0c5450d..00cb0f6 100644 --- a/osm/opensm/osm_opensm.c +++ b/osm/opensm/osm_opensm.c @@ -92,8 +92,12 @@ static int setup_routing_engine(osm_open for (r = routing_modules; r-name *r-name; r++) { if(!strcmp(r-name, name)) { p_osm-routing_engine.name = r-name; - if (r-setup(p_osm)) - break; + if (r-setup(p_osm)) { + osm_log(p_osm-log, OSM_LOG_VERBOSE, + setup of routing engine \'%s\' +failed\n, name); + return -2; + } osm_log (p_osm-log, OSM_LOG_DEBUG, setup_routing_engine: \'%s\' routing engine set up\n, @@ -299,8 +303,13 @@ #endif goto Exit; if( p_opt-routing_engine_name - setup_routing_engine(p_osm, p_opt-routing_engine_name)) + setup_routing_engine(p_osm, p_opt-routing_engine_name)) { + osm_log( p_osm-log, OSM_LOG_VERBOSE, + osm_opensm_init: cannot find or setup routing engine +\'%s\'. Default will be used instead.\n, + p_opt-routing_engine_name); goto Exit; + } Exit: osm_log( p_osm-log, OSM_LOG_FUNCS, osm_opensm_init: ]\n ); /* Format Waived */ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] opensm: verbose message about fallback to default routing engine.
On Wed, 2006-10-04 at 15:34, Sasha Khapyorsky wrote: This provides verbose message for cases then specified routing engine (with -R) was not found or this setup was failed. Signed-off-by: Sasha Khapyorsky [EMAIL PROTECTED] Thanks. Applied. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] 2.6.18 kernel support in the main trunk.
Quoting r. Matt Leininger [EMAIL PROTECTED]: A couple of more requests as far as you are working on the infrastructure - updated svn server enables fast mirroring better web access and other goodies Are you referring to svn 1.4? our plan is to upgrade to 1.4. Yes, thanks. I was generally saying svn should be kept up to date in some way. - add bugzilla email gateway (as seen e.g. at kernel.org) that supports accepting Cc mail where you put [Bug ] in the subject (where is the bug number) and cc [EMAIL PROTECTED] I'll add that to the list. Thanks. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm branch
Quoting r. Sean Hefty [EMAIL PROTECTED]: Subject: Re: rdma_cm branch Michael S. Tsirkin wrote: Quoting r. Sean Hefty [EMAIL PROTECTED]: 1. We need to add rdma_establish() and expose the rdma_conn_param values as part of the connection event. I'm working on a patch for the latter. I have both patches as part of OFED. Should I post them for review? I have a patch for rdma_establish(), but please post both. Here's the rdma_establish patch from OFED. Seems to even still apply to 2.6.19. I expect just replacing the id-device-node_type test you'll get what you want for upstream. I know we don't have an in-tree user yet, but it *is* necessary for passive-side completeness, so maybe a case can be still made to have it in 2.6.19? === Make it possible for ULPs on the passive side to handle RTU loss by calling rdma_establish upon completion or qp event. Signed-off-by: Sean Hefty [EMAIL PROTECTED] Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED] --- Index: a/include/rdma/rdma_cm.h === --- a/include/rdma/rdma_cm.h(revision 8822) +++ a/include/rdma/rdma_cm.h(working copy) @@ -256,6 +256,16 @@ int rdma_listen(struct rdma_cm_id *id, i int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param); /** + * rdma_establish - Forces a connection state to established. + * @id: Connection identifier to transition to established. + * + * This routine should be invoked by users who receive messages on a + * QP before being notified that the connection has been established by the + * RDMA CM. + */ +int rdma_establish(struct rdma_cm_id *id); + +/** * rdma_reject - Called to reject a connection request or response. */ int rdma_reject(struct rdma_cm_id *id, const void *private_data, Index: a/drivers/infiniband/core/cm.c === --- a/drivers/infiniband/core/cm.c (revision 8823) +++ a/drivers/infiniband/core/cm.c (working copy) @@ -3207,6 +3207,10 @@ static int cm_init_qp_rts_attr(struct cm spin_lock_irqsave(cm_id_priv-lock, flags); switch (cm_id_priv-id.state) { + /* Allow transition to RTS before sending REP */ + case IB_CM_REQ_RCVD: + case IB_CM_MRA_REQ_SENT: + case IB_CM_REP_RCVD: case IB_CM_MRA_REP_SENT: case IB_CM_REP_SENT: Index: a/drivers/infiniband/core/cma.c === --- a/drivers/infiniband/core/cma.c (revision 8822) +++ a/drivers/infiniband/core/cma.c (working copy) @@ -840,22 +840,6 @@ static int cma_verify_rep(struct rdma_id return 0; } -static int cma_rtu_recv(struct rdma_id_private *id_priv) -{ - int ret; - - ret = cma_modify_qp_rts(id_priv-id); - if (ret) - goto reject; - - return 0; -reject: - cma_modify_qp_err(id_priv-id); - ib_send_cm_rej(id_priv-cm_id.ib, IB_CM_REJ_CONSUMER_DEFINED, - NULL, 0, NULL, 0); - return ret; -} - static int cma_ib_handler(struct ib_cm_id *cm_id, struct ib_cm_event *ib_event) { struct rdma_id_private *id_priv = cm_id-context; @@ -886,9 +870,8 @@ static int cma_ib_handler(struct ib_cm_i private_data_len = IB_CM_REP_PRIVATE_DATA_SIZE; break; case IB_CM_RTU_RECEIVED: - status = cma_rtu_recv(id_priv); - event = status ? RDMA_CM_EVENT_CONNECT_ERROR : -RDMA_CM_EVENT_ESTABLISHED; + case IB_CM_USER_ESTABLISHED: + event = RDMA_CM_EVENT_ESTABLISHED; break; case IB_CM_DREQ_ERROR: status = -ETIMEDOUT; /* fall through */ @@ -1981,11 +1964,25 @@ static int cma_accept_ib(struct rdma_id_ struct rdma_conn_param *conn_param) { struct ib_cm_rep_param rep; - int ret; + struct ib_qp_attr qp_attr; + int qp_attr_mask, ret; - ret = cma_modify_qp_rtr(id_priv-id); - if (ret) - return ret; + if (id_priv-id.qp) { + ret = cma_modify_qp_rtr(id_priv-id); + if (ret) + goto out; + + qp_attr.qp_state = IB_QPS_RTS; + ret = ib_cm_init_qp_attr(id_priv-cm_id.ib, qp_attr, +qp_attr_mask); + if (ret) + goto out; + + qp_attr.max_rd_atomic = conn_param-initiator_depth; + ret = ib_modify_qp(id_priv-id.qp, qp_attr, qp_attr_mask); + if (ret) + goto out; + } memset(rep, 0, sizeof rep); rep.qp_num = id_priv-qp_num; @@ -2000,7 +1997,9 @@ static int cma_accept_ib(struct rdma_id_ rep.rnr_retry_count = conn_param-rnr_retry_count; rep.srq = id_priv-srq ? 1 : 0; - return
Re: [openib-general] rdma_cm branch
Quoting r. Sean Hefty [EMAIL PROTECTED]: Subject: Re: rdma_cm branch The ideal fix for this is to include rdma_conn_param as part of the rdma_cm_event. BTW, wouldn't it be cleaner to just pass it up in the request event? Yes - this is what I meant by including it in the rdma_cm_event structure. However, this breaks every userspace app that's been coded to OFED / SVN. An alternative is to add another call to retrieve the data, but that's not a very clean alternative for new kernel submission. Another alternative is to version the create ID call. Hmm... I need to think about the implementation of this more, but this sounds like a possibility. Can you provide any details on how you're envisioning this working? Well, I have not thought this through yet, but suppose you extend struct rdma_ucm_create_id, and check the length parameter in ucma_create_id to figure out which format was used. If length is small, you know you have userspace with old ABI. An extra field could be a userspace ABI version number, which you then carry around and use to figure out how to decode the resst of the stuff that comes from userspace. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Infiniband Fedora Core5
Aviram No, Fedora Core 5 is not part of the OFED OS matrix. However, it is the case that FC5 alread includes very up-to-date kernel IB drivers. And at least libibverbs and libmthca are available through Fedora Extras -- so on a default FC5 install you should be able to do yum install libibverbs libmthca and get at least that much IB support. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm branch
Here's the rdma_establish patch from OFED. Seems to even still apply to 2.6.19. I expect just replacing the id-device-node_type test you'll get what you want for upstream. Thanks - this matches what I have queued in my local git tree against 2.6.19. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout
Quoting r. Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] IB_CM: Limit the MRA timeout Michael S. Tsirkin wrote: There's several timeout values transfered and used by the cm, most notably the remote cm response timeout and packet life time. Does it make more sense to have a single, generic timeout maximum instead? Hmm. I'm not sure - we are working around an actual broken implementation here - what do you think? I wasn't sure either. The MRA timeout is a combination of the packet life time + service timeout, which made me bring this up. The patch only handles the service timeout portion, so we end up in the same situation if a large packet life time is ever used. But that comes from the SA, does it not? Would it make more sense to enable the maximum(s) by default, since we're dependent upon values received over the network? I think it would. So do I. The CM has checks to bring out of range values into range, but at the maximum, we get a timeout of about 2.5 hours. Multiple that by 15 retries, and the cm can literally spend all day retrying a request. I was considering dropping the default maximum down to around 4-8 seconds, which with retries still gives us about a minute to timeout a request. The default maximum would apply to local and remote cm timeouts, packet life time, and service timeout, but could be overridden by the user. (Basically, with Ishai's patch: rename mra_timeout_limit to timeout_limit, set to a default of 20, and replace occurrences of '31' in the code with timeout_limit.) For remote cm timeout and service timeout this makes sense - they seem currently mostly taken out of the blue on implementations I've seen. But since the packet lifetime comes from the SM, it actually has a chance to reflect some knowledge about the network topology. And since we haven't see any practical issues with packet life time yet - maybe a different paremeter for that, with a higher limit? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] FW: FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD
Quoting r. Michael S. Tsirkin [EMAIL PROTECTED]: Subject: Re: FW: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD Quoting r. Moshe Kazir [EMAIL PROTECTED]: Subject: FW: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD Michael, I received the attached files from Frank. they look small , easy to understand, and change almost nothing in the code. The patch solves the ppc64 problems. Please approve the patch and integrate it into OFED-1.1-rc7. I tested it . it's working o.k. on on JS21 ppc64 sles 10, JS21 ppc64 sles9, redhat as4 u3 x86_64, redhat as4 u3 i386. Frank also tested it on AMD and JS21 PPC and MAC PPC64 . Best regards, Moshe OK, not sure what's in a tarball, but the patch looks small and safe enough to go in. But, we need the Signed-off-by like from the patch author, certifying to the Developer's Certificate of Origin 1.1: Please note RC7 is closing tomorrow, so we need to get the signature stuff out of the way by then if the patch's to make it in OFED 1.1. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout
Ishai There is a bug in SRP Engenio target that send a large Ishai value as service timeout. (It gets 30 which mean timeout of Ishai (2^(30-8))=4195 sec.) Such a long timeout is not Ishai reasonable and it may leave the kernel module waiting on Ishai wait_for_completion and may stuck a lot of processes. OK, that's a problem, I guess... Ishai The following patch allows the load of ib_cm module with a Ishai limit on the timeout. ...but adding yet another knob that has to be set correctly can't be the right way to fix this. Should we just chop off too-big timeout values onconditionally? Or make Engenio fix their broken target and tell everybody to upgrade their firmware? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB/SRP: Remove redundant memset of the target
Thanks, queued for 2.6.19 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout
Michael S. Tsirkin wrote: For remote cm timeout and service timeout this makes sense - they seem currently mostly taken out of the blue on implementations I've seen. But since the packet lifetime comes from the SM, it actually has a chance to reflect some knowledge about the network topology. And since we haven't see any practical issues with packet life time yet - maybe a different paremeter for that, with a higher limit? I guess the question is how much do we trust the timeout values sent in a CM MAD. (It's hard for me to imagine a network that requires a 2.5 hour packet life time. IB to space?) Having separate timeout values may make sense, but my expectation is for the remote cm timeout and service timeout values to be greater than the packet life time. If we go with separate values, then is there a reason not to have separate defaults for each one? My preference is to try to limit the number of values. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Bug 266] New: IPoIB multicast does not work with RHEL4 U4
http://openib.org/bugzilla/show_bug.cgi?id=266 Summary: IPoIB multicast does not work with RHEL4 U4 Product: OpenFabrics Linux Version: 1.1rc6 Platform: All OS/Version: RHEL 4 Status: NEW Severity: major Priority: P2 Component: IPoIB AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] I'm opening a bug on this so customers can find this info more easily. Email thread on issue: On Tue, 2006-09-19 at 14:44 +0300, Eli cohen wrote: Hi, while testing ipoib multicast on RHEL4.0 u4, I noticed that setsockopt() succeeds to add a multicast group to an interface but actually the multicast group is not added to the net_device. This means that an application cannot join a multicast group as a full member. When I examined the differences between the kernel sources for u3 and u4 I noticed that essential code was removed: diff -ru net/ipv4/arp.c ../linux-2.6.9-42.ELsmp/net/ipv4/arp.c --- net/ipv4/arp.c 2006-09-18 15:35:03.0 +0300 +++ ../linux-2.6.9-42.ELsmp/net/ipv4/arp.c 2006-09-19 10:08:06.0 +0300 @@ -213,9 +213,6 @@ case ARPHRD_IEEE802_TR: ip_tr_mc_map(addr, haddr); return 0; - case ARPHRD_INFINIBAND: - ip_ib_mc_map(addr, haddr); - return 0; default: if (dir) { memcpy(haddr, dev-broadcast, dev-addr_len); Can anyone suggest a workaround to this issue? Short of spinning a kernel, it's going to be hard to work around. Thanks for finding this, I'll track down how this got left out of the U4 kernel when it was in the U3 kernel :-/ -- Doug Ledford [EMAIL PROTECTED] GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] ipoib question when running on the same node as opensm
We just brought another cluster up and had an issue with our management node (node running opensm) not coming up on ipoib. Here is what happened and how I got it working and I had some questions. 1) We had both opensm running and a switch based Voltaire SM running. This caused problems. 2) We stopped the Voltaire SM and restarted all the nodes. This got all of the nodes except the one with opensm running to work. 3) I had to unload all the modules, load only those needed by opensm, start opensm, and then bring up the ipoib interface. At this point the node seemed to be in the multicast group and ipoib worked fine. Does this seem like proper behavior? I would think that on boot if ipoib does not find a SM running it will delay setting up a connection until the SM comes on-line? (ie when the opensm init script gets run.) It seems like the card saves some information (from the Voltaire SM) across a soft reboot? I know that it was not coming up in the multicast group with the opensm. Is this by design? At this point ipoib seems to work fine after a reboot even though the interface is brought up before opensm. Do I need to ensure that opensm is up before all ipoib requests in the future? Thanks, Ira Weiny [EMAIL PROTECTED] ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout
Quoting r. Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] IB_CM: Limit the MRA timeout Michael S. Tsirkin wrote: For remote cm timeout and service timeout this makes sense - they seem currently mostly taken out of the blue on implementations I've seen. But since the packet lifetime comes from the SM, it actually has a chance to reflect some knowledge about the network topology. And since we haven't see any practical issues with packet life time yet - maybe a different paremeter for that, with a higher limit? I guess the question is how much do we trust the timeout values sent in a CM MAD. (It's hard for me to imagine a network that requires a 2.5 hour packet life time. IB to space?) Having separate timeout values may make sense, but my expectation is for the remote cm timeout and service timeout values to be greater than the packet life time. If we go with separate values, then is there a reason not to have separate defaults for each one? My preference is to try to limit the number of values. The way I see it, we trust e.g. the SRP target anyway. So I'm not sure there's much value in range-checking everything. The only reason we are touching this is because we see a target reporting an obviously broken service timeout value in MRA - in the hours range. So, maybe start small and just use Ishai's patch (with default set to several seconds or so), and wait with the fix till a problem surfaces? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout
From: Michael S. Tsirkin Sent: Wednesday, October 04, 2006 4:37 PM To: Sean Hefty Cc: Ishai Rabinovitz; openib-general@openib.org Subject: Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout Quoting r. Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] IB_CM: Limit the MRA timeout Michael S. Tsirkin wrote: There's several timeout values transfered and used by the cm, most notably the remote cm response timeout and packet life time. Does it make more sense to have a single, generic timeout maximum instead? Hmm. I'm not sure - we are working around an actual broken implementation here - what do you think? I wasn't sure either. The MRA timeout is a combination of the packet life time + service timeout, which made me bring this up. The patch only handles the service timeout portion, so we end up in the same situation if a large packet life time is ever used. But that comes from the SA, does it not? Would it make more sense to enable the maximum(s) by default, since we're dependent upon values received over the network? I think it would. So do I. The CM has checks to bring out of range values into range, but at the maximum, we get a timeout of about 2.5 hours. Multiple that by 15 retries, and the cm can literally spend all day retrying a request. I was considering dropping the default maximum down to around 4-8 seconds, which with retries still gives us about a minute to timeout a request. The default maximum would apply to local and remote cm timeouts, packet life time, and service timeout, but could be overridden by the user. (Basically, with Ishai's patch: rename mra_timeout_limit to timeout_limit, set to a default of 20, and replace occurrences of '31' in the code with timeout_limit.) For remote cm timeout and service timeout this makes sense - they seem currently mostly taken out of the blue on implementations I've seen. But since the packet lifetime comes from the SM, it actually has a chance to reflect some knowledge about the network topology. And since we haven't see any practical issues with packet life time yet - maybe a different paremeter for that, with a higher limit? -- I recommend sticking with the IB spec for the various timeouts. In our products we carefully implemented the timeouts and computations as defined by the spec. The SM controls the pkt lifetime and should base it on a knowledge of the fabric topology and configuration. Many of the CA specific base timers are specific to the HCA/TCA itself (hence we provided this information as part of queries to the CA verbs driver). We permitted configuration in the individual verbs drivers to override the reasonable estimates which we provided as defaults for each HCA model we support. It's a little tricky to work out the details defined in the spec (a summary section on timers would have made it easier), however I did that effort a few years ago and here is a summary of all the HCA/TCA related IB timers below. Notice many of these must be uncomputed from information in the CM REQ and REP to get the base level values (such as pkt lifetime which is not directly specified in CM REQ): 3.1 Base Timers CA Ack Delay - time from Receipt of IB transport packet to sending of ACK. Hardware and VlArb dependent. CA inbound processing time - time from receipt of IB transport packet to delivery and processing in CA's transport state machine. Hardware dependent. CA outbound processing time - time from entry of packet to QP until transmit packet on wire. hardware and VlArb dependent. Class turnaround time(class) - processing time from delivery of request on QP to posting of response on QP 3.2 Derived Timers Ack Timeout - timeout for QP ACK/NAK before QP resends up to RetryCount = 2*(PktLifeTime)+Remote CA Ack Delay + local CA inbound processing Time RNR NAK Delay - Appl protocol must be prepared to replenish Recv Q of QP within RNR NAK Delay + 2*(PktLifeTime), can set this to low bound and RNRNakDelay*RNRRetryLimit must be upper bound PortInfo:SubnetTimeout = max(PktLifeTime for all pathsRecords within subnet) PortInfo:RespTimeout - SMA max time between receipt to response within Node, includes CA delays in receive and Send. = ClassTurnaroundTime(SMA) + CA inbound (QP0) + CA outbound (QP0) ClassPortInfo:RespTimeout- GSA class max time between receipt to response within Node, includes CA delays in receive and Send. = ClassTurnaroundTime(class) + CA inbound (QP1) + CA outbound (QP1) PathRecord:PacketLifeTime - reasonable estimate of worst case time through path for packet to traverse fabric in 1 direction. 0 if loopback path from port to itself (CA inbound/outbound and/or ACK delay values should cover) LocalAckTimeout - QP/CM - 2*PathRecord:PktLifeTime + local CA Ack Delay QP:AckTimeout - use 2*PathRecord:PktLifeTime + remote CA Ack Delay Remote CM Resp Timeout - CM - CM server REQ response time (should be based on
Re: [openib-general] ipoib question when running on the same node as opensm
Quoting r. Ira Weiny [EMAIL PROTECTED]: Do I need to ensure that opensm is up before all ipoib requests in the future? Shouldn't be required, thing work well for me, anyway. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout
Quoting r. Rimmer, Todd [EMAIL PROTECTED]: I recommend sticking with the IB spec for the various timeouts. So what do you suggest, wait a day or so to timeout the MRA? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout
Michael S. Tsirkin wrote: The way I see it, we trust e.g. the SRP target anyway. So I'm not sure there's much value in range-checking everything. The only reason we are touching this is because we see a target reporting an obviously broken service timeout value in MRA - in the hours range. The CM is also exposed into userspace, so I think this issue is highlighting a larger potential problem. I'm a little hesitant to add precedence that we want to reduce large timeout values by adding separate timer constraints exposed through module parameters. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH fixed] IB/srp: enable multiple connections to the same target
Thanks, queued for 2.6.19 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH repost] IB/mthca: query port fix
thanks, queued for 2.6.19 BTW when forwarding patches (I assume this one was from Jack) please include an extra From: line so I get the write author when I import it back into git... ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 2/2] 2.6.19 ib_cm: send DREP in response to unmatched DREQ
Thanks, queued 1 2 for 2.6.19 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH repost] IB/mthca: query port fix
Quoting r. Roland Dreier [EMAIL PROTECTED]: Subject: Re: [PATCH repost] IB/mthca: query port fix thanks, queued for 2.6.19 BTW when forwarding patches (I assume this one was from Jack) please include an extra From: line so I get the write author when I import it back into git... Right, missed that, sorry. This one was from Jack, pls fix it up accordingly. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout
Quoting r. Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] IB_CM: Limit the MRA timeout Michael S. Tsirkin wrote: The way I see it, we trust e.g. the SRP target anyway. So I'm not sure there's much value in range-checking everything. The only reason we are touching this is because we see a target reporting an obviously broken service timeout value in MRA - in the hours range. The CM is also exposed into userspace, so I think this issue is highlighting a larger potential problem. I'm a little hesitant to add precedence that we want to reduce large timeout values by adding separate timer constraints exposed through module parameters. So, let's just have a #define for now? And maybe print a warning so we can figure out what's wrong ... -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout
Michael S. Tsirkin wrote: So, let's just have a #define for now? And maybe print a warning so we can figure out what's wrong ... That sounds simple enough for now. (maybe set to 21 = 8 seconds = 2 minutes with retries?) Having the maximum apply at least to remote CM timeout + service timeout would be good. (It appears that Intel MPI just hit into this issue after setting the remote CM timeout to 31.) - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout
Quoting r. Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] IB_CM: Limit the MRA timeout Michael S. Tsirkin wrote: So, let's just have a #define for now? And maybe print a warning so we can figure out what's wrong ... That sounds simple enough for now. (maybe set to 21 = 8 seconds = 2 minutes with retries?) Having the maximum apply at least to remote CM timeout + service timeout would be good. (It appears that Intel MPI just hit into this issue after setting the remote CM timeout to 31.) OK. Patch? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout
Michael S. Tsirkin wrote: That sounds simple enough for now. (maybe set to 21 = 8 seconds = 2 minutes with retries?) Having the maximum apply at least to remote CM timeout + service timeout would be good. (It appears that Intel MPI just hit into this issue after setting the remote CM timeout to 31.) OK. Patch? From me or you? I can probably throw something together tomorrow. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB_CM: Limit the MRA timeout
Quoting r. Sean Hefty [EMAIL PROTECTED]: Subject: Re: [PATCH] IB_CM: Limit the MRA timeout Michael S. Tsirkin wrote: That sounds simple enough for now. (maybe set to 21 = 8 seconds = 2 minutes with retries?) Having the maximum apply at least to remote CM timeout + service timeout would be good. (It appears that Intel MPI just hit into this issue after setting the remote CM timeout to 31.) OK. Patch? From me or you? I can probably throw something together tomorrow. Pls go ahead then. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64
I see it for all MVAPICH tests, it's 100% consistent. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Pavel Shamis (Pasha) [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 03, 2006 3:37 AM To: Scott Weitzenkamp (sweitzen) Cc: Aviram Gutman; OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 Hi Scott, Unfortunately was not able to reproduce the failure on our platforms. Do you see the problem with all tests or with the specific only ? Is it consistent problem ? Regards, Pasha Scott Weitzenkamp (sweitzen) wrote: $ uname -a Linux svbu-qa1850-3 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 hostname svbu-qa1850-4 svbu-qa1850-3 $ /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh -np 2 192.168.2.46 192.168.2.49 /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/tests/osu_bench marks-2.2/ osu_latency The last command just hangs. Can I try your binary RPMs? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -Original Message- From: Aviram Gutman [mailto:[EMAIL PROTECTED] Sent: Sunday, October 01, 2006 2:29 AM To: Scott Weitzenkamp (sweitzen) Cc: OpenFabricsEWG; openib; [EMAIL PROTECTED] Subject: Re: [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64 Can you please elaborate on MVAPICH issues, can you send command line? We ran it here on 32 Opteron nodes each quad core and also rigorous tests on the many other nodes? Scott Weitzenkamp (sweitzen) wrote: We are just getting started with OFED testing on SLES10, first platform is x86_64. IPoIB, SDP, SRP, Open MPI, HP MPI, and Intel MPI are working so far. MVAPICH with OSU benchmarks just hang.This same hardware works fine with OFED and RHEL4 U3. Has anyone else seen this? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems -- -- ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg -- Pavel Shamis (Pasha) Software Engineer Mellanox Technologies LTD. [EMAIL PROTECTED] ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IB/ipath - initialize diagpkt file on device init only
I tried loading ib_ipath on one of my systems without an ipath device, and I got the message ib_ipath: Could not create class_dev for minor 127, ipath_diagpkt (err 19) but I couldn't reproduce the hang of modprobe. Anyway, taking a quick look at what caused that message showed that the problem is that ipath_class is NULL there. And that makes sense, because ipath_class isn't created until ipath_user_add() is called in ipath_init_one() (which is never called if there are no devices). So I think a correct fix is to move any global initialization like creating ipath_class into the module_init function before probing any devices. I don't approve of this latest patch because you do this: -int __init ipath_diagpkt_add(void) -{ -return ipath_cdev_init(IPATH_DIAGPKT_MINOR, - ipath_diagpkt, diagpkt_file_ops, - diagpkt_cdev, diagpkt_class_dev); -} +void ipath_diagpkt_add(void) +{ +if (atomic_inc_return(diagpkt_count) == 1) +ipath_cdev_init(IPATH_DIAGPKT_MINOR, +ipath_diagpkt, diagpkt_file_ops, +diagpkt_cdev, diagpkt_class_dev); +} which means that you've stopped checking the return value of ipath_cdev_init(). What will happen if it fails? I would guess everything blows up, right? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IB/ipath - initialize diagpkt file on device init only
Roland Dreier wrote: I tried loading ib_ipath on one of my systems without an ipath device, and I got the message ib_ipath: Could not create class_dev for minor 127, ipath_diagpkt (err 19) but I couldn't reproduce the hang of modprobe. This was without the patch, though, right? Cause if you've applied the patch and you're still getting this message, I'm confused. Anyway, taking a quick look at what caused that message showed that the problem is that ipath_class is NULL there. And that makes sense, because ipath_class isn't created until ipath_user_add() is called in ipath_init_one() (which is never called if there are no devices). Ah. Well spotted. So I think a correct fix is to move any global initialization like creating ipath_class into the module_init function before probing any devices. I don't approve of this latest patch because you do this: -int __init ipath_diagpkt_add(void) -{ - return ipath_cdev_init(IPATH_DIAGPKT_MINOR, - ipath_diagpkt, diagpkt_file_ops, - diagpkt_cdev, diagpkt_class_dev); -} +void ipath_diagpkt_add(void) +{ + if (atomic_inc_return(diagpkt_count) == 1) + ipath_cdev_init(IPATH_DIAGPKT_MINOR, + ipath_diagpkt, diagpkt_file_ops, + diagpkt_cdev, diagpkt_class_dev); +} which means that you've stopped checking the return value of ipath_cdev_init(). What will happen if it fails? I would guess everything blows up, right? Sigh. Yeah. Code go boom. I'll roll it again. We had been ignoring the failure anyway before this patch. I'll just make sure we do a dev_warn and atomic_dec if ipath_cdev_init fails. Michael: do we have much longer left for RC7? As I said before, it's no skin of my nose if you want to roll RC7 with my hacky workaround patch in place, or even without that. But if you're still working on other stuff, I can probably get a proper fix in tonight. Regards, Robert. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IB/ipath - initialize diagpkt file on device init only
Sigh. Yeah. Code go boom. I'll roll it again. We had been ignoring the failure anyway before this patch. I'll just make sure we do a dev_warn and atomic_dec if ipath_cdev_init fails. Scrub that - I'm going to roll all this into ipath_diag_add and make it all a bit simpler. I'll send out a new patch tomorrow afternoon, after some testing. Regards, Robert. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IB/ipath - initialize diagpkt file on device init only
Robert This was without the patch, though, right? Cause if Robert you've applied the patch and you're still getting this Robert message, I'm confused. Yes, I was trying to debug the root cause of the problem, so I was just running the mainline kernel. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general