[openib-general] reason behind locking the WQs while checking the state in modify_qp?
hello allrecently i have gone through the discussions how you have decided to split the QP lock in to separate WQ locks and the locking mechanismhttp://openib.org/pipermail/openib-general/2005-February/004491.htmlin this patch it is mentioned the only place we will be taking the lock is in modify_qp while checking the state of the QP but no description why it is required to do so my question is why it is required to lock the WQs. Is there any dependence of the QP state on the posting WRs-Mahesh Find out what India is talking about on - Yahoo! Answers India Send FREE SMS to your friend's mobile from Yahoo! Messenger Version 8. Get it NOW___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Bug 229] heavy CPU load can starve ib_mad thread on latest processors
http://openib.org/bugzilla/show_bug.cgi?id=229 --- Comment #3 from [EMAIL PROTECTED] 2006-09-11 23:14 --- Put email in bugzilla: Hmm, OK. I'd like to figure out whether this could be something other than a scheduler issue. Could you test on kernel 2.6.18 or 2.6.17 please? If this is a scheduler issue, there's a chance scheduler is more fair there. Quoting r. Scott Weitzenkamp (sweitzen) <[EMAIL PROTECTED]>: > Subject: RE: [Bug 229] heavy CPU load can starve ib_mad thread on latest > processors > > I only tested with renice -20. > > Scott Weitzenkamp > SQA and Release Manager > Server Virtualization Business Unit > Cisco Systems > > > > -Original Message- > > From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] > > Sent: Monday, September 11, 2006 11:02 PM > > To: Scott Weitzenkamp (sweitzen) > > Cc: openib-general@openib.org > > Subject: Re: [Bug 229] heavy CPU load can starve ib_mad > > thread on latest processors > > > > Quoting r. [EMAIL PROTECTED] <[EMAIL PROTECTED]>: > > > Subject: [Bug 229] heavy CPU load can starve ib_mad thread > > on latest processors > > > > > > http://openib.org/bugzilla/show_bug.cgi?id=229 > > > > > > > > > > > > > > > > > > --- Comment #2 from [EMAIL PROTECTED] 2006-09-11 21:54 --- > > > Cisco embedded SM on a switch, thus no SM on a host, only > > IB drivers. > > > > Looks like we'll add the workaround for ofed. > > What renice level are you using? > > > > -- > > MST > > > --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [Bug 229] heavy CPU load can starve ib_mad thread on latest processors
Hmm, OK. I'd like to figure out whether this could be something other than a scheduler issue. Could you test on kernel 2.6.18 or 2.6.17 please? If this is a scheduler issue, there's a chance scheduler is more fair there. Quoting r. Scott Weitzenkamp (sweitzen) <[EMAIL PROTECTED]>: > Subject: RE: [Bug 229] heavy CPU load can starve ib_mad thread on latest > processors > > I only tested with renice -20. > > Scott Weitzenkamp > SQA and Release Manager > Server Virtualization Business Unit > Cisco Systems > > > > -Original Message- > > From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] > > Sent: Monday, September 11, 2006 11:02 PM > > To: Scott Weitzenkamp (sweitzen) > > Cc: openib-general@openib.org > > Subject: Re: [Bug 229] heavy CPU load can starve ib_mad > > thread on latest processors > > > > Quoting r. [EMAIL PROTECTED] <[EMAIL PROTECTED]>: > > > Subject: [Bug 229] heavy CPU load can starve ib_mad thread > > on latest processors > > > > > > http://openib.org/bugzilla/show_bug.cgi?id=229 > > > > > > > > > > > > > > > > > > --- Comment #2 from [EMAIL PROTECTED] 2006-09-11 21:54 --- > > > Cisco embedded SM on a switch, thus no SM on a host, only > > IB drivers. > > > > Looks like we'll add the workaround for ofed. > > What renice level are you using? > > > > -- > > MST > > > -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [Bug 229] heavy CPU load can starve ib_mad thread on latest processors
I only tested with renice -20. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems > -Original Message- > From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] > Sent: Monday, September 11, 2006 11:02 PM > To: Scott Weitzenkamp (sweitzen) > Cc: openib-general@openib.org > Subject: Re: [Bug 229] heavy CPU load can starve ib_mad > thread on latest processors > > Quoting r. [EMAIL PROTECTED] <[EMAIL PROTECTED]>: > > Subject: [Bug 229] heavy CPU load can starve ib_mad thread > on latest processors > > > > http://openib.org/bugzilla/show_bug.cgi?id=229 > > > > > > > > > > > > --- Comment #2 from [EMAIL PROTECTED] 2006-09-11 21:54 --- > > Cisco embedded SM on a switch, thus no SM on a host, only > IB drivers. > > Looks like we'll add the workaround for ofed. > What renice level are you using? > > -- > MST > ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [Bug 229] heavy CPU load can starve ib_mad thread on latest processors
Quoting r. [EMAIL PROTECTED] <[EMAIL PROTECTED]>: > Subject: [Bug 229] heavy CPU load can starve ib_mad thread on latest > processors > > http://openib.org/bugzilla/show_bug.cgi?id=229 > > > > > > --- Comment #2 from [EMAIL PROTECTED] 2006-09-11 21:54 --- > Cisco embedded SM on a switch, thus no SM on a host, only IB drivers. Looks like we'll add the workaround for ofed. What renice level are you using? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 status
When will rc4 be available? I'd also like to suggest we not rush the final build, end of this week seems too soon. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet KorenSent: Thursday, September 07, 2006 1:02 PMTo: EWGCc: openibSubject: [openfabrics-ewg] OFED 1.1 status Hi, OFED 1.1 RC4 will be published on Monday 11-Sep. We currently work on several showstoppers: 223: mthca.so not properly linked to libibverbs – Vlad & Jack 221: SRP on V40Z and Sun T4 gets Kernel BUG at spinlock:118 - Roland 219: OFED 1.1rc3 contains prerelease unstable libibverbs code – Vlad & Jack Thus final release date will be delayed to end of next week Tziporet Koren Software Director Mellanox Technologies mailto: [EMAIL PROTECTED]Tel +972-4-9097200, ext 380 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Bug 229] heavy CPU load can starve ib_mad thread on latest processors
http://openib.org/bugzilla/show_bug.cgi?id=229 --- Comment #2 from [EMAIL PROTECTED] 2006-09-11 21:54 --- Cisco embedded SM on a switch, thus no SM on a host, only IB drivers. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] CMA issue: bind selects the same port after close
Hi Michael, > > The basic problem in the CMA is in cma_alloc_port(). If the port number (passed > > in as snum) is 0, the first available port starting at > > sysctl_local_port_range[0] is used. We could instead start our search by > > adding an increasing counter or a random value to the lower-end of the port > > range. Then expand the code to handle searching below our starting value if we > > failed to find one above it. > > Sounds good. > > > Are the port numbers assigned by TCP sequential or more random? > > TCP ports seem to be sequential. Are you getting sequential port numbers ? inet_csk_get_port() is actually using random number to get the *starting* value between sysctl_local_port_range[0] and sysctl_local_port_range[2]. Once it gets this starting number, it goes sequentially all the way to the high limit (sysctl*[1]) and then loops back from low (sysctl*[0]) limit until all the numbers in the middle are looked at. I think we can easily use the same logic. Sean's second option seems to be followed here "> > adding a random value to the lower-end of the port range" Thanks, - KK ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] Optimize cma_process_remove()
Hi Sean, > I don't think that this will work. The issue is that we need to walk a list of > IDs associated with a particular device to notify the user that the device is > being removed. While we're doing that, the user could try to destroy the ID, > which removes the ID from the device list. > > The original code takes a reference on the ID before removing it from the from > cma_dev's list to ensure that the ID will be valid while we process it. The > remove list ensures that the user is only notified once of a device removal. > (We don't know where the thread calling rdma_destroy_id() is at.) Yes, you are right - I missed the parallel rdma_destroy_id's. How about something like this then (it is cleaner than dropping/re-getting locks) : mutex_lock(&lock); while (!list_empty(&cma_dev->id_list)) { id_priv = list_entry(cma_dev->id_list.next, struct rdma_id_private, list); if (cma_internal_listen(id_priv)) { cma_destroy_listen(id_priv); } else { atomic_inc(&id_priv->refcount); list_del(&id_priv->list); list_add_tail(&id_priv->list, &remove_list); } } mutex_unlock(&lock); list_for_each_entry_safe(id_priv, tmp, &remove_list, list) { ret = cma_remove_id_dev(id_priv); cma_deref_id(id_priv); if (ret) rdma_destroy_id(&id_priv->id); } thanks, - KK ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] why sdp connections cost so much memory
--- "Michael S. Tsirkin" <[EMAIL PROTECTED]> wrote: > Quoting r. zhu shi song <[EMAIL PROTECTED]>: > > Subject: Re: why sdp connections cost so much > memory > > > > > You should not need this change with the scale > patch > > > I posted - after applying > > > this, and setting the scale parameter to 0x1, > each > > > connection should use around > > > 128K for RX. Please confirm. > > Just setting the scale parameter to 0x1, memory > > reduction is OK. But there occurred one bug, > > sometimes my kernel crashed. > > Shouldn't happen. Backtrace? > > > So I think PRE POST buf > > size should be changed either. > > zhu > > Hmm. I don't really see how this would help. > Is it true that changing just the RX size fixes the > crashes for you? > If yes I'd like to investigate. > > -- > MST > (1) when changing RX_SIZE=0x4 and TX_SIZE=0x4, I ran my testbench for 30 times, there was no kernel crash. I found sdp worked more stably and fast when I changed RX and TX size. (2) when RX_SIZE=0x40 and TX_SIZE=0x40, I could just run my testbench for several times before kernel crashed. The result is very different for the two cases. zhu __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RFC: mthca: implement timewait by tracking QPNs
> Could be a library function in core so that ipath etc can reuse it. > But note how there's no dependency between drivers here - no > reason to block change in mthca until ipath/ehca implement this functionality, > too. True. But FWIW, we (QLogic) could probably spin something like this pretty quickly anyway. Regards, Robert. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] 4 patches in mst-for-2.6.19
OK, I applied [PATCH] IB/cm: do not track remote QPN in timewait state since Sean has acked that already. I'll review the rest in the next day or two. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RFC: mthca: implement timewait by tracking QPNs
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: RFC: mthca: implement timewait by tracking QPNs > > My gut reaction is that it seems pretty ugly. Hmm. All of it or just some bits? > I guess we'll also need > similar patches for ipath and ehca too -- which makes me want to have > this in common code somehow. Could be a library function in core so that ipath etc can reuse it. But note how there's no dependency between drivers here - no reason to block change in mthca until ipath/ehca implement this functionality, too. > Also timewait is really only part of the CM spec Not entirely corect. Please look at 9.7.1 - search for "stale packets": In addition to duplicate packets and invalid packets, there is a third condition, called a Stale Packet (.TIME WAIT packet.). If a connection to a responder is torn down and a new connection is established while packets are in flight, a packet from the old (stale) connection may arrive at the responder. The responder, in turn, may interpret this stale incoming packet as a valid packet, when in fact it is a remnant of a previous connection. There are no transport layer mechanisms to guard against this condition; it is the responsibility of connection management to avoid re-using QPs until there is no possibility that a stale packet could arrive at the responder. This is done by placing the requester and responder QPs in a .Time Wait. state long enough to ensure that any stale packets left in the fabric have expired before re-using those QPs. So the spec suggests that timewait be implemented in CM, but timewait is needed to solve a problem that affects the transport layer and that is described in Chapter 9. > -- do we want to > limit the rate of RC QP creation in general for potential non-CM users > that know what they're doing? I don't see how this limits the rate of QP creation. Could you explain? Second, there's no way I can see verbs user can check there no stale packets (AK TimeWait packets). Is there? So user only *thinks* he knows what he's doing, meanwhile getting silen data corruption. Correct? > I'm not sure the following is a real concern (since a hostile user can > currently just create a ton of QPs and hold onto them forever), but > this also allows someone to create a bunch of QPs with a super-long > timeout and prevent any other QPs from being created for a few hours > (until the timewait expires). Another reason why this might not be an issue is that the QPN space is reasonably big - 2^24. I guess when we start looking at limiting #of QPs per user, we'll need to limit the max legal packet lifetime too. Might be a good idea anyway. > Finally one implementation comment: I think you'll want a list in > addition to QPN + timer, to allow the ib_mthca module to be unloaded > without having to wait an hour for all timers to expire. This allows > timewait to be bypassed by unloading + reloading but that's no > different than rebooting really. Sure, that's obvious. > Another good prophylactic measure would probably to randomize initial > PSNs for RC connections. SRP currently does this. I agree this also helps. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] 4 patches in mst-for-2.6.19
I have put the following patches in my mst-for-2.6.10 tree: $git log --pretty=short origin..mst-for-2.6.19 commit ddfe6867088167b64962399934d21cf3e37c338b Author: Jack Morgenstein <[EMAIL PROTECTED]> [PATCH] IB/mthca: recover from device errors commit 4403ad431b139b03a291263be4686363fd04138b Author: Michael S. Tsirkin <[EMAIL PROTECTED]> [PATCH] IB/cm: do not track remote QPN in timewait state commit 12f4b3b6fabcccf96ca0fa9911e86c1a6d9fc7a4 Author: Ishai Rabinovitz <[EMAIL PROTECTED]> [PATCH] IB/srp: don't schedule reconnect from srp, scsi does it for us commit a6f9624098dada22825d116d104c92bfd34465b2 Author: Ishai Rabinovitz <[EMAIL PROTECTED]> [PATCH] IB/srp: destroy and re-create QP and CQ on reconnect You can get them here git://www.mellanox.co.il/~git/infiniband mst-for-2.6.19 This is against Roland's for-2.6.19 001c6b9030233a14fa27795ab3e6a6f45f16a317 These patches have been posted on the list previously, but let me know and I'll repost them if needed. Please comment. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RFC: mthca: implement timewait by tracking QPNs
My gut reaction is that it seems pretty ugly. I guess we'll also need similar patches for ipath and ehca too -- which makes me want to have this in common code somehow. Also timewait is really only part of the CM spec -- do we want to limit the rate of RC QP creation in general for potential non-CM users that know what they're doing? I'm not sure the following is a real concern (since a hostile user can currently just create a ton of QPs and hold onto them forever), but this also allows someone to create a bunch of QPs with a super-long timeout and prevent any other QPs from being created for a few hours (until the timewait expires). Finally one implementation comment: I think you'll want a list in addition to QPN + timer, to allow the ib_mthca module to be unloaded without having to wait an hour for all timers to expire. This allows timewait to be bypassed by unloading + reloading but that's no different than rebooting really. Another good prophylactic measure would probably to randomize initial PSNs for RC connections. SRP currently does this. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] RFC: mthca: implement timewait by tracking QPNs (was Fwd: Re: [PATCH ] RFC IB/cm do not track remote QPN in timewait state)
Roland, all, we plan to implement the timewait handling in mthca in time for 2.6.19: For all connected QPs: - upon QP destroy or move from RTS to reset/error, start timer for the duration of packet lifetime - until packet expires, do not reuse this QPN This must be done to prevent stale packets from corruptiing the new connection (see 9.7.1). Could you pls let me know if this approach looks sane to you? This approach has a number of advantages over attempting to implement same in CM on top of verbs by not destroying the QP: - Reduce resource usage by freeing the QP (only track QPN+timer) - Applies to all verbs users even if they bypass CM - Solves problem for userspace CM where we can't rely on CM to enforce timewait More detail can be found in thread I'm replying to. Please comment. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB/cma: add rdma_establish
Quoting r. Sean Hefty <[EMAIL PROTECTED]>: > > As a side note, reasons for frequent loss of RTU must be investigated. > > A lost RTU shouldn't be any more likely than a lost REQ or REP. Is the RTU > never showing up? Seems like that. I know fir sure I do accept after REP but remote side never gets ESTABLISHED. > I will look into the ib_cm and see if there's an issue that > would cause an RTU not to be retried. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB/cma: add rdma_establish
Michael S. Tsirkin wrote: > Sean, did we decide what to do for upstream yet? > I would say we need something like the below for 2.6.19 too > (probably just need to update node type check). > And, I like it that this approach leaves all matters of policy > to users (such as whether move QP to RTS after asynchronous event > or after completion event). I will go with a patch similar to this one. It seems the most flexible. > As a side note, reasons for frequent loss of RTU must be investigated. A lost RTU shouldn't be any more likely than a lost REQ or REP. Is the RTU never showing up? I will look into the ib_cm and see if there's an issue that would cause an RTU not to be retried. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] CMA issue: bind selects the same port after close
Quoting r. Sean Hefty <[EMAIL PROTECTED]>: > Subject: Re: [openib-general] CMA issue: bind selects the same port after > close > > Michael S. Tsirkin wrote: > > We have encountered an issue in CMA: if > > I bind to port 0, destroy the id, then bind to port 0 again > > I often get back the same port from both binds. > > > > TCP behaves differently - it seems to assign new port numbers > > each time. > > This is an issue for some socket programs that assume that > > the same port number won't be reused to a remote side that > > connects to the same port after I have closed by socket will get > > connection refused message. > > I also see applications looking for a port number that matches > > some rule by repeating the create/bind/close cycle. > > With CMA they always get back the same port number it seems. > > > > Is this something that can be fixed in CMA? > > I think we can fix this without a huge impact. Is there anything that states > the way bind is supposed to behave wrt this? I don't think so. But since that's how it works on linux and other systems, apps assume this. > Is there some delay between > releasing a port and it being re-used that needs to be taken into account? TCP keeps port busy while in timewait state, unless REUSEADDR is given. I have not yet seen any app rely on this, so it might not be important to emulate this. > The basic problem in the CMA is in cma_alloc_port(). If the port number > (passed > in as snum) is 0, the first available port starting at > sysctl_local_port_range[0] is used. We could instead start our search by > adding an increasing counter or a random value to the lower-end of the port > range. Then expand the code to handle searching below our starting value if > we > failed to find one above it. Sounds good. > Are the port numbers assigned by TCP sequential or more random? TCP ports seem to be sequential. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] CMA issue: bind selects the same port after close
Michael S. Tsirkin wrote: > We have encountered an issue in CMA: if > I bind to port 0, destroy the id, then bind to port 0 again > I often get back the same port from both binds. > > TCP behaves differently - it seems to assign new port numbers > each time. > This is an issue for some socket programs that assume that > the same port number won't be reused to a remote side that > connects to the same port after I have closed by socket will get > connection refused message. > I also see applications looking for a port number that matches > some rule by repeating the create/bind/close cycle. > With CMA they always get back the same port number it seems. > > Is this something that can be fixed in CMA? I think we can fix this without a huge impact. Is there anything that states the way bind is supposed to behave wrt this? Is there some delay between releasing a port and it being re-used that needs to be taken into account? The basic problem in the CMA is in cma_alloc_port(). If the port number (passed in as snum) is 0, the first available port starting at sysctl_local_port_range[0] is used. We could instead start our search by adding an increasing counter or a random value to the lower-end of the port range. Then expand the code to handle searching below our starting value if we failed to find one above it. Are the port numbers assigned by TCP sequential or more random? - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH v3] ib_sa: require SA registration
> - CMA can have a static variable (good to avoid clashes with a global > 'sa_client' variable name too) Sounds good - that's a goof on my part. > - IPoIB does not use multicast module upstream, fix ipoib_multicast.c too. Okay - As an FYI, I will probably submit the multicast module upstream for 2.6.20, along with some sort of support for userspace access. > - Simplify sa_query.c changes a little. I don't like the > "deref_client" name for a function, since it sounds too much like > dereferencing a pointer rather than dropping a reference. And I > also didn't like ib_sa_client_get() having a magic side effect of > setting query->client. So I just open-coded more stuff. Those changes sound fine to me. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH v3] ib_sa: require SA registration
OK, I added the following to my for-2.6.19 branch. The differences from your patch are: - CMA can have a static variable (good to avoid clashes with a global 'sa_client' variable name too) - IPoIB does not use multicast module upstream, fix ipoib_multicast.c too. - Simplify sa_query.c changes a little. I don't like the "deref_client" name for a function, since it sounds too much like dereferencing a pointer rather than dropping a reference. And I also didn't like ib_sa_client_get() having a magic side effect of setting query->client. So I just open-coded more stuff. How does this look? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] CMA issue: bind selects the same port after close
We have encountered an issue in CMA: if I bind to port 0, destroy the id, then bind to port 0 again I often get back the same port from both binds. TCP behaves differently - it seems to assign new port numbers each time. This is an issue for some socket programs that assume that the same port number won't be reused to a remote side that connects to the same port after I have closed by socket will get connection refused message. I also see applications looking for a port number that matches some rule by repeating the create/bind/close cycle. With CMA they always get back the same port number it seems. Is this something that can be fixed in CMA? Thanks, -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] Goodbye and Transition
Hi everyone. Just wanted to respond and say I'm on the alias, and will prepare a small statement in line with what has been asked below. I should be able to get this out in a day or two. Looking forward to working with you all. Cheers - jamie Jamie Riotto Sr. Director Engineering Server Virtualization Business Unit (SVBU) Cisco Communications 408-853-7813 [EMAIL PROTECTED] -Original Message- From: Ryan, Jim [mailto:[EMAIL PROTECTED] Sent: Monday, September 11, 2006 9:35 AM To: Sujal Das; OpenFabricsEWG; openib-general@openib.org Cc: Jamie Riotto (jriotto) Subject: RE: [openfabrics-ewg] Goodbye and Transition Sujal, yes, thanks, makes sense. I got a "no longer there" response from my earlier email, so Shawn won't be around to do a handoff Jim -Original Message- From: Sujal Das [mailto:[EMAIL PROTECTED] Sent: Monday, September 11, 2006 9:28 AM To: Ryan, Jim; Shawn Hansen (shahanse); OpenFabricsEWG; openib-general@openib.org Cc: Jamie Riotto (jriotto) Subject: RE: [openfabrics-ewg] Goodbye and Transition Sounds like a good idea. Not sure if the EWG community knows Jamie (I do not, for example) - it might be a good idea if Jamie introduces himself, and specifically highlights his roles and contributions to OFA in the past and what his vision is for OFED and its adoption by OSVs, ISVs, HPC and enterprise customers. -Sujal -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ryan, Jim Sent: Monday, September 11, 2006 7:22 AM To: Shawn Hansen (shahanse); OpenFabricsEWG; openib-general@openib.org Cc: Jamie Riotto (jriotto) Subject: Re: [openfabrics-ewg] Goodbye and Transition Shawn, thanks for the note and best of luck at Microsoft. I suggest we take Shawn's recommendation and ask Jamie to continue Shawn's leadership of the EWG. Jim -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Shawn Hansen (shahanse) Sent: Friday, September 08, 2006 5:29 PM To: OpenFabricsEWG; openib-general@openib.org Cc: Jamie Riotto (jriotto) Subject: [openfabrics-ewg] Goodbye and Transition All, FYI: I've decided to relocate my family to Seattle, and will be leaving Cisco. I plan to join Microsoft's Server and Tools division at the end of this month. I would like to recommend Jamie Riotto, Senior Director of Engineering, as my EWG replacement. Jamie is responsible for all engineering for Cisco's Server Networking and Virtualization Business Unit, including Cisco's host driver and RDMA development efforts. Please stay in touch, and I wish the team the best. Regards, --Shawn Shawn Hansen Director, Product Management Cisco Systems ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] Optimize cma_process_remove()
Krishna Kumar wrote: > static void cma_process_remove(struct cma_device *cma_dev) > { > struct list_head remove_list; > - struct rdma_id_private *id_priv; > + struct rdma_id_private *id_priv, *tmp; > int ret; > > INIT_LIST_HEAD(&remove_list); > @@ -2344,22 +2344,20 @@ static void cma_process_remove(struct cm > > if (cma_internal_listen(id_priv)) { > cma_destroy_listen(id_priv); > - continue; > + } else { > + list_del(&id_priv->list); > + list_add_tail(&id_priv->list, &remove_list); > } > + } > + mutex_unlock(&lock); > > - list_del(&id_priv->list); > - list_add_tail(&id_priv->list, &remove_list); > + list_for_each_entry_safe(id_priv, tmp, &remove_list, list) { > atomic_inc(&id_priv->refcount); > - mutex_unlock(&lock); > - I don't think that this will work. The issue is that we need to walk a list of IDs associated with a particular device to notify the user that the device is being removed. While we're doing that, the user could try to destroy the ID, which removes the ID from the device list. The original code takes a reference on the ID before removing it from the from cma_dev's list to ensure that the ID will be valid while we process it. The remove list ensures that the user is only notified once of a device removal. (We don't know where the thread calling rdma_destroy_id() is at.) We can eliminate the remove_list by calling list_del_init(). - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] Modify callers of cma_get_net_info for better error handling.
Krishna Kumar wrote: > Re-organize code relating to cma_get_net_info() and rdma_create_id() to > optimize error case handling (no need to alloc memory/etc as part of > rdma_create_id() if input parameters are wrong). Thanks! Committed with a minor adjustment to rename 'out' label 'err'. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] ib_madeye kfree() problem on module unload
When using OFED-1.1-rc3 on a x86_64 system running a 2.6.17.3 debug kernel in a RHEL4 U2 environment, I see the follwing console warning messages when I unload the ib_madeye kernel module: modprobe ib_madeye modprobe -r ib_madeye console messages slab error in cache_free_debugcheck(): cache `size-32': double free, or memory outside object was overwritten Call Trace: {__slab_error+36} {cache_free_debugcheck+365} {kfree+136} {:ib_madeye:madeye_remove_one+123} {:ib_core:ib_unregister_client+75} {:ib_madeye:ib_madeye_cleanup+16} {sys_delete_module+446} {tracesys+113} {tracesys+209} 81007834bd48: redzone 1:0x170fc2a5, redzone 2:0x8100400929c8 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] cma_connect_ib leaks memory in failure cases.
Quoting r. Sean Hefty <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] cma_connect_ib leaks memory in failure cases. > > Michael S. Tsirkin wrote: > >>cma_connect_ib leaks an struct ib_cm_id* in failure cases. > >> > >>Signed-off-by: Krishna Kumar <[EMAIL PROTECTED]> > > > > > > This one looks like it might be good for 2.6.18. Sean? > > The ib_cm_id will be cleaned up if the rdma_cm_id is destroyed, as long as a > second call is not made to rdma_connect after the first call fails. So we're > probably safe deferring this until 2.6.19, unless someone has code which > calls > rdma_connect twice. SDP can do this I think. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] InfiniBand DevCon Conference
Hello, Attached is the final reminder for InfinBand DevCon conference. If you have any questions, please let me know. Thank you, Stephanie Stephanie Howard Owen Media 206.322.1167 ext. 102 [EMAIL PROTECTED] InfiniBand DevCon Blast Final Blast - OFA.doc Description: InfiniBand DevCon Blast Final Blast - OFA.doc ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] cma_connect_ib leaks memory in failure cases.
Krishna Kumar wrote: > cma_connect_ib leaks an struct ib_cm_id* in failure cases. Thanks - committed. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Fwd: linux- 2.6.18-rc6-git1 issue 46: drivers/infiniband/ulp/iser/iser_verbs.c:514: undefined reference to `rdma_create_id'
Yep, that patch fixes the bug :-) Thanks Am Monday 11 September 2006 16:37 schrieb Roland Dreier: > There is definitely a bug in the drivers/infiniband/ulp/iser/Kconfig > file. ISER only depends on INFINIBAND && SCSI. However it is easily > possible to enable INFINIBAND and SCSI without enabling INET (in fact > they can be enabled without NET as in the original config in this thread). > > iser does select SCSI_ISCSI_ATTRS, but without selecting NET that it > depends on, so this alone will result in a broken config. However > nothing will enable INET (which I think you said iser depends on). So > something like the below is required, I think. Although it would > probably be better to make iser depend on INET (as ISCSI_TCP does) > rather than selecting NET and INET. > > Toralf, can you confirm that applying this patch and doing make > oldconfig and make with your original config works OK? > > Thanks, > Roland > > diff --git a/drivers/infiniband/ulp/iser/Kconfig > b/drivers/infiniband/ulp/iser/Kconfig > index fead87d..a122bb4 100644 > --- a/drivers/infiniband/ulp/iser/Kconfig > +++ b/drivers/infiniband/ulp/iser/Kconfig > @@ -1,6 +1,8 @@ > config INFINIBAND_ISER > tristate "ISCSI RDMA Protocol" > depends on INFINIBAND && SCSI > + select NET > + select INET > select SCSI_ISCSI_ATTRS > ---help--- > Support for the ISCSI RDMA Protocol over InfiniBand. This > > -- MfG/Sincerely Toralf Förster pgpvF6tvGk904.pgp Description: PGP signature ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] cma_connect_ib leaks memory in failure cases.
Michael S. Tsirkin wrote: >>cma_connect_ib leaks an struct ib_cm_id* in failure cases. >> >>Signed-off-by: Krishna Kumar <[EMAIL PROTECTED]> > > > This one looks like it might be good for 2.6.18. Sean? The ib_cm_id will be cleaned up if the rdma_cm_id is destroyed, as long as a second call is not made to rdma_connect after the first call fails. So we're probably safe deferring this until 2.6.19, unless someone has code which calls rdma_connect twice. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RDMA CMA and C++
Dotan Barak wrote: >>The user-mode cm header files don't have the C++ stuff to identify all >>the declarations as C. The verbs.h file has it and works fine if you >>wanted to copy it, but all you really need is ... >> > Sean, please add those definitions to the libibcm header as well. I've updated the libibcm and librdmacm header files. Thanks. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] is there a plan for getting SDP into kernel.org?
> Scott> How about just adding netstat before the merge, so we have > Scott> some visibility into what SDP connections are in use? > > That's fine. Merging upstream is somewhat long-term anyway, since > Michael has not even posted a first candidate for review -- I expect > SDP will require several go-arounds to get merged. > > - R. Michael, when do you expect to post a first candidate for review? Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] is there a plan for getting SDP into kernel.org?
Scott> How about just adding netstat before the merge, so we have Scott> some visibility into what SDP connections are in use? That's fine. Merging upstream is somewhat long-term anyway, since Michael has not even posted a first candidate for review -- I expect SDP will require several go-arounds to get merged. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH v3] ib_sa: require SA registration
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH v3] ib_sa: require SA registration > > Sean> Roland, Not sure if you've had a chance to review the SA > Sean> patches, but any comments on any of the SA related patches? > Sean> (SA registration, generic RMPP query support, or userspace > Sean> SA) > > I haven't really read the later patches but I am planning on merging > at least the registration stuff for 2.6.19. Yes, the registration stuff is clearly safe -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Wrong byte order in lid of struct ibv_port_attr reported by ibv_query port!?
Bub Thomas wrote: > with the help of your modified cmpost.c example I found out that the > byte order in the lid your query_for_path in cmpost.c is getting into > the ib_sa_path_rec is the opposite to the one reported by ibv_query_port. The path record defines all fields in network-byte order. The verb calls use host-byte order. Typically, the path record information will come directly from the SA, which defines the fields in network-byte order, which is why it isn't converted to host-order. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH v3] ib_sa: require SA registration
Roland Dreier wrote: > I haven't really read the later patches but I am planning on merging > at least the registration stuff for 2.6.19. I'd like to commit the SA related patches soon. There have been several e-mails recently about using IB multicast and the IB CM directly. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] is there a plan for getting SDP into kernel.org?
> Scott> I would like to see netstat support, zcopy support, and > Scott> ideally AIO support get added first... > > Better to merge first and then add features I think. > > - R. > How about just adding netstat before the merge, so we have some visibility into what SDP connections are in use? Scott ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] Goodbye and Transition
Sujal, yes, thanks, makes sense. I got a "no longer there" response from my earlier email, so Shawn won't be around to do a handoff Jim -Original Message- From: Sujal Das [mailto:[EMAIL PROTECTED] Sent: Monday, September 11, 2006 9:28 AM To: Ryan, Jim; Shawn Hansen (shahanse); OpenFabricsEWG; openib-general@openib.org Cc: Jamie Riotto (jriotto) Subject: RE: [openfabrics-ewg] Goodbye and Transition Sounds like a good idea. Not sure if the EWG community knows Jamie (I do not, for example) - it might be a good idea if Jamie introduces himself, and specifically highlights his roles and contributions to OFA in the past and what his vision is for OFED and its adoption by OSVs, ISVs, HPC and enterprise customers. -Sujal -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ryan, Jim Sent: Monday, September 11, 2006 7:22 AM To: Shawn Hansen (shahanse); OpenFabricsEWG; openib-general@openib.org Cc: Jamie Riotto (jriotto) Subject: Re: [openfabrics-ewg] Goodbye and Transition Shawn, thanks for the note and best of luck at Microsoft. I suggest we take Shawn's recommendation and ask Jamie to continue Shawn's leadership of the EWG. Jim -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Shawn Hansen (shahanse) Sent: Friday, September 08, 2006 5:29 PM To: OpenFabricsEWG; openib-general@openib.org Cc: Jamie Riotto (jriotto) Subject: [openfabrics-ewg] Goodbye and Transition All, FYI: I've decided to relocate my family to Seattle, and will be leaving Cisco. I plan to join Microsoft's Server and Tools division at the end of this month. I would like to recommend Jamie Riotto, Senior Director of Engineering, as my EWG replacement. Jamie is responsible for all engineering for Cisco's Server Networking and Virtualization Business Unit, including Cisco's host driver and RDMA development efforts. Please stay in touch, and I wish the team the best. Regards, --Shawn Shawn Hansen Director, Product Management Cisco Systems ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] PXE + infiniband?
Eli cohen wrote: > On Thu, 2006-09-07 at 08:19 +0100, Paul Baxter wrote: >>> There is an implementation of PXE for Mellanox's HCAs that can be >>> found here: http://sourceforge.net/forum/forum.php?forum_id=494529 >> >> Thanks for the tip >> >> I, too, am interested in this. >> >> Do you have a more direct link as I wandered around etherboot's >> project site and couldn't find anything IB-specific. >> >> Paul Baxter > Hi, > > Please use the following link > http://kent.dl.sourceforge.net/sourceforge/etherboot/etherboot-5.4.2.tar .bz2 > to download the package. Unpack the package and cd to the src dir. > Use an x86 arch machine to build the binaries. The infiniband drivers > are located at src/drivers/net/mlx_ipoib/ where you can find a readme > file in the doc directory. To build. > > cd src > make bin/MT23108.zrom // for MT230108 > make bin/MT25208.zrom > make bin/MT25218.zrom > > This covers all Mellanox HCAs. Please let me know if you need more > assistance. > A less involved solution is to use ROM-o-matic http://rom-o-matic.net/ . The Etherboot 5.4.2 image for MT23108 works nicely. > > > ___ > openib-general mailing list > openib-general@openib.org > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] Goodbye and Transition
Sounds like a good idea. Not sure if the EWG community knows Jamie (I do not, for example) - it might be a good idea if Jamie introduces himself, and specifically highlights his roles and contributions to OFA in the past and what his vision is for OFED and its adoption by OSVs, ISVs, HPC and enterprise customers. -Sujal -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ryan, Jim Sent: Monday, September 11, 2006 7:22 AM To: Shawn Hansen (shahanse); OpenFabricsEWG; openib-general@openib.org Cc: Jamie Riotto (jriotto) Subject: Re: [openfabrics-ewg] Goodbye and Transition Shawn, thanks for the note and best of luck at Microsoft. I suggest we take Shawn's recommendation and ask Jamie to continue Shawn's leadership of the EWG. Jim -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Shawn Hansen (shahanse) Sent: Friday, September 08, 2006 5:29 PM To: OpenFabricsEWG; openib-general@openib.org Cc: Jamie Riotto (jriotto) Subject: [openfabrics-ewg] Goodbye and Transition All, FYI: I've decided to relocate my family to Seattle, and will be leaving Cisco. I plan to join Microsoft's Server and Tools division at the end of this month. I would like to recommend Jamie Riotto, Senior Director of Engineering, as my EWG replacement. Jamie is responsible for all engineering for Cisco's Server Networking and Virtualization Business Unit, including Cisco's host driver and RDMA development efforts. Please stay in touch, and I wish the team the best. Regards, --Shawn Shawn Hansen Director, Product Management Cisco Systems ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Fwd: linux- 2.6.18-rc6-git1 issue 46: drivers/infiniband/ulp/iser/iser_verbs.c:514: undefined reference to `rdma_create_id'
Erez> Let me make sure that I understand: If INET is disabled and Erez> we enable INFINIBAND, INFINIBAND_ADDR_TRANS will not be Erez> enabled (because INET is disbaled). This results in the Erez> scenario that Toralf is in. If this is correct, I agree with Erez> your patch. Yes, that's right. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 5/5] IB/iser: Do not use FMR for a single dma entry sg
Roland Dreier wrote: > Thanks, applied 1-5 with this minor fix for a compile warning: > > --- a/drivers/infiniband/ulp/iser/iser_memory.c > +++ b/drivers/infiniband/ulp/iser/iser_memory.c > @@ -427,9 +427,9 @@ int iser_reg_rdma_mem(struct iscsi_iser_ > iser_err("page_vec: data_size = 0x%x, length = %d, > offset = 0x%x\n", >ib_conn->page_vec->data_size, > ib_conn->page_vec->length, >ib_conn->page_vec->offset); > - for (i=0 ; ipage_vec->length ; i++) { > - iser_err("page_vec[%d] = 0x%lx\n", i, > ib_conn->page_vec->pages[i]); > - } > + for (i=0 ; ipage_vec->length ; i++) > + iser_err("page_vec[%d] = 0x%llx\n", i, > + (unsigned long long) > ib_conn->page_vec->pages[i]); > return err; > } > } > OK, thanks. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Fwd: linux- 2.6.18-rc6-git1 issue 46: drivers/infiniband/ulp/iser/iser_verbs.c:514: undefined reference to `rdma_create_id'
Roland Dreier wrote: > There is definitely a bug in the drivers/infiniband/ulp/iser/Kconfig > file. ISER only depends on INFINIBAND && SCSI. However it is easily > possible to enable INFINIBAND and SCSI without enabling INET (in fact > they can be enabled without NET as in the original config in this thread). > > iser does select SCSI_ISCSI_ATTRS, but without selecting NET that it > depends on, so this alone will result in a broken config. However > nothing will enable INET (which I think you said iser depends on). So > something like the below is required, I think. Although it would > probably be better to make iser depend on INET (as ISCSI_TCP does) > rather than selecting NET and INET. > > Let me make sure that I understand: If INET is disabled and we enable INFINIBAND, INFINIBAND_ADDR_TRANS will not be enabled (because INET is disbaled). This results in the scenario that Toralf is in. If this is correct, I agree with your patch. Thanks Erez ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 5/5] IB/iser: Do not use FMR for a single dma entry sg
Thanks, applied 1-5 with this minor fix for a compile warning: --- a/drivers/infiniband/ulp/iser/iser_memory.c +++ b/drivers/infiniband/ulp/iser/iser_memory.c @@ -427,9 +427,9 @@ int iser_reg_rdma_mem(struct iscsi_iser_ iser_err("page_vec: data_size = 0x%x, length = %d, offset = 0x%x\n", ib_conn->page_vec->data_size, ib_conn->page_vec->length, ib_conn->page_vec->offset); - for (i=0 ; ipage_vec->length ; i++) { - iser_err("page_vec[%d] = 0x%lx\n", i, ib_conn->page_vec->pages[i]); - } + for (i=0 ; ipage_vec->length ; i++) + iser_err("page_vec[%d] = 0x%llx\n", i, +(unsigned long long) ib_conn->page_vec->pages[i]); return err; } } ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] is there a plan for getting SDP into kernel.org?
Scott> I would like to see netstat support, zcopy support, and Scott> ideally AIO support get added first... Better to merge first and then add features I think. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Fwd: linux- 2.6.18-rc6-git1 issue 46: drivers/infiniband/ulp/iser/iser_verbs.c:514: undefined reference to `rdma_create_id'
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: Fwd: linux- 2.6.18-rc6-git1 issue 46: > drivers/infiniband/ulp/iser/iser_verbs.c:514: undefined reference to > `rdma_create_id' > > There is definitely a bug in the drivers/infiniband/ulp/iser/Kconfig > file. ISER only depends on INFINIBAND && SCSI. However it is easily > possible to enable INFINIBAND and SCSI without enabling INET (in fact > they can be enabled without NET as in the original config in this thread). > > iser does select SCSI_ISCSI_ATTRS, but without selecting NET that it > depends on, so this alone will result in a broken config. However > nothing will enable INET (which I think you said iser depends on). So > something like the below is required, I think. Although it would > probably be better to make iser depend on INET (as ISCSI_TCP does) > rather than selecting NET and INET. Maybe just make iser depend on CMA since that is what it really needs? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Fwd: linux- 2.6.18-rc6-git1 issue 46: drivers/infiniband/ulp/iser/iser_verbs.c:514: undefined reference to `rdma_create_id'
There is definitely a bug in the drivers/infiniband/ulp/iser/Kconfig file. ISER only depends on INFINIBAND && SCSI. However it is easily possible to enable INFINIBAND and SCSI without enabling INET (in fact they can be enabled without NET as in the original config in this thread). iser does select SCSI_ISCSI_ATTRS, but without selecting NET that it depends on, so this alone will result in a broken config. However nothing will enable INET (which I think you said iser depends on). So something like the below is required, I think. Although it would probably be better to make iser depend on INET (as ISCSI_TCP does) rather than selecting NET and INET. Toralf, can you confirm that applying this patch and doing make oldconfig and make with your original config works OK? Thanks, Roland diff --git a/drivers/infiniband/ulp/iser/Kconfig b/drivers/infiniband/ulp/iser/Kconfig index fead87d..a122bb4 100644 --- a/drivers/infiniband/ulp/iser/Kconfig +++ b/drivers/infiniband/ulp/iser/Kconfig @@ -1,6 +1,8 @@ config INFINIBAND_ISER tristate "ISCSI RDMA Protocol" depends on INFINIBAND && SCSI + select NET + select INET select SCSI_ISCSI_ATTRS ---help--- Support for the ISCSI RDMA Protocol over InfiniBand. This ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] Goodbye and Transition
Shawn, thanks for the note and best of luck at Microsoft. I suggest we take Shawn's recommendation and ask Jamie to continue Shawn's leadership of the EWG. Jim -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Shawn Hansen (shahanse) Sent: Friday, September 08, 2006 5:29 PM To: OpenFabricsEWG; openib-general@openib.org Cc: Jamie Riotto (jriotto) Subject: [openfabrics-ewg] Goodbye and Transition All, FYI: I've decided to relocate my family to Seattle, and will be leaving Cisco. I plan to join Microsoft's Server and Tools division at the end of this month. I would like to recommend Jamie Riotto, Senior Director of Engineering, as my EWG replacement. Jamie is responsible for all engineering for Cisco's Server Networking and Virtualization Business Unit, including Cisco's host driver and RDMA development efforts. Please stay in touch, and I wish the team the best. Regards, --Shawn Shawn Hansen Director, Product Management Cisco Systems ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH][TRIVIAL] OpenSM: Change QoS syntax for CA ports
OpenSM: Change QoS syntax for CA ports Change names from hca_ to ca_ to make it clearer that these are for both HCAs and TCAs. Signed-off-by: Hal Rosenstock <[EMAIL PROTECTED]> Index: doc/qos-config.txt === --- doc/qos-config.txt (revision 9347) +++ doc/qos-config.txt (working copy) @@ -28,11 +28,11 @@ values may be stored in OpenSM config fi In addition to the above, we may define separate QoS configuration parameters sets for various target types. As targets, we currently support -HCA, routers, switch external ports, and switch's enhanced port 0. The +CAs, routers, switch external ports, and switch's enhanced port 0. The names of such specialized parameters are prefixed by "qos__" string. Here is a full list of the currently supported sets: - qos_hca_ - QoS configuration parameters set for HCAs. + qos_ca_ - QoS configuration parameters set for CAs. qos_rtr_ - parameters set for routers. qos_sw0_ - parameters set for switches' port 0. qos_swe_ - parameters set for switches' external ports. @@ -40,5 +40,5 @@ string. Here is a full list of the curre Examples: qos_sw0_max_vls=2 - qos_hca_sl2vl=0,1,2,3,5,5,5,12,12,0, + qos_ca_sl2vl=0,1,2,3,5,5,5,12,12,0, qos_swe_high_limit=0 Index: man/opensm.8 === --- man/opensm.8(revision 9347) +++ man/opensm.8(working copy) @@ -1,4 +1,4 @@ -.TH OPENSM 8 "Setpember 6, 2006" "OpenIB" "OpenIB Management" +.TH OPENSM 8 "Setpember 11, 2006" "OpenIB" "OpenIB Management" .SH NAME opensm \- InfiniBand subnet manager and administration (SM/SA) @@ -365,18 +365,18 @@ values may be stored in OpenSM config fi In addition to the above, we may define separate QoS configuration parameters sets for various target types. As targets, we currently support -HCA, routers, switch external ports, and switch's enhanced port 0. The +CAs, routers, switch external ports, and switch's enhanced port 0. The names of such specialized parameters are prefixed by "qos__" string. Here is a full list of the currently supported sets: - qos_hca_ - QoS configuration parameters set for HCAs. + qos_ca_ - QoS configuration parameters set for CAs. qos_rtr_ - parameters set for routers. qos_sw0_ - parameters set for switches' port 0. qos_swe_ - parameters set for switches' external ports. Examples: qos_sw0_max_vls=2 - qos_hca_sl2vl=0,1,2,3,5,5,5,12,12,0, + qos_ca_sl2vl=0,1,2,3,5,5,5,12,12,0, qos_swe_high_limit=0 .SH ROUTING Index: include/opensm/osm_subnet.h === --- include/opensm/osm_subnet.h (revision 9351) +++ include/opensm/osm_subnet.h (working copy) @@ -282,7 +282,7 @@ typedef struct _osm_subn_opt boolean_texit_on_fatal; boolean_thonor_guid2lid_file; osm_qos_options_tqos_options; - osm_qos_options_tqos_hca_options; + osm_qos_options_tqos_ca_options; osm_qos_options_tqos_sw0_options; osm_qos_options_tqos_swe_options; osm_qos_options_tqos_rtr_options; @@ -457,8 +457,8 @@ typedef struct _osm_subn_opt * qos_options * Default set of QoS options * -* qos_hca_options -* QoS options for HCA ports +* qos_ca_options +* QoS options for CA ports * * qos_sw0_options * QoS options for switches' port 0 Index: opensm/osm_subnet.c === --- opensm/osm_subnet.c (revision 9351) +++ opensm/osm_subnet.c (working copy) @@ -495,7 +495,7 @@ osm_subn_set_default_opt( p_opt->updn_guid_file = NULL; p_opt->exit_on_fatal = TRUE; subn_set_default_qos_options(&p_opt->qos_options); - subn_set_default_qos_options(&p_opt->qos_hca_options); + subn_set_default_qos_options(&p_opt->qos_ca_options); subn_set_default_qos_options(&p_opt->qos_sw0_options); subn_set_default_qos_options(&p_opt->qos_swe_options); subn_set_default_qos_options(&p_opt->qos_rtr_options); @@ -737,8 +737,8 @@ osm_subn_rescan_conf_file( subn_parse_qos_options("qos", p_key, p_val, &p_opts->qos_options); - subn_parse_qos_options("qos_hca", -p_key, p_val, &p_opts->qos_hca_options); + subn_parse_qos_options("qos_ca", +p_key, p_val, &p_opts->qos_ca_options); subn_parse_qos_options("qos_sw0", p_key, p_val, &p_opts->qos_sw0_options); @@ -967,8 +967,8 @@ osm_subn_parse_conf_file( subn_parse_qos_options("qos", p_key, p_val, &p_opts->qos_options); - subn_parse_qos_options("qos_hca", -p_key, p_val, &p_opts->qos_hca_options); + subn_parse_qos_options("qos_ca", +p_key, p_val, &p_opts->qos_ca_options); subn_parse_qos_options("qos_sw0", p_key, p_val, &p_opts->qos_sw0_options); @@ -1211,7 +1211,7 @@ osm_subn_write_conf_file( "QoS default option
[openib-general] kernel mode
Hi, A general doubt. If I write a kernel program (linux kernel module) to send and receive data using IB, will it perform better then its user mode counterpart. Unlike user mode, in kernel mode, I think it is possible to allocate physically contiguous memory using "kmalloc or alloc_pages" which means HCAs need not do any address translation ( i.e. no need of page table lookup as I guess in this case virtual address and physical address will differ only by a fixed offset) for copying data into main memory. Besides I think traditional DMAs give better performance with contiguous memory and use a special GFP_DMA zone. Moreover polling a CQ may be more efficient in kernel. Is this correct? Regards, John T. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] why sdp connections cost so much memory
Quoting r. zhu shi song <[EMAIL PROTECTED]>: > Subject: Re: why sdp connections cost so much memory > > > You should not need this change with the scale patch > > I posted - after applying > > this, and setting the scale parameter to 0x1, each > > connection should use around > > 128K for RX. Please confirm. > Just setting the scale parameter to 0x1, memory > reduction is OK. But there occurred one bug, > sometimes my kernel crashed. Shouldn't happen. Backtrace? > So I think PRE POST buf > size should be changed either. > zhu Hmm. I don't really see how this would help. Is it true that changing just the RX size fixes the crashes for you? If yes I'd like to investigate. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] why sdp connections cost so much memory
> You should not need this change with the scale patch > I posted - after applying > this, and setting the scale parameter to 0x1, each > connection should use around > 128K for RX. Please confirm. Just setting the scale parameter to 0x1, memory reduction is OK. But there occurred one bug, sometimes my kernel crashed. So I think PRE POST buf size should be changed either. zhu __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] osm: OSM bug fix with --run-once option
Hi Yevgeny, On Sun, 2006-09-10 at 02:35, Yevgeny Kliteynik wrote: > Hi Hal > > This patch fixes the bug that was occurring when OSM was > running with --run-once option (-o) and the SM port was down. > In that case, OSM would be stuck in cond_wait forever (or until > the port will become active), and could not be terminated, > other than by SIGKILL. > > Yevgeny > > Signed-off-by: Yevgeny Kliteynik <[EMAIL PROTECTED]> Thanks. Applied to trunk only. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] why sdp connections cost so much memory
> You should not need this change with the scale patch > I posted - after applying > this, and setting the scale parameter to 0x1, each > connection should use around > 128K for RX. Please confirm. I have tested it again, yes, you are right. I just set the scale parameter to 0x1, each connection cost about 128K memory. zhu __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ehca for OFED 1.1-rc4
Quoting r. Hoang-Nam Nguyen <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] ehca for OFED 1.1-rc4 > > I guess my email client must have wrapped the lines so that the patch is > not applicable any more. Sorry for that! You also want to fix the mail format - you currently send each mail in both HTML and plain text - make it plaintext only. > Need little time to fix that problem. For now I'm sending you the patch > file as attachment that I could apply without errors. > Thanks, > Nam Nguyen > > (See attached file: ofed_svnehca_0015.patch) OK, applied and will be pushed out. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 5/5] IB/iser: Do not use FMR for a single dma entry sg
Fast Memory Registration (fmr) is used to register for rdma an sg whose elements are not linearly sequential after dma mapping. The IB verbs layer provides an "all dma memory MR (memory region)" which can be used for RDMA-ing a dma linearly sequential buffer. Change the code to use the dma mr instead of doing fmr when dma mapping produces a single dma entry sg. Signed-off-by: Erez Zilber <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/iser/iscsi_iser.h |1 + drivers/infiniband/ulp/iser/iser_memory.c | 48 + drivers/infiniband/ulp/iser/iser_verbs.c |6 ++-- 3 files changed, 39 insertions(+), 16 deletions(-) c403e930977afb2838588523d10819ce586951a2 diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h b/drivers/infiniband/ulp/iser/iscsi_iser.h index 2c8bc67..2cf9ae0 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.h +++ b/drivers/infiniband/ulp/iser/iscsi_iser.h @@ -175,6 +175,7 @@ struct iser_mem_reg { u64 va; u64 len; void *mem_h; + int is_fmr; }; struct iser_regd_buf { diff --git a/drivers/infiniband/ulp/iser/iser_memory.c b/drivers/infiniband/ulp/iser/iser_memory.c index 8fea0bc..e0d4347 100644 --- a/drivers/infiniband/ulp/iser/iser_memory.c +++ b/drivers/infiniband/ulp/iser/iser_memory.c @@ -56,7 +56,7 @@ int iser_regd_buff_release(struct iser_r if ((atomic_read(®d_buf->ref_count) == 0) || atomic_dec_and_test(®d_buf->ref_count)) { /* if we used the dma mr, unreg is just NOP */ - if (regd_buf->reg.rkey != 0) + if (regd_buf->reg.is_fmr) iser_unreg_mem(®d_buf->reg); if (regd_buf->dma_addr) { @@ -91,9 +91,9 @@ void iser_reg_single(struct iser_device BUG_ON(dma_mapping_error(dma_addr)); regd_buf->reg.lkey = device->mr->lkey; - regd_buf->reg.rkey = 0; /* indicate there's no need to unreg */ regd_buf->reg.len = regd_buf->data_size; regd_buf->reg.va = dma_addr; + regd_buf->reg.is_fmr = 0; regd_buf->dma_addr = dma_addr; regd_buf->direction = direction; @@ -379,11 +379,13 @@ int iser_reg_rdma_mem(struct iscsi_iser_ enum iser_data_dircmd_dir) { struct iser_conn *ib_conn = iser_ctask->iser_conn->ib_conn; + struct iser_device *device = ib_conn->device; struct iser_data_buf *mem = &iser_ctask->data[cmd_dir]; struct iser_regd_buf *regd_buf; int aligned_len; int err; int i; + struct scatterlist *sg; regd_buf = &iser_ctask->rdma_regd[cmd_dir]; @@ -399,19 +401,37 @@ int iser_reg_rdma_mem(struct iscsi_iser_ mem = &iser_ctask->data_copy[cmd_dir]; } - iser_page_vec_build(mem, ib_conn->page_vec); - err = iser_reg_page_vec(ib_conn, ib_conn->page_vec, ®d_buf->reg); - if (err) { - iser_data_buf_dump(mem); - iser_err("mem->dma_nents = %d (dlength = 0x%x)\n", mem->dma_nents, -ntoh24(iser_ctask->desc.iscsi_header.dlength)); - iser_err("page_vec: data_size = 0x%x, length = %d, offset = 0x%x\n", -ib_conn->page_vec->data_size, ib_conn->page_vec->length, -ib_conn->page_vec->offset); - for (i=0 ; ipage_vec->length ; i++) { - iser_err("page_vec[%d] = 0x%lx\n", i, ib_conn->page_vec->pages[i]); + /* if there a single dma entry, FMR is not needed */ + if (mem->dma_nents == 1) { + sg = (struct scatterlist *)mem->buf; + + regd_buf->reg.lkey = device->mr->lkey; + regd_buf->reg.rkey = device->mr->rkey; + regd_buf->reg.len = sg_dma_len(&sg[0]); + regd_buf->reg.va = sg_dma_address(&sg[0]); + regd_buf->reg.is_fmr = 0; + + iser_dbg("PHYSICAL Mem.register: lkey: 0x%08X rkey: 0x%08X " +"va: 0x%08lX sz: %ld]\n", +(unsigned int)regd_buf->reg.lkey, +(unsigned int)regd_buf->reg.rkey, +(unsigned long)regd_buf->reg.va, +(unsigned long)regd_buf->reg.len); + } else { /* use FMR for multiple dma entries */ + iser_page_vec_build(mem, ib_conn->page_vec); + err = iser_reg_page_vec(ib_conn, ib_conn->page_vec, ®d_buf->reg); + if (err) { + iser_data_buf_dump(mem); + iser_err("mem->dma_nents = %d (dlength = 0x%x)\n", mem->dma_nents, +ntoh24(iser_ctask->desc.iscsi_header.dlength)); + iser_err("page_vec: data_size = 0x%x, length = %d, offset = 0x%x\n", +ib_conn->page_vec->data_size, ib_conn->page_vec->length, +ib_conn->page_vec->offset); + for (
[openib-general] [PATCH 4/5] IB/iser: fix some debug prints
fix and add some debug prints related to iser handling of memory for rdma. Signed-off-by: Erez Zilber <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/iser/iser_memory.c | 17 ++--- 1 files changed, 14 insertions(+), 3 deletions(-) 00703cf2800ce3ac864b149ce75435b00480d9d2 diff --git a/drivers/infiniband/ulp/iser/iser_memory.c b/drivers/infiniband/ulp/iser/iser_memory.c index bcef0d3..8fea0bc 100644 --- a/drivers/infiniband/ulp/iser/iser_memory.c +++ b/drivers/infiniband/ulp/iser/iser_memory.c @@ -329,9 +329,9 @@ static void iser_data_buf_dump(struct is struct scatterlist *sg = (struct scatterlist *)data->buf; int i; - for (i = 0; i < data->size; i++) + for (i = 0; i < data->dma_nents; i++) iser_err("sg[%d] dma_addr:0x%lX page:0x%p " -"off:%d sz:%d dma_len:%d\n", +"off:0x%x sz:0x%x dma_len:0x%x\n", i, (unsigned long)sg_dma_address(&sg[i]), sg[i].page, sg[i].offset, sg[i].length,sg_dma_len(&sg[i])); @@ -383,6 +383,7 @@ int iser_reg_rdma_mem(struct iscsi_iser_ struct iser_regd_buf *regd_buf; int aligned_len; int err; + int i; regd_buf = &iser_ctask->rdma_regd[cmd_dir]; @@ -400,8 +401,18 @@ int iser_reg_rdma_mem(struct iscsi_iser_ iser_page_vec_build(mem, ib_conn->page_vec); err = iser_reg_page_vec(ib_conn, ib_conn->page_vec, ®d_buf->reg); - if (err) + if (err) { + iser_data_buf_dump(mem); + iser_err("mem->dma_nents = %d (dlength = 0x%x)\n", mem->dma_nents, +ntoh24(iser_ctask->desc.iscsi_header.dlength)); + iser_err("page_vec: data_size = 0x%x, length = %d, offset = 0x%x\n", +ib_conn->page_vec->data_size, ib_conn->page_vec->length, +ib_conn->page_vec->offset); + for (i=0 ; ipage_vec->length ; i++) { + iser_err("page_vec[%d] = 0x%lx\n", i, ib_conn->page_vec->pages[i]); + } return err; + } /* take a reference on this regd buf such that it will not be released * * (eg in send dto completion) before we get the scsi response */ -- 1.2.6 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 3/5] IB/iser: make FMR "page size" be 4K and not PAGE_SIZE
As iser is able to use at most one rdma operation for the execution of a scsi command, and registration of the sg associated with scsi command has its restrictions, the code checks if an sg is "aligned for rdma". Alignment for rdma is measured in "fmr page" units whose possible resolutions are different between HCAs and can be smaller, equal or bigger to the system page size. When the system page size is bigger than 4KB (eg the default with ia64 kernels) there a bigger chance that an sg would be aligned for rdma if the fmr page size is 4KB. Change the code to create FMR whose pages are of size 4KB and to take that into account when processing the sg. Signed-off-by: Erez Zilber <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/iser/iscsi_iser.h |6 +- drivers/infiniband/ulp/iser/iser_memory.c | 31 +++-- drivers/infiniband/ulp/iser/iser_verbs.c |4 ++-- 3 files changed, 27 insertions(+), 14 deletions(-) 1f90243f796772fcaea6ad059876a0aad6a06d52 diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h b/drivers/infiniband/ulp/iser/iscsi_iser.h index 7c3d0c9..2c8bc67 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.h +++ b/drivers/infiniband/ulp/iser/iscsi_iser.h @@ -82,8 +82,12 @@ __func__ , ## arg); \ } while (0) +#define SHIFT_4K 12 +#define SIZE_4K(1UL << SHIFT_4K) +#define MASK_4K(~(SIZE_4K-1)) + /* support upto 512KB in one RDMA */ -#define ISCSI_ISER_SG_TABLESIZE (0x8 >> PAGE_SHIFT) +#define ISCSI_ISER_SG_TABLESIZE (0x8 >> SHIFT_4K) #define ISCSI_ISER_MAX_LUN 256 #define ISCSI_ISER_MAX_CMD_LEN 16 diff --git a/drivers/infiniband/ulp/iser/iser_memory.c b/drivers/infiniband/ulp/iser/iser_memory.c index 53af956..bcef0d3 100644 --- a/drivers/infiniband/ulp/iser/iser_memory.c +++ b/drivers/infiniband/ulp/iser/iser_memory.c @@ -42,6 +42,7 @@ #include "iscsi_iser.h" #define ISER_KMALLOC_THRESHOLD 0x2 /* 128K - kmalloc limit */ + /** * Decrements the reference count for the * registered buffer & releases it @@ -239,7 +240,7 @@ static int iser_sg_to_page_vec(struct is int i; /* compute the offset of first element */ - page_vec->offset = (u64) sg[0].offset; + page_vec->offset = (u64) sg[0].offset & ~MASK_4K; for (i = 0; i < data->dma_nents; i++) { total_sz += sg_dma_len(&sg[i]); @@ -247,21 +248,30 @@ static int iser_sg_to_page_vec(struct is first_addr = sg_dma_address(&sg[i]); last_addr = first_addr + sg_dma_len(&sg[i]); - start_aligned = !(first_addr & ~PAGE_MASK); - end_aligned = !(last_addr & ~PAGE_MASK); + start_aligned = !(first_addr & ~MASK_4K); + end_aligned = !(last_addr & ~MASK_4K); /* continue to collect page fragments till aligned or SG ends */ while (!end_aligned && (i + 1 < data->dma_nents)) { i++; total_sz += sg_dma_len(&sg[i]); last_addr = sg_dma_address(&sg[i]) + sg_dma_len(&sg[i]); - end_aligned = !(last_addr & ~PAGE_MASK); + end_aligned = !(last_addr & ~MASK_4K); } - first_addr = first_addr & PAGE_MASK; - - for (page = first_addr; page < last_addr; page += PAGE_SIZE) - page_vec->pages[cur_page++] = page; + /* handle the 1st page in the 1st DMA element */ + if (cur_page == 0) { + page = first_addr & MASK_4K; + page_vec->pages[cur_page] = page; + cur_page++; + page += SIZE_4K; + } else + page = first_addr; + + for (; page < last_addr; page += SIZE_4K) { + page_vec->pages[cur_page] = page; + cur_page++; + } } page_vec->data_size = total_sz; @@ -269,8 +279,7 @@ static int iser_sg_to_page_vec(struct is return cur_page; } -#define MASK_4K((1UL << 12) - 1) /* 0xFFF */ -#define IS_4K_ALIGNED(addr)unsigned long)addr) & MASK_4K) == 0) +#define IS_4K_ALIGNED(addr)unsigned long)addr) & ~MASK_4K) == 0) /** * iser_data_buf_aligned_len - Tries to determine the maximal correctly aligned @@ -352,7 +361,7 @@ static void iser_page_vec_build(struct i page_vec->length = page_vec_len; - if (page_vec_len * PAGE_SIZE < page_vec->data_size) { + if (page_vec_len * SIZE_4K < page_vec->data_size) { iser_err("page_vec too short to hold this SG\n"); iser_data_buf_dump(data); iser_dump_page_vec(page_vec); diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/is
[openib-general] [PATCH 2/5] IB/iser: Limit the max size of a scsi command
Currently, the data length of a command coming down from scsi-ml is limited only by the size of its sg list (sg_tablesize). The max data length may be different for different page size values. By setting max_sectors, we limit the data length to max_sectors*512 bytes. Signed-off-by: Erez Zilber <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/iser/iscsi_iser.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) 522817c2dbb865c98465f3d17978dbdc8c4ff100 diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c index 101e407..2a14fe2 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.c +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c @@ -545,6 +545,7 @@ static struct scsi_host_template iscsi_i .queuecommand = iscsi_queuecommand, .can_queue = ISCSI_XMIT_CMDS_MAX - 1, .sg_tablesize = ISCSI_ISER_SG_TABLESIZE, + .max_sectors= 1024, .cmd_per_lun= ISCSI_MAX_CMD_PER_LUN, .eh_abort_handler = iscsi_eh_abort, .eh_host_reset_handler = iscsi_eh_host_reset, -- 1.2.6 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 1/5] IB/iser: fix a check of SG alignment for RDMA
dma mapping may include a "compaction" of the sg associated with scsi command. Hence, the size of the maximal prefix of the SG which is aligned for rdma must be compared against the length of the dma mapped sg (mem->dma_nents) and not against the size of it before it was mapped (mem->size). Signed-off-by: Erez Zilber <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/iser/iser_memory.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) 5301a4bb4f73250a93bc0c103839ae527f6b4110 diff --git a/drivers/infiniband/ulp/iser/iser_memory.c b/drivers/infiniband/ulp/iser/iser_memory.c index 31950a5..53af956 100644 --- a/drivers/infiniband/ulp/iser/iser_memory.c +++ b/drivers/infiniband/ulp/iser/iser_memory.c @@ -378,7 +378,7 @@ int iser_reg_rdma_mem(struct iscsi_iser_ regd_buf = &iser_ctask->rdma_regd[cmd_dir]; aligned_len = iser_data_buf_aligned_len(mem); - if (aligned_len != mem->size) { + if (aligned_len != mem->dma_nents) { iser_err("rdma alignment violation %d/%d aligned\n", aligned_len, mem->size); iser_data_buf_dump(mem); -- 1.2.6 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 0/5] IB/iser: iSER code changes for 2.6.19
Here is a series of patches that fix some bugs that were found in iSER during testing (some were found while testing iSER on architectures like ia64). All of them are related to memory registartion. Thanks Erez ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] why sdp connections cost so much memory
Quoting r. zhu shi song <[EMAIL PROTECTED]>: > Subject: Re: why sdp connections cost so much memory > > > > --- "Michael S. Tsirkin" <[EMAIL PROTECTED]> wrote: > > > Quoting r. zhu shi song <[EMAIL PROTECTED]>: > > > Subject: Re: why sdp connections cost so much > > memory > > > > > > >> You mean - when only a single socket is open? > > > Every one connection will cost 2M RAM. So I make > > the > > > following changes: > > > #define SDP_TX_SIZE 0x4 > > > #define SDP_RX_SIZE 0x4 > > > > You should not need this change with the scale patch > > I posted - after applying > > this, and setting the scale parameter to 0x1, each > > connection should use around > > 128K for RX. Please confirm. Could you please confirm that setting scale factor to 1 works for you, without changing SDP_TX_SIZE/SDP_RX_SIZE? > can each connection use 64K memory? SDP_MAX_SEND_SKB_FRAGS controls the number of pages per descriptor. You need at least 4 of these. I have it at 8 at the moment, try scaling it down. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Fwd: linux- 2.6.18-rc6-git1 issue 46: drivers/infiniband/ulp/iser/iser_verbs.c:514: undefined reference to `rdma_create_id'
Ah, thanks for clarifying this. Unfortunately this means, that there is a small chance, that "make oldconfig" will not work correctly in all cases, eg. upgrading a kernel to a newer version could yield into such failed compile step :-( Am Monday 11 September 2006 08:33 schrieb Erez Zilber: > Toralf Förster wrote: > > Your're right, sorry, > > > > I forgot this to say that: > > - first I created a random .config file using "make rndconfig" > > - after that I removed all options not fitting my system > > - then I prepended some options commonly used by me on top of the .config > > file > > - and run finally a "make oldconfig" to-hopefully- get a clean config > > > > Doesn't "make oldconfig" make a clean .config file ? > > > > > Here's what the kernel's README file has to say about it: > "make oldconfig": Default all questions based on the contents of your > existing ./.config file and asking about new config symbols. > > I guess that 'make rndconfig' selected CONFIG_INFINIBAND=y, but didn't > select CONFIG_INFINIBAND_ADDR_TRANS=y. Then, 'make oldconfig' asked you > about new symbols. I guess that running 'make rndconfig' may create > scenarios like this, but I don't think that there's a bug in iSER's > Kconfig file. If you still want to use your .config file, reselect > InfiniBand in 'make menuconfig'. It will set CONFIG_INFINIBAND_ADDR_TRANS=y. > > I hope this helps. > > Erez > > -- MfG/Sincerely Toralf Förster pgpANvQzXX86B.pgp Description: PGP signature ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] why sdp connections cost so much memory
--- "Michael S. Tsirkin" <[EMAIL PROTECTED]> wrote: > Quoting r. zhu shi song <[EMAIL PROTECTED]>: > > Subject: Re: why sdp connections cost so much > memory > > > > >> You mean - when only a single socket is open? > > Every one connection will cost 2M RAM. So I make > the > > following changes: > > #define SDP_TX_SIZE 0x4 > > #define SDP_RX_SIZE 0x4 > > You should not need this change with the scale patch > I posted - after applying > this, and setting the scale parameter to 0x1, each > connection should use around > 128K for RX. Please confirm. can each connection use 64K memory? > > > The SDP protocol uses ARP over IPoIB for its > address > > > resolution. > > > So you'd need to find some other way to perform > > > address resolution. > > > > > I'll try pre-resolute the address, So I can remove > ARP > > from ipoib > > But you'll still need the ipoib module loaded. > is it difficult not to load ipoib module? zhu __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Bug 229] heavy CPU load can starve ib_mad thread on latest processors
http://openib.org/bugzilla/show_bug.cgi?id=229 --- Comment #1 from [EMAIL PROTECTED] 2006-09-11 00:58 --- Which SM are you running ? --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] why sdp connections cost so much memory
Quoting r. zhu shi song <[EMAIL PROTECTED]>: > Subject: Re: why sdp connections cost so much memory > > > > >> You mean - when only a single socket is open? > Every one connection will cost 2M RAM. So I make the > following changes: > #define SDP_TX_SIZE 0x4 > #define SDP_RX_SIZE 0x4 You should not need this change with the scale patch I posted - after applying this, and setting the scale parameter to 0x1, each connection should use around 128K for RX. Please confirm. > > The SDP protocol uses ARP over IPoIB for its address > > resolution. > > So you'd need to find some other way to perform > > address resolution. > > > I'll try pre-resolute the address, So I can remove ARP > from ipoib But you'll still need the ipoib module loaded. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] why sdp connections cost so much memory
>> You mean - when only a single socket is open? Every one connection will cost 2M RAM. So I make the following changes: #define SDP_TX_SIZE 0x4 #define SDP_RX_SIZE 0x4 > The SDP protocol uses ARP over IPoIB for its address > resolution. > So you'd need to find some other way to perform > address resolution. I'll try pre-resolute the address, So I can remove ARP from ipoib zhu __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Bug 229] New: heavy CPU load can starve ib_mad thread on latest processors
http://openib.org/bugzilla/show_bug.cgi?id=229 Summary: heavy CPU load can starve ib_mad thread on latest processors Product: OpenFabrics Linux Version: 1.1rc3 Platform: All OS/Version: RHEL 4 Status: NEW Severity: normal Priority: P3 Component: IB Core AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] RHEL4 U3 x86_64 We have a proprietary test tool that places a very heavy CPU load on system. When this test is run on an IB host on Intel Woodcrest, AMD Opteron (Rev F, I believe, not sure about Rev G), and PowerPC JS21 systems, IB port goes from ACTIVE to INIT state. The workaround is to renice the ib_mad thread to highest priority, we recommend changing the OpenIB code to do this when ib_mad thread is created. This does not seem to happen on older Intel or AMD processors, dunno about PowerPC. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general