RE: [openib-general] How about ib_send_page() ?

2005-06-08 Thread Vivek Kashyap
On Tue, 7 Jun 2005, Michael Krause wrote: > At 09:28 AM 6/7/2005, Fab Tillier wrote: > > > From: Roland Dreier [mailto:[EMAIL PROTECTED] > > > Sent: Tuesday, June 07, 2005 8:38 AM > > > > > > Michael> Why not just use the IETF draft for RC / UC based IP over > > > Michael> IB and not worry

RE: [openib-general] How about ib_send_page() ?

2005-06-07 Thread Michael Krause
At 09:28 AM 6/7/2005, Fab Tillier wrote: > From: Roland Dreier [ mailto:[EMAIL PROTECTED]] > Sent: Tuesday, June 07, 2005 8:38 AM > > Michael> Why not just use the IETF draft for RC / UC based IP over > Michael> IB and not worry about creating something new? > > I think we've come full

RE: [openib-general] How about ib_send_page() ?

2005-06-07 Thread Fab Tillier
> From: Roland Dreier [mailto:[EMAIL PROTECTED] > Sent: Tuesday, June 07, 2005 8:38 AM > > Michael> Why not just use the IETF draft for RC / UC based IP over > Michael> IB and not worry about creating something new? > > I think we've come full circle. The original post was a suggestion o

Re: [openib-general] How about ib_send_page() ?

2005-06-07 Thread Roland Dreier
Michael> Why not just use the IETF draft for RC / UC based IP over Michael> IB and not worry about creating something new? I think we've come full circle. The original post was a suggestion on how to handle the fact the the connected-mode IPoIB draft requires a network stack to deal with

Re: [openib-general] How about ib_send_page() ?

2005-06-07 Thread Michael Krause
At 12:13 PM 6/3/2005, Sean Hefty wrote: Fab Tillier wrote: Ok, so this question is from a noob, but here goes anyway.  Why can't IPoIB advertise a larger MTU than the UD MTU, and then just fragment large IP packets up if they need to go over the IB UD transport?  Is there any reason this couldn't

Re: [openib-general] How about ib_send_page() ?

2005-06-03 Thread Sean Hefty
Fab Tillier wrote: Ok, so this question is from a noob, but here goes anyway. Why can't IPoIB advertise a larger MTU than the UD MTU, and then just fragment large IP packets up if they need to go over the IB UD transport? Is there any reason this couldn't work? If it does, it allows IPoIB to e

Re: [openib-general] How about ib_send_page() ?

2005-05-24 Thread Vivek Kashyap
On 19 May 2005, Hal Rosenstock wrote: > Hi Vivek, > > On Thu, 2005-05-19 at 12:41, Vivek Kashyap wrote: > > > > > > > > > > The most interesting optimization available is implementing the IPoIB > > > connected mode draft, although I don't think it's as easy as Vivek > > > indicated -- for exam

RE: [openib-general] How about ib_send_page() ?

2005-05-24 Thread Fab Tillier
> From: Vivek Kashyap [mailto:[EMAIL PROTECTED] > Sent: Tuesday, May 24, 2005 9:57 AM > > On Tue, 24 May 2005, Roland Dreier wrote: > > > Vivek> I should say it depends. One can utilise a setup that sets > > Vivek> the MTU to 2044 or whatever is the UD MTU on the subnet for > > Vivek>

Re: [openib-general] How about ib_send_page() ?

2005-05-24 Thread Vivek Kashyap
On Tue, 24 May 2005, Roland Dreier wrote: > Vivek> I should say it depends. One can utilise a setup that sets > Vivek> the MTU to 2044 or whatever is the UD MTU on the subnet for > Vivek> all modes. The connection will advertise the maximum > Vivek> receive MTU as this value. > >

Re: [openib-general] How about ib_send_page() ?

2005-05-24 Thread Roland Dreier
Vivek> I should say it depends. One can utilise a setup that sets Vivek> the MTU to 2044 or whatever is the UD MTU on the subnet for Vivek> all modes. The connection will advertise the maximum Vivek> receive MTU as this value. OK, but that's throwing away the main advantage of conn

Re: [openib-general] How about ib_send_page() ?

2005-05-23 Thread Vivek Kashyap
On Thu, 19 May 2005, Roland Dreier wrote: > Vivek> The draft does allow for a negotiation per connection for > Vivek> the implementations that wish to take advantage of > Vivek> it. However, an implementation can by default choose to use > Vivek> a 'connected-mode MTU' e.g. 32K alw

Re: [openib-general] How about ib_send_page() ?

2005-05-19 Thread Grant Grundler
On Wed, May 18, 2005 at 08:00:15PM -0700, Felix Marti wrote: > Hi Roland, > > define SMP :) Anytime a CPU is cache coherent with another CPU. > at these rates, system architecture comes into place, Definitely. The architecture puts boundaries on how coherency can be implemented...and thus avai

Re: [openib-general] How about ib_send_page() ?

2005-05-19 Thread Roland Dreier
Vivek> The draft does allow for a negotiation per connection for Vivek> the implementations that wish to take advantage of Vivek> it. However, an implementation can by default choose to use Vivek> a 'connected-mode MTU' e.g. 32K always. It can then, for Vivek> every connection c

Re: [openib-general] How about ib_send_page() ?

2005-05-19 Thread Hal Rosenstock
Hi Vivek, On Thu, 2005-05-19 at 12:41, Vivek Kashyap wrote: > > > > > > The most interesting optimization available is implementing the IPoIB > > connected mode draft, although I don't think it's as easy as Vivek > > indicated -- for example, I'm not sure how to deal with having > > different M

Re: [openib-general] How about ib_send_page() ?

2005-05-19 Thread Vivek Kashyap
> > The most interesting optimization available is implementing the IPoIB > connected mode draft, although I don't think it's as easy as Vivek > indicated -- for example, I'm not sure how to deal with having > different MTUs depending on the destination. The draft does allow for a negotiation

RE: [openib-general] How about ib_send_page() ?

2005-05-18 Thread Felix Marti
1:42 PM To: Felix Marti Cc: Jeff Carr; openib-general@openib.org Subject: Re: [openib-general] How about ib_send_page() ? Felix> I get just above 5G on RX (goodput, as reported by netperf) Felix> on a single opteron 248 (100%) using standard ethernet MTU Felix> (1500). Felix

Re: [openib-general] How about ib_send_page() ?

2005-05-18 Thread Jeff Carr
Libor Michalek wrote: OK. Well I would rather make something generic. Besides, wasn't there some MS patent issue? The last thread on the subject that I read kinda made it sound like you were going to look into the issue and respond. Maybe I missed the response; there's a lot of mail in the archi

Re: [openib-general] How about ib_send_page() ?

2005-05-18 Thread Jeff Carr
Grant Grundler wrote: 4K -> 1.8 GB/s 16k -> 3.3 GB/s 64k -> 3.8 GB/s This seems reasonable. IIRC the ZX1 chipset has 6GB/s backplane but one CPU can only drive ~4GB/s. I have a E7501. Thanks for running this test. I'd not looked so closely at this before or been up to the wall against it where it

Re: [openib-general] How about ib_send_page() ?

2005-05-18 Thread Grant Grundler
On Tue, May 17, 2005 at 07:48:32PM -0700, Roland Dreier wrote: ... > NAPI isn't really a throughput optimization. It pretty much only > helps with small packet workloads. In general, yes, I agree. NAPI prevents the NIC from saturating the CPU with interrupts from small packets. It's also one defe

Re: [openib-general] How about ib_send_page() ?

2005-05-18 Thread Grant Grundler
On Wed, May 18, 2005 at 01:38:48PM -0700, Jeff Carr wrote: > Maybe. I'm not a PCI expert and have only used a PCI bus analyzer a few > times. I don't understand how disabling the interrupts would interfere > with DMA or add overhead. PCI ordering rules force the PCI Bus controller to service MMI

Re: [openib-general] How about ib_send_page() ?

2005-05-18 Thread Grant Grundler
On Wed, May 18, 2005 at 01:00:13PM -0700, Jeff Carr wrote: > Grant Grundler wrote: > >[EMAIL PROTECTED] dd if=/dev/shm/test of=/dev/null bs=4K > >dd: opening `/dev/shm/test': No such file or directory > > I just did dd if=/dev/zero of=test bs=1M count=768 I realized later last night that /dev/shm

Re: [openib-general] How about ib_send_page() ?

2005-05-18 Thread Jeff Carr
Roland Dreier wrote: The most interesting optimization available is implementing the IPoIB connected mode draft, although I don't think it's as easy as Vivek indicated -- for example, I'm not sure how to deal with having different MTUs depending on the destination. Thank you for that reference. I'l

Re: [openib-general] How about ib_send_page() ?

2005-05-18 Thread Roland Dreier
Felix> I get just above 5G on RX (goodput, as reported by netperf) Felix> on a single opteron 248 (100%) using standard ethernet MTU Felix> (1500). Felix> TX performance is higher (close to 7G), but it is probably Felix> not the kind of comparison that you're interested in, sin

Re: [openib-general] How about ib_send_page() ?

2005-05-18 Thread Jeff Carr
Grant Grundler wrote: We..Looks like I'm wrong. Previous email on this thread suggested it's possible by people who know alot more about it than I do. But I'm still concerned it's going to affect latency. Maybe that's one reason why NAPI was made a compile time option? Anyway, it might be just

Re: [openib-general] How about ib_send_page() ?

2005-05-18 Thread Jeff Carr
Grant Grundler wrote: [EMAIL PROTECTED]:/# dd if=/dev/shm/test of=/dev/null bs=4K 196608+0 records in 196608+0 records out 805306368 bytes transferred in 0.628504 seconds (1281306571 bytes/sec) Yeah. Sounds like there is. Should be able to do several GB/s like that. I suppose it's possibly an issu

RE: [openib-general] How about ib_send_page() ?

2005-05-17 Thread Felix Marti
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Roland Dreier Sent: Tuesday, May 17, 2005 7:49 PM To: Jeff Carr Cc: openib-general@openib.org Subject: Re: [openib-general] How about ib_send_page() ? First of all, let me say that to me, IPoIB performance

Re: [openib-general] How about ib_send_page() ?

2005-05-17 Thread Libor Michalek
On Tue, May 17, 2005 at 07:08:16PM -0700, Grant Grundler wrote: > On Tue, May 17, 2005 at 06:32:38PM -0700, Jeff Carr wrote: > > >>>But IPoIB can't really implement NAPI since it's sending work to > > >>>a shared HCA. > > > > Hmm. I'm not knowledgeable to know why; I'll have to take your word for

Re: [openib-general] How about ib_send_page() ?

2005-05-17 Thread Roland Dreier
First of all, let me say that to me, IPoIB performance tuning isn't really that interesting. IPoIB is very easy to set up and there's a wide variety of tools that spit out all sorts of numbers, so it's definitely a very accessible area of research, but in the end there are probably better ways to

Re: [openib-general] How about ib_send_page() ?

2005-05-17 Thread Grant Grundler
On Tue, May 17, 2005 at 06:32:38PM -0700, Jeff Carr wrote: > >>>But IPoIB can't really implement NAPI since it's sending work to > >>>a shared HCA. > > Hmm. I'm not knowledgeable to know why; I'll have to take your word for > it. I'm not sure yet all the conditions that the HCA can generate > i

Re: [openib-general] How about ib_send_page() ?

2005-05-17 Thread Jeff Carr
Grant Grundler wrote: If it's NAPI that means nothing, here's probably the best summary: http://lwn.net/Articles/30098/ Cool; I see now. But IPoIB can't really implement NAPI since it's sending work to a shared HCA. Hmm. I'm not knowledgeable to know why; I'll have to take your word for i

Re: Fw: [openib-general] How about ib_send_page() ?

2005-05-17 Thread Vivek Kashyap
An implementation can work with the current draft. The delta to an implementation over an existing IPoIB-UD is not much. That should also help uncover or clarify any hidden issues not well covered in the draft. I've received a few comments - more on details - that I'll incorporate in the next ve

Re: [openib-general] How about ib_send_page() ?

2005-05-17 Thread Libor Michalek
On Mon, May 16, 2005 at 03:54:18PM -0700, Jeff Carr wrote: > Libor Michalek wrote: > > On Mon, May 16, 2005 at 03:26:57PM -0700, Jeff Carr wrote: > > > >>It seems to me it would be useful to have a simple ib_send_page() function. > >> > >>This is essentially what I'm going to end up writing for wh

Re: [openib-general] How about ib_send_page() ?

2005-05-17 Thread Shirley Ma
 > Note that IPoIB connected mode is currently an internet draft. Since there is an IP issue of SDP, it's better to implement IPoIB connected mode soon. We can ask for when this draft comes to RFC. Thanks Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(

Re: [openib-general] How about ib_send_page() ?

2005-05-17 Thread Hal Rosenstock
On Mon, 2005-05-16 at 21:55, Jeff Carr wrote: > Could (or would it help if) the MTU was increased to something much > larger than 2044? As the packet rate is one limiting factor, this has been discussed before. With UD, the IB limit is 4K (-4 for CRC) but the device limit is 2K (-4), which is the

Re: [openib-general] How about ib_send_page() ?

2005-05-16 Thread Grant Grundler
On Mon, May 16, 2005 at 07:20:07PM -0700, Jeff Carr wrote: > >You also want to explore "netperf -C" option that' available with > >netperf 2.4.0-rc1 (See www.netperf.orf). I've posted results here > >before about binding netperf/netserver processes to different CPUs. > > I think I remember this t

Re: [openib-general] How about ib_send_page() ?

2005-05-16 Thread Shirley Ma
>> But IPoIB can't really implement NAPI since it's sending work to >> a shared HCA. >> And any form of interrupt coalescing would interfere with >> any latency sensitive work as well (if present). >Surely. It would have to be configureable for people (like me) that wanted it. Configurable fe

Re: [openib-general] How about ib_send_page() ?

2005-05-16 Thread Jeff Carr
Grant Grundler wrote: vmstat doesn't tell you where the time is being spent. > Get a profile or try out the beta Pentium M or AMD64 perfmon support Yes, I would but I didn't think I could because I have Xeon's. You also want to explore "netperf -C" option that' available with netperf 2.4.0-rc1 (See

Re: [openib-general] How about ib_send_page() ?

2005-05-16 Thread Jeff Carr
Roland Dreier wrote: Jeff> (side note: it would seem IPoIB could be re-written to Jeff> dramatically improve it's performance). Out of curiousity, what would the rewrite change to obtain better performance? Could (or would it help if) the MTU was increased to something much larger than 204

Re: [openib-general] How about ib_send_page() ?

2005-05-16 Thread Grant Grundler
On Mon, May 16, 2005 at 06:00:49PM -0700, Jeff Carr wrote: > with vmstat on the server showing: > > -memory-- ---swap-- -io --system-- cpu > swpd free buff cache si sobibo incs us sy id wa > 0 1115720 440988 4983600 0 0 43943

Re: [openib-general] How about ib_send_page() ?

2005-05-16 Thread Jeff Carr
Roland Dreier wrote: Jeff> (side note: it would seem IPoIB could be re-written to Jeff> dramatically improve it's performance). Out of curiousity, what would the rewrite change to obtain better performance? I'm just speculating that it could be rewritten to improve performance. There were m

Re: [openib-general] How about ib_send_page() ?

2005-05-16 Thread Jeff Carr
Libor Michalek wrote: On Mon, May 16, 2005 at 03:26:57PM -0700, Jeff Carr wrote: It seems to me it would be useful to have a simple ib_send_page() function. This is essentially what I'm going to end up writing for what I need IB to do. If there is anyone else that has similar needs or interests I'

Re: [openib-general] How about ib_send_page() ?

2005-05-16 Thread Roland Dreier
Jeff> (side note: it would seem IPoIB could be re-written to Jeff> dramatically improve it's performance). Out of curiousity, what would the rewrite change to obtain better performance? Thanks, Roland ___ openib-general mailing list openib-gen

Re: [openib-general] How about ib_send_page() ?

2005-05-16 Thread Libor Michalek
On Mon, May 16, 2005 at 03:26:57PM -0700, Jeff Carr wrote: > It seems to me it would be useful to have a simple ib_send_page() function. > > This is essentially what I'm going to end up writing for what I need IB > to do. If there is anyone else that has similar needs or interests I'd > be happy

[openib-general] How about ib_send_page() ?

2005-05-16 Thread Jeff Carr
It seems to me it would be useful to have a simple ib_send_page() function. This is essentially what I'm going to end up writing for what I need IB to do. If there is anyone else that has similar needs or interests I'd be happy to work with them. The CM works well enough to allow me to initiate