from:"Greg Lindahl"

Re: [openib-general] Address List Change for Friday, 2/23/2007

2007-02-19 Thread Greg Lindahl

I see that the EWG list is now calling itself the Engineering Working
Group, has it been renamed from the Enterprise Working Group? If so,
did the nature of the list change? Or was it a typo?

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Infiniband Network Library

2007-01-08 Thread Greg Lindahl

On Mon, Jan 08, 2007 at 12:26:00PM -0600, Sean Hubbell wrote:

 I have had to make significant 
 work arounds for our current, third party network API that we purchased 
 and continue to watch if fall down and still not take advantage on the 
 bandwidth that I need. With that said, does anyone on this list have a 
 recommendation for an InfiniBand capable network library?

To amplify Roland's question: What does this library do that the
existing ways of using Infiniband doesn't? Sockets, verbs, MPI...

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] mthca question

2006-11-19 Thread Greg Lindahl

Or remove SVN code which is misleading, as it continues to mislead
people repeatedly.

-- greg

 We should put this type of warning in all the infiniband/core modules
 that have moved to the kernel...
 
 
 On Wed, 2006-11-15 at 11:29 -0800, Roland Dreier wrote:
#warning The mthca driver is no longer kept up to date in svn.
#warning For the latest code, track the upstream kernel.

What does this mean?  What is the upstream kernel?  Where do
I download the latest sources from?
  
  this means that the definitive source for the mthca driver is the
  standard Linux kernel.  The upstream kernel just means Linus's kernel
  tree, which you can download from kernel.org or any of the many mirrors.
  
   - R.
  
  ___
  openib-general mailing list
  openib-general@openib.org
  http://openib.org/mailman/listinfo/openib-general
  
  To unsubscribe, please visit 
  http://openib.org/mailman/listinfo/openib-general
  
 
 
 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general
 
 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] BandWidth doubt

2006-11-10 Thread Greg Lindahl

On Fri, Nov 10, 2006 at 03:32:34PM +0530, john t wrote:

 If I send data from mthca0-1 to mthca0-1 meaning from same port to the same
 port i.e. same port doing send/recv (also same cable doing send/recv) I get
 a BW of around 10 Gb/sec.

Note that the IB standard says in this case that the adaptor may not
send this traffic to the switch. So what you're seeing is a loopback
operation inside the HCA or inside the host.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] PC to PC data transfer using Infiniband

2006-10-25 Thread Greg Lindahl

On Thu, Oct 26, 2006 at 03:16:50PM +1300, vishal wrote:

 I am interested in moving data from memory in one PC to  
 memory in the other using Infiniband hardware. Any suggestions on what
 would be the best way to do it ?

There are 2 main approaches, messages and RDMA. The best one depends
on the size of the data, how it's used, and the number of nodes that
might be touching a given area of memory.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] IPoIB Question

2006-10-24 Thread Greg Lindahl

On Tue, Oct 24, 2006 at 08:35:18AM -0500, Sean Hubbell wrote:

 We are currently looking at the new tickless kernel. Do you have one 
 that you recommend?

The main one to less-recommend is 2.6.9-based kernels, those are the
slowest at TCP. Modern kernels, like the ones you see in Fedora 4 and
up and SLES 10, seem to all be good and about equal in this area.

I don't think we've tried a tickless kernel. We do most of our testing
on the various kernels that ship with distros, plus the tip-of-tree
kernel.org kernel.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] IPoIB Question

2006-10-23 Thread Greg Lindahl

On Mon, Oct 23, 2006 at 07:53:06AM -0500, Hubbell, Sean C Contractor/Decibel 
wrote:

   I currently have several applications that uses a legacy IPv4 protocol
 and I use IPoIB to utilize my infiniband network which works great. I
 have completed some timing and throughput analysis and noticed that I do
 not get very much more if I use an infiniband network interface than
 using my GigE network interface.

You might want to note that different InfinBand implementations have
quite different performance of IPoIB, especially for UDP.

Another issue is that IPoIB has quite different performance with
different Linux kernels. This is especially evident for TCP, although
you can use SDP to accelerate TCP sockets and avoid this issue.

 My question is, am I using IPoIB correctly or are these the typical
 numbers that everyone is seeing?

It is certainly the case that there are some message patterns and
situations for which InfiniBand is not much of an improvement over
gigE.

-- greg



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...

2006-10-19 Thread Greg Lindahl

On Thu, Oct 19, 2006 at 11:37:55AM -0400, Doug Ledford wrote:

 and ISTR that it
 isn't even required by the MPI spec since that leaves behavior of an MPI
 app undefined after a fork() call and hence any application written to
 depend on undefined behavior is broken by design,

Doug,

There are many things which MPI programs do that MPI vendors like to
support even though they're undefined in the standard. This is one of
the minor ones. The situation is similar to F77 extensions, there are
a bunch of them you must support to have a commercially viable
compiler.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] sysfs exposure of port counters useless?

2006-10-17 Thread Greg Lindahl

On Tue, Oct 17, 2006 at 05:18:34PM -0700, Scott Weitzenkamp (sweitzen) wrote:

 I agree the 32-bit byte and packet counters are useless as they get
 pegged in a few seconds on a busy IB networks.  I thought there was an
 effort in IBTA to fix this.

Yes, it's in the management working group.

 For IB counters in a Cisco switch, we read and reset the 32-bit counters
 once per second and keep 64-bit counters internally.  This would be
 possible in OF too, right?

Yep. We keep 64 bit counters internally and dumb them down as required
to meet the standard.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Dropping NETIF_F_SG since no checksum feature.

2006-10-11 Thread Greg Lindahl

On Wed, Oct 11, 2006 at 04:21:41PM +0200, Or Gerlitz wrote:
 On 10/9/06, Michael S. Tsirkin [EMAIL PROTECTED] wrote:
 
  I'm trying to build a network device driver supporting a very large MTU 
  (around 64K)
  on top of an infiniband connection, and I've hit a couple of issues I'd
  appreciate some feedback on:
 
 Does it mean you are implementing IPoIB RC? Cool ...

The ipath_ether device, which was submitted but rejected, has a 64k
MTU using UD.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] HCAs with and without memory

2006-09-10 Thread Greg Lindahl

On Fri, Sep 08, 2006 at 03:49:57PM +0530, john t wrote:

 What is the difference between HCAs with memory and without memory.

And to answer for QLogic InfiniPath HCAs, we don't sell HCAs with
memory. We don't need it. There's actually a small amount of memory
within the single chip that makes up our HCA, and that's all that's
necessary.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] opensm: truncate log file when fs is overflowed

2006-08-29 Thread Greg Lindahl

On Sun, Aug 27, 2006 at 06:28:06PM -0400, Doug Ledford wrote:

 I would definitely put the option in, and in fact would default it to
 *NOT* truncate.

I agree. I have never seen any other daemon with a logfile do this,
why are we out to surprise the admin? The admin might want the start
of the long instead of the end. And so on.

-- g

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] osm: handle local events

2006-08-25 Thread Greg Lindahl

On Fri, Aug 25, 2006 at 05:17:04PM +0300, Sasha Khapyorsky wrote:

 So more generic question: some application performs blocked read() from
 /dev/umadN, should this read() be interrupted and return error (with
 appropriate errno value), then the port state becomes DOWN?

Iif the SM gets a signal (alarm timeout) and the read() is interrupted
with errno=EINTR... presumably this is not the case you had in mind.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] A critique of RDMA PUT/GET in HPC

2006-08-25 Thread Greg Lindahl

On Fri, Aug 25, 2006 at 10:13:01AM -0500, Tom Tucker wrote:

 He does say this, but his analysis does not support this conclusion. His
 analysis revolves around MPI send/recv, not the MPI 2.0 get/put
 services.

Nobody uses MPI put/get anyway, so leaving out analyzing that doesn't
change reality much.

 A valid conclusion IMO is that MPI send/recv can
 be most efficiently implemented over an unconnected reliable datagram
 protocol that supports 64bit tag matching at the data sink. And not
 coincidentally, Myricom has this ;-)

As do all of the non-VIA-family interconnects he mentions.  Since we
all landed on the same conclusion, you might think we're on to
something. Or not.

However, that's only part of the argument.  Another part is that the
buffer space needed to use RDMA put/get for all data links is huge.
And there are some other interesting points.

 I DO agree that it is interesting reading. :-), it's definitely got
 people fired up.

Heh. Glad you found it interesting.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-25 Thread Greg Lindahl

On Fri, Aug 25, 2006 at 03:21:20PM -0400, [EMAIL PROTECTED] wrote:

 I presume you meant invalidate the cache, not flush it, before accessing 
 DMA'ed 
 data. 

Yes, this is what I meant. Sorry!

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl

On Thu, Aug 24, 2006 at 01:57:39PM -0700, Sean Hefty wrote:
 OK, great. I'm fine with people using things which are supported, but
 then we need the big, blinking Warning! This program is non-standard, and
 won't work with many of the devices supported by Open Fabrics! sign.
 
 If an application were written to use Myrinet, would you consider it
 non-standard?

Er, this question is a bit existential for my taste. Myrinet has its
own standards. We're trying to create *inter-operable* hardware and
software in this community. So we follow the IB standard. Myricom is
doing their own thing, although of course they have software which
obeys the Ethernet, VIA, DAPL, and other standards. And I expect that
if they say they obey a standard, that they do. They're good people
that way.

 It's up to the application to verify that the hardware that they're
 using provides the required features, or adjust accordingly, and
 publish those requirements to the end users.

If that was being done (and it isn't), it would still be bad for the
ecosystem as a whole. But, basically, that's about the same as what I
proposed, quoted above.

-- g

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl

On Wed, Aug 23, 2006 at 04:46:52PM -0700, Roland Dreier wrote:

 Yes, Mellanox documents that it is safe to rely on the last byte of an
 RDMA being written last.

OK, great. I'm fine with people using things which are supported, but
then we need the big, blinking Warning! This program is non-standard, and
won't work with many of the devices supported by Open Fabrics! sign.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl

On Thu, Aug 24, 2006 at 02:10:38PM -0700, Woodruff, Robert J wrote:

 The other way to look at it is, the customer goes to the ISV and asks,
 what hardware should I buy, and the ISV says I support X version of MPI
 and vendor Y's hardware works with X version of MPI. 

I thought the goal of InfiniBand was to create an ecosystem where you
didn't have to do this. I guess I missed something somewhere.

Adding undocumented requirements to a standard isn't the way to entice
more people into implementing or using it.

I would challenge you to find a single ISV that would prefer a
situation where some infiniband middleware requires things which
aren't in the standard.

 So, if you want your hardware to work with
 basically almost any MPI today, since most of the MPIs assume this
 data placement ordering, then you will make your hardware so that
 it will guarantee this type of data delivery. 

I think there's some confusion here between practicality and theory.

There's no question that any IB vendor would make that kind of
decision, although it will be expensive and annoying for iWarp vendors
to do so. They'll feel forced to do so. But this is bad for the
community.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl

On Thu, Aug 24, 2006 at 02:58:18PM -0700, Sean Hefty wrote:
 We're trying to create *inter-operable* hardware and
 software in this community. So we follow the IB standard.
 
 Atomic operations and RDD are optional, yet still part of the IB standard.  
 An
 application that makes use of either of these isn't guaranteed to operate with
 all IB hardware.

But those are example of things which are actually written down in the standard.

The example we were talking about isn't.

 But I do not see an issue with a vendor adding value beyond what's defined in
 the spec.

Neither do I. If you think so, you haven't understood my argument.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl

On Thu, Aug 24, 2006 at 03:13:33PM -0700, Woodruff, Robert J wrote:

 If the feature gives them a huge advantage in performance (and it
 does) and all of the hardware vendors that they deal with already
 implement it, then yes, they will force, by defacto standard that
 all other newcomers implement it or face the fact that no one will
 buy their hardware. It seems like that is what is happening in this
 case.

In this case the feature reduces performance on one HCA and increases
it on another. Which shows why it's a bad idea to pick features based
on a single implementation.

But you're still confusing practicality and theory. I can see why it's
pratical sense for newcomers to implement this new, performance-
reducing feature. But why is it theoretically good? And shouldn't it
be added to the standard, before all the poor iWarp people discover
the hard way that they need it?

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl

On Thu, Aug 24, 2006 at 03:28:44PM -0700, Woodruff, Robert J wrote:

 Yes, IMO, the iWarp folks and the IBTA should consider making this a 
 requirement, but even if they do not the ISVs will still require it.

Ahah. So with all this head and light, we agree that it should be
added to the standard.

 That being said, if you can show the ISVs how they can implement
 their completion model faster using some other mechanism than
 what they do now, they would probably listen, as they are going to do
 what gives them the best performance, standard or not. 

The other way to do it, faster on some HCAs, is to follow the
standard.  So, no showing needed. If MPI implementations implemented
this in addition to the other, non-standard way, and automagically
picked the right one, we wouldn't be having this discussion.

So, I think we're ending up in agreement. Feel free to disagree ;-)

-- greg



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl

On Thu, Aug 24, 2006 at 03:37:21PM -0700, Sean Hefty wrote:

 I'm missing the standard you're using to judge what's theoretically good and
 bad.

Having simpler programming to get good performance is a theoretical
good. Extra hacks for specific hardware is theoretically bad,
pratically good only when it ends up with much better performance.

Silently non-standard software is bad by both accounts.

 Applications are written this way today.  A vendor can either:
 
 * Support those apps by providing the feature.
 * Require that the apps be rewritten to use their hardware.

That's what I was calling practical. It's clear what a hardware
vendor will do in that case.

 Whether apps should have been written this way seems irrelevant.  They are, 
 and
 we should make decisions based on that, including extending the spec and/or
 implementation if needed.

In this case we're talking about code which can easily be changed to
follow the standard, in addition to having a hack mode that's faster
on 1 particular hardware implementation.

You seem to be implying that the applications are set in stone, and
that their authors have no interest in making them
standard-conformant. I don't think that's the case. If there is a
standard extension which can provide better performance on 1 particular
hardware implementation, let's add it to the standard. But let's
also make the software standard-conformant on other hardware.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl

On Thu, Aug 24, 2006 at 03:43:38PM -0700, Sean Hefty wrote:
 Actually, if a hardware implementation provided the same performance
 (in this case latency) by polling on a CQ as one where polling on
 memory was guaranteed to work, the customer may actually prefer the
 standard implementation.
 
 Polling on a CQ involves a function call, synchronization to the CQ, and
 formatting a structure to return to the user.  I don't see this ever being
 faster than polling memory.

Why don't you measure it, then? For example, an iWarp implementation
is going to be slowed down if it has to reorder segments to deliver
the last byte last. This expense might be more than the function call.
You guess not, but...

You're also assuming that programs are only checking the last byte of
the buffer. For all you know, Mellanox is delivering the whole buffer
in ascending order, and the user is checking bytes in the middle, too.
Which is a hazard of not-yet-specified standards extensions.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl

On Thu, Aug 24, 2006 at 03:53:37PM -0700, Woodruff, Robert J wrote:

 If the overhead in polling the CQ rather than memory was not so
 high, they would have used it, but they found that it added  2us to
 the latency and found they could get better performance if they
 polled memory,

I keep on mentioning that measuring one instance of one implementation
isn't necessarily a good way to evaluate a standard. Or the
implementation.

We already have one example where Mellanox improved their
implementation in the DDR generation such that a performance
workaround -- choosing a 1k MTU for RC connections -- was no longer
needed. I was pleased when we all agreed to remove defaulting to the
workaround -- it was clearly the right thing to do.

-- g


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl

On Thu, Aug 24, 2006 at 04:13:22PM -0700, Sean Hefty wrote:

 Why don't you measure it, then?
 
 Why?  Reading a memory location directly will be faster than calling a 
 function
 to read from a memory location.

... sigh. This is not true, there are quite obvious implementations
where this is not true. We had one, actually, and changed it due to
this issue. That was a practical choice.

 You're also assuming that programs are only checking the last byte of
 the buffer.
 
 The applications I care about are polling on the last byte.

And tomorrow, the next app may depend on the behavior that all the
bytes arrive in order. Slippery slope, you know.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Rollup patch for ipath and OFED

2006-08-23 Thread Greg Lindahl

On Wed, Aug 23, 2006 at 06:01:32PM +0300, Michael S. Tsirkin wrote:

 So this seems to be ripping out chunks of upstream code (ipath_ht400)
 replacing them with something else (ipath_iba6110, ipath_iba6120.o)

To answer this piece of the question, we were acquired last April, and
of course we have to rename all our devices. Since patch doesn't have
a rename feature, this looks much worse than it really is.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

2006-08-23 Thread Greg Lindahl

On Wed, Aug 23, 2006 at 09:29:18AM -0700, Sean Hefty wrote:

 I don't believe that there is any ordering guarantee by the architecture. 
 However, specific adapters may behave this way, and I've seen applications 
 make 
 use of this by polling the last memory byte for a completion, for example.

Actually, that leads me to a question: does the vendor of that adaptor
say that this is actually safe? Just because something behaves one way
most of the time doesn't mean it does it all of the time. So it it
really smart to write non-standard-conforming programs unless the
vendor stands behind that behavior?

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-08-03 Thread Greg Lindahl

On Tue, Aug 01, 2006 at 01:25:44PM +0200, Christoph Hellwig wrote:

 Exactly.  Please don't even try to put brand names (especially if
 they're as stupid as this) in.  We don't call our wireless stack
 centrino just because intel contributed to it either.

Centrino: Intel-only brand name

WiFi: trade association brand-name, not joined by all players

802.11{a,b,g}: technical name of technologies

wireless: an overly generic name that people might think should
   include bluetooth, wireless usb, etc etc.

OpenFabrics is not a single company brand name, it is the name of
the community that's actually implementing this software stack,
like 'Gnome' or 'KDE'.

BTW, I've had meetings with about 5 startups that began like, We have
an rdma device, but it's not actually RDMA as defined by that IEEE
Committee. And these devices don't work like that definition. So
there's considerable difference of opinion as to what RDMA means.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-31 Thread Greg Lindahl

On Mon, Jul 31, 2006 at 09:52:48AM -0500, Steve Wise wrote:

 rdma_* is more descriptive than something like ofv_* or of_* in my
 opinion.  I would think the prefix should help describe the
 functionality being implemented:  Transport Neutral RDMA. 

Some functions are RDMA. Others are not. If all are called RDMA,
that's misleading.

For example, in IB, there is send/receive as well as RDMA. ULPs often
use send/receive for short messages.

I wouldn't know anything about the non-IB parts of Open Fabrics, but
I would bet that there is non-RDMA functionality in them.

The common concept is messaging, not RDMA.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-31 Thread Greg Lindahl

On Mon, Jul 31, 2006 at 10:24:11AM -0500, Steve Wise wrote:

 However, the IETF RDMA protocol defines SEND as well as READ, WRITE,
 etc.  So in my mind, that's all RDMA, not just read and write.

Well, most people think RDMA means RDMA. The RDMA protocol undoubtedly
defines SEND/RECV because it's needed in addition to RDMA to get good
performance. But trying to call all of that RDMA is a marketing slogan.

Here's why it's a problem: I've repeatedly seen people try to use RDMA
(get and put) all the time because they think it must be faster than
simple send and receive... that's what the slogans tell them. But then
they discover that they need to use ordinary SEND/RECV for shorter
messages and for conversations with a lot of participants. That's a
technical screwup caused by the marketing slogan.

Let's pick symbol names that match our organization name.

I'm a bit dissappointed that several of you who were at the last
Sonoma conference forgot we discussed this in a public session right
before the name change. I am not on the steering committee, and
wouldn't be surprised if the openrdma domain name issue was the big
decider in the name choice, but the wisdom of having RDMA in our name
was in doubt for more reasons than just that.

-- g


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-31 Thread Greg Lindahl

On Mon, Jul 31, 2006 at 09:01:16AM -0700, Caitlin Bestler wrote:

 That would imply that the purpose of the openfabrics stack
 is to replace netdev.

I don't think it implies that at all.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-31 Thread Greg Lindahl

On Mon, Jul 31, 2006 at 01:25:39PM -0400, James Lentini wrote:

 I agree that the term RDMA SEND is confusing. However, the data in an 
 RDMA SEND is deposited directly (zero copy) into the users memory. 

There are many mechanisms other than DMA or RDMA which have this
property. You're confusing specification with implementation, too.
When I read from a disk on modern Unix, the data is deposited into
the user's memory, whether it's DMA or PIO.

The defining characteristic of RDMA is that it deposits or reads data
based on address provided by the other side, *and* that it has one-sided
semantics. In ordinary messaging, data is transferred from buffers
which are much less flexibly addressed, and semantics are two-sided.

  Here's why it's a problem: I've repeatedly seen people try to use RDMA
  (get and put) all the time because they think it must be faster than
 
 I'm assuming RDMA get/put correspond to RDMA READ/WRITE.

Yes, get and put are what the general community have traditionally
called these operations. These names emphasize the one-sided nature of
the operation, unlike the new official(tm) names.

  simple send and receive... that's what the slogans tell them. But then
  they discover that they need to use ordinary SEND/RECV for shorter
  messages and for conversations with a lot of participants. 
 
 By ordinary SEND/RECV, do you mean IB/iWARP SEND/RECV or traditional 
 (sockets) networking send(2)/recv(2)?

I was actually thinking of OpenIB SEND/RECV.

  That's a technical screwup caused by the marketing slogan.
 
 The terms RDMA read and RDMA write are technically accurate.

It seems we have different defintions of technical, then. Slogans
don't make good engineering.

 Perhaps someone can think of a better prefix. How about dav_ (direct 
 access verb)?

That's much better than rdma_, but do you really think the Linux folks
are going to be happy about OpenFabrics calls with a prefix that
doesn't look anything like Open Fabrics?

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-31 Thread Greg Lindahl

On Mon, Jul 31, 2006 at 10:32:05AM -0700, Roland Dreier wrote:

 Greg, what would be your suggestion of a more generic (not
 IB-specific) replacement of the libibverbs name and ibv_ prefix?

Anything that makes it clear that it's an Open Fabrics call. Which is
what our organization and software stack are called.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-31 Thread Greg Lindahl

On Mon, Jul 31, 2006 at 10:39:49AM -0700, Sean Hefty wrote:

 Or maybe just verb.  Would that be better?

That's a good one.

 IMO, the underlying issue with using 'rdma' is that a software based 
 solution doesn't actually do 'rdma'.  I think this is Greg's complaint, and 
 why he uses the terms 'get/put' instead of rdma read/write.

Actually, no, it isn't that. It's philosophical, a reaction to the
marketing over-hyping of RDMA.

I'm stunned that you've never heard of put and get ! Never used
CRAY SHMEM or any one-sided interconnect, I guess? MPI uses those
terms, too.

-- greg



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-31 Thread Greg Lindahl

On Mon, Jul 31, 2006 at 10:45:39AM -0700, Roland Dreier wrote:

 No other drivers have a brand name and it's pretty silly trying to
 brand IB/iWARP/RDMA/whatever drivers.

I don't see this as branding or marketing. I see it as trying to come
up with a name that's accurate.

What do you think of verb_ ?

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-31 Thread Greg Lindahl

On Mon, Jul 31, 2006 at 10:54:55AM -0700, Caitlin Bestler wrote:

 Trying to characterize RDMA as consisting *solely* of 
 messages that identify target buffers in the message is
 off target.

You're using circular arguments: Because one particular subset of the
RDMA community defines RDMA in fashion X, it is off target to define
RDMA in any other fashion.

One-sided vs. two-sided is important. You've completely left that out.

Well, no matter: we don't need to argue about the defintion of RDMA to
solve the question of what the transport-neutral prefix should be.

I have no doubt that we would never agree about the defintion.

 Now if you can come up with a short acronym that conveys
 that then I am fine with using it.

Try now if *someone* can come up with. How did you like verb_ ?

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-31 Thread Greg Lindahl

On Mon, Jul 31, 2006 at 02:17:20PM -0400, James Lentini wrote:

 Dusting off my copy of vipl.h, circa 1996, I see that these operations 
 were called RDMA READ/WRITE in VIA.

Yes, and that's the predecessor to IB, so that's no surprise that it
uses the same term. The IETF RDMA people also use it. Do you think
that's all there is to RDMA? I am not surprised that as a storage guy,
that's what you're most familiar with.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-31 Thread Greg Lindahl

On Mon, Jul 31, 2006 at 01:03:16PM -0500, Steve Wise wrote:

 I agree.  Plus we already have precedence for rdma_ with the RDMA CMA...

That's precedence about like I used the term 'wimps' in a poster
paper once, so now you should allow me to use 'wimps' in my
Astrophysical Journal article.

True story. Weakly Interacting Massive Particles. Which in turn
spawned MACHOs, MAssive Compact Halo Objects. Fun, but not the way to
do software engineering.

Hint: did you ever hold a discussion as to whether or not that was the
right transport-neutral name?

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-31 Thread Greg Lindahl

On Mon, Jul 31, 2006 at 11:18:16AM -0700, Roland Dreier wrote:

 My gut reaction is negative.  The whole idea of verbs is a bit of
 technical jargon that makes no sense unless you've lived in the RDMA
 world for a while,

Given the way you are defining RDMA, I'm not surprised at the
conclusion you are coming to. We have been calling these the
transport neutral verbs, btw.

How about ofabric_ ?

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-31 Thread Greg Lindahl

On Mon, Jul 31, 2006 at 01:34:41PM -0500, Steve Wise wrote:

 You seem to be the only one objecting to rdma_ and/or rdmav_. 

At Sonoma, I was not the only one. I forget, were you there?

 I've listened to your arguments for why you think rdma is a bad name,
 and I'm not convinced.  

I'm not surprised, I did not expect to convince everyone. However, it
is not the case that you get to pick the name by yourself. Nor I.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-31 Thread Greg Lindahl

On Mon, Jul 31, 2006 at 11:31:33AM -0700, Roland Dreier wrote:
 Greg Hint: did you ever hold a discussion as to whether or not
 Greg that was the right transport-neutral name?
 
 Jeeze, Sean posted the RDMA CM code to three mailing lists for review
 about 100 times.  Did you ever complain about the naming convention?

Roland, I'm not sure what to say. I suspect you think you're being
constructive, but I'm getting tired of being shot at for being the
messanger.

This is an issue important enough that having an explicit discussion
is a good idea. It shouldn't have come up as part of a patch.

And it wasn't clear to me that the RDMA CM was intended to be part
of the transport neutral verbs. If you look a the subject of this
thread, it's clear that it's about transport neutral verbs. So I
looked, and was surprised.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-31 Thread Greg Lindahl

On Mon, Jul 31, 2006 at 12:04:21PM -0700, Roland Dreier wrote:

 Unless someone else has a problem with the rmdav_ name then I think we
 should let this die.

Sounds like a call for an open discssion on it, with a proper subject
line, even. And asking outside of openib-general. Which is what I am
suggesting.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.

2006-07-28 Thread Greg Lindahl


 This patchset is a proposal to create new API's and data structures with
 transport neutral names.

We named ourselves OpenFabrics instead of OpenRDMA for a reason, did I
miss some point where we decided that we would use RDMA as a transport
neutral name in the source code?

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] MPI error when using a system call in mpi job.

2006-06-14 Thread Greg Lindahl

On Tue, Jun 13, 2006 at 05:11:47PM -0700, Ira Weiny wrote:

 After some tracking down he found that apparently if he used a system call
 [int system(const char *string)] the next MPI command will fail.

Are you sure MVAPICH supports fork()? It is not unusual for MPI
implementations to not support fork(). system() uses fork().

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] osm: fix num of blocks of GUIDInfo GetTable query

2006-06-11 Thread Greg Lindahl

So this is a _critical_ bugfix ?

 Auuuch it is there! 
 My mistake. Sp please apply the patch to the OFED 1.0 branch too.
 BTW: Is the osmtest -f a excersizes this query on the OFED 1.0 ?
 
  Huh ? What's
 
 https://openfabrics.org/svn/gen2/branches/1.0/src/userspace/management/o
 sm/opens
  m/osm_sa_guidinfo_record.c
  
  -- Hal
  
  
   Eitan Zahavi
   Senior Engineering Director, Software Architect
   Mellanox Technologies LTD
   Tel:+972-4-9097208
   Fax:+972-4-9593245
   P.O. Box 586 Yokneam 20692 ISRAEL
  
  
-Original Message-
From: Hal Rosenstock [mailto:[EMAIL PROTECTED]
Sent: Sunday, June 11, 2006 12:22 AM
To: Eitan Zahavi
Cc: OPENIB
Subject: Re: [PATCH] osm: fix num of blocks of GUIDInfo GetTable
 query
   
Eitan,
   
On Thu, 2006-06-08 at 07:24, Eitan Zahavi wrote:
 Hi Hal

 I'm working on passing osmtest check. Found a bug in the new
 GUIDInfoRecord query: If you had a physical port with zero
 guid_cap
 the code would loop on blocks 0..255 instead of trying the next
   port.

 I am still looking for why we might have a guid_cap == 0 on some
 ports.

 This patch resolves this new problem. osmtest passes on some
   arbitrary
 networks.

 Eitan

 Signed-off-by:  Eitan Zahavi [EMAIL PROTECTED]
   
Thanks. Applied to trunk only.
   
Let me know if it also should be applied to 1.0.
   
-- Hal
 
 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general
 
 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RE: Failed multicast join withnew multicast module

2006-06-08 Thread Greg Lindahl

On Thu, Jun 08, 2006 at 09:49:35AM -0700, Sean Hefty wrote:

 What if we started with something like the following compliance statement, and
 tried to add this to the spec?
 
 An SM, upon becoming the master, shall respect all existing communication in 
 the
 fabric, where possible.

Isn't this a quality of implementation issue? It's hard to imagine a
SM author not realizing this is a good thing to do.

If it was in the standard, how would you test it for compliance?

-- g

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] which way to port squid for supporting large amount concurrent connections

2006-05-22 Thread Greg Lindahl

On Mon, May 22, 2006 at 07:56:03PM -0700, zhu shi song wrote:

 I won't wait sdp OK.  I hope to use another method to
 port squid.

How about IPoIB?

-- greg
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RFC: detecting duplicate MAD requests

2006-04-30 Thread Greg Lindahl

On Fri, Apr 28, 2006 at 11:23:44PM -0700, Sean Hefty wrote:

 The proposal is to only discard duplicate requests while a response to the 
 first
 request is being generated.

Ah. Does that happen that often?

-- greg
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Re: [PATCH 04/16] ehca: userspace support

2006-04-30 Thread Greg Lindahl

 Do you really need this heavy debug logging in the first place? You
 can use kprobes for arbitrary run-time inspection anyway, so logging
 everything seems wasteful.
 
 The problem I see with kprobes is that you have to set several kernel
 configuration options (e.g. CONFIG_KPROBES, CONFIG_DEBUG_INFO, etc.)
 on compile time to use it. Same problem with pr_debug().

Note that one usage of debug code is for a vendor to ask a customer to
turn it on to figure out weird problems that the vendor can't
replicate. Customers are more likely to cooperate if the effort is
small... rebuilding the kernel is not a small effort compared
to turning on debug that's already compiled in.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RFC: detecting duplicate MAD requests

2006-04-28 Thread Greg Lindahl

On Fri, Apr 28, 2006 at 03:20:13PM -0700, Sean Hefty wrote:

 I'd like to propose that the MAD layer detect duplicate requests.

Sean,

You can't add this kind of thing piecemeal to a protocol and have it
work. If the sender doesn't see a response (perhaps the response was
lost, or was slow coming), and sends another MAD, this 2nd MAD will
have a different sequence number. How does the recipient know it's the
same request?  If the response was lost the first time, eating the 2nd
MAD without sending a response will result in another timeout and a
3rd MAD... so maybe the recipient remembers the response and sends it
again. Will that work? Well, no, it's not guaranteed, because the
sender may reject a stale response received after sending the 2nd
MAD...

Really, it's up to the MAD client to deal with duplicates in its own
way.

And yes, this class of issues shows up in practice. Ask anyone who's
ever worked on a large distributed system. Execute exactly once
semantics require end-to-end design.

-- greg
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] opensm: handle stdout and stderr in osm_log

2006-04-27 Thread Greg Lindahl

On Thu, Apr 27, 2006 at 01:15:57AM +0300, Sasha Khapyorsky wrote:

 There is small patch for osm_log, this provide possibility to drop log
 output to stdout or stderr.

Isn't the Unix convention to use -- to mean stdout? Or you can use
/dev/fd/{0,1}...

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Re: TSO and IPoIB performance degradation

2006-04-27 Thread Greg Lindahl

On Wed, Apr 26, 2006 at 11:13:24PM -0500, Troy Benjegerdes wrote:

 David is right. If you care about performance, you are already using SDP
 or verbs layer for the transport anyway. If I am going to be doing IPoIB,
 it's because eventually I expect the packet might get off the IB network
 and onto some other network and go halfway across the country.

This is going to be a surprise to lots of people who want high-speed
gateways from IB to ethernet -- many clusters connect to fileservers
and other performance-sensitive gizmos that way.

-- greg
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Re: TSO and IPoIB performance degradation

2006-04-27 Thread Greg Lindahl

On Thu, Apr 27, 2006 at 04:22:40PM -0700, Grant Grundler wrote:

 Anything preventnig such a gateway from routing SDP to ethernet?
 Those gateways obviously will grok IB protocols.
 I'm asking becuase I don't understand/know if there is a real
 barrier to an IB - ethernet gateway _without_ IPoIB.

I don't know if a SDP to ethernet gateway even exists, but I do know
that it's a lot more work than just an IPoIB to ethernet gateway --
the gateway is going to have to pass all its data through a TCP stack.
So I would expect SDP to ethernet to not run very fast, especially on
a gateway with lots of streams going.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RFC userspace / MPI multicast support

2006-04-20 Thread Greg Lindahl

On Thu, Apr 20, 2006 at 01:00:43PM -0400, Hal Rosenstock wrote:

 Also, is IPoIB always setup when running MPI ?

That's an easy one: No.

-- greg
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Help with CONFIG_PCI_MSI in the kernel

2006-04-04 Thread Greg Lindahl

On Mon, Apr 03, 2006 at 09:36:00PM -0700, Grant Grundler wrote:

 The only evidence I have is one AMD chipset is buggy WRT MSI.

Grant,

I know about that case, the kernel disables stuff if it sees an AMD
8131 due to a bug. What I am referring to was IPoIB performance on
Mellanox HCAs being improved with MSI. I figure if it's gotten this
far with Red Hat turning it off, concrete examples are in order.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Help with CONFIG_PCI_MSI in the kernel

2006-04-03 Thread Greg Lindahl

Red Hat has started turning off CONFIG_PCI_MSI in their kernels (FC5
and the latest FC4 update). I remember a while back there was a
discussion about how MSI made the Mellanox HCA run faster, can someone
please add some concrete details about this to the bug? Thanks.

http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=186520

-- greg
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] IPoIB destructor patches

2006-03-31 Thread Greg Lindahl

On Fri, Mar 31, 2006 at 08:37:58AM -0800, Roland Dreier wrote:

 If you want to forward the necessary patches to [EMAIL PROTECTED] for
 inclusion in 2.6.16.x releases, that would be great.

It would be nice to have a single person responsible for spotting
things that qualify for the stable kernel, and shepherd them into it.
This one is a good example, something that hurts IB 1.0 on some
new hardware.

I'm not volunteering Roland; anyone else want to volunteer? Here are
the criteria:

http://www.kroah.com/log/2005/03/09/

and a note that 2.6.16.y will be more relaxed than the above:

http://www.ussg.iu.edu/hypermail/linux/kernel/0603.2/1274.html

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Re: Problem configuring ipath_ether

2006-03-28 Thread Greg Lindahl

On Mon, Mar 27, 2006 at 08:18:28AM -0800, Shirley Ma wrote:

 Is it a good idea to send such a big messages size using UD mode?

The usual reason for not doing this with UDP is that it turns a small
amount of packet loss into lots more lossage. This is a real issue
with ethernet when it drops packets when there is congestion. With IB
the error rate is consistantly low.

It could be that other HCAs don't like lots of UD messages in a row,
either sending or receiving them efficiently. We never tried, as you
can see the code is not written on top of verbs.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Problem configuring ipath_ether

2006-03-25 Thread Greg Lindahl

On Fri, Mar 24, 2006 at 10:38:35PM -0800, Matt Leininger wrote:

That could be.  I'll check.  I thought ipath_ether was suppose to
 work when IPoIB was enabled.

In general, our non-IB protocols work together with OpenIB.  This item
is the one thing that doesn't work, and it's due to an SMA bug that we
haven't fixed yet.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Re: Problem configuring ipath_ether

2006-03-25 Thread Greg Lindahl

On Sat, Mar 25, 2006 at 10:36:10PM +0200, Michael S. Tsirkin wrote:

 In what sense is it more like an ethernet device than IPoIB? Since IP over IB
 just puts an IP packet inside a UD packet without overhead, what wire protocol
 changes can thinkably be done, while still using InfiniBand protocols, to
 improve performance?

Michael,

Given that you're objecting to it being included in the kernel, I'd
expect that you would have already reviewed the code.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Problem configuring ipath_ether

2006-03-24 Thread Greg Lindahl

On Thu, Mar 23, 2006 at 11:50:18PM -0800, Matt Leininger wrote:

 I have the ipath driver up, running, and working with IPoIB.  I'm using
 2.6.16 with svn 5938.  The ipath_ether comes up as eth2.  I can set the
 netmask and broadcast, but when I try to set the ip address for this
 device I get the following error:

Matt,

ipath_ether uses InfiniBand protocols but is not the same as IPoIB. So
it's better to ask [EMAIL PROTECTED] about it. (The reason we ship
it at all is that it's faster than IPoIB -- this necessitates a
different wire format -- and it behaves more like an ethernet device
than IPoIB.)

My guess is that you're trying to use it with ib_mad (the in-kernel
SMA).  Do you see anything in /var/log/messages like ipath_ether_open
timed out waiting for MLID? The relevant entry in RELEASE-NOTES.txt
under the KNOWN LIMITATIONS section is:

   * IPoIB and ipath_ether do not work together, as ipath_ether requires
 the InfiniPath SMA, which needs to be disabled to use OpenIB.  This
 will be fixed in a future release.

Apologies for this not being more obvious.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] IPoIB broadcast MC group membership

2006-02-22 Thread Greg Lindahl


On Tue, Feb 21, 2006 at 11:40:53PM -0800, Fabian Tillier wrote:

 You'd have to make the group 1X.  Note that the group being 1X doesn't
 limit unicast traffic to 1X rates, since the rate for unicast traffic
 would be set based on the rate reported in the path records for the
 various endpoints.
 
 So 4X SDR and 4X DDR nodes would have to set their inter-packet delay
 for the broadcast group to end up with a 1X packet injection rate.

So, basically, MVAPICH doesn't have code that does either the group
creation properly when there is a mixture of HCA bandwidths, or limit
the packet injection rate. And IPoIB could violate this rule depending
on how user programs use it, e.g. if I did a lot of broadcasting, I
could easily exceed 1X's bandwidth.

So this is more than just a fix OpenSM issue. It's more of a fix
the spec issue, if I'm understanding it correctly.

-- greg



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] IPoIB broadcast MC group membership

2006-02-21 Thread Greg Lindahl

Is this a correct summary of this thread?

* IPoIB uses an InfiniBand multicast group to fake ethernet broadcast
  * This is optional, I'm not sure what functionality is lost without it

* MVAPICH uses a multicast group for some MPI collectives
  * This can be turned off by setting env var DISABLE_HARDWARE_MCST

* An IB multicast group has to use ports of the same speed
  * This one was a surprise to me

Ergo, when you mix 1X, 4X SDR, and 4X DDR hosts, it behaves
differently from a homogeneous network.

-- greg
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] IPoIB broadcast MC group membership

2006-02-21 Thread Greg Lindahl

On Tue, Feb 21, 2006 at 05:21:33PM -0800, Roland Dreier wrote:

 No, but an IB multicast group has a speed associated to it.  This is
 to allow, say, a 4X port sending multicast packets to use the right
 static rate to avoid overrunning a 1X port that is also a member of
 the group.

Thanks for the clarification, Roland, I see that the previous
discussion had successfully confused me. So is it the case that the
create and the joins to a multicast group have to specify the correct
speed? And the problem then would be that a IB host at boot doesn't
know the right speed? Am I getting colder or warmer?

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] IPoIB broadcast MC group membership

2006-02-21 Thread Greg Lindahl

On Tue, Feb 21, 2006 at 08:17:02PM -0800, Fabian Tillier wrote:

 The node joining or creating the multicast group doesn't need to
 specify the rate - the SA can figure out the rate to use based on the
 requestor (for creation), or validate that the requestor supports the
 existing group's rate (for joining).

Um, but that gets back to my point: I want 1X, 4X SDR, and 4X DDR
nodes running IPoIB to share a multicast group. Are you saying this
can be done by making the group a 1X group? Or that it's impossible
to have such a group? Or that everyone would have to drop to 1X to
make such a group?

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] OpenIb and Ada95 binding

2006-01-31 Thread Greg Lindahl

On Tue, Jan 31, 2006 at 03:53:22PM +0100, Xavier Grave wrote:

 I'm using Ada95 and I would like to implement a thin binding in order to
 use infiniband.

If MPI meets your needs -- do you need recovery from failures? -- then
you can use one of the existing MPI/ADA bindings.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] SA cache design

2006-01-11 Thread Greg Lindahl

Since no one's really answered this yet:

Many sysadmins are not going to want to install a relational database
to run an SA cache. So I'd stick to Berkeley DB if I were you.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RFC MPI and app. requirements of OpenIB

2005-12-22 Thread Greg Lindahl

On Thu, Dec 22, 2005 at 12:14:44PM -0800, Sean Hefty wrote:

 Can MPI operate with unreliable multicast support?  Does MPI plan on
 using IB multicast?

Given the large number of MPI implementations over IB, I don't think
there's a single answer.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RE: [PATCH] Opensm - fix segfault on exit - cont.

2005-12-20 Thread Greg Lindahl

On Tue, Dec 20, 2005 at 09:11:12AM +0200, Yael Kalka wrote:

  -   if (p_ur-signal)
  +   if (p_ur-signal != NULL)

Aren't these 2 statements required to execute the same according to
the C standard?

I wrote a tiny test program and gcc4.0.0 as distributed with Fedora
Core 3 generated identical assembly code for both.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [Openib-promoters] Re: [openib-general] Next workshop dates? Ideas for agenda???

2005-12-12 Thread Greg Lindahl

On Mon, Dec 12, 2005 at 09:55:03AM -0800, Bill Boas wrote:

 Now I believe that when a Linux 
 distribution, an IB company or a Tier One OEM decides that is a 
 version of the code that they will support, then that is a release. 

Why don't we imitate the Linux kernel process? OpenIB has to follow a
sane process of innovation followed by stabilization and bug-fixing in
order for the IB companies and Tier 1s to be able to make solid
releases.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [ANNOUNCE] Contribute RDS ( ReliableDatagramSockets) to OpenIB

2005-11-09 Thread Greg Lindahl

Caitlin,

Can you please use the standard quoting style? I can't tell which
comments are yours. Thanks.

-- greg
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [ANNOUNCE] Contribute RDS (ReliableDatagramSockets) to OpenIB

2005-11-09 Thread Greg Lindahl

On Tue, Nov 08, 2005 at 01:08:13PM -0800, Michael Krause wrote:

 If an application takes any action assuming that send complete means
 it is delivered, then it is subject to silent data corruption.

Right. That's the same as pretty much all other *transport* layers. I
don't think anyone's asserting RDS is any different: you can't assume
the other side's application received and acted on your message until
the other side's application tells you that it did.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [ANNOUNCE] Contribute RDS (ReliableDatagramSockets) to OpenIB

2005-11-09 Thread Greg Lindahl

On Wed, Nov 09, 2005 at 12:18:28PM -0800, Michael Krause wrote:

 So, things like HCA failure are not transparent and one cannot simply 
 replay the operations since you don't know what was really seen by the 
 other side unless the application performs the resync itself.

I think you are over-stating the case. On the remote end, the kernel
piece of RDS knows what it presented to the remote application, ditto
on the local end. If only an HCA fails, and not the sending and
receiving kernels or applications, that knowledge is not lost.

Perhaps you were assuming that RDS would be implemented only in
firmware on the HCA, and there is no kernel piece that knows what's
going on. I hadn't seen that stated by anyone, and of course there are
several existing and contemplated OpenIB devices that are considerably
different from the usual offload engine. You could also choose to
implement RDS using an offload engine and still keep enough state in
the kernel to recover.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [ANNOUNCE] Contribute RDS (ReliableDatagramSockets) to OpenIB

2005-11-09 Thread Greg Lindahl

On Wed, Nov 09, 2005 at 01:57:06PM -0800, Michael Krause wrote:

 What you indicate above is that RDS 
 will implement a resync of the two sides of the association to determine 
 what has been successfully sent.

More accurate to say that it could implement that. I'm just
kibbutzing on someone else's proposal.

 This then implies that the reliability of the underlying
 interconnect isn't as critical per se as the end-to-end RDS protocol
 will assure that data is delivered to the RDS components in the face
 of hardware failures.  Correct?

Yes. That's the intent that I see in the proposal. The implementation
required to actually support this may not be what the proposers had in
mind.

This sort of message service, by the way, has a long history in
distributed computing.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [ANNOUNCE] Contribute RDS (Reliable DatagramSockets) to OpenIB

2005-11-04 Thread Greg Lindahl

On Fri, Nov 04, 2005 at 11:54:27AM -0800, pandit ib wrote:

 Since a UDP application assumes the underlying transport is
 unrealiable it should not have any problems running on RDS.
 On getting EWOUDBLOCK it will simply retry.

Most existing UDP applications do not expect a return error code of
EWOULDBLOCK. To begin with, the Linux manpages say that you have to
specify non-blocking to get this error in the first place. Another
possibility is ENOBUFS, which gives the advice Normally, this does
not occur in Linux. Packets are silently dropped when a device queue
overflows.

There was a somewhat famous case showing lack of error handling in UCP
applications under Linux, where Alan Cox decided to read the RFCs
different from everyone else, and caused an ICMP 'port unreach' to
later cause the same sending socket to return an error for a send to
some unrelated host. Many UDP-using apps considered this a fatal
error.  This was ~ 7 years ago, and this misfeature caused enough
anger that it was corrected soon after Alan stopped owning the TCP/UDP
stack.

In short, I'm not sure there would be much benefit for giving existing
UDP-expecting apps a reliable, ordered stream of datagrams. The only
app which would see a benefit are those who know that they can turn
off their reliability and ordering code, and handle backpressure
explicitly. Those folks would benefit from a simpler programming
interface than verbs.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Re: Opensm - osm_sa_path_record.c - variable declaration

2005-09-20 Thread Greg Lindahl

 Windows compiler does not enable declaration not in the beginning of
 the function, so I would
 like to have it changed. 
 We can either move the declaration to the beginning of the function,
 or add {} around the declaration.

Isn't there a gcc flag to convince it to give an error in this case? If
so, adding it to the OpenIB makefile would be a good idea.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Infinipath support in OpenIB

2005-05-19 Thread Greg Lindahl

On Thu, May 19, 2005 at 12:46:04PM -0700, Tom Duffy wrote:

  The article states: 'InfiniPath will also support the OpenIB software stack 
  providing full InfiniBand compliance.'
 
 Also interesting that OpenIB is suddenly an industry standard.  ;-)

Tom,

There's a reason for that -- when we asked customers, they asked us to
support OpenIB in particular. So congratulations, not only is OpenIB a
part of the IB ecosystem, it's one that everyone seems to want to
support.

We didn't want to invent a new flavor of verbs or DAPL anyway. No
point to it, no benefit to anyone.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation

2005-05-03 Thread Greg Lindahl

On Tue, May 03, 2005 at 11:43:25AM -0700, Andy Isaacson wrote:

 [1] You might want to allow the child to start a completely new RDMA
 context, but I don't see that as necessary.

Some people use a hybrid OpenMP+MPI model, so some MPI implementations
want to fork and have the child open a new communications context.
This is fairly rare, but unfortunately is a checklist item for most
MPI implementations, and this will get worse over time.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Re: RDMA memory registration

2005-04-29 Thread Greg Lindahl

On Fri, Apr 29, 2005 at 12:33:54PM -0700, Grant Grundler wrote:

 Being mostly clueless about Quadrics implementation, I'm probably
 missing something that makes Quadrics a MMU but not the IB variants.
 Can someone clue me in please?

As far as I can tell it's mostly a marketing distinction. Many
Quadrics customers run with memory registration, and Mellanox could
probably alter their firmware to not require registration.  Myricom
certainly can, and in fact Patrick Geoffrey claimed they were doing so
in their MX software. The only one I know of that isn't that flexible
is PathScale's InfiniPath. Ours is a pure hardware mechanism, but it
requires memory registration and is clearly not an MMU.

Confused yet?

-- greg
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Re: RDMA memory registration

2005-04-29 Thread Greg Lindahl

 Todd But that implies the hardware has an MMU and it also puts an
 Todd interrupt in the path per page sent.
 
 Well, there's one interrupt per non-resident page sent.  But nearly
 all of the time the page will be present.

It doesn't imply that there's an MMU, either. I know that Myricom uses
a little lookup routine in software on their nic, which most people
wouldn't call an MMU. I don't know what Mellanox does for this, they
don't talk much about what's hardware and what's software on their
nic. I think Quadrics actually uses the TLB of their risc cpu on their
nic for this lookup, but that's just a guess.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] IB Address Translation service

2005-03-04 Thread Greg Lindahl

On Fri, Mar 04, 2005 at 11:58:33AM -0800, Tom Duffy wrote:

 In any event, I think being able to plop an IB network in an Ethernet
 world will require things like RARP to work.  If there is no spec now,
 it should be written.

Much more important is understanding the role of RARP in the ethernet
world.

It is *not* something you do to find _someone else's_ IP addr from
their MAC addr. It's what you do to find your _own_ IP addr because
you're booting. Ethernet protocols such as IP include enough IP
information to talk back to someone who sent you a packet. So you
don't need to find out an IP addr from a MAC for remote nodes on a
regular basis. Instead, you find out a MAC addr from an IP address,
which is ARP.

RARP is little used now that DHCP is popular.

Now it would be nice for ethernet broadcast packets to just work(tm)
with IPoIB. ping -b is an example of a user-level program that
generates a broadcast packet.  DHCP clients also generate such
packets, and DHCP servers listen for them. Getting a RARP client and
server to work ought to be the same as a DHCP client and server.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] another shutdown reminder

2005-01-05 Thread Greg Lindahl

On Wed, Jan 05, 2005 at 05:08:31PM -0500, Hal Rosenstock wrote:

 What TZ ? PST ?

Ah, if only you knew how funny that question is! 2 beer penalty and
loss of down for failure to appreciate the uniqueness of Sandia's
geography.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

82 matches

Mail list logo