Re: [openib-general] Address List Change for Friday, 2/23/2007
I see that the EWG list is now calling itself the Engineering Working Group, has it been renamed from the Enterprise Working Group? If so, did the nature of the list change? Or was it a typo? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Infiniband Network Library
On Mon, Jan 08, 2007 at 12:26:00PM -0600, Sean Hubbell wrote: I have had to make significant work arounds for our current, third party network API that we purchased and continue to watch if fall down and still not take advantage on the bandwidth that I need. With that said, does anyone on this list have a recommendation for an InfiniBand capable network library? To amplify Roland's question: What does this library do that the existing ways of using Infiniband doesn't? Sockets, verbs, MPI... -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] mthca question
Or remove SVN code which is misleading, as it continues to mislead people repeatedly. -- greg We should put this type of warning in all the infiniband/core modules that have moved to the kernel... On Wed, 2006-11-15 at 11:29 -0800, Roland Dreier wrote: #warning The mthca driver is no longer kept up to date in svn. #warning For the latest code, track the upstream kernel. What does this mean? What is the upstream kernel? Where do I download the latest sources from? this means that the definitive source for the mthca driver is the standard Linux kernel. The upstream kernel just means Linus's kernel tree, which you can download from kernel.org or any of the many mirrors. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] BandWidth doubt
On Fri, Nov 10, 2006 at 03:32:34PM +0530, john t wrote: If I send data from mthca0-1 to mthca0-1 meaning from same port to the same port i.e. same port doing send/recv (also same cable doing send/recv) I get a BW of around 10 Gb/sec. Note that the IB standard says in this case that the adaptor may not send this traffic to the switch. So what you're seeing is a loopback operation inside the HCA or inside the host. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] PC to PC data transfer using Infiniband
On Thu, Oct 26, 2006 at 03:16:50PM +1300, vishal wrote: I am interested in moving data from memory in one PC to memory in the other using Infiniband hardware. Any suggestions on what would be the best way to do it ? There are 2 main approaches, messages and RDMA. The best one depends on the size of the data, how it's used, and the number of nodes that might be touching a given area of memory. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPoIB Question
On Tue, Oct 24, 2006 at 08:35:18AM -0500, Sean Hubbell wrote: We are currently looking at the new tickless kernel. Do you have one that you recommend? The main one to less-recommend is 2.6.9-based kernels, those are the slowest at TCP. Modern kernels, like the ones you see in Fedora 4 and up and SLES 10, seem to all be good and about equal in this area. I don't think we've tried a tickless kernel. We do most of our testing on the various kernels that ship with distros, plus the tip-of-tree kernel.org kernel. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPoIB Question
On Mon, Oct 23, 2006 at 07:53:06AM -0500, Hubbell, Sean C Contractor/Decibel wrote: I currently have several applications that uses a legacy IPv4 protocol and I use IPoIB to utilize my infiniband network which works great. I have completed some timing and throughput analysis and noticed that I do not get very much more if I use an infiniband network interface than using my GigE network interface. You might want to note that different InfinBand implementations have quite different performance of IPoIB, especially for UDP. Another issue is that IPoIB has quite different performance with different Linux kernels. This is especially evident for TCP, although you can use SDP to accelerate TCP sockets and avoid this issue. My question is, am I using IPoIB correctly or are these the typical numbers that everyone is seeing? It is certainly the case that there are some message patterns and situations for which InfiniBand is not much of an improvement over gigE. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...
On Thu, Oct 19, 2006 at 11:37:55AM -0400, Doug Ledford wrote: and ISTR that it isn't even required by the MPI spec since that leaves behavior of an MPI app undefined after a fork() call and hence any application written to depend on undefined behavior is broken by design, Doug, There are many things which MPI programs do that MPI vendors like to support even though they're undefined in the standard. This is one of the minor ones. The situation is similar to F77 extensions, there are a bunch of them you must support to have a commercially viable compiler. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] sysfs exposure of port counters useless?
On Tue, Oct 17, 2006 at 05:18:34PM -0700, Scott Weitzenkamp (sweitzen) wrote: I agree the 32-bit byte and packet counters are useless as they get pegged in a few seconds on a busy IB networks. I thought there was an effort in IBTA to fix this. Yes, it's in the management working group. For IB counters in a Cisco switch, we read and reset the 32-bit counters once per second and keep 64-bit counters internally. This would be possible in OF too, right? Yep. We keep 64 bit counters internally and dumb them down as required to meet the standard. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Dropping NETIF_F_SG since no checksum feature.
On Wed, Oct 11, 2006 at 04:21:41PM +0200, Or Gerlitz wrote: On 10/9/06, Michael S. Tsirkin [EMAIL PROTECTED] wrote: I'm trying to build a network device driver supporting a very large MTU (around 64K) on top of an infiniband connection, and I've hit a couple of issues I'd appreciate some feedback on: Does it mean you are implementing IPoIB RC? Cool ... The ipath_ether device, which was submitted but rejected, has a 64k MTU using UD. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] HCAs with and without memory
On Fri, Sep 08, 2006 at 03:49:57PM +0530, john t wrote: What is the difference between HCAs with memory and without memory. And to answer for QLogic InfiniPath HCAs, we don't sell HCAs with memory. We don't need it. There's actually a small amount of memory within the single chip that makes up our HCA, and that's all that's necessary. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] opensm: truncate log file when fs is overflowed
On Sun, Aug 27, 2006 at 06:28:06PM -0400, Doug Ledford wrote: I would definitely put the option in, and in fact would default it to *NOT* truncate. I agree. I have never seen any other daemon with a logfile do this, why are we out to surprise the admin? The admin might want the start of the long instead of the end. And so on. -- g ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] osm: handle local events
On Fri, Aug 25, 2006 at 05:17:04PM +0300, Sasha Khapyorsky wrote: So more generic question: some application performs blocked read() from /dev/umadN, should this read() be interrupted and return error (with appropriate errno value), then the port state becomes DOWN? Iif the SM gets a signal (alarm timeout) and the read() is interrupted with errno=EINTR... presumably this is not the case you had in mind. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] A critique of RDMA PUT/GET in HPC
On Fri, Aug 25, 2006 at 10:13:01AM -0500, Tom Tucker wrote: He does say this, but his analysis does not support this conclusion. His analysis revolves around MPI send/recv, not the MPI 2.0 get/put services. Nobody uses MPI put/get anyway, so leaving out analyzing that doesn't change reality much. A valid conclusion IMO is that MPI send/recv can be most efficiently implemented over an unconnected reliable datagram protocol that supports 64bit tag matching at the data sink. And not coincidentally, Myricom has this ;-) As do all of the non-VIA-family interconnects he mentions. Since we all landed on the same conclusion, you might think we're on to something. Or not. However, that's only part of the argument. Another part is that the buffer space needed to use RDMA put/get for all data links is huge. And there are some other interesting points. I DO agree that it is interesting reading. :-), it's definitely got people fired up. Heh. Glad you found it interesting. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Fri, Aug 25, 2006 at 03:21:20PM -0400, [EMAIL PROTECTED] wrote: I presume you meant invalidate the cache, not flush it, before accessing DMA'ed data. Yes, this is what I meant. Sorry! -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 01:57:39PM -0700, Sean Hefty wrote: OK, great. I'm fine with people using things which are supported, but then we need the big, blinking Warning! This program is non-standard, and won't work with many of the devices supported by Open Fabrics! sign. If an application were written to use Myrinet, would you consider it non-standard? Er, this question is a bit existential for my taste. Myrinet has its own standards. We're trying to create *inter-operable* hardware and software in this community. So we follow the IB standard. Myricom is doing their own thing, although of course they have software which obeys the Ethernet, VIA, DAPL, and other standards. And I expect that if they say they obey a standard, that they do. They're good people that way. It's up to the application to verify that the hardware that they're using provides the required features, or adjust accordingly, and publish those requirements to the end users. If that was being done (and it isn't), it would still be bad for the ecosystem as a whole. But, basically, that's about the same as what I proposed, quoted above. -- g ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Wed, Aug 23, 2006 at 04:46:52PM -0700, Roland Dreier wrote: Yes, Mellanox documents that it is safe to rely on the last byte of an RDMA being written last. OK, great. I'm fine with people using things which are supported, but then we need the big, blinking Warning! This program is non-standard, and won't work with many of the devices supported by Open Fabrics! sign. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 02:10:38PM -0700, Woodruff, Robert J wrote: The other way to look at it is, the customer goes to the ISV and asks, what hardware should I buy, and the ISV says I support X version of MPI and vendor Y's hardware works with X version of MPI. I thought the goal of InfiniBand was to create an ecosystem where you didn't have to do this. I guess I missed something somewhere. Adding undocumented requirements to a standard isn't the way to entice more people into implementing or using it. I would challenge you to find a single ISV that would prefer a situation where some infiniband middleware requires things which aren't in the standard. So, if you want your hardware to work with basically almost any MPI today, since most of the MPIs assume this data placement ordering, then you will make your hardware so that it will guarantee this type of data delivery. I think there's some confusion here between practicality and theory. There's no question that any IB vendor would make that kind of decision, although it will be expensive and annoying for iWarp vendors to do so. They'll feel forced to do so. But this is bad for the community. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 02:58:18PM -0700, Sean Hefty wrote: We're trying to create *inter-operable* hardware and software in this community. So we follow the IB standard. Atomic operations and RDD are optional, yet still part of the IB standard. An application that makes use of either of these isn't guaranteed to operate with all IB hardware. But those are example of things which are actually written down in the standard. The example we were talking about isn't. But I do not see an issue with a vendor adding value beyond what's defined in the spec. Neither do I. If you think so, you haven't understood my argument. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 03:13:33PM -0700, Woodruff, Robert J wrote: If the feature gives them a huge advantage in performance (and it does) and all of the hardware vendors that they deal with already implement it, then yes, they will force, by defacto standard that all other newcomers implement it or face the fact that no one will buy their hardware. It seems like that is what is happening in this case. In this case the feature reduces performance on one HCA and increases it on another. Which shows why it's a bad idea to pick features based on a single implementation. But you're still confusing practicality and theory. I can see why it's pratical sense for newcomers to implement this new, performance- reducing feature. But why is it theoretically good? And shouldn't it be added to the standard, before all the poor iWarp people discover the hard way that they need it? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 03:28:44PM -0700, Woodruff, Robert J wrote: Yes, IMO, the iWarp folks and the IBTA should consider making this a requirement, but even if they do not the ISVs will still require it. Ahah. So with all this head and light, we agree that it should be added to the standard. That being said, if you can show the ISVs how they can implement their completion model faster using some other mechanism than what they do now, they would probably listen, as they are going to do what gives them the best performance, standard or not. The other way to do it, faster on some HCAs, is to follow the standard. So, no showing needed. If MPI implementations implemented this in addition to the other, non-standard way, and automagically picked the right one, we wouldn't be having this discussion. So, I think we're ending up in agreement. Feel free to disagree ;-) -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 03:37:21PM -0700, Sean Hefty wrote: I'm missing the standard you're using to judge what's theoretically good and bad. Having simpler programming to get good performance is a theoretical good. Extra hacks for specific hardware is theoretically bad, pratically good only when it ends up with much better performance. Silently non-standard software is bad by both accounts. Applications are written this way today. A vendor can either: * Support those apps by providing the feature. * Require that the apps be rewritten to use their hardware. That's what I was calling practical. It's clear what a hardware vendor will do in that case. Whether apps should have been written this way seems irrelevant. They are, and we should make decisions based on that, including extending the spec and/or implementation if needed. In this case we're talking about code which can easily be changed to follow the standard, in addition to having a hack mode that's faster on 1 particular hardware implementation. You seem to be implying that the applications are set in stone, and that their authors have no interest in making them standard-conformant. I don't think that's the case. If there is a standard extension which can provide better performance on 1 particular hardware implementation, let's add it to the standard. But let's also make the software standard-conformant on other hardware. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 03:43:38PM -0700, Sean Hefty wrote: Actually, if a hardware implementation provided the same performance (in this case latency) by polling on a CQ as one where polling on memory was guaranteed to work, the customer may actually prefer the standard implementation. Polling on a CQ involves a function call, synchronization to the CQ, and formatting a structure to return to the user. I don't see this ever being faster than polling memory. Why don't you measure it, then? For example, an iWarp implementation is going to be slowed down if it has to reorder segments to deliver the last byte last. This expense might be more than the function call. You guess not, but... You're also assuming that programs are only checking the last byte of the buffer. For all you know, Mellanox is delivering the whole buffer in ascending order, and the user is checking bytes in the middle, too. Which is a hazard of not-yet-specified standards extensions. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 03:53:37PM -0700, Woodruff, Robert J wrote: If the overhead in polling the CQ rather than memory was not so high, they would have used it, but they found that it added 2us to the latency and found they could get better performance if they polled memory, I keep on mentioning that measuring one instance of one implementation isn't necessarily a good way to evaluate a standard. Or the implementation. We already have one example where Mellanox improved their implementation in the DDR generation such that a performance workaround -- choosing a 1k MTU for RC connections -- was no longer needed. I was pleased when we all agreed to remove defaulting to the workaround -- it was clearly the right thing to do. -- g ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 04:13:22PM -0700, Sean Hefty wrote: Why don't you measure it, then? Why? Reading a memory location directly will be faster than calling a function to read from a memory location. ... sigh. This is not true, there are quite obvious implementations where this is not true. We had one, actually, and changed it due to this issue. That was a practical choice. You're also assuming that programs are only checking the last byte of the buffer. The applications I care about are polling on the last byte. And tomorrow, the next app may depend on the behavior that all the bytes arrive in order. Slippery slope, you know. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Rollup patch for ipath and OFED
On Wed, Aug 23, 2006 at 06:01:32PM +0300, Michael S. Tsirkin wrote: So this seems to be ripping out chunks of upstream code (ipath_ht400) replacing them with something else (ipath_iba6110, ipath_iba6120.o) To answer this piece of the question, we were acquired last April, and of course we have to rename all our devices. Since patch doesn't have a rename feature, this looks much worse than it really is. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Wed, Aug 23, 2006 at 09:29:18AM -0700, Sean Hefty wrote: I don't believe that there is any ordering guarantee by the architecture. However, specific adapters may behave this way, and I've seen applications make use of this by polling the last memory byte for a completion, for example. Actually, that leads me to a question: does the vendor of that adaptor say that this is actually safe? Just because something behaves one way most of the time doesn't mean it does it all of the time. So it it really smart to write non-standard-conforming programs unless the vendor stands behind that behavior? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Tue, Aug 01, 2006 at 01:25:44PM +0200, Christoph Hellwig wrote: Exactly. Please don't even try to put brand names (especially if they're as stupid as this) in. We don't call our wireless stack centrino just because intel contributed to it either. Centrino: Intel-only brand name WiFi: trade association brand-name, not joined by all players 802.11{a,b,g}: technical name of technologies wireless: an overly generic name that people might think should include bluetooth, wireless usb, etc etc. OpenFabrics is not a single company brand name, it is the name of the community that's actually implementing this software stack, like 'Gnome' or 'KDE'. BTW, I've had meetings with about 5 startups that began like, We have an rdma device, but it's not actually RDMA as defined by that IEEE Committee. And these devices don't work like that definition. So there's considerable difference of opinion as to what RDMA means. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Mon, Jul 31, 2006 at 09:52:48AM -0500, Steve Wise wrote: rdma_* is more descriptive than something like ofv_* or of_* in my opinion. I would think the prefix should help describe the functionality being implemented: Transport Neutral RDMA. Some functions are RDMA. Others are not. If all are called RDMA, that's misleading. For example, in IB, there is send/receive as well as RDMA. ULPs often use send/receive for short messages. I wouldn't know anything about the non-IB parts of Open Fabrics, but I would bet that there is non-RDMA functionality in them. The common concept is messaging, not RDMA. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Mon, Jul 31, 2006 at 10:24:11AM -0500, Steve Wise wrote: However, the IETF RDMA protocol defines SEND as well as READ, WRITE, etc. So in my mind, that's all RDMA, not just read and write. Well, most people think RDMA means RDMA. The RDMA protocol undoubtedly defines SEND/RECV because it's needed in addition to RDMA to get good performance. But trying to call all of that RDMA is a marketing slogan. Here's why it's a problem: I've repeatedly seen people try to use RDMA (get and put) all the time because they think it must be faster than simple send and receive... that's what the slogans tell them. But then they discover that they need to use ordinary SEND/RECV for shorter messages and for conversations with a lot of participants. That's a technical screwup caused by the marketing slogan. Let's pick symbol names that match our organization name. I'm a bit dissappointed that several of you who were at the last Sonoma conference forgot we discussed this in a public session right before the name change. I am not on the steering committee, and wouldn't be surprised if the openrdma domain name issue was the big decider in the name choice, but the wisdom of having RDMA in our name was in doubt for more reasons than just that. -- g ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Mon, Jul 31, 2006 at 09:01:16AM -0700, Caitlin Bestler wrote: That would imply that the purpose of the openfabrics stack is to replace netdev. I don't think it implies that at all. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Mon, Jul 31, 2006 at 01:25:39PM -0400, James Lentini wrote: I agree that the term RDMA SEND is confusing. However, the data in an RDMA SEND is deposited directly (zero copy) into the users memory. There are many mechanisms other than DMA or RDMA which have this property. You're confusing specification with implementation, too. When I read from a disk on modern Unix, the data is deposited into the user's memory, whether it's DMA or PIO. The defining characteristic of RDMA is that it deposits or reads data based on address provided by the other side, *and* that it has one-sided semantics. In ordinary messaging, data is transferred from buffers which are much less flexibly addressed, and semantics are two-sided. Here's why it's a problem: I've repeatedly seen people try to use RDMA (get and put) all the time because they think it must be faster than I'm assuming RDMA get/put correspond to RDMA READ/WRITE. Yes, get and put are what the general community have traditionally called these operations. These names emphasize the one-sided nature of the operation, unlike the new official(tm) names. simple send and receive... that's what the slogans tell them. But then they discover that they need to use ordinary SEND/RECV for shorter messages and for conversations with a lot of participants. By ordinary SEND/RECV, do you mean IB/iWARP SEND/RECV or traditional (sockets) networking send(2)/recv(2)? I was actually thinking of OpenIB SEND/RECV. That's a technical screwup caused by the marketing slogan. The terms RDMA read and RDMA write are technically accurate. It seems we have different defintions of technical, then. Slogans don't make good engineering. Perhaps someone can think of a better prefix. How about dav_ (direct access verb)? That's much better than rdma_, but do you really think the Linux folks are going to be happy about OpenFabrics calls with a prefix that doesn't look anything like Open Fabrics? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Mon, Jul 31, 2006 at 10:32:05AM -0700, Roland Dreier wrote: Greg, what would be your suggestion of a more generic (not IB-specific) replacement of the libibverbs name and ibv_ prefix? Anything that makes it clear that it's an Open Fabrics call. Which is what our organization and software stack are called. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Mon, Jul 31, 2006 at 10:39:49AM -0700, Sean Hefty wrote: Or maybe just verb. Would that be better? That's a good one. IMO, the underlying issue with using 'rdma' is that a software based solution doesn't actually do 'rdma'. I think this is Greg's complaint, and why he uses the terms 'get/put' instead of rdma read/write. Actually, no, it isn't that. It's philosophical, a reaction to the marketing over-hyping of RDMA. I'm stunned that you've never heard of put and get ! Never used CRAY SHMEM or any one-sided interconnect, I guess? MPI uses those terms, too. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Mon, Jul 31, 2006 at 10:45:39AM -0700, Roland Dreier wrote: No other drivers have a brand name and it's pretty silly trying to brand IB/iWARP/RDMA/whatever drivers. I don't see this as branding or marketing. I see it as trying to come up with a name that's accurate. What do you think of verb_ ? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Mon, Jul 31, 2006 at 10:54:55AM -0700, Caitlin Bestler wrote: Trying to characterize RDMA as consisting *solely* of messages that identify target buffers in the message is off target. You're using circular arguments: Because one particular subset of the RDMA community defines RDMA in fashion X, it is off target to define RDMA in any other fashion. One-sided vs. two-sided is important. You've completely left that out. Well, no matter: we don't need to argue about the defintion of RDMA to solve the question of what the transport-neutral prefix should be. I have no doubt that we would never agree about the defintion. Now if you can come up with a short acronym that conveys that then I am fine with using it. Try now if *someone* can come up with. How did you like verb_ ? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Mon, Jul 31, 2006 at 02:17:20PM -0400, James Lentini wrote: Dusting off my copy of vipl.h, circa 1996, I see that these operations were called RDMA READ/WRITE in VIA. Yes, and that's the predecessor to IB, so that's no surprise that it uses the same term. The IETF RDMA people also use it. Do you think that's all there is to RDMA? I am not surprised that as a storage guy, that's what you're most familiar with. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Mon, Jul 31, 2006 at 01:03:16PM -0500, Steve Wise wrote: I agree. Plus we already have precedence for rdma_ with the RDMA CMA... That's precedence about like I used the term 'wimps' in a poster paper once, so now you should allow me to use 'wimps' in my Astrophysical Journal article. True story. Weakly Interacting Massive Particles. Which in turn spawned MACHOs, MAssive Compact Halo Objects. Fun, but not the way to do software engineering. Hint: did you ever hold a discussion as to whether or not that was the right transport-neutral name? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Mon, Jul 31, 2006 at 11:18:16AM -0700, Roland Dreier wrote: My gut reaction is negative. The whole idea of verbs is a bit of technical jargon that makes no sense unless you've lived in the RDMA world for a while, Given the way you are defining RDMA, I'm not surprised at the conclusion you are coming to. We have been calling these the transport neutral verbs, btw. How about ofabric_ ? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Mon, Jul 31, 2006 at 01:34:41PM -0500, Steve Wise wrote: You seem to be the only one objecting to rdma_ and/or rdmav_. At Sonoma, I was not the only one. I forget, were you there? I've listened to your arguments for why you think rdma is a bad name, and I'm not convinced. I'm not surprised, I did not expect to convince everyone. However, it is not the case that you get to pick the name by yourself. Nor I. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Mon, Jul 31, 2006 at 11:31:33AM -0700, Roland Dreier wrote: Greg Hint: did you ever hold a discussion as to whether or not Greg that was the right transport-neutral name? Jeeze, Sean posted the RDMA CM code to three mailing lists for review about 100 times. Did you ever complain about the naming convention? Roland, I'm not sure what to say. I suspect you think you're being constructive, but I'm getting tired of being shot at for being the messanger. This is an issue important enough that having an explicit discussion is a good idea. It shouldn't have come up as part of a patch. And it wasn't clear to me that the RDMA CM was intended to be part of the transport neutral verbs. If you look a the subject of this thread, it's clear that it's about transport neutral verbs. So I looked, and was surprised. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
On Mon, Jul 31, 2006 at 12:04:21PM -0700, Roland Dreier wrote: Unless someone else has a problem with the rmdav_ name then I think we should let this die. Sounds like a call for an open discssion on it, with a proper subject line, even. And asking outside of openib-general. Which is what I am suggesting. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/6] Tranport Neutral Verbs Proposal.
This patchset is a proposal to create new API's and data structures with transport neutral names. We named ourselves OpenFabrics instead of OpenRDMA for a reason, did I miss some point where we decided that we would use RDMA as a transport neutral name in the source code? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] MPI error when using a system call in mpi job.
On Tue, Jun 13, 2006 at 05:11:47PM -0700, Ira Weiny wrote: After some tracking down he found that apparently if he used a system call [int system(const char *string)] the next MPI command will fail. Are you sure MVAPICH supports fork()? It is not unusual for MPI implementations to not support fork(). system() uses fork(). -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] osm: fix num of blocks of GUIDInfo GetTable query
So this is a _critical_ bugfix ? Auuuch it is there! My mistake. Sp please apply the patch to the OFED 1.0 branch too. BTW: Is the osmtest -f a excersizes this query on the OFED 1.0 ? Huh ? What's https://openfabrics.org/svn/gen2/branches/1.0/src/userspace/management/o sm/opens m/osm_sa_guidinfo_record.c -- Hal Eitan Zahavi Senior Engineering Director, Software Architect Mellanox Technologies LTD Tel:+972-4-9097208 Fax:+972-4-9593245 P.O. Box 586 Yokneam 20692 ISRAEL -Original Message- From: Hal Rosenstock [mailto:[EMAIL PROTECTED] Sent: Sunday, June 11, 2006 12:22 AM To: Eitan Zahavi Cc: OPENIB Subject: Re: [PATCH] osm: fix num of blocks of GUIDInfo GetTable query Eitan, On Thu, 2006-06-08 at 07:24, Eitan Zahavi wrote: Hi Hal I'm working on passing osmtest check. Found a bug in the new GUIDInfoRecord query: If you had a physical port with zero guid_cap the code would loop on blocks 0..255 instead of trying the next port. I am still looking for why we might have a guid_cap == 0 on some ports. This patch resolves this new problem. osmtest passes on some arbitrary networks. Eitan Signed-off-by: Eitan Zahavi [EMAIL PROTECTED] Thanks. Applied to trunk only. Let me know if it also should be applied to 1.0. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RE: Failed multicast join withnew multicast module
On Thu, Jun 08, 2006 at 09:49:35AM -0700, Sean Hefty wrote: What if we started with something like the following compliance statement, and tried to add this to the spec? An SM, upon becoming the master, shall respect all existing communication in the fabric, where possible. Isn't this a quality of implementation issue? It's hard to imagine a SM author not realizing this is a good thing to do. If it was in the standard, how would you test it for compliance? -- g ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] which way to port squid for supporting large amount concurrent connections
On Mon, May 22, 2006 at 07:56:03PM -0700, zhu shi song wrote: I won't wait sdp OK. I hope to use another method to port squid. How about IPoIB? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RFC: detecting duplicate MAD requests
On Fri, Apr 28, 2006 at 11:23:44PM -0700, Sean Hefty wrote: The proposal is to only discard duplicate requests while a response to the first request is being generated. Ah. Does that happen that often? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: [PATCH 04/16] ehca: userspace support
Do you really need this heavy debug logging in the first place? You can use kprobes for arbitrary run-time inspection anyway, so logging everything seems wasteful. The problem I see with kprobes is that you have to set several kernel configuration options (e.g. CONFIG_KPROBES, CONFIG_DEBUG_INFO, etc.) on compile time to use it. Same problem with pr_debug(). Note that one usage of debug code is for a vendor to ask a customer to turn it on to figure out weird problems that the vendor can't replicate. Customers are more likely to cooperate if the effort is small... rebuilding the kernel is not a small effort compared to turning on debug that's already compiled in. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RFC: detecting duplicate MAD requests
On Fri, Apr 28, 2006 at 03:20:13PM -0700, Sean Hefty wrote: I'd like to propose that the MAD layer detect duplicate requests. Sean, You can't add this kind of thing piecemeal to a protocol and have it work. If the sender doesn't see a response (perhaps the response was lost, or was slow coming), and sends another MAD, this 2nd MAD will have a different sequence number. How does the recipient know it's the same request? If the response was lost the first time, eating the 2nd MAD without sending a response will result in another timeout and a 3rd MAD... so maybe the recipient remembers the response and sends it again. Will that work? Well, no, it's not guaranteed, because the sender may reject a stale response received after sending the 2nd MAD... Really, it's up to the MAD client to deal with duplicates in its own way. And yes, this class of issues shows up in practice. Ask anyone who's ever worked on a large distributed system. Execute exactly once semantics require end-to-end design. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] opensm: handle stdout and stderr in osm_log
On Thu, Apr 27, 2006 at 01:15:57AM +0300, Sasha Khapyorsky wrote: There is small patch for osm_log, this provide possibility to drop log output to stdout or stderr. Isn't the Unix convention to use -- to mean stdout? Or you can use /dev/fd/{0,1}... -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: TSO and IPoIB performance degradation
On Wed, Apr 26, 2006 at 11:13:24PM -0500, Troy Benjegerdes wrote: David is right. If you care about performance, you are already using SDP or verbs layer for the transport anyway. If I am going to be doing IPoIB, it's because eventually I expect the packet might get off the IB network and onto some other network and go halfway across the country. This is going to be a surprise to lots of people who want high-speed gateways from IB to ethernet -- many clusters connect to fileservers and other performance-sensitive gizmos that way. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: TSO and IPoIB performance degradation
On Thu, Apr 27, 2006 at 04:22:40PM -0700, Grant Grundler wrote: Anything preventnig such a gateway from routing SDP to ethernet? Those gateways obviously will grok IB protocols. I'm asking becuase I don't understand/know if there is a real barrier to an IB - ethernet gateway _without_ IPoIB. I don't know if a SDP to ethernet gateway even exists, but I do know that it's a lot more work than just an IPoIB to ethernet gateway -- the gateway is going to have to pass all its data through a TCP stack. So I would expect SDP to ethernet to not run very fast, especially on a gateway with lots of streams going. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RFC userspace / MPI multicast support
On Thu, Apr 20, 2006 at 01:00:43PM -0400, Hal Rosenstock wrote: Also, is IPoIB always setup when running MPI ? That's an easy one: No. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Help with CONFIG_PCI_MSI in the kernel
On Mon, Apr 03, 2006 at 09:36:00PM -0700, Grant Grundler wrote: The only evidence I have is one AMD chipset is buggy WRT MSI. Grant, I know about that case, the kernel disables stuff if it sees an AMD 8131 due to a bug. What I am referring to was IPoIB performance on Mellanox HCAs being improved with MSI. I figure if it's gotten this far with Red Hat turning it off, concrete examples are in order. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Help with CONFIG_PCI_MSI in the kernel
Red Hat has started turning off CONFIG_PCI_MSI in their kernels (FC5 and the latest FC4 update). I remember a while back there was a discussion about how MSI made the Mellanox HCA run faster, can someone please add some concrete details about this to the bug? Thanks. http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=186520 -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPoIB destructor patches
On Fri, Mar 31, 2006 at 08:37:58AM -0800, Roland Dreier wrote: If you want to forward the necessary patches to [EMAIL PROTECTED] for inclusion in 2.6.16.x releases, that would be great. It would be nice to have a single person responsible for spotting things that qualify for the stable kernel, and shepherd them into it. This one is a good example, something that hurts IB 1.0 on some new hardware. I'm not volunteering Roland; anyone else want to volunteer? Here are the criteria: http://www.kroah.com/log/2005/03/09/ and a note that 2.6.16.y will be more relaxed than the above: http://www.ussg.iu.edu/hypermail/linux/kernel/0603.2/1274.html -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: Problem configuring ipath_ether
On Mon, Mar 27, 2006 at 08:18:28AM -0800, Shirley Ma wrote: Is it a good idea to send such a big messages size using UD mode? The usual reason for not doing this with UDP is that it turns a small amount of packet loss into lots more lossage. This is a real issue with ethernet when it drops packets when there is congestion. With IB the error rate is consistantly low. It could be that other HCAs don't like lots of UD messages in a row, either sending or receiving them efficiently. We never tried, as you can see the code is not written on top of verbs. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Problem configuring ipath_ether
On Fri, Mar 24, 2006 at 10:38:35PM -0800, Matt Leininger wrote: That could be. I'll check. I thought ipath_ether was suppose to work when IPoIB was enabled. In general, our non-IB protocols work together with OpenIB. This item is the one thing that doesn't work, and it's due to an SMA bug that we haven't fixed yet. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: Problem configuring ipath_ether
On Sat, Mar 25, 2006 at 10:36:10PM +0200, Michael S. Tsirkin wrote: In what sense is it more like an ethernet device than IPoIB? Since IP over IB just puts an IP packet inside a UD packet without overhead, what wire protocol changes can thinkably be done, while still using InfiniBand protocols, to improve performance? Michael, Given that you're objecting to it being included in the kernel, I'd expect that you would have already reviewed the code. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Problem configuring ipath_ether
On Thu, Mar 23, 2006 at 11:50:18PM -0800, Matt Leininger wrote: I have the ipath driver up, running, and working with IPoIB. I'm using 2.6.16 with svn 5938. The ipath_ether comes up as eth2. I can set the netmask and broadcast, but when I try to set the ip address for this device I get the following error: Matt, ipath_ether uses InfiniBand protocols but is not the same as IPoIB. So it's better to ask [EMAIL PROTECTED] about it. (The reason we ship it at all is that it's faster than IPoIB -- this necessitates a different wire format -- and it behaves more like an ethernet device than IPoIB.) My guess is that you're trying to use it with ib_mad (the in-kernel SMA). Do you see anything in /var/log/messages like ipath_ether_open timed out waiting for MLID? The relevant entry in RELEASE-NOTES.txt under the KNOWN LIMITATIONS section is: * IPoIB and ipath_ether do not work together, as ipath_ether requires the InfiniPath SMA, which needs to be disabled to use OpenIB. This will be fixed in a future release. Apologies for this not being more obvious. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPoIB broadcast MC group membership
On Tue, Feb 21, 2006 at 11:40:53PM -0800, Fabian Tillier wrote: You'd have to make the group 1X. Note that the group being 1X doesn't limit unicast traffic to 1X rates, since the rate for unicast traffic would be set based on the rate reported in the path records for the various endpoints. So 4X SDR and 4X DDR nodes would have to set their inter-packet delay for the broadcast group to end up with a 1X packet injection rate. So, basically, MVAPICH doesn't have code that does either the group creation properly when there is a mixture of HCA bandwidths, or limit the packet injection rate. And IPoIB could violate this rule depending on how user programs use it, e.g. if I did a lot of broadcasting, I could easily exceed 1X's bandwidth. So this is more than just a fix OpenSM issue. It's more of a fix the spec issue, if I'm understanding it correctly. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPoIB broadcast MC group membership
Is this a correct summary of this thread? * IPoIB uses an InfiniBand multicast group to fake ethernet broadcast * This is optional, I'm not sure what functionality is lost without it * MVAPICH uses a multicast group for some MPI collectives * This can be turned off by setting env var DISABLE_HARDWARE_MCST * An IB multicast group has to use ports of the same speed * This one was a surprise to me Ergo, when you mix 1X, 4X SDR, and 4X DDR hosts, it behaves differently from a homogeneous network. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPoIB broadcast MC group membership
On Tue, Feb 21, 2006 at 05:21:33PM -0800, Roland Dreier wrote: No, but an IB multicast group has a speed associated to it. This is to allow, say, a 4X port sending multicast packets to use the right static rate to avoid overrunning a 1X port that is also a member of the group. Thanks for the clarification, Roland, I see that the previous discussion had successfully confused me. So is it the case that the create and the joins to a multicast group have to specify the correct speed? And the problem then would be that a IB host at boot doesn't know the right speed? Am I getting colder or warmer? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IPoIB broadcast MC group membership
On Tue, Feb 21, 2006 at 08:17:02PM -0800, Fabian Tillier wrote: The node joining or creating the multicast group doesn't need to specify the rate - the SA can figure out the rate to use based on the requestor (for creation), or validate that the requestor supports the existing group's rate (for joining). Um, but that gets back to my point: I want 1X, 4X SDR, and 4X DDR nodes running IPoIB to share a multicast group. Are you saying this can be done by making the group a 1X group? Or that it's impossible to have such a group? Or that everyone would have to drop to 1X to make such a group? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OpenIb and Ada95 binding
On Tue, Jan 31, 2006 at 03:53:22PM +0100, Xavier Grave wrote: I'm using Ada95 and I would like to implement a thin binding in order to use infiniband. If MPI meets your needs -- do you need recovery from failures? -- then you can use one of the existing MPI/ADA bindings. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] SA cache design
Since no one's really answered this yet: Many sysadmins are not going to want to install a relational database to run an SA cache. So I'd stick to Berkeley DB if I were you. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RFC MPI and app. requirements of OpenIB
On Thu, Dec 22, 2005 at 12:14:44PM -0800, Sean Hefty wrote: Can MPI operate with unreliable multicast support? Does MPI plan on using IB multicast? Given the large number of MPI implementations over IB, I don't think there's a single answer. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RE: [PATCH] Opensm - fix segfault on exit - cont.
On Tue, Dec 20, 2005 at 09:11:12AM +0200, Yael Kalka wrote: - if (p_ur-signal) + if (p_ur-signal != NULL) Aren't these 2 statements required to execute the same according to the C standard? I wrote a tiny test program and gcc4.0.0 as distributed with Fedora Core 3 generated identical assembly code for both. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [Openib-promoters] Re: [openib-general] Next workshop dates? Ideas for agenda???
On Mon, Dec 12, 2005 at 09:55:03AM -0800, Bill Boas wrote: Now I believe that when a Linux distribution, an IB company or a Tier One OEM decides that is a version of the code that they will support, then that is a release. Why don't we imitate the Linux kernel process? OpenIB has to follow a sane process of innovation followed by stabilization and bug-fixing in order for the IB companies and Tier 1s to be able to make solid releases. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [ANNOUNCE] Contribute RDS ( ReliableDatagramSockets) to OpenIB
Caitlin, Can you please use the standard quoting style? I can't tell which comments are yours. Thanks. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [ANNOUNCE] Contribute RDS (ReliableDatagramSockets) to OpenIB
On Tue, Nov 08, 2005 at 01:08:13PM -0800, Michael Krause wrote: If an application takes any action assuming that send complete means it is delivered, then it is subject to silent data corruption. Right. That's the same as pretty much all other *transport* layers. I don't think anyone's asserting RDS is any different: you can't assume the other side's application received and acted on your message until the other side's application tells you that it did. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [ANNOUNCE] Contribute RDS (ReliableDatagramSockets) to OpenIB
On Wed, Nov 09, 2005 at 12:18:28PM -0800, Michael Krause wrote: So, things like HCA failure are not transparent and one cannot simply replay the operations since you don't know what was really seen by the other side unless the application performs the resync itself. I think you are over-stating the case. On the remote end, the kernel piece of RDS knows what it presented to the remote application, ditto on the local end. If only an HCA fails, and not the sending and receiving kernels or applications, that knowledge is not lost. Perhaps you were assuming that RDS would be implemented only in firmware on the HCA, and there is no kernel piece that knows what's going on. I hadn't seen that stated by anyone, and of course there are several existing and contemplated OpenIB devices that are considerably different from the usual offload engine. You could also choose to implement RDS using an offload engine and still keep enough state in the kernel to recover. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [ANNOUNCE] Contribute RDS (ReliableDatagramSockets) to OpenIB
On Wed, Nov 09, 2005 at 01:57:06PM -0800, Michael Krause wrote: What you indicate above is that RDS will implement a resync of the two sides of the association to determine what has been successfully sent. More accurate to say that it could implement that. I'm just kibbutzing on someone else's proposal. This then implies that the reliability of the underlying interconnect isn't as critical per se as the end-to-end RDS protocol will assure that data is delivered to the RDS components in the face of hardware failures. Correct? Yes. That's the intent that I see in the proposal. The implementation required to actually support this may not be what the proposers had in mind. This sort of message service, by the way, has a long history in distributed computing. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [ANNOUNCE] Contribute RDS (Reliable DatagramSockets) to OpenIB
On Fri, Nov 04, 2005 at 11:54:27AM -0800, pandit ib wrote: Since a UDP application assumes the underlying transport is unrealiable it should not have any problems running on RDS. On getting EWOUDBLOCK it will simply retry. Most existing UDP applications do not expect a return error code of EWOULDBLOCK. To begin with, the Linux manpages say that you have to specify non-blocking to get this error in the first place. Another possibility is ENOBUFS, which gives the advice Normally, this does not occur in Linux. Packets are silently dropped when a device queue overflows. There was a somewhat famous case showing lack of error handling in UCP applications under Linux, where Alan Cox decided to read the RFCs different from everyone else, and caused an ICMP 'port unreach' to later cause the same sending socket to return an error for a send to some unrelated host. Many UDP-using apps considered this a fatal error. This was ~ 7 years ago, and this misfeature caused enough anger that it was corrected soon after Alan stopped owning the TCP/UDP stack. In short, I'm not sure there would be much benefit for giving existing UDP-expecting apps a reliable, ordered stream of datagrams. The only app which would see a benefit are those who know that they can turn off their reliability and ordering code, and handle backpressure explicitly. Those folks would benefit from a simpler programming interface than verbs. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: Opensm - osm_sa_path_record.c - variable declaration
Windows compiler does not enable declaration not in the beginning of the function, so I would like to have it changed. We can either move the declaration to the beginning of the function, or add {} around the declaration. Isn't there a gcc flag to convince it to give an error in this case? If so, adding it to the OpenIB makefile would be a good idea. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Infinipath support in OpenIB
On Thu, May 19, 2005 at 12:46:04PM -0700, Tom Duffy wrote: The article states: 'InfiniPath will also support the OpenIB software stack providing full InfiniBand compliance.' Also interesting that OpenIB is suddenly an industry standard. ;-) Tom, There's a reason for that -- when we asked customers, they asked us to support OpenIB in particular. So congratulations, not only is OpenIB a part of the IB ecosystem, it's one that everyone seems to want to support. We didn't want to invent a new flavor of verbs or DAPL anyway. No point to it, no benefit to anyone. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation
On Tue, May 03, 2005 at 11:43:25AM -0700, Andy Isaacson wrote: [1] You might want to allow the child to start a completely new RDMA context, but I don't see that as necessary. Some people use a hybrid OpenMP+MPI model, so some MPI implementations want to fork and have the child open a new communications context. This is fairly rare, but unfortunately is a checklist item for most MPI implementations, and this will get worse over time. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: RDMA memory registration
On Fri, Apr 29, 2005 at 12:33:54PM -0700, Grant Grundler wrote: Being mostly clueless about Quadrics implementation, I'm probably missing something that makes Quadrics a MMU but not the IB variants. Can someone clue me in please? As far as I can tell it's mostly a marketing distinction. Many Quadrics customers run with memory registration, and Mellanox could probably alter their firmware to not require registration. Myricom certainly can, and in fact Patrick Geoffrey claimed they were doing so in their MX software. The only one I know of that isn't that flexible is PathScale's InfiniPath. Ours is a pure hardware mechanism, but it requires memory registration and is clearly not an MMU. Confused yet? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: RDMA memory registration
Todd But that implies the hardware has an MMU and it also puts an Todd interrupt in the path per page sent. Well, there's one interrupt per non-resident page sent. But nearly all of the time the page will be present. It doesn't imply that there's an MMU, either. I know that Myricom uses a little lookup routine in software on their nic, which most people wouldn't call an MMU. I don't know what Mellanox does for this, they don't talk much about what's hardware and what's software on their nic. I think Quadrics actually uses the TLB of their risc cpu on their nic for this lookup, but that's just a guess. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] IB Address Translation service
On Fri, Mar 04, 2005 at 11:58:33AM -0800, Tom Duffy wrote: In any event, I think being able to plop an IB network in an Ethernet world will require things like RARP to work. If there is no spec now, it should be written. Much more important is understanding the role of RARP in the ethernet world. It is *not* something you do to find _someone else's_ IP addr from their MAC addr. It's what you do to find your _own_ IP addr because you're booting. Ethernet protocols such as IP include enough IP information to talk back to someone who sent you a packet. So you don't need to find out an IP addr from a MAC for remote nodes on a regular basis. Instead, you find out a MAC addr from an IP address, which is ARP. RARP is little used now that DHCP is popular. Now it would be nice for ethernet broadcast packets to just work(tm) with IPoIB. ping -b is an example of a user-level program that generates a broadcast packet. DHCP clients also generate such packets, and DHCP servers listen for them. Getting a RARP client and server to work ought to be the same as a DHCP client and server. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] another shutdown reminder
On Wed, Jan 05, 2005 at 05:08:31PM -0500, Hal Rosenstock wrote: What TZ ? PST ? Ah, if only you knew how funny that question is! 2 beer penalty and loss of down for failure to appreciate the uniqueness of Sandia's geography. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general