Re: [openib-general] [openfabrics-ewg] OFED 1.1 release schedule
Hi, Let me first explain why the current OFED release does not support SRP-HA on RHEL4. SRP-HA is using Device Mapper multipath. Multipath prerequisites include udev of higher version than 050. RHEL4 distributions includes udev 039. udev is an important part of the distribution and I do not think that users will be ready to upgrade it in order to have SRP-HA. To my best knowledge the main reason that multipath needs at least udev 050 is because it uses the RUN option (This option executes its given parameter after the device exist). Multipath uses the RUN option to execute kpartx that handles the partitions of the new device. SRP-HA also uses the RUN option to execute the multipath command. I have an idea on how to overcome this problem. I want to implement a srp-multipath-daemon. This daemon will get kpartx and multipath requests using a shared message queue. The udev will use the PROGRAM option (That executes its given parameter immediately - before the device exist) to post request to this shared message queue and return immediately. The daemon will wait for the device to create and only than it will execute the commands. In any case this technique will not be a part of the coming OFED release. Ishai -Original Message-From: Sharma, Karun [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 17, 2006 5:11 AMTo: Tziporet Koren; Open FabricsCc: openibSubject: RE: [openfabrics-ewg] OFED 1.1 release schedule The plan is OK with Silverstorm. I have a question though. What are the plans to support SRP-HA feature on RHEL4 kernels ? Thanks Karun ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release schedule
The plan is OK with Silverstorm. I have a question though. What are the plans to support SRP-HA feature on RHEL4 kernels ? Thanks Karun From: [EMAIL PROTECTED] on behalf of Tziporet KorenSent: Mon 10/16/2006 1:03 PMTo: Open FabricsCc: openibSubject: [openfabrics-ewg] OFED 1.1 release schedule This is the plan to do the 1.1 release this week: We will publish 1.1-pre1 package tomorrow (Tue. 17-Oct) Only blocker issues from RC7 will be updated: SRP fix for Cisco FC gateway Small updates for the install Fix in diagnet to support SM on a switch Activate scaling code of ehca as default in the install Documentation update Each company will have 3 days for latest certification process and then the release can be done on Thursday. Company owners – please approve if this is OK with you. If not please elaborate the blocking reasons. Thanks, Tziporet Koren Software Director Mellanox Technologies mailto: [EMAIL PROTECTED]Tel +972-4-9097200, ext 380 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release schedule
This plan is OK with Cisco. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet KorenSent: Monday, October 16, 2006 10:04 AMTo: Open FabricsCc: openibSubject: [openfabrics-ewg] OFED 1.1 release schedule This is the plan to do the 1.1 release this week: We will publish 1.1-pre1 package tomorrow (Tue. 17-Oct) Only blocker issues from RC7 will be updated: SRP fix for Cisco FC gateway Small updates for the install Fix in diagnet to support SM on a switch Activate scaling code of ehca as default in the install Documentation update Each company will have 3 days for latest certification process and then the release can be done on Thursday. Company owners – please approve if this is OK with you. If not please elaborate the blocking reasons. Thanks, Tziporet Koren Software Director Mellanox Technologies mailto: [EMAIL PROTECTED]Tel +972-4-9097200, ext 380 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release schedule
Hi, > We will publish 1.1-pre1 package tomorrow (Tue. 17-Oct) > Only blocker issues from RC7 will be updated: > 1. SRP fix for Cisco FC gateway > 2. Small updates for the install currently we're working on the one install issue as I mentioned in another thread. We found out that the 64- and 32-bit binaries were built properly, but during the packaging the 32-bit binaries were not picked, but the 64-bit ones together with 32-bit libraries. We're trying to understand how the specific files for each rpm come in. We appreciate for any hints/suggestions. Has anyone else also observed this problem? > 3. Fix in diagnet to support SM on a switch > 4. Activate scaling code of ehca as default in the install Great, thanks! > 5. Documentation update > Each company will have 3 days for latest certification process and then the release can be done on Thursday. Supposed we could solve the issue above tomorrow this should be ok for us. Regards Hoang-Nam Nguyen ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features
Or Gerlitz wrote: >> Or Gerlitz wrote: >>> Vladimir Sokolovsky wrote: > >>> Did you have any special reason to assign host1:ib1 an IP address >>> ***before*** the failover? is the reason for that happen to be >>> having it joins the IPv4 multicast group at "batch time", that is >>> not during the failover? > >> ib1 interface is loaded in any case (with or without configuration) >> if ib0 is loaded by /etc/init.d/network or /etc/init.d/openibd. It >> can't be configured with IP 0.0.0.0 - it fails to start with this >> configuration. >> So, I gave it some IP in a different IP subnet. > > Not sure what you mean by "loaded": the trigger for IPoIB to registers > network devices is plain IB, that is "device (not link!) up" event it > gets through the ib stack client register hotplug mechanism, for > exampe if the HCA has two ports, IPoIB will register ib0 and ib1 (same > for two HCAs each of them with one port etc). > However, I think the trigger for IPoIB to attempt doing the SA Q to > have the port GID associated with IPoIB netdevice join an mcast group > is the user action towards having this device being "UP" (eg the > assignment of IP address to it). > > Not sure what you mean by "start", you can just do nothing before the > failure of ib0 and during the failover from ib0 to ib1, assign ib1 the > address which used to be of ib0. By "loaded" I meant that ib1 is configured with IP address and other parameters after executing '/etc/init.d/network start' or '/etc/init.d/openibd start' I worked on SuSE 10 and I saw that even if ifcfg-ib1 does not exist then ib1 get the same configuration as ib0. This does not happens on RedHat 4.0. > >>> I think we want arping to send a gratuitous arp with the MAC of ib1 >>> so weren't you need to provide the -U or -A command line to arping? > >> You are right I used 'arping -A ...' (fogot to insert it in the >> email). Actually, I have added my flag '-R' which means '-A over IPoIB' > > thanks for the patch, i am not sure to fully follow the code path when > the "unsolicited" flag is set, but i do see what unlike in the -A/-U > options you have made the -R option not to set the "unsolicited" flag, > can you explain what was the issue? There was no issue, it just a drop version. So, you can change it as you wish. > > Or. > Regards, Vladimir ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features
> Or Gerlitz wrote: >> Vladimir Sokolovsky wrote: >> Did you have any special reason to assign host1:ib1 an IP address >> ***before*** the failover? is the reason for that happen to be having >> it joins the IPv4 multicast group at "batch time", that is not during >> the failover? > ib1 interface is loaded in any case (with or without configuration) if > ib0 is loaded by /etc/init.d/network or /etc/init.d/openibd. It can't be > configured with IP 0.0.0.0 - it fails to start with this configuration. > So, I gave it some IP in a different IP subnet. Not sure what you mean by "loaded": the trigger for IPoIB to registers network devices is plain IB, that is "device (not link!) up" event it gets through the ib stack client register hotplug mechanism, for exampe if the HCA has two ports, IPoIB will register ib0 and ib1 (same for two HCAs each of them with one port etc). However, I think the trigger for IPoIB to attempt doing the SA Q to have the port GID associated with IPoIB netdevice join an mcast group is the user action towards having this device being "UP" (eg the assignment of IP address to it). Not sure what you mean by "start", you can just do nothing before the failure of ib0 and during the failover from ib0 to ib1, assign ib1 the address which used to be of ib0. >> I think we want arping to send a gratuitous arp with the MAC of ib1 >> so weren't you need to provide the -U or -A command line to arping? > You are right I used 'arping -A ...' (fogot to insert it in the email). > Actually, I have added my flag '-R' which means '-A over IPoIB' thanks for the patch, i am not sure to fully follow the code path when the "unsolicited" flag is set, but i do see what unlike in the -A/-U options you have made the -R option not to set the "unsolicited" flag, can you explain what was the issue? Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features
Hi Or, See below, Regards, Vladimir Or Gerlitz wrote: Vladimir Sokolovsky wrote: Hi Or, I am working on IPoIB failover. I tried Michael's Tsirkin patch for ipoib (updating neighbor structure) and it fixes the issue Roland was talking about. Meanwhile I have tested the following flow: /*_Setup description:_*/ host1 - 2 IB ports connected to IB switch. ib0: 11.0.0.1 ib1: 12.0.0.1 host2 - port 1 connected to the IB switch. ib0: 11.0.0.2 opensm over port1 /*_Flow description:_*/ - ping host2 -> 11.0.0.1 (passed) - set port1 of the host1 to 'DOWN' state (disconnect the port from IB subnet) - ping host2 -> 11.0.0.1 (failed) - ifconfig ib0 0.0.0.0 (on host1) - ifconfig ib1 11.0.0.1 (on host1) - arping -I ib1 11.0.0.1 (on host1) - ping host2 -> 11.0.0.1 (passed) arping in this case was not really necessary because ping issues ARP requests by himself. Hi Vlad, Did you have any special reason to assign host1:ib1 an IP address ***before*** the failover? is the reason for that happen to be having it joins the IPv4 multicast group at "batch time", that is not during the failover? ib1 interface is loaded in any case (with or without configuration) if ib0 is loaded by /etc/init.d/network or /etc/init.d/openibd. It can't be configured with IP 0.0.0.0 - it fails to start with this configuration. So, I gave it some IP in a different IP subnet. >- arping -I ib1 11.0.0.1 (on host1) -U Unsolicited ARP mode to update neighbours' ARP caches. No replies are expected -A The same as -U, but ARP REPLY packets used instead of ARP REQUEST. I think we want arping to send a gratuitous arp with the MAC of ib1 so weren't you need to provide the -U or -A command line to arping? You are right I used 'arping -A ...' (fogot to insert it in the email). Actually, I have added my flag '-R' which means '-A over IPoIB' If i understand correct, gratuitous arp was not sent in your usage case so i am not sure Michael's patch was exercised. > Note: I updated the original arping to be able to send broadcast using > ipv4_bcast_addr. Can you please send the patch to arping? Attached (arping_full_ib.c - arping.c with changes for IPoIB). > Also, I have tested ssh over IPoIB with the same flow. In this case > arping also wasn't necessary , but it makes an update of neighbors > with the new MAC address (of ib1 interface) more quickly. Two interesting test cases you might want to validate your approach with is something "long" ie that delivers much traffic before and after the failover ie: iperf or netperf over TCP AND UDP. I have not validated it but i think UDP would not generate ARP so the gratuitous is the only way to update the remote system with the MAC change. I will test it later. Or. /* * arping.c * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version * 2 of the License, or (at your option) any later version. * * Authors: Alexey Kuznetsov, <[EMAIL PROTECTED]> */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "SNAPSHOT.h" static void usage(void) __attribute__((noreturn)); int quit_on_reply=0; char *device="eth0"; int ifindex; char *source; struct in_addr src, dst; char *target; int dad, unsolicited, advert; int quiet; int count=-1; int timeout; int unicasting; int s; int broadcast_only; int ib_arprep; struct sockaddr_ll me; struct sockaddr_ll he; struct timeval start, last; int sent, brd_sent; int received, brd_recv, req_recv; static const u8 ipv4_bcast_addr[] = { 0x00, 0xff, 0xff, 0xff, 0xff, 0x12, 0x40, 0x1b, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff }; #define MS_TDIFF(tv1,tv2) ( ((tv1).tv_sec-(tv2).tv_sec)*1000 + \ ((tv1).tv_usec-(tv2).tv_usec)/1000 ) void usage(void) { fprintf(stderr, "Usage: arping [-fqbDUAV] [-c count] [-w timeout] [-I device] [-s source] destination\n" " -f : quit on first reply\n" " -q : be quiet\n" " -b : keep broadcasting, don't go unicast\n" " -D : duplicate address detection mode\n" " -U : Unsolicited ARP mode, update your neighbours\n" " -A : ARP answer mode, update your neighbours\n" " -V : print version and exit\n" " -c count : how many packets to send\n" " -w timeout : how long to wait for a reply\n" " -I device : which ethernet device to use (eth0)\n" " -s source : source ip address\n" " -R : send ARP reply to IPoIB broadcast address
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features
Vladimir Sokolovsky wrote: > Hi Or, > I am working on IPoIB failover. > I tried Michael's Tsirkin patch for ipoib (updating neighbor structure) > and it fixes the issue Roland was talking about. > > Meanwhile I have tested the following flow: > /*_Setup description:_*/ > >host1 - 2 IB ports connected to IB switch. >ib0: 11.0.0.1 >ib1: 12.0.0.1 > >host2 - port 1 connected to the IB switch. >ib0: 11.0.0.2 >opensm over port1 > > > /*_Flow description:_*/ >- ping host2 -> 11.0.0.1 (passed) >- set port1 of the host1 to 'DOWN' state (disconnect the port from IB > subnet) >- ping host2 -> 11.0.0.1 (failed) >- ifconfig ib0 0.0.0.0 (on host1) >- ifconfig ib1 11.0.0.1 (on host1) >- arping -I ib1 11.0.0.1 (on host1) >- ping host2 -> 11.0.0.1 (passed) > > arping in this case was not really necessary because ping issues ARP > requests by himself. Hi Vlad, Did you have any special reason to assign host1:ib1 an IP address ***before*** the failover? is the reason for that happen to be having it joins the IPv4 multicast group at "batch time", that is not during the failover? >- arping -I ib1 11.0.0.1 (on host1) -U Unsolicited ARP mode to update neighbours' ARP caches. No replies are expected -A The same as -U, but ARP REPLY packets used instead of ARP REQUEST. I think we want arping to send a gratuitous arp with the MAC of ib1 so weren't you need to provide the -U or -A command line to arping? If i understand correct, gratuitous arp was not sent in your usage case so i am not sure Michael's patch was exercised. > Note: I updated the original arping to be able to send broadcast using > ipv4_bcast_addr. Can you please send the patch to arping? > Also, I have tested ssh over IPoIB with the same flow. In this case > arping also wasn't necessary , but it makes an update of neighbors > with the new MAC address (of ib1 interface) more quickly. Two interesting test cases you might want to validate your approach with is something "long" ie that delivers much traffic before and after the failover ie: iperf or netperf over TCP AND UDP. I have not validated it but i think UDP would not generate ARP so the gratuitous is the only way to update the remote system with the MAC change. Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features
Hi Or, I am working on IPoIB failover. I tried Michael's Tsirkin patch for ipoib (updating neighbor structure) and it fixes the issue Roland was talking about. Meanwhile I have tested the following flow: /*_Setup description:_*/ host1 - 2 IB ports connected to IB switch. ib0: 11.0.0.1 ib1: 12.0.0.1 host2 - port 1 connected to the IB switch. ib0: 11.0.0.2 opensm over port1 /*_Flow description:_*/ - ping host2 -> 11.0.0.1 (passed) - set port1 of the host1 to 'DOWN' state (disconnect the port from IB subnet) - ping host2 -> 11.0.0.1 (failed) - ifconfig ib0 0.0.0.0 (on host1) - ifconfig ib1 11.0.0.1 (on host1) - arping -I ib1 11.0.0.1 (on host1) - ping host2 -> 11.0.0.1 (passed) arping in this case was not really necessary because ping issues ARP requests by himself. Also, I have tested ssh over IPoIB with the same flow. In this case arping also wasn't necessary , but it makes an update of neighbors with the new MAC address (of ib1 interface) more quickly. Note: I updated the original arping to be able to send broadcast using ipv4_bcast_addr. We should decide about initial configuration of the IPoIB interfaces for high availability: should they be in a different IP subnets or stay in the same one. Regards, Vladimir Eitan Zahavi wrote: > Hi Roland, > > We are trying this approach and will probably be done with it tomorrow. > So I guess Vlad will be able to update the group soon. > > Eitan Zahavi > > >> -Original Message- >> From: Roland Dreier >> Sent: Thursday, July 13, 2006 11:11 PM >> To: Or Gerlitz >> Cc: Tziporet Koren; OpenFabricsEWG; openib >> Subject: Re: [openib-general] OFED 1.1 release - schedule and features >> >> > So if the link which ib0 maps to is DOWN you move the ib0 IPv4 >> > address > to > >> another device whose link is UP (eg ib1) and you somehow have ib1 > >> > send a > >> gratuitous ARP? >> >> I think there may be a problem in the way IPoIB deals with gratuitous >> > ARPs. Because > >> if a neighbour structure is updated by the networking core, there's no >> > way for IPoIB > >> to know about that and update the associated IB path. >> >> Has anyone actually tried this failover approach? >> >> - R. >> > > > ___ > openfabrics-ewg mailing list > [EMAIL PROTECTED] > http://openib.org/mailman/listinfo/openfabrics-ewg > > ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features
At 03:49 PM 7/12/2006, Fabian Tillier wrote: Hi Mike, On 7/12/06, Michael Krause <[EMAIL PROTECTED]> wrote: At 09:48 AM 7/12/2006, Jeff Broughton wrote: Modifying the sockets API is just defining yet another RDMA API, and we have so many already I disagree. This effort has distilled the API to basically one for RDMA developers. Applications are supported over this via either MPI or Sockets. There's been a lot of effort to make the RDMA verbs easy to use. With the RDMA CM, socket-like connection semantics can be used to establish the connection between QPs. The connection establishment is the hard part - doing I/O is trivial in comparisson. This verbs and RDMA CM have nothing to do with MPI. If an application is going to be RDMA aware, I don't see any reason it shouldn't just use the verbs directly and use the RDMA CM to establish the connections. What's your point? It seems you are in agreement that there is a single RDMA API that people can use. It seems rather self limiting to think the traditional BSD synchronous Sockets API is all the world should be able to use when it comes to Sockets. Sockets developers could easily incorporate the extensions into their applications providing them with improved designs and flexibility without having to learn about RDMA itself. Wait, you want applications to be able to register memory and issue RDMA operations, but not have to learn about RDMA? How does that make sense? The Sockets API extensions allow developers to register memory. That has been a desire by many when it comes to SDP or copy avoidance technology as it optimizes the performance path by eliminating the need to do per op registration. For many applications which already known working sets, they can use this to enable the OS and underlying infrastructure to take advantage of this fact to improve performance and quality of the solution. The extensions provide the async communications and event collection mechanisms to also improve performance over the rather limiting select / poll supported by Sockets today. It currently does not support explicit RDMA but it is rather trivial to add such calls and remove the need to interject SDP if desired. The benefits of such new API extensions are there for those that want to eliminate one more ULP with its unfortunate IP cloud over head. If the couple of calls necessary to extend this API to support direct RDMA would allow them to eliminate SDP entirely, well, that has benefits that go beyond just its all Sockets; For a socket implementation to support RDMA, the socket must have an underlying RDMA QP. This means that if you want the application to not have to be verbs-aware, you can't really get rid of SDP - you're just extending SDP to let the application have a part in memory registration and RDMA, while still supporting the traditional BSD operations. This is IMO more complex than just letting applications interface directly with verbs, especially since the SDP implementation will size the QP for its own use, without a means for negotiating with the user so that you don't cause buffer overruns. Please take a look at the API extensions. I never stated that one gets rid of SDP unless one adds the RDMA-explicit calls. As for complexity, well, the goal is to extend to Sockets developers the optimal communication paradigm already available on OS such as Windows without having to leave with the same unfortunate constraints imposed by the OS. The same logic applies to extending the benefits derived from MPI which supports async communications as well as put / get semantics which would be analogous to the additional RDMA interfaces I referenced. I find it strange that people would argue against improving the Sockets developer's tool suite when the benefits are already proven elsewhere within the industry and even within this open source effort. Giving the millions of Sockets developers the choice of a set of extensions that work over both RDMA and traditional network stacks seems like a no brainer. Trying to force them to use a native RDMA API even if semantically similar to Sockets seems like a poor path to pursue. Leave the RDMA API to the middleware providers and those that need to be close the metal. it also eliminates the IP cloud that hovers over SDP licensing. Something that many developers and customers would appreciate. I believe that Microsoft's IP claims only apply to SDP over IB -- I don't believe SDP over iWarp is affected. I don't know how the RDMA verbs moving towards a hardware independent (wrt IB vs. iWarp) affects the IP claims, but it should certainly make things interesting if a single SDP code base can work over both IB and iWarp. SDP is SDP and it isn't just restricted to IB. I'll leave it to the lawyers to sort it out but having a single SDP with minor code execution path deltas for the IB-specifics isn't that hard to construct. It has been done on other OS already. Mike
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features
Hi Mike, On 7/12/06, Michael Krause <[EMAIL PROTECTED]> wrote: > > At 09:48 AM 7/12/2006, Jeff Broughton wrote: > >> Modifying the sockets API is just defining yet another RDMA API, and we have >> so many already > > I disagree. This effort has distilled the API to basically one for RDMA > developers. Applications are supported over this via either MPI or Sockets. There's been a lot of effort to make the RDMA verbs easy to use. With the RDMA CM, socket-like connection semantics can be used to establish the connection between QPs. The connection establishment is the hard part - doing I/O is trivial in comparisson. This verbs and RDMA CM have nothing to do with MPI. If an application is going to be RDMA aware, I don't see any reason it shouldn't just use the verbs directly and use the RDMA CM to establish the connections. >It seems rather self limiting to think the traditional BSD synchronous > Sockets API is all the world should be able to use when it comes to Sockets. > Sockets developers could easily incorporate the extensions into their > applications providing them with improved designs and flexibility without > having to learn about RDMA itself. Wait, you want applications to be able to register memory and issue RDMA operations, but not have to learn about RDMA? How does that make sense? > If the couple of calls necessary to > extend this API to support direct RDMA would allow them to eliminate SDP > entirely, well, that has benefits that go beyond just its all Sockets; For a socket implementation to support RDMA, the socket must have an underlying RDMA QP. This means that if you want the application to not have to be verbs-aware, you can't really get rid of SDP - you're just extending SDP to let the application have a part in memory registration and RDMA, while still supporting the traditional BSD operations. This is IMO more complex than just letting applications interface directly with verbs, especially since the SDP implementation will size the QP for its own use, without a means for negotiating with the user so that you don't cause buffer overruns. > it also eliminates the IP cloud that hovers over SDP licensing. Something > that many developers and customers would appreciate. I believe that Microsoft's IP claims only apply to SDP over IB -- I don't believe SDP over iWarp is affected. I don't know how the RDMA verbs moving towards a hardware independent (wrt IB vs. iWarp) affects the IP claims, but it should certainly make things interesting if a single SDP code base can work over both IB and iWarp. - Fab ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features
At 09:48 AM 7/12/2006, Jeff Broughton wrote: Mike, The whole purpose of SDP is to make sockets go faster without having to have the applications modified. This is what the customers want. I've heard this time and time again, across a wide spectrum of customers. I am well aware of this. However, Linux / Unix do not support async communications which severely limits the potential performance benefits of SDP. When we wrote the SDP specification it was fully understood that optimal performance is achieved through async communications. We spent considerable time constructing SDP to support both synchronous and asynchronous communication paradigms which there are many applications that would benefit. Customers want to be able to use RDMA interconnects without recompilation and through the use of SDP and shared libraries this is certainly practical to execute. Developers however are not the same as customers and it is developers who would benefit from the Sockets extensions and this would in turn benefit customers. Modifying the sockets API is just defining yet another RDMA API, and we have so many already I disagree. This effort has distilled the API to basically one for RDMA developers. Applications are supported over this via either MPI or Sockets. It seems rather self limiting to think the traditional BSD synchronous Sockets API is all the world should be able to use when it comes to Sockets. Sockets developers could easily incorporate the extensions into their applications providing them with improved designs and flexibility without having to learn about RDMA itself. If the couple of calls necessary to extend this API to support direct RDMA would allow them to eliminate SDP entirely, well, that has benefits that go beyond just its all Sockets; it also eliminates the IP cloud that hovers over SDP licensing. Something that many developers and customers would appreciate. In the end, this effort could choose to progress Sockets technology and extend the number of developers and applications that can achieve optimal performance with only minor knowledge growth or they can live with the limitations of the BSD Sockets API and either accept performance loss or be forced to jump through the hoops of using other rather niche or obscure API to accomplish what is possible with a small number of Sockets extensions which were defined by people with years of experience implementing Sockets and working with application developers. Mike -Jeff From: [EMAIL PROTECTED] [ mailto:[EMAIL PROTECTED]] On Behalf Of Michael Krause Sent: Wednesday, July 12, 2006 9:23 AM To: Tziporet Koren; Scott Weitzenkamp (sweitzen) Cc: OpenFabricsEWG; openib Subject: Re: [openfabrics-ewg] [openib-general] OFED 1.1 release - schedule and features At 12:59 AM 7/12/2006, Tziporet Koren wrote: Scott Weitzenkamp (sweitzen) wrote: > For SDP, I would like to see "improved stability" (maybe you have this > in mind under "beta quality"), also how about "AIO support"? The rest > of the list looks good. > Yes - beta quality means improved stability. AIO is not planed for 1.1 (schedule issue). If needed we can add it to 1.2 Would be nice if people thought about implementing the Sockets API Extensions from the OpenGroup. They provide explicit memory management and async communications which will allow SDP performance to be fully exploited. The benefits go beyond what is found in AIO or on other OS such as Windows. If one were to extend slightly to have explicit RDMA Read and Write from the Sockets API, then it would be quite possible to eliminate SDP entirely for new applications leaving SDP strictly for legacy Sockets environments. Mike Tziporet ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features
Mike, The whole purpose of SDP is to make sockets go faster without having to have the applications modified. This is what the customers want. I've heard this time and time again, across a wide spectrum of customers. Modifying the sockets API is just defining yet another RDMA API, and we have so many already -Jeff From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael KrauseSent: Wednesday, July 12, 2006 9:23 AMTo: Tziporet Koren; Scott Weitzenkamp (sweitzen)Cc: OpenFabricsEWG; openibSubject: Re: [openfabrics-ewg] [openib-general] OFED 1.1 release - schedule and features At 12:59 AM 7/12/2006, Tziporet Koren wrote: Scott Weitzenkamp (sweitzen) wrote:> For SDP, I would like to see "improved stability" (maybe you have this > in mind under "beta quality"), also how about "AIO support"? The rest > of the list looks good.> Yes - beta quality means improved stability.AIO is not planed for 1.1 (schedule issue). If needed we can add it to 1.2Would be nice if people thought about implementing the Sockets API Extensions from the OpenGroup. They provide explicit memory management and async communications which will allow SDP performance to be fully exploited. The benefits go beyond what is found in AIO or on other OS such as Windows. If one were to extend slightly to have explicit RDMA Read and Write from the Sockets API, then it would be quite possible to eliminate SDP entirely for new applications leaving SDP strictly for legacy Sockets environments.Mike Tziporet___openib-general mailing listopenib-general@openib.orghttp://openib.org/mailman/listinfo/openib-generalTo unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features
On Wed, 2006-07-12 at 06:51, Tziporet Koren wrote: > Hal Rosenstock wrote: > >> • OSM: > >> > >> –Partition Manager (Pkey) > >> > > > > Also, primitive QoS support. > > > > > >> –Pre-computed routing load from file > >> > > > > Also, diags: > > > > Add saquery tool > > > > Enhancement to ibnetdiscover tool with grouping function > > > > OK - I will update my plans with these features. Thanks. > BTW - I count on you to be the owner of madaye We can cover SLES10 and RHEL4 on x86_64 and x86. Can Mellanox or someone else pick up the other "holes" in the matrix ? -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features
Hal Rosenstock wrote: >> • OSM: >> >> –Partition Manager (Pkey) >> > > Also, primitive QoS support. > > >> –Pre-computed routing load from file >> > > Also, diags: > > Add saquery tool > > Enhancement to ibnetdiscover tool with grouping function > OK - I will update my plans with these features. BTW - I count on you to be the owner of madaye Tziporet ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features
On Wed, 2006-07-12 at 01:53, Tziporet Koren wrote: > Hi All, > > > > I wish to start the release process of OFED 1.1. > > I would like that we will have a meeting next Monday to review this > proposal of the release features and schedule. > > If possible I wish to move the meeting hour from 9am PST to 11am or > 11:30am PST > > > > Tziporet > > > > - > > > > Schedule: > > Target release date: 24-Aug > > Intermediate milestones: > > • Development: now – 31-Jul > > • Create 1.1 branch of user level code and rc1: 24-Jul > > • Features freeze (rc2): 31-Jul > > • Code freeze (rc-x): 18- Aug > > > > Features: > > • OS: > > • Novell: > > –SLES 9.0 SP3* > > –SLES10 (official release)* > > • Redhat: > > –Redhat EL4 up2 > > –Redhat EL4 up3 > > • kernel.org: > > –Kernel 2.6.17* > > * changes from last release > > > > Note: Fedora C4 and SuSE Pro 10 were dropped from the list since I > have not seen so many customers requesting them. > > We will keep the backport patches for these OSes and make sure OFED > compile and loaded properly but will not do full QA cycle. > > Please reply if this is acceptable > > > > • General changes: > > –lib32 on 64 bits systems > > –Add madeye utility > > –Kernel code based on 2.6.18 > > –Bug fixes > > • Core: > > –Set options in CMA & uCMA (needed for Intel MPI) > > –HCA fatal - full flow support > > –Huge pages support > > • OSM: > > –Partition Manager (Pkey) Also, primitive QoS support. > –Pre-computed routing load from file Also, diags: Add saquery tool Enhancement to ibnetdiscover tool with grouping function > • SDP: > > –Beta quality > > –Improved latency > > –Improved bandwidth of small messages (by implementing the > Naggle algorithm) > > –Support the backlog parameter in the listen call > > –Interoperability with other SDP implementations > > –support sending/receiving out of band data > > • SRP: > > –GA quality > > –DM (Device Mapper) - for high availability > > –Basic failover/failback testing with daemon+srp+XVM/MPP and > Engenio target > > • IPoIB > > –Performance tuning > > –Bonding - for high availability > > • uDAPL: > > –Scalability features needed for Intel MPI – take from trunk > > • Arlin & James – please reply if there are more features > needed. > > • OSU - MVAPICH > > –Based on 0.97 (we will not move to 0.98 since we tested it > and found it is less stable then 0.97) > > –Message coalescing > > • Open MPI > > –TBD from Jeff > > • MPI tests: > > –Replace to the new test versions from LLNL, Intel, OSU > > • iSER > > –Any update Voltaire will drive to kernel 2.6.18 > > • RDS: > > –TBD – Oracle and SilverStorm should decide what should be in. > > > > > > Tziporet Koren > > Software Director > > Mellanox Technologies > > mailto: [EMAIL PROTECTED] > Tel +972-4-9097200, ext 380 > > > > > > __ > > ___ > openfabrics-ewg mailing list > [EMAIL PROTECTED] > http://openib.org/mailman/listinfo/openfabrics-ewg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general