Re: [openib-general] [openfabrics-ewg] OFED 1.1 release schedule

2006-10-17 Thread Ishai Rabinovitz




Hi,
Let 
me first explain why the current OFED release does not support SRP-HA on 
RHEL4.
SRP-HA is using Device Mapper 
multipath.
Multipath prerequisites include udev of higher version 
than 050.
RHEL4 distributions includes udev 039. udev is an 
important part of the distribution and I do not think that users will be ready 
to upgrade it in order to have SRP-HA.
To 
my best knowledge the main reason that multipath needs at least udev 050 is 
because it uses the RUN option (This option executes its given parameter after 
the device exist). Multipath uses the RUN option to execute kpartx 
that handles the partitions of the new device. SRP-HA also uses the RUN 
option to execute the multipath command.
I 
have an idea on how to overcome this problem. I want to implement a 
srp-multipath-daemon. This daemon will get kpartx and multipath requests using a 
shared message queue. The udev will use the PROGRAM option (That executes its 
given parameter immediately - before the device exist) to post request to this 
shared message queue and return immediately. The daemon will wait for the device 
to create and only than it will execute the commands.
In 
any case this technique will not be a part of the coming OFED 
release.
Ishai
 
 
-Original 
Message-From: Sharma, 
Karun [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 17, 2006 5:11 
AMTo: Tziporet Koren; Open 
FabricsCc: openibSubject: RE: [openfabrics-ewg] OFED 1.1 
release schedule
 


The plan is OK with 
Silverstorm.

I have a question though. What 
are the plans to support SRP-HA feature on RHEL4 kernels 
?

 

Thanks

Karun
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [openfabrics-ewg] OFED 1.1 release schedule

2006-10-16 Thread Sharma, Karun



The plan is OK with Silverstorm.
I have a question though. What are the plans to support SRP-HA feature on RHEL4 kernels ?
 
Thanks
Karun


From: [EMAIL PROTECTED] on behalf of Tziporet KorenSent: Mon 10/16/2006 1:03 PMTo: Open FabricsCc: openibSubject: [openfabrics-ewg] OFED 1.1 release schedule


This is the plan to do the 1.1 release this week:
 
We will publish 1.1-pre1 package tomorrow (Tue. 17-Oct)
Only blocker issues from RC7 will be updated:

SRP fix for Cisco FC gateway 
Small updates for the install 
Fix in diagnet to support SM on a switch 
Activate scaling code of ehca as default in the install 
Documentation update 
 
Each company will have 3 days for latest certification process and then the release can be done on Thursday.
 
Company owners – please approve if this is OK with you.
If not please elaborate the blocking reasons.
 
Thanks,
 
Tziporet Koren
Software Director
Mellanox Technologies
mailto: [EMAIL PROTECTED]Tel +972-4-9097200, ext 380
 ___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [openfabrics-ewg] OFED 1.1 release schedule

2006-10-16 Thread Scott Weitzenkamp (sweitzen)



This plan is OK with Cisco.
 
Scott 
Weitzenkamp
SQA and Release 
Manager
Server Virtualization 
Business Unit
Cisco Systems
 

  
  
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Tziporet 
  KorenSent: Monday, October 16, 2006 10:04 AMTo: Open 
  FabricsCc: openibSubject: [openfabrics-ewg] OFED 1.1 
  release schedule
  
  
  This is the plan to do the 1.1 
  release this week:
   
  We will publish 1.1-pre1 package 
  tomorrow (Tue. 17-Oct)
  Only blocker issues from RC7 will 
  be updated:
  
SRP fix for Cisco FC 
gateway 
Small updates for the 
install 
Fix in diagnet to support SM on 
a switch 
Activate scaling code of ehca as 
default in the install 
Documentation 
update 
   
  Each company will have 3 days for 
  latest certification process and then the release can be done on 
  Thursday.
   
  Company owners – please approve if 
  this is OK with you.
  If not please elaborate the 
  blocking reasons.
   
  Thanks,
   
  Tziporet 
  Koren
  Software 
  Director
  Mellanox 
  Technologies
  mailto: [EMAIL PROTECTED]Tel 
  +972-4-9097200, ext 380
   
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [openfabrics-ewg] OFED 1.1 release schedule

2006-10-16 Thread Hoang-Nam Nguyen
Hi,
> We will publish 1.1-pre1 package tomorrow (Tue. 17-Oct)
> Only blocker issues from RC7 will be updated:
> 1. SRP fix for Cisco FC gateway
> 2. Small updates for the install
currently we're working on the one install issue as I mentioned in another
thread.
We found out that the 64- and 32-bit binaries were built properly, but
during
the packaging the 32-bit binaries were not picked, but the 64-bit ones
together
with 32-bit libraries. We're trying to understand how the specific files
for
each rpm come in. We appreciate for any hints/suggestions.
Has anyone else also observed this problem?
> 3. Fix in diagnet to support SM on a switch
> 4. Activate scaling code of ehca as default in the install
Great, thanks!
> 5. Documentation update
> Each company will have 3 days for latest certification process and then
the release can be done on Thursday.
Supposed we could solve the issue above tomorrow this should be ok for us.
Regards
Hoang-Nam Nguyen


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features

2006-07-24 Thread Vladimir Sokolovsky
Or Gerlitz wrote:
>> Or Gerlitz wrote:
>>> Vladimir Sokolovsky wrote:
>
>>> Did you have any special reason to assign host1:ib1 an IP address 
>>> ***before*** the failover? is the reason for that happen to be 
>>> having it joins the IPv4 multicast group at "batch time", that is 
>>> not during the failover?
>
>> ib1 interface is loaded in any case (with or without configuration) 
>> if ib0 is loaded by /etc/init.d/network or /etc/init.d/openibd. It 
>> can't be configured with IP 0.0.0.0 - it fails to start with this 
>> configuration.
>> So, I gave it some IP in a different IP subnet.
>
> Not sure what you mean by "loaded": the trigger for IPoIB to registers 
> network devices is plain IB, that is "device (not link!) up" event it 
> gets through the ib stack client register hotplug mechanism, for 
> exampe if the HCA has two ports, IPoIB will register ib0 and ib1 (same 
> for two HCAs each of them with one port etc).
> However, I think the trigger for IPoIB to attempt doing the SA Q to 
> have the port GID associated with IPoIB netdevice join an mcast group 
> is the user action towards having this device being "UP" (eg the 
> assignment of IP address to it).
>
> Not sure what you mean by "start", you can just do nothing before the 
> failure of ib0 and during the failover from ib0 to ib1, assign ib1 the 
> address which used to be of ib0.
By "loaded" I meant that ib1 is configured with IP address and other 
parameters after executing '/etc/init.d/network start' or 
'/etc/init.d/openibd start'
I worked on SuSE 10 and I saw that even if ifcfg-ib1 does not exist then 
ib1 get the same configuration as ib0. This does not happens on RedHat 4.0.

>
>>> I think we want arping to send a gratuitous arp with the MAC of ib1
>>> so weren't you need to provide the -U or -A command line to arping?
>
>> You are right I used 'arping -A ...' (fogot to insert it in the 
>> email). Actually, I have added my flag '-R' which means '-A over IPoIB'
>
> thanks for the patch, i am not sure to fully follow the code path when 
> the "unsolicited" flag is set, but i do see what unlike in the -A/-U 
> options you have made the -R option not to set the "unsolicited" flag, 
> can you explain what was the issue?
There was no issue, it just a drop version. So, you can change it as you 
wish.
>
> Or.
>

Regards,
Vladimir

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features

2006-07-24 Thread Or Gerlitz
> Or Gerlitz wrote:
>> Vladimir Sokolovsky wrote:

>> Did you have any special reason to assign host1:ib1 an IP address 
>> ***before*** the failover? is the reason for that happen to be having 
>> it joins the IPv4 multicast group at "batch time", that is not during 
>> the failover?

> ib1 interface is loaded in any case (with or without configuration) if 
> ib0 is loaded by /etc/init.d/network or /etc/init.d/openibd. It can't be 
> configured with IP 0.0.0.0 - it fails to start with this configuration.
> So, I gave it some IP in a different IP subnet.

Not sure what you mean by "loaded": the trigger for IPoIB to registers 
network devices is plain IB, that is "device (not link!) up" event it 
gets through the ib stack client register hotplug mechanism, for exampe 
if the HCA has two ports, IPoIB will register ib0 and ib1 (same for two 
HCAs each of them with one port etc).

However, I think the trigger for IPoIB to attempt doing the SA Q to have 
the port GID associated with IPoIB netdevice join an mcast group is the 
user action towards having this device being "UP" (eg the assignment of 
IP address to it).

Not sure what you mean by "start", you can just do nothing before the 
failure of ib0 and during the failover from ib0 to ib1, assign ib1 the 
address which used to be of ib0.

>> I think we want arping to send a gratuitous arp with the MAC of ib1
>> so weren't you need to provide the -U or -A command line to arping?

> You are right I used 'arping -A ...' (fogot to insert it in the email). 
> Actually, I have added my flag '-R' which means '-A over IPoIB'

thanks for the patch, i am not sure to fully follow the code path when 
the "unsolicited" flag is set, but i do see what unlike in the -A/-U 
options you have made the -R option not to set the "unsolicited" flag, 
can you explain what was the issue?

Or.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features

2006-07-23 Thread Vladimir Sokolovsky

Hi Or,
See below,

Regards,
Vladimir

Or Gerlitz wrote:

Vladimir Sokolovsky wrote:

Hi Or,
I am working on IPoIB failover.
I tried Michael's Tsirkin patch for ipoib (updating neighbor 
structure) and it fixes the issue Roland was talking about.


Meanwhile I have tested the following flow:
/*_Setup description:_*/

   host1 - 2 IB ports connected to IB switch.
   ib0: 11.0.0.1
   ib1: 12.0.0.1

   host2 - port 1 connected to the IB switch.
   ib0: 11.0.0.2
   opensm over port1


/*_Flow description:_*/
   - ping host2 -> 11.0.0.1 (passed)
   - set port1 of the host1 to 'DOWN' state (disconnect the port from 
IB subnet)

   - ping host2 -> 11.0.0.1 (failed)
   - ifconfig ib0 0.0.0.0 (on host1)
   - ifconfig ib1 11.0.0.1 (on host1)
   - arping -I ib1 11.0.0.1 (on host1)
   - ping host2 -> 11.0.0.1 (passed)

arping in this case was not really necessary because ping issues ARP 
requests by himself.


Hi Vlad,

Did you have any special reason to assign host1:ib1 an IP address 
***before*** the failover? is the reason for that happen to be having 
it joins the IPv4 multicast group at "batch time", that is not during 
the failover?
ib1 interface is loaded in any case (with or without configuration) if 
ib0 is loaded by /etc/init.d/network or /etc/init.d/openibd. It can't be 
configured with IP 0.0.0.0 - it fails to start with this configuration.

So, I gave it some IP in a different IP subnet.


>- arping -I ib1 11.0.0.1 (on host1)

-U  Unsolicited ARP mode to update neighbours' ARP caches. No replies 
are expected


-A  The same as -U, but ARP REPLY packets used instead of ARP REQUEST.

I think we want arping to send a gratuitous arp with the MAC of ib1
so weren't you need to provide the -U or -A command line to arping?

You are right I used 'arping -A ...' (fogot to insert it in the email). 
Actually, I have added my flag '-R' which means '-A over IPoIB'
If i understand correct, gratuitous arp was not sent in your usage 
case so i am not sure Michael's patch was exercised.


> Note: I updated the original arping to be able to send broadcast using
> ipv4_bcast_addr.

Can you please send the patch to arping?


Attached (arping_full_ib.c - arping.c with changes for IPoIB).

> Also, I have tested ssh over IPoIB with the same flow. In this case
> arping also wasn't necessary , but it makes an update of neighbors
> with the new MAC address (of ib1 interface) more quickly.

Two interesting test cases you might want to validate your approach 
with is something "long" ie that delivers much traffic before and 
after the failover ie: iperf or netperf over TCP AND UDP. I have not 
validated it but i think UDP would not generate ARP so the gratuitous 
is the only way   to update the remote system with the MAC change.



I will test it later.

Or.




/*
 * arping.c
 *
 *  This program is free software; you can redistribute it and/or
 *  modify it under the terms of the GNU General Public License
 *  as published by the Free Software Foundation; either version
 *  2 of the License, or (at your option) any later version.
 *
 * Authors: Alexey Kuznetsov, <[EMAIL PROTECTED]>
 */

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#include 

#include "SNAPSHOT.h"

static void usage(void) __attribute__((noreturn));

int quit_on_reply=0;
char *device="eth0";
int ifindex;
char *source;
struct in_addr src, dst;
char *target;
int dad, unsolicited, advert;
int quiet;
int count=-1;
int timeout;
int unicasting;
int s;
int broadcast_only;
int ib_arprep;

struct sockaddr_ll me;
struct sockaddr_ll he;

struct timeval start, last;

int sent, brd_sent;
int received, brd_recv, req_recv;


static const u8 ipv4_bcast_addr[] = {
0x00, 0xff, 0xff, 0xff,
0xff, 0x12, 0x40, 0x1b, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff
};


#define MS_TDIFF(tv1,tv2) ( ((tv1).tv_sec-(tv2).tv_sec)*1000 + \
   ((tv1).tv_usec-(tv2).tv_usec)/1000 )

void usage(void)
{
fprintf(stderr,
"Usage: arping [-fqbDUAV] [-c count] [-w timeout] [-I device] 
[-s source] destination\n"
"  -f : quit on first reply\n"
"  -q : be quiet\n"
"  -b : keep broadcasting, don't go unicast\n"
"  -D : duplicate address detection mode\n"
"  -U : Unsolicited ARP mode, update your neighbours\n"
"  -A : ARP answer mode, update your neighbours\n"
"  -V : print version and exit\n"
"  -c count : how many packets to send\n"
"  -w timeout : how long to wait for a reply\n"
"  -I device : which ethernet device to use (eth0)\n"
"  -s source : source ip address\n"
"  -R : send ARP reply to IPoIB broadcast address

Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features

2006-07-23 Thread Or Gerlitz
Vladimir Sokolovsky wrote:
> Hi Or,
> I am working on IPoIB failover.
> I tried Michael's Tsirkin patch for ipoib (updating neighbor structure) 
> and it fixes the issue Roland was talking about.
> 
> Meanwhile I have tested the following flow:
> /*_Setup description:_*/
> 
>host1 - 2 IB ports connected to IB switch.
>ib0: 11.0.0.1
>ib1: 12.0.0.1
> 
>host2 - port 1 connected to the IB switch.
>ib0: 11.0.0.2
>opensm over port1
> 
> 
> /*_Flow description:_*/
>- ping host2 -> 11.0.0.1 (passed)
>- set port1 of the host1 to 'DOWN' state (disconnect the port from IB 
> subnet)
>- ping host2 -> 11.0.0.1 (failed)
>- ifconfig ib0 0.0.0.0 (on host1)
>- ifconfig ib1 11.0.0.1 (on host1)
>- arping -I ib1 11.0.0.1 (on host1)
>- ping host2 -> 11.0.0.1 (passed)
> 
> arping in this case was not really necessary because ping issues ARP 
> requests by himself.

Hi Vlad,

Did you have any special reason to assign host1:ib1 an IP address 
***before*** the failover? is the reason for that happen to be having it 
joins the IPv4 multicast group at "batch time", that is not during the 
failover?

 >- arping -I ib1 11.0.0.1 (on host1)

-U  Unsolicited ARP mode to update neighbours' ARP caches. No replies 
are expected

-A  The same as -U, but ARP REPLY packets used instead of ARP REQUEST.

I think we want arping to send a gratuitous arp with the MAC of ib1
so weren't you need to provide the -U or -A command line to arping?

If i understand correct, gratuitous arp was not sent in your usage case 
so i am not sure Michael's patch was exercised.

 > Note: I updated the original arping to be able to send broadcast using
 > ipv4_bcast_addr.

Can you please send the patch to arping?

 > Also, I have tested ssh over IPoIB with the same flow. In this case
 > arping also wasn't necessary , but it makes an update of neighbors
 > with the new MAC address (of ib1 interface) more quickly.

Two interesting test cases you might want to validate your approach with 
is something "long" ie that delivers much traffic before and after the 
failover ie: iperf or netperf over TCP AND UDP. I have not validated it 
but i think UDP would not generate ARP so the gratuitous is the only way 
   to update the remote system with the MAC change.

Or.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features

2006-07-20 Thread Vladimir Sokolovsky
Hi Or,
I am working on IPoIB failover.
I tried Michael's Tsirkin patch for ipoib (updating neighbor structure) 
and it fixes the issue Roland was talking about.

Meanwhile I have tested the following flow:
/*_Setup description:_*/
 
host1 - 2 IB ports connected to IB switch.
ib0: 11.0.0.1
ib1: 12.0.0.1
 
host2 - port 1 connected to the IB switch.
ib0: 11.0.0.2
opensm over port1
 
 
/*_Flow description:_*/
- ping host2 -> 11.0.0.1 (passed)
- set port1 of the host1 to 'DOWN' state (disconnect the port from 
IB subnet)
- ping host2 -> 11.0.0.1 (failed)
- ifconfig ib0 0.0.0.0 (on host1)
- ifconfig ib1 11.0.0.1 (on host1)
- arping -I ib1 11.0.0.1 (on host1)
- ping host2 -> 11.0.0.1 (passed)

arping in this case was not really necessary because ping issues ARP 
requests by himself.

Also, I have tested ssh over IPoIB with the same flow. In this case 
arping also wasn't necessary , but it makes an update of neighbors with 
the new MAC address (of ib1 interface) more quickly.
Note: I updated the original arping to be able to send broadcast using 
ipv4_bcast_addr.

We should decide about initial configuration of the IPoIB interfaces for 
high availability: should they be in a different IP subnets or stay in 
the same one.

Regards,
Vladimir


Eitan Zahavi wrote:
> Hi Roland,
>
> We are trying this approach and will probably be done with it tomorrow.
> So I guess Vlad will be able to update the group soon.
>
> Eitan Zahavi
>
>   
>> -Original Message-
>> From: Roland Dreier
>> Sent: Thursday, July 13, 2006 11:11 PM
>> To: Or Gerlitz
>> Cc: Tziporet Koren; OpenFabricsEWG; openib
>> Subject: Re: [openib-general] OFED 1.1 release - schedule and features
>>
>>  > So if the link which ib0 maps to is DOWN you move the ib0 IPv4
>> 
> address  > to
>   
>> another device whose link is UP (eg ib1) and you somehow have ib1  >
>> 
> send a
>   
>> gratuitous ARP?
>>
>> I think there may be a problem in the way IPoIB deals with gratuitous
>> 
> ARPs.  Because
>   
>> if a neighbour structure is updated by the networking core, there's no
>> 
> way for IPoIB
>   
>> to know about that and update the associated IB path.
>>
>> Has anyone actually tried this failover approach?
>>
>>  - R.
>> 
>
>
> ___
> openfabrics-ewg mailing list
> [EMAIL PROTECTED]
> http://openib.org/mailman/listinfo/openfabrics-ewg
>
>   


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features

2006-07-13 Thread Michael Krause


At 03:49 PM 7/12/2006, Fabian Tillier wrote:
Hi Mike,
On 7/12/06, Michael Krause <[EMAIL PROTECTED]> wrote:

At 09:48 AM 7/12/2006, Jeff Broughton wrote:
Modifying the sockets API is
just defining yet another RDMA API, and we have
so many already
I disagree.  This effort has distilled the API to basically one for
RDMA
developers.  Applications are supported over this via either MPI or
Sockets.
There's been a lot of effort to make the RDMA verbs easy to use. 
With
the RDMA CM, socket-like connection semantics can be used to
establish
the connection between QPs.  The connection establishment is the
hard
part - doing I/O is trivial in comparisson.  This verbs and RDMA
CM
have nothing to do with MPI.
If an application is going to be RDMA aware, I don't see any reason
it
shouldn't just use the verbs directly and use the RDMA CM to
establish
the connections.
What's your point?  It seems you are in agreement that there is a
single RDMA API that people can use.


   It
seems rather self limiting to think the traditional BSD synchronous
Sockets API is all the world should be able to use when it comes to
Sockets.
 Sockets developers could easily incorporate the extensions into
their
applications providing them with improved designs and flexibility
without
having to learn about RDMA itself.
Wait, you want applications to be able to register memory and issue
RDMA operations, but not have to learn about RDMA?  How does that
make
sense?
The Sockets API extensions allow developers to register
memory.   That has been a desire by many when it comes to SDP
or copy avoidance technology as it optimizes the performance path by
eliminating the need to do per op registration.  For many
applications which already known working sets, they can use this to
enable the OS and underlying infrastructure to take advantage of this
fact to improve performance and quality of the solution.  The
extensions provide the async communications and event collection
mechanisms to also improve performance over the rather limiting select /
poll supported by Sockets today.
It currently does not support explicit RDMA but it is rather trivial to
add such calls and remove the need to interject SDP if
desired.    The benefits of such new API extensions are
there for those that want to eliminate one more ULP with its unfortunate
IP cloud over head.


 If the couple
of calls necessary to
extend this API to support direct RDMA would allow them to eliminate
SDP
entirely, well, that has benefits that go beyond just its all
Sockets;
For a socket implementation to support RDMA, the socket must have an
underlying RDMA QP.  This means that if you want the application
to
not have to be verbs-aware, you can't really get rid of SDP - you're
just extending SDP to let the application have a part in memory
registration and RDMA, while still supporting the traditional BSD
operations.  This is IMO more complex than just letting
applications
interface directly with verbs, especially since the SDP
implementation
will size the QP for its own use, without a means for negotiating
with
the user so that you don't cause buffer
overruns.
Please take a look at the API extensions.   I never stated that
one gets rid of SDP unless one adds the RDMA-explicit calls. 

As for complexity, well, the goal is to extend to Sockets developers the
optimal communication paradigm already available on OS such as Windows
without having to leave with the same unfortunate constraints imposed by
the OS.  The same logic applies to extending the benefits derived
from MPI which supports async communications as well as put / get
semantics which would be analogous to the additional RDMA interfaces I
referenced.  
I find it strange that people would argue against improving the Sockets
developer's tool suite when the benefits are already proven elsewhere
within the industry and even within this open source effort.  Giving
the millions of Sockets developers the choice of a set of extensions that
work over both RDMA and traditional network stacks seems like a no
brainer.  Trying to force them to use a native RDMA API even if
semantically similar to Sockets seems like a poor path to pursue. 
Leave the RDMA API to the middleware providers and those that need to be
close the metal.   


it also eliminates
the IP cloud that hovers over SDP licensing.   Something
that many developers and customers would appreciate.
I believe that Microsoft's IP claims only apply to SDP over IB -- I
don't believe SDP over iWarp is affected.  I don't know how the
RDMA
verbs moving towards a hardware independent (wrt IB vs. iWarp)
affects
the IP claims, but it should certainly make things interesting if a
single SDP code base can work over both IB and
iWarp.
SDP is SDP and it isn't just restricted to IB.   I'll leave it
to the lawyers to sort it out but having a single SDP with minor code
execution path deltas for the IB-specifics isn't that hard to
construct.  It has been done on other OS already.
Mike



Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features

2006-07-12 Thread Fabian Tillier
Hi Mike,

On 7/12/06, Michael Krause <[EMAIL PROTECTED]> wrote:
>
> At 09:48 AM 7/12/2006, Jeff Broughton wrote:
>
>> Modifying the sockets API is just defining yet another RDMA API, and we have
>> so many already
>
> I disagree.  This effort has distilled the API to basically one for RDMA
> developers.  Applications are supported over this via either MPI or Sockets.

There's been a lot of effort to make the RDMA verbs easy to use.  With
the RDMA CM, socket-like connection semantics can be used to establish
the connection between QPs.  The connection establishment is the hard
part - doing I/O is trivial in comparisson.  This verbs and RDMA CM
have nothing to do with MPI.

If an application is going to be RDMA aware, I don't see any reason it
shouldn't just use the verbs directly and use the RDMA CM to establish
the connections.

>It seems rather self limiting to think the traditional BSD synchronous
> Sockets API is all the world should be able to use when it comes to Sockets.
>  Sockets developers could easily incorporate the extensions into their
> applications providing them with improved designs and flexibility without
> having to learn about RDMA itself.

Wait, you want applications to be able to register memory and issue
RDMA operations, but not have to learn about RDMA?  How does that make
sense?

>  If the couple of calls necessary to
> extend this API to support direct RDMA would allow them to eliminate SDP
> entirely, well, that has benefits that go beyond just its all Sockets;

For a socket implementation to support RDMA, the socket must have an
underlying RDMA QP.  This means that if you want the application to
not have to be verbs-aware, you can't really get rid of SDP - you're
just extending SDP to let the application have a part in memory
registration and RDMA, while still supporting the traditional BSD
operations.  This is IMO more complex than just letting applications
interface directly with verbs, especially since the SDP implementation
will size the QP for its own use, without a means for negotiating with
the user so that you don't cause buffer overruns.

> it also eliminates the IP cloud that hovers over SDP licensing.   Something
> that many developers and customers would appreciate.

I believe that Microsoft's IP claims only apply to SDP over IB -- I
don't believe SDP over iWarp is affected.  I don't know how the RDMA
verbs moving towards a hardware independent (wrt IB vs. iWarp) affects
the IP claims, but it should certainly make things interesting if a
single SDP code base can work over both IB and iWarp.

- Fab

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features

2006-07-12 Thread Michael Krause


At 09:48 AM 7/12/2006, Jeff Broughton wrote:

Mike,
 
The whole purpose of SDP
is to make sockets go faster without having to have the applications
modified.  This is what the customers want.  I've heard this
time and time again, across a wide spectrum of
customers.
I am well aware of this.  However, Linux / Unix do not support async
communications which severely limits the potential performance benefits
of SDP.  When we wrote the SDP specification it was fully understood
that optimal performance is achieved through async
communications.   We spent considerable time constructing SDP
to support both synchronous and asynchronous communication paradigms
which there are many applications that would benefit.  
Customers want to be able to use RDMA interconnects without recompilation
and through the use of SDP and shared libraries this is certainly
practical to execute.  Developers however are not the same as
customers and it is developers who would benefit from the Sockets
extensions and this would in turn benefit customers.  

Modifying the sockets API is
just defining yet another RDMA API, and we have so many already 

I disagree.  This effort has distilled the API to basically one for
RDMA developers.  Applications are supported over this via either
MPI or Sockets.    It seems rather self limiting to think
the traditional BSD synchronous Sockets API is all the world should be
able to use when it comes to Sockets.  Sockets developers could
easily incorporate the extensions into their applications providing them
with improved designs and flexibility without having to learn about RDMA
itself.  If the couple of calls necessary to extend this API to
support direct RDMA would allow them to eliminate SDP entirely, well,
that has benefits that go beyond just its all Sockets; it also eliminates
the IP cloud that hovers over SDP licensing.   Something that
many developers and customers would appreciate.
In the end, this effort could choose to progress Sockets technology and
extend the number of developers and applications that can achieve optimal
performance with only minor knowledge growth or they can live with the
limitations of the BSD Sockets API and either accept performance loss or
be forced to jump through the hoops of using other rather niche or
obscure API to accomplish what is possible with a small number of Sockets
extensions which were defined by people with years of experience
implementing Sockets and working with application developers.
Mike
 
-Jeff




From:
[EMAIL PROTECTED]
[
mailto:[EMAIL PROTECTED]] On Behalf Of Michael
Krause

Sent: Wednesday, July 12, 2006 9:23 AM

To: Tziporet Koren; Scott Weitzenkamp (sweitzen)

Cc: OpenFabricsEWG; openib

Subject: Re: [openfabrics-ewg] [openib-general] OFED 1.1 release
- schedule and features


At 12:59 AM 7/12/2006, Tziporet Koren wrote:

Scott Weitzenkamp (sweitzen) wrote:

> For SDP, I would like to see "improved stability"
(maybe you have this 

> in mind under "beta quality"), also how about
"AIO support"?  The rest 

> of the list looks good.

>  

Yes - beta quality means improved stability.

AIO is not planed for 1.1 (schedule issue). If needed we can add it
to 1.2

Would be nice if people thought about implementing the Sockets API
Extensions from the OpenGroup.  They provide explicit memory
management and async communications which will allow SDP performance to
be fully exploited.   The benefits go beyond what is found in
AIO or on other OS such as Windows.  If one were to extend slightly
to have explicit RDMA Read and Write from the Sockets API, then it would
be quite possible to eliminate SDP entirely for new applications leaving
SDP strictly for legacy Sockets environments.

Mike


Tziporet

___

openib-general mailing list

openib-general@openib.org



http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit

http://openib.org/mailman/listinfo/openib-general



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features

2006-07-12 Thread Jeff Broughton



Mike,
 
The whole purpose of SDP is to make 
sockets go faster without having to have the applications modified.  This 
is what the customers want.  I've heard this time and time again, across a 
wide spectrum of customers.
 
Modifying the sockets API is just defining yet another 
RDMA API, and we have so many already  
 
-Jeff

  
  
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Michael 
  KrauseSent: Wednesday, July 12, 2006 9:23 AMTo: Tziporet 
  Koren; Scott Weitzenkamp (sweitzen)Cc: OpenFabricsEWG; 
  openibSubject: Re: [openfabrics-ewg] [openib-general] OFED 1.1 
  release - schedule and features
  At 12:59 AM 7/12/2006, Tziporet Koren wrote:
  Scott Weitzenkamp (sweitzen) 
wrote:> For SDP, I would like to see "improved stability" (maybe you 
have this > in mind under "beta quality"), also how about "AIO 
support"?  The rest > of the list looks good.>  
Yes - beta quality means improved stability.AIO is not planed for 
1.1 (schedule issue). If needed we can add it to 
  1.2Would be nice if people thought about implementing 
  the Sockets API Extensions from the OpenGroup.  They provide explicit 
  memory management and async communications which will allow SDP performance to 
  be fully exploited.   The benefits go beyond what is found in AIO or 
  on other OS such as Windows.  If one were to extend slightly to have 
  explicit RDMA Read and Write from the Sockets API, then it would be quite 
  possible to eliminate SDP entirely for new applications leaving SDP strictly 
  for legacy Sockets environments.Mike
  Tziporet___openib-general 
mailing listopenib-general@openib.orghttp://openib.org/mailman/listinfo/openib-generalTo 
unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features

2006-07-12 Thread Hal Rosenstock
On Wed, 2006-07-12 at 06:51, Tziporet Koren wrote:
> Hal Rosenstock wrote:
> >> • OSM:
> >>
> >> –Partition Manager (Pkey)
> >> 
> >
> > Also, primitive QoS support.
> >
> >   
> >> –Pre-computed routing load from file 
> >> 
> >
> > Also, diags:
> >
> > Add saquery tool
> >
> > Enhancement to ibnetdiscover tool with grouping function
> >   
> 
> OK - I will update my plans with these features.

Thanks.

> BTW - I count on you to be the owner of madaye

We can cover SLES10 and RHEL4 on x86_64 and x86. Can Mellanox or someone
else pick up the other "holes" in the matrix ?

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features

2006-07-12 Thread Tziporet Koren
Hal Rosenstock wrote:
>> • OSM:
>>
>> –Partition Manager (Pkey)
>> 
>
> Also, primitive QoS support.
>
>   
>> –Pre-computed routing load from file 
>> 
>
> Also, diags:
>
> Add saquery tool
>
> Enhancement to ibnetdiscover tool with grouping function
>   

OK - I will update my plans with these features.
BTW - I count on you to be the owner of madaye

Tziporet

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [openfabrics-ewg] OFED 1.1 release - schedule and features

2006-07-12 Thread Hal Rosenstock
On Wed, 2006-07-12 at 01:53, Tziporet Koren wrote:
> Hi All,
> 
>  
> 
> I wish to start the release process of OFED 1.1.
> 
> I would like that we will have a meeting next Monday to review this
> proposal of the release features and schedule.
> 
> If possible I wish to move the meeting hour from 9am PST to 11am or
> 11:30am PST
> 
>  
> 
> Tziporet
> 
>  
> 
> -
> 
>  
> 
> Schedule: 
> 
> Target release date: 24-Aug
> 
> Intermediate milestones:
> 
> • Development: now – 31-Jul
> 
> • Create 1.1 branch of user level code and rc1: 24-Jul
> 
> • Features freeze (rc2): 31-Jul
> 
> • Code freeze (rc-x): 18- Aug 
> 
>  
> 
> Features:
> 
> • OS:
> 
> • Novell:
> 
> –SLES 9.0 SP3*
> 
> –SLES10 (official release)*
> 
> • Redhat:
> 
> –Redhat EL4 up2
> 
> –Redhat EL4 up3
> 
> • kernel.org:
> 
> –Kernel 2.6.17*
> 
> * changes from last release
> 
>  
> 
> Note: Fedora C4 and SuSE Pro 10 were dropped from the list since I
> have not seen so many customers requesting them.
> 
> We will keep the backport patches for these OSes and make sure OFED
> compile and loaded properly but will not do full QA cycle.
> 
> Please reply if this is acceptable
> 
>  
> 
> • General changes:
> 
> –lib32 on 64 bits systems
> 
> –Add madeye utility
> 
> –Kernel code based on 2.6.18
> 
> –Bug fixes
> 
> • Core:
> 
> –Set options in CMA & uCMA (needed for Intel MPI)
> 
> –HCA fatal - full flow support
> 
> –Huge pages support
> 
> • OSM:
> 
> –Partition Manager (Pkey)

Also, primitive QoS support.

> –Pre-computed routing load from file 

Also, diags:

Add saquery tool

Enhancement to ibnetdiscover tool with grouping function

> • SDP:
> 
> –Beta quality
> 
> –Improved latency 
> 
> –Improved bandwidth of small messages (by implementing the
> Naggle algorithm) 
> 
> –Support the backlog parameter in the listen call 
> 
> –Interoperability with other SDP implementations 
> 
> –support sending/receiving out of band data 
> 
> • SRP:
> 
> –GA quality
> 
> –DM (Device Mapper) - for high availability
> 
> –Basic failover/failback testing with daemon+srp+XVM/MPP and
> Engenio target
> 
> • IPoIB
> 
> –Performance tuning
> 
> –Bonding - for high availability
> 
> • uDAPL:
> 
> –Scalability features needed for Intel MPI – take from trunk
> 
> • Arlin & James – please reply if there are more features
> needed.
> 
> • OSU - MVAPICH
> 
> –Based on 0.97 (we will not move to 0.98 since we tested it
> and found it is less stable then 0.97)
> 
> –Message coalescing
> 
> • Open MPI
> 
> –TBD from Jeff
> 
> • MPI tests:
> 
> –Replace to the new test versions from LLNL, Intel, OSU
> 
> • iSER
> 
> –Any update Voltaire will drive to kernel 2.6.18
> 
> • RDS:
> 
> –TBD – Oracle and SilverStorm should decide what should be in.
> 
>  
> 
>  
> 
> Tziporet Koren
> 
> Software Director
> 
> Mellanox Technologies
> 
> mailto: [EMAIL PROTECTED]
> Tel +972-4-9097200, ext 380
> 
>  
> 
> 
> 
> __
> 
> ___
> openfabrics-ewg mailing list
> [EMAIL PROTECTED]
> http://openib.org/mailman/listinfo/openfabrics-ewg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general