Re: Linux and system area networks

2001-06-28 Thread Bernd Eckenfels

In article <[EMAIL PROTECTED]> you wrote:
> We seem to have come full circle.  My original question was about
> providing a better way for sockets applications to take advantage of
> SAN hardware.  W2K Datacenter introduces "Winsock Direct," which will
> bypass the protocol stack when appropriate.  The Infiniband people are
> working on a "Sockets Direct" standard, which is a similar idea.  No
> one seems to care about this for Linux.

Well, there is some work done by the zero-copy folks and the sendfile()
function. Realy much more than a mmaped network socket is not needed.

Besides it looks like SAN will go all the way in the IP Direction sooner or
later anyway :)

There are some interesting Features like accessing MS SQL 7.0 Server via VIA
architecture interfaces over SAN, I am not sure o how open VIA is.

Greetings
Bernd
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux and system area networks

2001-06-28 Thread Roland Dreier

Pekka> If you used sockets, I believe the normal way to use SAN
Pekka> boards is to just make them look like network cards with a
Pekka> large MTU Sure it works, but it's not very efficient :) (I
Pekka> have to admit I've not played with that kind of toys at
Pekka> all, though)

We seem to have come full circle.  My original question was about
providing a better way for sockets applications to take advantage of
SAN hardware.  W2K Datacenter introduces "Winsock Direct," which will
bypass the protocol stack when appropriate.  The Infiniband people are
working on a "Sockets Direct" standard, which is a similar idea.  No
one seems to care about this for Linux.

Roland
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux and system area networks

2001-06-28 Thread Pekka Pietikainen

On Thu, Jun 28, 2001 at 07:28:20PM +0200, Bogdan Costescu wrote:
> On Wed, 27 Jun 2001, Pekka Pietikainen wrote:
> 
> I'm sorry, but I don't understand your reference to MPI here. MPI is a
> high-level API; MPI can run on top of whatever communication features
> exists: TCP/IP, shared memory, VI, etc.

Well, the way I understood the discussion was about how you can
utilize your new $$$ SAN boards well with your existing applications.
If you used something like MPI you just switch to a new implementation
optimized for your network (and hope the new one is compatible
with your code ;) )

Of course you can use some lower-level API and get better 
performance, but your programs will undoubtedly be more complicated
and probably need to be rewritten for new APIs every now and then.

If you used sockets, I believe the normal way to use SAN boards
is to just make them look like network cards with a large MTU 
Sure it works, but it's not very efficient :) (I have to admit 
I've not played with that kind of toys at all, though)

-- 
Pekka Pietikainen



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux and system area networks

2001-06-28 Thread Bogdan Costescu

On Wed, 27 Jun 2001, Pekka Pietikainen wrote:

> Providing a wrapper library for use with Infiniband and the current
> SAN boards like WSD would probably be a useful exercise, but to really get
> good performance (especially latency-wise) you probably want to use
> something like MPI. For many applications a wrapper will be enough, though.

I'm sorry, but I don't understand your reference to MPI here. MPI is a
high-level API; MPI can run on top of whatever communication features
exists: TCP/IP, shared memory, VI, etc.
MPI (as well as other "standards" for parallel programming - PVM, OpenMP)
came from the need to have a common interface, not to have all parallel
programs include specific code to deal with TCP/IP, shared memory, VI,
etc. whenever they were available. Instead, MPI serves as a middle-man
between them and the parallel programs. So, MPI cannot be faster than the
underlying communication features.

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux and system area networks

2001-06-27 Thread Pekka Pietikainen

On Tue, Jun 26, 2001 at 07:36:30AM -0500, Jesse Pollard wrote:
> > I think you misunderstood the point.  Microsoft is providing this WSD
> > DLL as a standard part of W2K now.  This means that hardware vendors
> > just have to write a SAN service provider, and all Winsock-using
> > applications benefit transparently.  No matter how good your TCP/IP
> > implementation is, you still lose (especially in latency) compared to
> > using reliable hardware transport.  Oracle-with-VI and DAFS-vs-NFS
> > benchmarks show this quite clearly.
> 
> You do loose in security. You can't use IPSec over such a device without
> some drastic overhaul.
And the performance gains are not as obvious as one would hope, as
 there is some overhead caused by the WSD switch software
that transparently maps connections onto standard IP networks and SAN
boards depending on who you are talking to.

For some performance comparisions comparing WSD/native VI/TCP, there's
a paper called "WSDLite: a Lightweight Alternative to Windows Sockets Direct
Path", there's a link to the paper at http://citeseer.nj.nec.com/388853.html
(seems you have to use the Cached: links)

Providing a wrapper library for use with Infiniband and the current
SAN boards like WSD would probably be a useful exercise, but to really get
good performance (especially latency-wise) you probably want to use
something like MPI. For many applications a wrapper will be enough, though.
-- 
Pekka Pietikainen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux and system area networks

2001-06-26 Thread Jesse Pollard

-  Received message begins Here  -

> 
> > "Pete" == Pete Zaitcev <[EMAIL PROTECTED]> writes:
> 
> Roland> The rough idea is that WSD is a new user space library
> Roland> that looks at sockets calls and decides if they have to go
> Roland> through the usual kernel network stack, or if they can be
> Roland> handed off to a "SAN service provider" which bypasses the
> Roland> network stack and uses hardware reliable transport and
> Roland> possibly RDMA.
> 
> Pete> That can be done in Linux just as easily, using same DLLs
> Pete> (they are called .so for "shared object"). If you look at
> Pete> Ashok Raj's Infi presentation, you may discern "user-level
> Pete> sockets", if you look hard enough. I invite you to try, if
> Pete> errors of others did not teach you anything.
> 
> I think you misunderstood the point.  Microsoft is providing this WSD
> DLL as a standard part of W2K now.  This means that hardware vendors
> just have to write a SAN service provider, and all Winsock-using
> applications benefit transparently.  No matter how good your TCP/IP
> implementation is, you still lose (especially in latency) compared to
> using reliable hardware transport.  Oracle-with-VI and DAFS-vs-NFS
> benchmarks show this quite clearly.

You do loose in security. You can't use IPSec over such a device without
some drastic overhaul.

> Linux has nothing to compare to Winsock Direct.  I agree, one could
> put an equivalent in glibc, or one could take advantage of Linux's
> relatively low system call latency and put something in the kernel.
> The unfortunate consequence of this is that SAN (system area network)
> hardware vendors are not going to support Linux very well.
> 
> BTW, do you have a pointer to Ashok Raj's presentation?

That would be usefull. We had a presentation here, but it did not
show any great detail (mostly marketing drivel "it will be faster/more
efficient/less overhead.." but nothing about security).
 
> Roland> This means that all applications that use Winsock benefit
> Roland> from the advanced network hardware.  Also, it means that
> Roland> Windows is much easier for hardware vendors to support
> Roland> than other OSes.  For example, Alacritech's TCP/IP offload
> Roland> NIC only works under Windows.  Microsoft is also including
> Roland> Infiniband support in Windows XP and Windows 2002.
> 
> Pete> IMHO, Alacritech is about to join scores and scores of
> Pete> vendors who tried that before. Customers understand very
> Pete> soon that a properly written host based stack works much
> Pete> better in the face of a changing environment: Faster CPUs,
> Pete> new CPUs (IA-64), new network protocols (ECN). Besides, it
> Pete> is easy to "accelerate" a bad network stack, but try to
> Pete> outdo a well done stack.
> 
> OK, how about an Infiniband network with a TCP/IP gateway at the edge?
> Have we thought about how Linux servers should use the gateway to talk
> to internet hosts?  Surely there's no point in running TCP/IP inside
> the Infiniband network, so there needs to be some concept of "socket
> over Infiniband."

One of the problems I haven't seen explained is how the address translation
between TCP/IP and any SAN. Much less how security is going to be controled.
Personally, I think it will end up equivalent to TCP/IP over fibre channel...

-
Jesse I Pollard, II
Email: [EMAIL PROTECTED]

Any opinions expressed are solely my own.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux and system area networks

2001-06-25 Thread Alan Cox

> OK, how about an Infiniband network with a TCP/IP gateway at the edge?
> Have we thought about how Linux servers should use the gateway to talk
> to internet hosts?  Surely there's no point in running TCP/IP inside
> the Infiniband network, so there needs to be some concept of "socket
> over Infiniband."

So write the library, it shouldnt need the kernel involved, and you can
take over AF_INET socket syscalls with an LD_PRELOAD so it can be transparent
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux and system area networks

2001-06-25 Thread Alan Cox

> a properly written host based stack works much better in
> the face of a changing environment: Faster CPUs, new CPUs
> (IA-64), new network protocols (ECN). Besides, it is easy
> to "accelerate" a bad network stack, but try to outdo a
> well done stack.

Putting the stack partly in user spacd can sometimes be a benefit. Linux 8086
does this to cut down kernel size for example ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux and system area networks

2001-06-25 Thread Roland Dreier

> "Pete" == Pete Zaitcev <[EMAIL PROTECTED]> writes:

Roland> The rough idea is that WSD is a new user space library
Roland> that looks at sockets calls and decides if they have to go
Roland> through the usual kernel network stack, or if they can be
Roland> handed off to a "SAN service provider" which bypasses the
Roland> network stack and uses hardware reliable transport and
Roland> possibly RDMA.

Pete> That can be done in Linux just as easily, using same DLLs
Pete> (they are called .so for "shared object"). If you look at
Pete> Ashok Raj's Infi presentation, you may discern "user-level
Pete> sockets", if you look hard enough. I invite you to try, if
Pete> errors of others did not teach you anything.

I think you misunderstood the point.  Microsoft is providing this WSD
DLL as a standard part of W2K now.  This means that hardware vendors
just have to write a SAN service provider, and all Winsock-using
applications benefit transparently.  No matter how good your TCP/IP
implementation is, you still lose (especially in latency) compared to
using reliable hardware transport.  Oracle-with-VI and DAFS-vs-NFS
benchmarks show this quite clearly.

Linux has nothing to compare to Winsock Direct.  I agree, one could
put an equivalent in glibc, or one could take advantage of Linux's
relatively low system call latency and put something in the kernel.
The unfortunate consequence of this is that SAN (system area network)
hardware vendors are not going to support Linux very well.

BTW, do you have a pointer to Ashok Raj's presentation?

Roland> This means that all applications that use Winsock benefit
Roland> from the advanced network hardware.  Also, it means that
Roland> Windows is much easier for hardware vendors to support
Roland> than other OSes.  For example, Alacritech's TCP/IP offload
Roland> NIC only works under Windows.  Microsoft is also including
Roland> Infiniband support in Windows XP and Windows 2002.

Pete> IMHO, Alacritech is about to join scores and scores of
Pete> vendors who tried that before. Customers understand very
Pete> soon that a properly written host based stack works much
Pete> better in the face of a changing environment: Faster CPUs,
Pete> new CPUs (IA-64), new network protocols (ECN). Besides, it
Pete> is easy to "accelerate" a bad network stack, but try to
Pete> outdo a well done stack.

OK, how about an Infiniband network with a TCP/IP gateway at the edge?
Have we thought about how Linux servers should use the gateway to talk
to internet hosts?  Surely there's no point in running TCP/IP inside
the Infiniband network, so there needs to be some concept of "socket
over Infiniband."

Thanks,
  Roland
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux and system area networks

2001-06-25 Thread Pete Zaitcev

> I'd like to find out if anyone has thought about how Linux will handle
> some of the new network technologies people are starting to push.
> Specifically I'm talking about "System Area Networks," that is, things
> like Infiniband, as well as TCP/IP offload.

Infiniband is doing relatively well, as much as anything can
with Intel at the helm (see "Itanic"). This has nothing to
do with TCP/IP offload, which is an extremely stupid idea.
The whole thing stems from a desire by vendors to sell
"smart" (== very expensive) NICs.

RDMA in Infiniband is, in my view, a little more than
traditional DMA in any other advanced server I/O bus.
Sun UPA is packetized, for instance.

> The rough idea is that WSD is a new user space library that looks at
> sockets calls and decides if they have to go through the usual kernel
> network stack, or if they can be handed off to a "SAN service
> provider" which bypasses the network stack and uses hardware reliable
> transport and possibly RDMA.

That can be done in Linux just as easily, using same DLLs
(they are called .so for "shared object"). If you look
at Ashok Raj's Infi presentation, you may discern "user-level
sockets", if you look hard enough. I invite you to try, if
errors of others did not teach you anything.

> This means that all applications that use Winsock benefit from the
> advanced network hardware.  Also, it means that Windows is much easier
> for hardware vendors to support than other OSes.  For example,
> Alacritech's TCP/IP offload NIC only works under Windows.  Microsoft
> is also including Infiniband support in Windows XP and Windows 2002.

IMHO, Alacritech is about to join scores and scores of vendors
who tried that before. Customers understand very soon that
a properly written host based stack works much better in
the face of a changing environment: Faster CPUs, new CPUs
(IA-64), new network protocols (ECN). Besides, it is easy
to "accelerate" a bad network stack, but try to outdo a
well done stack.

> So I guess my question is whether anyone has started thinking about
> the architectural changes needed to make System Area Networking and
> TCP/IP offload easier under Linux.

Pretty much zero-copy that DaveM and Co. do addresses this.

-- Pete
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Linux and system area networks

2001-06-25 Thread Roland Dreier

I'd like to find out if anyone has thought about how Linux will handle
some of the new network technologies people are starting to push.
Specifically I'm talking about "System Area Networks," that is, things
like Infiniband, as well as TCP/IP offload.

In the past people have advocated VIA as a way to use network hardware
that provides reliability and remote DMA (RDMA).  However, VI never
really caught on because it requires applications to be completely
rewritten.  In addition, the corporate backers of VI seem to have
mostly given up on it.

Late last year, Network Appliance proposed something they called
"DASockets," which would mostly preserve socket semantics.  However
that seems to have been put on hold.

Microsoft recently introduced something called "Winsock Direct" in W2K
Datacenter.  For more info you can look at:

http://www.microsoft.com/windows2000/en/datacenter/help/default.asp?url=/WINDOWS2000/en/datacenter/help/WSD_and_SAN.htm

The rough idea is that WSD is a new user space library that looks at
sockets calls and decides if they have to go through the usual kernel
network stack, or if they can be handed off to a "SAN service
provider" which bypasses the network stack and uses hardware reliable
transport and possibly RDMA.

This means that all applications that use Winsock benefit from the
advanced network hardware.  Also, it means that Windows is much easier
for hardware vendors to support than other OSes.  For example,
Alacritech's TCP/IP offload NIC only works under Windows.  Microsoft
is also including Infiniband support in Windows XP and Windows 2002.
(Intel will be pushing Infiniband onto motherboards pretty soon, which
will bring reliable transport, RDMA network hardware into the
mainstream)

So I guess my question is whether anyone has started thinking about
the architectural changes needed to make System Area Networking and
TCP/IP offload easier under Linux.

Thanks,
  Roland
-- 
Roland Dreier<[EMAIL PROTECTED]>
GPG Key fingerprint = A89F B5E9 C185 F34D BD50  4009 37E2 25CC E0EE FAC0
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/