Re: Network pipes
Diomidis Spinellis wrote: I think I can package the proposed sh changes as a separate command, following Luigi's suggestion. The syntax will not include a pipe symbol and layout, but the performance benefits will still be there. It will also be a lot more portable and also usable within any shell. The source code and the documentation for the netpipe standalone command are now available for download under a BSD license at http://www.spinellis.gr/sw/unix/netpipe/. Netpipe connects over a TCP/IP socket a remote command specified to a local input generation command and/or a local output processing command. The input and output of the remote command are appropriately redirected so that the remote command's input will come from the local input generation command and the remote command's output will be sent to the local output processing command. The remote command is executed on the machine accessed through the login command. The netpipe executable should be available through the execution path in the remote machine. The braces used for delimiting the commands and their arguments should be space-separated and can be nested. This feature allows you to setup complex and efficient topologies of distributed communicating processes. Although the initial netpipe communication setup is performed through client-server intermediaries such as ssh(1) or rsh(1), the communication channel that netpipe establishes is a direct socket connection between the local and the remote commands. Without the use of netpipe, when piping remote data through ssh(1) or rsh(1), each data block is read at the local end by the respective client, is sent to the remote daemon and written out again to the remote process. The use of netpipe removes the inefficiency of the multiple data copies and context switches and can in some cases provide dramatic throughput improvements. On the other hand, the confidentiality and integrity of the data passing through netpipe's data channel is not protected; netpipe should therefore be used only within a confined LAN environment. (The authentication process uses the protocol of the underlying login program and is no more or less vulnerable than using the program in isolation; ssh(1) remains secure, rsh(1) continues to be insecure.) Diomidis - http://www.spinellis.gr ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Network pipes
On Thu, 2003-07-24 at 21:51, Leo Bicknell wrote: In a message written on Thu, Jul 24, 2003 at 12:48:23PM -0700, Tim Kientzle wrote: Another approach would be to add a new option to SSH so that it could encrypt only the initial authentication, then pass data unencrypted after that. This would go a long way to addressing the performance concerns. ssh -c none? [EMAIL PROTECTED]:~$ uname -srm FreeBSD 5.1-RELEASE i386 [EMAIL PROTECTED]:~$ ssh -c none localhost No valid ciphers for protocol version 2 given, using defaults. Nice idea. OpenSSH has deliberately broken this, and last time I looked will not entertain unbreaking it. The patch is trivial, though. Note, you don't want to use password authentication in this case, but public key should still be ok. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Network pipes
Kirk Strauser wrote: # ssh -f remotehost nc -l -p 54321 | dd of=/dev/st0 bs=32k # tar cvf - / | nc remotehost 54321 Netcat implements a TCP/UDP transports and basically nothing else. Isn't that what you're trying to achieve? You still have the overhead of two nc instances copying data and context switching. The same is also the problem with the ssh -c none approach. My original proposal would setup a direct socket connection from tar to dd. I think I can package the proposed sh changes as a separate command, following Luigi's suggestion. The syntax will not include a pipe symbol and layout, but the performance benefits will still be there. It will also be a lot more portable and also usable within any shell. Many thanks to all for your feedback. Diomidis - http://www.spinellis.gr ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Network pipes
On Thu, Jul 24, 2003 at 05:51:27PM -0400, Leo Bicknell wrote: In a message written on Thu, Jul 24, 2003 at 12:48:23PM -0700, Tim Kientzle wrote: Another approach would be to add a new option to SSH so that it could encrypt only the initial authentication, then pass data unencrypted after that. This would go a long way to addressing the performance concerns. ssh -c none? Note, you don't want to use password authentication in this case, but public key should still be ok. I have patches for this, they may be a little out-of-date: http://www.incunabulum.com/code/patches/openssh/ http://www.incunabulum.com/code/patches/openssl/ BMS ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Network pipes
At 2003-07-25T06:06:01Z, Diomidis Spinellis [EMAIL PROTECTED] writes: You still have the overhead of two nc instances copying data and context switching. Forgive my ignorance, but is that significantly higher than two /bin/sh instances copying data and context switching? -- Kirk Strauser pgp0.pgp Description: PGP signature
Re: Network pipes
Kirk Strauser wrote: At 2003-07-25T06:06:01Z, Diomidis Spinellis [EMAIL PROTECTED] writes: You still have the overhead of two nc instances copying data and context switching. Forgive my ignorance, but is that significantly higher than two /bin/sh instances copying data and context switching? When the shell connects two local processes with a pipe it just redirects the output of the one and the input of the other to the two ends of a pipe(2) IPC abstraction and leaves them to communicate with each other simply wait(2)ing until they finish. The shell does not shuffle the data between the two processes. The same can also be done when connecting a local process with a remote process through a socket(2); there is no need for an intermediary, and this is what I have implemented. Diomidis - http://www.spinellis.gr ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Network pipes
At 2003-07-25T15:48:13Z, Diomidis Spinellis [EMAIL PROTECTED] writes: The same can also be done when connecting a local process with a remote process through a socket(2); there is no need for an intermediary, and this is what I have implemented. Gotcha. -- Kirk Strauser pgp0.pgp Description: PGP signature
Network pipes
I am currently testing a set of modifications to /bin/sh that allow a user to create a pipeline over the network using a socket as its endpoints. Currently a command like tar cvf - / | ssh remotehost dd of=/dev/st0 bs=32k has tar sending each block through a pipe to a local ssh process, ssh communicating through a socket with a remote ssh daemon and dd communicating with sshd through a pipe again. The changed shell allows you to write tar cvf - / |@ ssh remotehost -- dd of=/dev/st0 bs=32k | : The effect of the above command is that a socket is created between the local and the remote host with the standard output of tar and the standard input of dd redirected to that socket. Authentication is still performed using ssh (or any other remote login mechanism you specify before the -- argument), but the flow between the two processes is from then on not protected in terms of integrity and privacy. Thus the method will mostly be useful within the context of a LAN or a VPN. The authentication design requires the users to have a special command in their path on the remote host, but does not require an additional privileged server or the reservation of special ports. By eliminating two processes, the associated context switches, the data copying, and (in the case of ssh) encryption performance is markedly improved: dd if=/dev/zero bs=4k count=8192 | ssh remotehost -- dd of=/dev/null 33554432 bytes transferred in 17.118648 secs (1960110 bytes/sec) dd if=/dev/zero bs=4k count=8192 |@ ssh remotehost -- dd of=/dev/null | : 33554432 bytes transferred in 4.452980 secs (7535276 bytes/sec) Even eliminating the encryption overhead by using rsh you can still see dd if=/dev/zero bs=4k count=8192 | rsh remotehost -- dd of=/dev/null 33554432 bytes transferred in 131.907130 secs (254379 bytes/sec) dd if=/dev/zero bs=4k count=8192 |@ rsh remotehost -- dd of=/dev/null | : 33554432 bytes transferred in 86.545385 secs (387709 bytes/sec) My questions are: 1. How do you feel about integrating these changes to the /bin/sh in -CURRENT? Note that network pipes are a different process plumbing mechanism, so they really do belong to a shell; implementing them through a separate command would be inelegant. 2. Do you see any problems with the new syntax introduced? 3. After the remote process starts running standard error output is lost. Do find this a significant problem? 4. Both sides of the remote process are communication endpoints and have to be connected to other local processes via pipes. Would it be enough to document this behaviour or should it be hidden from the user by means of forked read/write processes? Diomidis - http://www.spinellis.gr ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Network pipes
hi, i have the following questions: * strange benchmark results! Given the description, I would expect the |@ rsh and |@ ssh cases to give the same throughput, and in any case | rsh to be faster than | ssh. How comes, instead, that the times differ by an order of magnitude ? Can you run the tests in similar conditions to appreciate the gains better ? * I do not understand how can you remove the pipe in the remote host without modifying there both sshd and sh ? I think it would be very important to understand how much |@ depends on the behaviour of the remote daemon. * the loss of encription on the channel is certainly something that might escape the attention of the user. I also wonder in how many cases you really need the extra performance to justify the extra plumbing mechanism. * there are subtle implications of your new plumbing in the way processes are started. With A | B | C the shell first creates the pipes, then it can start the processes in any order, and they can individually fail to start without any direct consequence other than an I/O failure. A |@ B |@ C requires that you start things from the end of the chain (because you cannot start a process until you have a [socket] descriptor from the next stage in the chain), and if a process fails to start you cannot even start the next one in the sequence. Not that this is bad, just very different from regular pipes. All the above leaves me a bit puzzled on whether or not this is a useful addition... In fact, i am not convinced that network pipes should be implemented in the shell... cheers luigi On Thu, Jul 24, 2003 at 11:19:49AM +0300, Diomidis Spinellis wrote: I am currently testing a set of modifications to /bin/sh that allow a user to create a pipeline over the network using a socket as its endpoints. Currently a command like tar cvf - / | ssh remotehost dd of=/dev/st0 bs=32k has tar sending each block through a pipe to a local ssh process, ssh communicating through a socket with a remote ssh daemon and dd communicating with sshd through a pipe again. The changed shell allows you to write tar cvf - / |@ ssh remotehost -- dd of=/dev/st0 bs=32k | : The effect of the above command is that a socket is created between the local and the remote host with the standard output of tar and the standard input of dd redirected to that socket. Authentication is still performed using ssh (or any other remote login mechanism you specify before the -- argument), but the flow between the two processes is from then on not protected in terms of integrity and privacy. Thus the method will mostly be useful within the context of a LAN or a VPN. The authentication design requires the users to have a special command in their path on the remote host, but does not require an additional privileged server or the reservation of special ports. By eliminating two processes, the associated context switches, the data copying, and (in the case of ssh) encryption performance is markedly improved: dd if=/dev/zero bs=4k count=8192 | ssh remotehost -- dd of=/dev/null 33554432 bytes transferred in 17.118648 secs (1960110 bytes/sec) dd if=/dev/zero bs=4k count=8192 |@ ssh remotehost -- dd of=/dev/null | : 33554432 bytes transferred in 4.452980 secs (7535276 bytes/sec) Even eliminating the encryption overhead by using rsh you can still see dd if=/dev/zero bs=4k count=8192 | rsh remotehost -- dd of=/dev/null 33554432 bytes transferred in 131.907130 secs (254379 bytes/sec) dd if=/dev/zero bs=4k count=8192 |@ rsh remotehost -- dd of=/dev/null | : 33554432 bytes transferred in 86.545385 secs (387709 bytes/sec) My questions are: 1. How do you feel about integrating these changes to the /bin/sh in -CURRENT? Note that network pipes are a different process plumbing mechanism, so they really do belong to a shell; implementing them through a separate command would be inelegant. 2. Do you see any problems with the new syntax introduced? 3. After the remote process starts running standard error output is lost. Do find this a significant problem? 4. Both sides of the remote process are communication endpoints and have to be connected to other local processes via pipes. Would it be enough to document this behaviour or should it be hidden from the user by means of forked read/write processes? Diomidis - http://www.spinellis.gr ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Network pipes
Luigi Rizzo wrote: * strange benchmark results! Given the description, I would expect the |@ rsh and |@ ssh cases to give the same throughput, and in any case | rsh to be faster than | ssh. How comes, instead, that the times differ by an order of magnitude ? Can you run the tests in similar conditions to appreciate the gains better ? They were executed on different machines. The ssh result was between freefall.freebsd.org and ref5, the rsh result was between old low-end Pentium machines on my home network. * I do not understand how can you remove the pipe in the remote host without modifying there both sshd and sh ? I think it would be very important to understand how much |@ depends on the behaviour of the remote daemon. The remote daemon is only used for authentication. Thus any remote host command execution method can be used without modifying the client or the server. What the modified shell does is start on the remote machine a separate command netpipe. Netpipe takes as arguments the originating host, the socket port, the command to execute, and its arguments. Netpipe opens the socket back to the originating host, redirects its I/O to the socket, and execs the specified command. * the loss of encription on the channel is certainly something that might escape the attention of the user. I also wonder in how many cases you really need the extra performance to justify the extra plumbing mechanism. I felt the need for such functionality when moving GB data between different machines for creating a disk copy and backup to tape. My requirements may be atypical, this is why I asked for input. * there are subtle implications of your new plumbing in the way processes are started. With A | B | C the shell first creates the pipes, then it can start the processes in any order, and they can individually fail to start without any direct consequence other than an I/O failure. A |@ B |@ C requires that you start things from the end of the chain (because you cannot start a process until you have a [socket] descriptor from the next stage in the chain), and if a process fails to start you cannot even start the next one in the sequence. Not that this is bad, just very different from regular pipes. It is even worse. You can not write A |@ B |@ C because sockets are created on the originating host. For the above to work you would need a mechanism to create another socket between the B and C machines. Maybe the syntax should be changed to make such constructions counterintuitive. Diomidis ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Network pipes
- Diomidis Spinellis's Original Message - Luigi Rizzo wrote: * the loss of encription on the channel is certainly something that might escape the attention of the user. I also wonder in how many cases you really need the extra performance to justify the extra plumbing mechanism. Once you pass a certain order of magnitute, it becomes an overriding priority. Thus the reason why many backup systems are hand crafted. I felt the need for such functionality when moving GB data between different machines for creating a disk copy and backup to tape. My requirements may be atypical, this is why I asked for input. Your requirements are not atypical. There are folks out here dealing with 100s 1000s of TB. * there are subtle implications of your new plumbing in the way processes are started. With A | B | C the shell first creates the pipes, then it can start the processes in any order, and they can individually fail to start without any direct consequence other than an I/O failure. A |@ B |@ C requires that you start things from the end of the chain (because you cannot start a process until you have a [socket] descriptor from the next stage in the chain), and if a process fails to start you cannot even start the next one in the sequence. Not that this is bad, just very different from regular pipes. It is even worse. You can not write A |@ B |@ C because sockets are created on the originating host. For the above to work you would need a mechanism to create another socket between the B and C machines. Maybe the syntax should be changed to make such constructions counterintuitive. Syntactic consistency should be a high priority. -john ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Network pipes
i understand the motivations (speeding up massive remote backups and the like), but i do not believe you need to introduce a new shell construct (with very different semantics) just to accommodate this. I believe it is a lot more intuitive to say something like foo [flags] source-host tar cvzf - /usr dest-host dd of=/dev/bar and have your 'foo' command do the authentication using ssh or whatever you require with the flags, create both ends of the socket, call dup() as appropriate and then exec the source and destination pipelines. cheers luigi On Thu, Jul 24, 2003 at 02:04:21PM +0300, Diomidis Spinellis wrote: Luigi Rizzo wrote: * strange benchmark results! Given the description, I would expect the |@ rsh and |@ ssh cases to give the same throughput, and in any case | rsh to be faster than | ssh. How comes, instead, that the times differ by an order of magnitude ? Can you run the tests in similar conditions to appreciate the gains better ? They were executed on different machines. The ssh result was between freefall.freebsd.org and ref5, the rsh result was between old low-end Pentium machines on my home network. * I do not understand how can you remove the pipe in the remote host without modifying there both sshd and sh ? I think it would be very important to understand how much |@ depends on the behaviour of the remote daemon. The remote daemon is only used for authentication. Thus any remote host command execution method can be used without modifying the client or the server. What the modified shell does is start on the remote machine a separate command netpipe. Netpipe takes as arguments the originating host, the socket port, the command to execute, and its arguments. Netpipe opens the socket back to the originating host, redirects its I/O to the socket, and execs the specified command. * the loss of encription on the channel is certainly something that might escape the attention of the user. I also wonder in how many cases you really need the extra performance to justify the extra plumbing mechanism. I felt the need for such functionality when moving GB data between different machines for creating a disk copy and backup to tape. My requirements may be atypical, this is why I asked for input. * there are subtle implications of your new plumbing in the way processes are started. With A | B | C the shell first creates the pipes, then it can start the processes in any order, and they can individually fail to start without any direct consequence other than an I/O failure. A |@ B |@ C requires that you start things from the end of the chain (because you cannot start a process until you have a [socket] descriptor from the next stage in the chain), and if a process fails to start you cannot even start the next one in the sequence. Not that this is bad, just very different from regular pipes. It is even worse. You can not write A |@ B |@ C because sockets are created on the originating host. For the above to work you would need a mechanism to create another socket between the B and C machines. Maybe the syntax should be changed to make such constructions counterintuitive. Diomidis ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Network pipes
Diomidis Spinellis wrote this message on Thu, Jul 24, 2003 at 14:04 +0300: separate command netpipe. Netpipe takes as arguments the originating host, the socket port, the command to execute, and its arguments. Netpipe opens the socket back to the originating host, redirects its I/O to the socket, and execs the specified command. This breaks nat firewalls. It is very common occurance to only accept incoming connections, and only on certain ports. This means any system of firewill will probably be broken by this. :( i.e. behind a nat to a public system, the return connection can't be established. From any system to a nat redirected ssh server, the incoming connection won't make it to the destination machine. I think this should just be a utility like Luigi suggested. This will help solve these problems. -- John-Mark Gurney Voice: +1 415 225 5579 All that I will do, has been done, All that I have, has not. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Network pipes
I think this should just be a utility like Luigi suggested. This will help solve these problems. And in large part the traditional netpipe/socket tools in combination with the -L and -R flags of SSH solve these issues rather handily. And when used with ssh-keyagent rather nicely. Dw ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Network pipes
On Thu, Jul 24, 2003 at 10:36:41AM -0700, John-Mark Gurney wrote: Diomidis Spinellis wrote this message on Thu, Jul 24, 2003 at 14:04 +0300: separate command netpipe. Netpipe takes as arguments the originating host, the socket port, the command to execute, and its arguments. Netpipe opens the socket back to the originating host, redirects its I/O to the socket, and execs the specified command. This breaks nat firewalls. It is very common occurance to only accept incoming connections, and only on certain ports. This means any system of firewill will probably be broken by this. :( actually it is the other way around -- this solution simply won't work on firewalled systems. But to tell the truth, i doubt you'd do a multi-gb backup through a nat and be worried about the encryption overhead. cheers luigi i.e. behind a nat to a public system, the return connection can't be established. From any system to a nat redirected ssh server, the incoming connection won't make it to the destination machine. I think this should just be a utility like Luigi suggested. This will help solve these problems. -- John-Mark GurneyVoice: +1 415 225 5579 All that I will do, has been done, All that I have, has not. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Network pipes
Dirk-Willem van Gulik wrote: I think this should just be a utility like Luigi suggested. This will help solve these problems. And in large part the traditional netpipe/socket tools in combination with the -L and -R flags of SSH solve these issues rather handily. And when used with ssh-keyagent rather nicely. But piping GBs of data through an encrypted SSH connection is still slow. The performance issues the OP is trying to address are real. Another approach would be to add a new option to SSH so that it could encrypt only the initial authentication, then pass data unencrypted after that. This would go a long way to addressing the performance concerns. Tim ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Network pipes
In a message written on Thu, Jul 24, 2003 at 12:48:23PM -0700, Tim Kientzle wrote: Another approach would be to add a new option to SSH so that it could encrypt only the initial authentication, then pass data unencrypted after that. This would go a long way to addressing the performance concerns. ssh -c none? Note, you don't want to use password authentication in this case, but public key should still be ok. You could also set up something like kerberos and use krsh or similar... -- Leo Bicknell - [EMAIL PROTECTED] - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - [EMAIL PROTECTED], www.tmbg.org pgp0.pgp Description: PGP signature
Re: Network pipes
At 2003-07-24T08:19:49Z, Diomidis Spinellis [EMAIL PROTECTED] writes: tar cvf - / |@ ssh remotehost -- dd of=/dev/st0 bs=32k | : The effect of the above command is that a socket is created between the local and the remote host with the standard output of tar and the standard input of dd redirected to that socket. Authentication is still performed using ssh (or any other remote login mechanism you specify before the -- argument), but the flow between the two processes is from then on not protected in terms of integrity and privacy. Thus the method will mostly be useful within the context of a LAN or a VPN. Isn't this almost the same as: # ssh -f remotehost nc -l -p 54321 | dd of=/dev/st0 bs=32k # tar cvf - / | nc remotehost 54321 Netcat implements a TCP/UDP transports and basically nothing else. Isn't that what you're trying to achieve? -- Kirk Strauser pgp0.pgp Description: PGP signature