On Wednesday 14 June 2006 08:12, 张韡武 <[EMAIL PROTECTED]> wrote 
about '[gentoo-user] how does a pipe work? Which process wait for which 
one, or they don't actually wait each other?':
> How does pipe actually work? I mean, when there is a pipe like this:
> $ appA | appB
> What happen if appA produced output when appB is still busy processing
> the data and did not require any data from input?
>
> possibility 1) as long as appA can get the resource (CPU?), it simply
> keep outputing data, and this data is cached in memory, as long as there
> is enough memory, and will finally feed to appB when appB finished his
> business and begin to accept more data;
>
> possibility 2) as long as appB stop requiring data, appA is suspended,
> the resource goes to appB. appA is only given resource (CPU?) when appB
> finished his business and begin to accept more data;
>
> Which one is true? I know usually 1) and 2) makes no difference to most
> users, that's why the detail explanation of "1) or 2)" is so hard to
> google out.

Neither! Both!

The implementation of pipes varies from *NIX to *NIX, and possibly within 
the same *NIX, since a shell might 'enhance' the pipes provided by the 
kernel/libc.  (The shell binary is ultimately responsible for implementing 
the pipe, so it may arbitrarily 'decorate' a standard pipe.)

In any case, you can't depend on any particular behavior if you want your 
shell script to be portable.

In linux/bash I believe it works like this:  Each, pipe has a small (~1 
page) FIFO buffer in memory.  (Not sure if this is kernel or userspace.)  
Both processes are started and compete for CPU time in the standard way.  
Either process may block on I/O when it performs standard, blocking I/O on 
the pipe. appA will block if the FIFO gets full; appB will block if the 
FIFO gets empty.

If you really must know: Use the Source, Luke.

> In my case appA gets the data from another host who have very short
> timeout settings, appB is used to compress the data obtained from appA.
> the compression is very difficult, usually at 30Kbps, the network is
> very fast, around 10Mbps. appB compress the data tunck by tunck, if
> Linux actually works in mode 2), the network connection is dropped when
> the interval of two tuncks of appB compressing data is longer then the
> network timeout setting. appA acutally don't know how to restart
> connection from where it was dropped, thus understanding this difference
> makes sense to me.

This also depends a lot on the application. appA can use asynchronous I/O, 
provide a larger buffer (perhaps even a temporary file), and/or send 
keepalives through the network.  Also, appB's compression my be 
interrupted while more data is written to the buffer.

> I made several experiements and my appA and appB both works fine, but I
> don't dare to share this appA/B to others unless I get the mechnism
> understood.

With the speeds you mention, the timeout would have to be ~8s or less for 
the socket to be dropped.[1]  Once a socket is established, they are 
amazingly stable; timeout for an established socket is usually more like 
5-10 minutes or even an hour.  Heck, I think OBSD 3.8 defaulted to a 1 DAY 
timeout before the OS reaped an established socket.

Also, you generally want to compress stuff before putting it on the wire, 
not after...

-- 
"If there's one thing we've established over the years,
it's that the vast majority of our users don't have the slightest
clue what's best for them in terms of package stability."
-- Gentoo Developer Ciaran McCreesh

[1] That's assuming 15Kbps compression rate and the ability to send 
full-size 16KB ip packets.  Most likely, ~1s would suffice, since packets 
are generally 1500B ~= 15Kb in size.

Attachment: pgpKAzANEGw3G.pgp
Description: PGP signature

Reply via email to