Re: [gentoo-user] how does a pipe work? Which process wait for which one, or they don't actually wait each other?

2006-06-14 Thread Boyd Stephen Smith Jr.
On Wednesday 14 June 2006 08:12, 张韡武 <[EMAIL PROTECTED]> wrote 
about '[gentoo-user] how does a pipe work? Which process wait for which 
one, or they don't actually wait each other?':
> How does pipe actually work? I mean, when there is a pipe like this:
> $ appA | appB
> What happen if appA produced output when appB is still busy processing
> the data and did not require any data from input?
>
> possibility 1) as long as appA can get the resource (CPU?), it simply
> keep outputing data, and this data is cached in memory, as long as there
> is enough memory, and will finally feed to appB when appB finished his
> business and begin to accept more data;
>
> possibility 2) as long as appB stop requiring data, appA is suspended,
> the resource goes to appB. appA is only given resource (CPU?) when appB
> finished his business and begin to accept more data;
>
> Which one is true? I know usually 1) and 2) makes no difference to most
> users, that's why the detail explanation of "1) or 2)" is so hard to
> google out.

Neither! Both!

The implementation of pipes varies from *NIX to *NIX, and possibly within 
the same *NIX, since a shell might 'enhance' the pipes provided by the 
kernel/libc.  (The shell binary is ultimately responsible for implementing 
the pipe, so it may arbitrarily 'decorate' a standard pipe.)

In any case, you can't depend on any particular behavior if you want your 
shell script to be portable.

In linux/bash I believe it works like this:  Each, pipe has a small (~1 
page) FIFO buffer in memory.  (Not sure if this is kernel or userspace.)  
Both processes are started and compete for CPU time in the standard way.  
Either process may block on I/O when it performs standard, blocking I/O on 
the pipe. appA will block if the FIFO gets full; appB will block if the 
FIFO gets empty.

If you really must know: Use the Source, Luke.

> In my case appA gets the data from another host who have very short
> timeout settings, appB is used to compress the data obtained from appA.
> the compression is very difficult, usually at 30Kbps, the network is
> very fast, around 10Mbps. appB compress the data tunck by tunck, if
> Linux actually works in mode 2), the network connection is dropped when
> the interval of two tuncks of appB compressing data is longer then the
> network timeout setting. appA acutally don't know how to restart
> connection from where it was dropped, thus understanding this difference
> makes sense to me.

This also depends a lot on the application. appA can use asynchronous I/O, 
provide a larger buffer (perhaps even a temporary file), and/or send 
keepalives through the network.  Also, appB's compression my be 
interrupted while more data is written to the buffer.

> I made several experiements and my appA and appB both works fine, but I
> don't dare to share this appA/B to others unless I get the mechnism
> understood.

With the speeds you mention, the timeout would have to be ~8s or less for 
the socket to be dropped.[1]  Once a socket is established, they are 
amazingly stable; timeout for an established socket is usually more like 
5-10 minutes or even an hour.  Heck, I think OBSD 3.8 defaulted to a 1 DAY 
timeout before the OS reaped an established socket.

Also, you generally want to compress stuff before putting it on the wire, 
not after...

-- 
"If there's one thing we've established over the years,
it's that the vast majority of our users don't have the slightest
clue what's best for them in terms of package stability."
-- Gentoo Developer Ciaran McCreesh

[1] That's assuming 15Kbps compression rate and the ability to send 
full-size 16KB ip packets.  Most likely, ~1s would suffice, since packets 
are generally 1500B ~= 15Kb in size.


pgpKAzANEGw3G.pgp
Description: PGP signature


Re: [gentoo-user] how does a pipe work? Which process wait for which one, or they don't actually wait each other?

2006-06-14 Thread Uwe Thiem
On 14 June 2006 14:12, 张韡武 wrote:
> Hello. This might be OT but I am pretty interested in this and being
> unlucky not able to find a real in-depth explanation of pipe on the
> Internet.
>
> How does pipe actually work? I mean, when there is a pipe like this:
> $ appA | appB
> What happen if appA produced output when appB is still busy processing
> the data and did not require any data from input?
>
> possibility 1) as long as appA can get the resource (CPU?), it simply
> keep outputing data, and this data is cached in memory, as long as there
> is enough memory, and will finally feed to appB when appB finished his
> business and begin to accept more data;
>
> possibility 2) as long as appB stop requiring data, appA is suspended,
> the resource goes to appB. appA is only given resource (CPU?) when appB
> finished his business and begin to accept more data;
>
> Which one is true? I know usually 1) and 2) makes no difference to most
> users, that's why the detail explanation of "1) or 2)" is so hard to
> google out.

Neither nor. ;-)

It's called "blacking IO". An app that has to wait for IO (whether it's 
reading or writing doesn't matter) doesn't get CPU until the IO resource is 
ready. Of course, the scheduler in your kernel can take your app off the CPU 
for other reasons even if the resource is ready.

Uwe

-- 
Mark Twain: I rather decline two drinks than a German adjective.
http://www.SysEx.com.na
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] how does a pipe work? Which process wait for which one, or they don't actually wait each other?

2006-06-14 Thread Devon Miller

It's mode 2. When appB stops reading, appA will continue writing until
the pipe is full (about 4k I believe) at which time appA will block in
a write.

dcm

On 6/14/06, 张�|武 <[EMAIL PROTECTED]> wrote:

Hello. This might be OT but I am pretty interested in this and being
unlucky not able to find a real in-depth explanation of pipe on the
Internet.

How does pipe actually work? I mean, when there is a pipe like this:
$ appA | appB
What happen if appA produced output when appB is still busy processing
the data and did not require any data from input?

possibility 1) as long as appA can get the resource (CPU?), it simply
keep outputing data, and this data is cached in memory, as long as there
is enough memory, and will finally feed to appB when appB finished his
business and begin to accept more data;

possibility 2) as long as appB stop requiring data, appA is suspended,
the resource goes to appB. appA is only given resource (CPU?) when appB
finished his business and begin to accept more data;

Which one is true? I know usually 1) and 2) makes no difference to most
users, that's why the detail explanation of "1) or 2)" is so hard to
google out.

In my case appA gets the data from another host who have very short
timeout settings, appB is used to compress the data obtained from appA.
the compression is very difficult, usually at 30Kbps, the network is
very fast, around 10Mbps. appB compress the data tunck by tunck, if
Linux actually works in mode 2), the network connection is dropped when
the interval of two tuncks of appB compressing data is longer then the
network timeout setting. appA acutally don't know how to restart
connection from where it was dropped, thus understanding this difference
makes sense to me.

I made several experiements and my appA and appB both works fine, but I
don't dare to share this appA/B to others unless I get the mechnism
understood.

Thank you in advance.

--
gentoo-user@gentoo.org mailing list




--
gentoo-user@gentoo.org mailing list



[gentoo-user] how does a pipe work? Which process wait for which one, or they don't actually wait each other?

2006-06-14 Thread 张韡武
Hello. This might be OT but I am pretty interested in this and being
unlucky not able to find a real in-depth explanation of pipe on the
Internet.

How does pipe actually work? I mean, when there is a pipe like this:
$ appA | appB
What happen if appA produced output when appB is still busy processing
the data and did not require any data from input?

possibility 1) as long as appA can get the resource (CPU?), it simply
keep outputing data, and this data is cached in memory, as long as there
is enough memory, and will finally feed to appB when appB finished his
business and begin to accept more data;

possibility 2) as long as appB stop requiring data, appA is suspended,
the resource goes to appB. appA is only given resource (CPU?) when appB
finished his business and begin to accept more data;

Which one is true? I know usually 1) and 2) makes no difference to most
users, that's why the detail explanation of "1) or 2)" is so hard to
google out.

In my case appA gets the data from another host who have very short
timeout settings, appB is used to compress the data obtained from appA.
the compression is very difficult, usually at 30Kbps, the network is
very fast, around 10Mbps. appB compress the data tunck by tunck, if
Linux actually works in mode 2), the network connection is dropped when
the interval of two tuncks of appB compressing data is longer then the
network timeout setting. appA acutally don't know how to restart
connection from where it was dropped, thus understanding this difference
makes sense to me.

I made several experiements and my appA and appB both works fine, but I
don't dare to share this appA/B to others unless I get the mechnism
understood.

Thank you in advance.

-- 
gentoo-user@gentoo.org mailing list