Re: [gentoo-user] how does a pipe work? Which process wait for which one, or they don't actually wait each other?
On Wednesday 14 June 2006 08:12, 张韡武 <[EMAIL PROTECTED]> wrote about '[gentoo-user] how does a pipe work? Which process wait for which one, or they don't actually wait each other?': > How does pipe actually work? I mean, when there is a pipe like this: > $ appA | appB > What happen if appA produced output when appB is still busy processing > the data and did not require any data from input? > > possibility 1) as long as appA can get the resource (CPU?), it simply > keep outputing data, and this data is cached in memory, as long as there > is enough memory, and will finally feed to appB when appB finished his > business and begin to accept more data; > > possibility 2) as long as appB stop requiring data, appA is suspended, > the resource goes to appB. appA is only given resource (CPU?) when appB > finished his business and begin to accept more data; > > Which one is true? I know usually 1) and 2) makes no difference to most > users, that's why the detail explanation of "1) or 2)" is so hard to > google out. Neither! Both! The implementation of pipes varies from *NIX to *NIX, and possibly within the same *NIX, since a shell might 'enhance' the pipes provided by the kernel/libc. (The shell binary is ultimately responsible for implementing the pipe, so it may arbitrarily 'decorate' a standard pipe.) In any case, you can't depend on any particular behavior if you want your shell script to be portable. In linux/bash I believe it works like this: Each, pipe has a small (~1 page) FIFO buffer in memory. (Not sure if this is kernel or userspace.) Both processes are started and compete for CPU time in the standard way. Either process may block on I/O when it performs standard, blocking I/O on the pipe. appA will block if the FIFO gets full; appB will block if the FIFO gets empty. If you really must know: Use the Source, Luke. > In my case appA gets the data from another host who have very short > timeout settings, appB is used to compress the data obtained from appA. > the compression is very difficult, usually at 30Kbps, the network is > very fast, around 10Mbps. appB compress the data tunck by tunck, if > Linux actually works in mode 2), the network connection is dropped when > the interval of two tuncks of appB compressing data is longer then the > network timeout setting. appA acutally don't know how to restart > connection from where it was dropped, thus understanding this difference > makes sense to me. This also depends a lot on the application. appA can use asynchronous I/O, provide a larger buffer (perhaps even a temporary file), and/or send keepalives through the network. Also, appB's compression my be interrupted while more data is written to the buffer. > I made several experiements and my appA and appB both works fine, but I > don't dare to share this appA/B to others unless I get the mechnism > understood. With the speeds you mention, the timeout would have to be ~8s or less for the socket to be dropped.[1] Once a socket is established, they are amazingly stable; timeout for an established socket is usually more like 5-10 minutes or even an hour. Heck, I think OBSD 3.8 defaulted to a 1 DAY timeout before the OS reaped an established socket. Also, you generally want to compress stuff before putting it on the wire, not after... -- "If there's one thing we've established over the years, it's that the vast majority of our users don't have the slightest clue what's best for them in terms of package stability." -- Gentoo Developer Ciaran McCreesh [1] That's assuming 15Kbps compression rate and the ability to send full-size 16KB ip packets. Most likely, ~1s would suffice, since packets are generally 1500B ~= 15Kb in size. pgpKAzANEGw3G.pgp Description: PGP signature
Re: [gentoo-user] how does a pipe work? Which process wait for which one, or they don't actually wait each other?
On 14 June 2006 14:12, 张韡武 wrote: > Hello. This might be OT but I am pretty interested in this and being > unlucky not able to find a real in-depth explanation of pipe on the > Internet. > > How does pipe actually work? I mean, when there is a pipe like this: > $ appA | appB > What happen if appA produced output when appB is still busy processing > the data and did not require any data from input? > > possibility 1) as long as appA can get the resource (CPU?), it simply > keep outputing data, and this data is cached in memory, as long as there > is enough memory, and will finally feed to appB when appB finished his > business and begin to accept more data; > > possibility 2) as long as appB stop requiring data, appA is suspended, > the resource goes to appB. appA is only given resource (CPU?) when appB > finished his business and begin to accept more data; > > Which one is true? I know usually 1) and 2) makes no difference to most > users, that's why the detail explanation of "1) or 2)" is so hard to > google out. Neither nor. ;-) It's called "blacking IO". An app that has to wait for IO (whether it's reading or writing doesn't matter) doesn't get CPU until the IO resource is ready. Of course, the scheduler in your kernel can take your app off the CPU for other reasons even if the resource is ready. Uwe -- Mark Twain: I rather decline two drinks than a German adjective. http://www.SysEx.com.na -- gentoo-user@gentoo.org mailing list
Re: [gentoo-user] how does a pipe work? Which process wait for which one, or they don't actually wait each other?
It's mode 2. When appB stops reading, appA will continue writing until the pipe is full (about 4k I believe) at which time appA will block in a write. dcm On 6/14/06, 张�|武 <[EMAIL PROTECTED]> wrote: Hello. This might be OT but I am pretty interested in this and being unlucky not able to find a real in-depth explanation of pipe on the Internet. How does pipe actually work? I mean, when there is a pipe like this: $ appA | appB What happen if appA produced output when appB is still busy processing the data and did not require any data from input? possibility 1) as long as appA can get the resource (CPU?), it simply keep outputing data, and this data is cached in memory, as long as there is enough memory, and will finally feed to appB when appB finished his business and begin to accept more data; possibility 2) as long as appB stop requiring data, appA is suspended, the resource goes to appB. appA is only given resource (CPU?) when appB finished his business and begin to accept more data; Which one is true? I know usually 1) and 2) makes no difference to most users, that's why the detail explanation of "1) or 2)" is so hard to google out. In my case appA gets the data from another host who have very short timeout settings, appB is used to compress the data obtained from appA. the compression is very difficult, usually at 30Kbps, the network is very fast, around 10Mbps. appB compress the data tunck by tunck, if Linux actually works in mode 2), the network connection is dropped when the interval of two tuncks of appB compressing data is longer then the network timeout setting. appA acutally don't know how to restart connection from where it was dropped, thus understanding this difference makes sense to me. I made several experiements and my appA and appB both works fine, but I don't dare to share this appA/B to others unless I get the mechnism understood. Thank you in advance. -- gentoo-user@gentoo.org mailing list -- gentoo-user@gentoo.org mailing list
[gentoo-user] how does a pipe work? Which process wait for which one, or they don't actually wait each other?
Hello. This might be OT but I am pretty interested in this and being unlucky not able to find a real in-depth explanation of pipe on the Internet. How does pipe actually work? I mean, when there is a pipe like this: $ appA | appB What happen if appA produced output when appB is still busy processing the data and did not require any data from input? possibility 1) as long as appA can get the resource (CPU?), it simply keep outputing data, and this data is cached in memory, as long as there is enough memory, and will finally feed to appB when appB finished his business and begin to accept more data; possibility 2) as long as appB stop requiring data, appA is suspended, the resource goes to appB. appA is only given resource (CPU?) when appB finished his business and begin to accept more data; Which one is true? I know usually 1) and 2) makes no difference to most users, that's why the detail explanation of "1) or 2)" is so hard to google out. In my case appA gets the data from another host who have very short timeout settings, appB is used to compress the data obtained from appA. the compression is very difficult, usually at 30Kbps, the network is very fast, around 10Mbps. appB compress the data tunck by tunck, if Linux actually works in mode 2), the network connection is dropped when the interval of two tuncks of appB compressing data is longer then the network timeout setting. appA acutally don't know how to restart connection from where it was dropped, thus understanding this difference makes sense to me. I made several experiements and my appA and appB both works fine, but I don't dare to share this appA/B to others unless I get the mechnism understood. Thank you in advance. -- gentoo-user@gentoo.org mailing list