Re: [Monotone-devel] Re: Automate stdio chunk size
On Monday 10 July 2006 14:45, Bruce Stephens wrote: How about makeing the chunk size settable using a new command (leaving the default as it is)? Or set the default to 1M (or BUFSIZ, or something), and then clients that would deadlock have a way to set it to something smaller. I'm not sure whether 1M would cause a problem. I wouldn't rule it out for some simpler clients (using synchronous I/O and polling various inputs). After thinking a while about it, it is no longer clear to me, why there is a need for chunked output *at all* ... The reading side of a pipe can always read the data in arbitrarily (and independently of the sender) sized packets, even when using synchronous I/O, by simply specifying the size in the read() call. The sender must of course check how many bytes of it's write() call actually got written. - Thomas -- Thomas Moschny [EMAIL PROTECTED] ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Re: Automate stdio chunk size
Thomas Moschny wrote After thinking a while about it, it is no longer clear to me, why there is a need for chunked output *at all* ... The reading side of a pipe can always read the data in arbitrarily (and independently of the sender) sized packets, even when using synchronous I/O, by simply specifying the size in the read() call. The sender must of course check how many bytes of it's write() call actually got written. Well, maybe there is no need for chunked output, but there is definitely the need for some EOF token which tells the client hey, I got all the data. Ideally this would be paired with the checksum of the just outputted data so the client can ensure that it got all data correctly. Thomas Keller. ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Re: Automate stdio chunk size
On Mon, 2006-07-10 at 16:09 +0200, Thomas Keller wrote: Thomas Moschny wrote After thinking a while about it, it is no longer clear to me, why there is a need for chunked output *at all* ... The reading side of a pipe can always read the data in arbitrarily (and independently of the sender) sized packets, even when using synchronous I/O, by simply specifying the size in the read() call. The sender must of course check how many bytes of it's write() call actually got written. Well, maybe there is no need for chunked output, but there is definitely the need for some EOF token which tells the client hey, I got all the data. Ideally this would be paired with the checksum of the just outputted data so the client can ensure that it got all data correctly. We can't use an in-stream EOF token, because the stream should be binary-safe. So this means prefixing each data chunk with the size of that chunk. A chunk is output when it reaches the maximum size (because having a known maximum size seems convenient), or when the stream is flushed (my understanding is that this is the Right Thing to do, plus it could be nice if we have commands that take a long time to finish). I think we need to keep the chunked output format, but there's no reason not to increase the maximum size. Just that when we do we should bump the interface version, since the maximum size is a documented part of the interface. But, I'm not sure how important this is, since I doubt anyone is relying on that. There are changes to inventory in the works, that would require changing the interface version anyway, perhaps we should increase the chunk size at the same time we land that? Tim ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Re: Automate stdio chunk size
We can't use an in-stream EOF token, because the stream should be binary-safe. So this means prefixing each data chunk with the size of that chunk. A chunk is output when it reaches the maximum size (because having a known maximum size seems convenient), or when the stream is flushed (my understanding is that this is the Right Thing to do, plus it could be nice if we have commands that take a long time to finish). Well, the EOF token wouldn't really have to be '\0', just something a parser could distinguish from the normal output flow. F.e. in emails the header is separated from the body by double newlines \n\n. If basic_io would become standard for all output of the automation interface there could even be some well-defined end token there, like ... command_finished 1234... where the 1234... part could be the checksum for the complete output echoed before that token. Thomas. ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Re: Automate stdio chunk size
On Mon, 2006-07-10 at 18:50 +0200, Thomas Keller wrote: We can't use an in-stream EOF token, because the stream should be binary-safe. So this means prefixing each data chunk with the size of that chunk. A chunk is output when it reaches the maximum size (because having a known maximum size seems convenient), or when the stream is flushed (my understanding is that this is the Right Thing to do, plus it could be nice if we have commands that take a long time to finish). Well, the EOF token wouldn't really have to be '\0', just something a parser could distinguish from the normal output flow. F.e. in emails the header is separated from the body by double newlines \n\n. If basic_io would become standard for all output of the automation interface there could even be some well-defined end token there, like basic_io is not always appropriate, for example automate get_file. This command also means that the output stream can contain arbirtrary binary data, so no in-stream EOF token would be safe. Tim ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Re: Automate stdio chunk size
On Monday 10 July 2006 17:48 Timothy Brownawell wrote: There are changes to inventory in the works, that would require changing the interface version anyway, perhaps we should increase the chunk size at the same time we land that? Yes. And I think we should change the docs (for the new interface version) to *not* specify a maximum chunk size, thus allowing us to change it freely later, for example through a command line option. - Thomas -- Thomas Moschny [EMAIL PROTECTED] ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Re: Automate stdio chunk size
On Mon, Jul 10, 2006 at 08:17:36PM +0200, Thomas Moschny wrote: On Monday 10 July 2006 17:48 Timothy Brownawell wrote: There are changes to inventory in the works, that would require changing the interface version anyway, perhaps we should increase the chunk size at the same time we land that? Yes. And I think we should change the docs (for the new interface version) to *not* specify a maximum chunk size, thus allowing us to change it freely later, for example through a command line option. Err, yes, I'm sort of surprised that's in the docs at all. The point of having an upper-limit is to put an upper bound on how much memory monotone has to use. 1M seems a bit large for this purpose, and I'm astonished if you actually have to go to 1M to get the benefit. Could someone run timings at different block sizes and pick one that gives most of the speed benefit without being huge? -- Nathaniel -- Damn the Solar System. Bad light; planets too distant; pestered with comets; feeble contrivance; could make a better one myself. -- Lord Jeffrey ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Re: Automate stdio chunk size
On 7/10/06, Nathaniel Smith [EMAIL PROTECTED] wrote: On Mon, Jul 10, 2006 at 08:17:36PM +0200, Thomas Moschny wrote: The point of having an upper-limit is to put an upper bound on how much memory monotone has to use. 1M seems a bit large for this purpose, and I'm astonished if you actually have to go to 1M to get the benefit. Could someone run timings at different block sizes and pick one that gives most of the speed benefit without being huge? Note that this interface will probably be run on a pipe and different block sizes will have a different impact on different operating systems (and even between versions). I remember there was a paper on the drastic differences in speed between the pipe handling on Windows 2000 and XP for different block sizes. Maybe someone can recollect where that paper was... Best regards, ~Nuno Lucas -- Nathaniel ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel