Re: [Monotone-devel] Re: Automate stdio chunk size

2006-07-10 Thread Thomas Moschny
On Monday 10 July 2006 14:45, Bruce Stephens wrote:
 How about makeing the chunk size settable using a new command (leaving
 the default as it is)?  Or set the default to 1M (or BUFSIZ, or
 something), and then clients that would deadlock have a way to set it
 to something smaller.

 I'm not sure whether 1M would cause a problem.  I wouldn't rule it out
 for some simpler clients (using synchronous I/O and polling various
 inputs).

After thinking a while about it, it is no longer clear to me, why there is a 
need for chunked output *at all* ...

The reading side of a pipe can always read the data in arbitrarily (and 
independently of the sender) sized packets, even when using synchronous I/O, 
by simply specifying the size in the read() call. The sender must of course 
check how many bytes of it's write() call actually got written.

- Thomas

-- 
Thomas Moschny  [EMAIL PROTECTED]


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: Automate stdio chunk size

2006-07-10 Thread Thomas Keller

Thomas Moschny wrote
After thinking a while about it, it is no longer clear to me, why there is a 
need for chunked output *at all* ...


The reading side of a pipe can always read the data in arbitrarily (and 
independently of the sender) sized packets, even when using synchronous I/O, 
by simply specifying the size in the read() call. The sender must of course 
check how many bytes of it's write() call actually got written.


Well, maybe there is no need for chunked output, but there is definitely 
the need for some EOF token which tells the client hey, I got all the 
data. Ideally this would be paired with the checksum of the just 
outputted data so the client can ensure that it got all data correctly.


Thomas Keller.


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: Automate stdio chunk size

2006-07-10 Thread Timothy Brownawell
On Mon, 2006-07-10 at 16:09 +0200, Thomas Keller wrote:
 Thomas Moschny wrote
  After thinking a while about it, it is no longer clear to me, why there is 
  a 
  need for chunked output *at all* ...
  
  The reading side of a pipe can always read the data in arbitrarily (and 
  independently of the sender) sized packets, even when using synchronous 
  I/O, 
  by simply specifying the size in the read() call. The sender must of course 
  check how many bytes of it's write() call actually got written.
 
 Well, maybe there is no need for chunked output, but there is definitely 
 the need for some EOF token which tells the client hey, I got all the 
 data. Ideally this would be paired with the checksum of the just 
 outputted data so the client can ensure that it got all data correctly.

We can't use an in-stream EOF token, because the stream should be
binary-safe. So this means prefixing each data chunk with the size of
that chunk. A chunk is output when it reaches the maximum size (because
having a known maximum size seems convenient), or when the stream is
flushed (my understanding is that this is the Right Thing to do, plus it
could be nice if we have commands that take a long time to finish).

I think we need to keep the chunked output format, but there's no reason
not to increase the maximum size. Just that when we do we should bump
the interface version, since the maximum size is a documented part of
the interface. But, I'm not sure how important this is, since I doubt
anyone is relying on that.

There are changes to inventory in the works, that would require changing
the interface version anyway, perhaps we should increase the chunk size
at the same time we land that?

Tim




___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: Automate stdio chunk size

2006-07-10 Thread Thomas Keller

We can't use an in-stream EOF token, because the stream should be
binary-safe. So this means prefixing each data chunk with the size of
that chunk. A chunk is output when it reaches the maximum size (because
having a known maximum size seems convenient), or when the stream is
flushed (my understanding is that this is the Right Thing to do, plus it
could be nice if we have commands that take a long time to finish).


Well, the EOF token wouldn't really have to be '\0', just something a 
parser could distinguish from the normal output flow. F.e. in emails the 
header is separated from the body by double newlines \n\n. If basic_io 
would become standard for all output of the automation interface there 
could even be some well-defined end token there, like


...

command_finished 1234...

where the 1234... part could be the checksum for the complete output 
echoed before that token.


Thomas.


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: Automate stdio chunk size

2006-07-10 Thread Timothy Brownawell
On Mon, 2006-07-10 at 18:50 +0200, Thomas Keller wrote:
  We can't use an in-stream EOF token, because the stream should be
  binary-safe. So this means prefixing each data chunk with the size of
  that chunk. A chunk is output when it reaches the maximum size (because
  having a known maximum size seems convenient), or when the stream is
  flushed (my understanding is that this is the Right Thing to do, plus it
  could be nice if we have commands that take a long time to finish).
 
 Well, the EOF token wouldn't really have to be '\0', just something a 
 parser could distinguish from the normal output flow. F.e. in emails the 
 header is separated from the body by double newlines \n\n. If basic_io 
 would become standard for all output of the automation interface there 
 could even be some well-defined end token there, like

basic_io is not always appropriate, for example automate get_file.
This command also means that the output stream can contain arbirtrary
binary data, so no in-stream EOF token would be safe.

Tim




___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: Automate stdio chunk size

2006-07-10 Thread Thomas Moschny
On Monday 10 July 2006 17:48 Timothy Brownawell wrote:
 There are changes to inventory in the works, that would require changing
 the interface version anyway, perhaps we should increase the chunk size
 at the same time we land that?

Yes. And I think we should change the docs (for the new interface version) to 
*not* specify a maximum chunk size, thus allowing us to change it freely 
later, for example through a command line option.

- Thomas

-- 
Thomas Moschny [EMAIL PROTECTED]


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: Automate stdio chunk size

2006-07-10 Thread Nathaniel Smith
On Mon, Jul 10, 2006 at 08:17:36PM +0200, Thomas Moschny wrote:
 On Monday 10 July 2006 17:48 Timothy Brownawell wrote:
  There are changes to inventory in the works, that would require changing
  the interface version anyway, perhaps we should increase the chunk size
  at the same time we land that?
 
 Yes. And I think we should change the docs (for the new interface version) to 
 *not* specify a maximum chunk size, thus allowing us to change it freely 
 later, for example through a command line option.

Err, yes, I'm sort of surprised that's in the docs at all.

The point of having an upper-limit is to put an upper bound on how
much memory monotone has to use.  1M seems a bit large for this
purpose, and I'm astonished if you actually have to go to 1M to get
the benefit.  Could someone run timings at different block sizes
and pick one that gives most of the speed benefit without being huge?

-- Nathaniel

-- 
Damn the Solar System.  Bad light; planets too distant; pestered with
comets; feeble contrivance; could make a better one myself.
  -- Lord Jeffrey


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: Automate stdio chunk size

2006-07-10 Thread Nuno Lucas

On 7/10/06, Nathaniel Smith [EMAIL PROTECTED] wrote:

On Mon, Jul 10, 2006 at 08:17:36PM +0200, Thomas Moschny wrote:
The point of having an upper-limit is to put an upper bound on how
much memory monotone has to use.  1M seems a bit large for this
purpose, and I'm astonished if you actually have to go to 1M to get
the benefit.  Could someone run timings at different block sizes
and pick one that gives most of the speed benefit without being huge?


Note that this interface will probably be run on a pipe and different
block sizes will have a different impact on different operating
systems (and even between versions).

I remember there was a paper on the drastic differences in speed
between the pipe handling on Windows 2000 and XP for different block
sizes.
Maybe someone can recollect where that paper was...


Best regards,
~Nuno Lucas


-- Nathaniel



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel