On 8/30/11 12:22 PM, jdrewsen wrote:
Walter suggested that I should write an article about using the wrapper.
I've now taken the first steps on writing such an article. I will have
to get the library API rock stable before I can finish it though.

I have a suggestion for you - write and test an asynchronous copy program.

It is a continuous source of surprise to me that even seasoned programmers don't realize that this is an inefficient copy routine:

while (read(source, buffer))
  write(target, buffer);

If the methods are synchronous and the speeds of source and target are independent, the net transfer rate of the routine is R1*R1/(R1+R2), where R1 and R2 are the transfer rates of the source and destination respectively. In the worst case R1=R2 and the net transfer rate is half that.

This is an equation very easy to derive from first principles but many people are very incredulous about it. Consequently, many classic file copying programs (including cp; I don't know about wget or curl) use the inefficient method. As the variety of data sources increases (SSD, magnetic, networked etc) I predict async I/O will become increasingly prevalent. In an async approach with a queue, transfer proceeds at the optimal speed min(R1, R2). That's why I'm insisting the async range should be super easy to use, encapsulated, and robust: if people reach for the async range by default for their dealings with networked data, they'll write optimal code, sometimes even without knowing it.

If your article discusses this and shows e.g. how to copy data optimally from one server to another using HTTP, or from one server to a file etc, and if furthermore you show how your API makes all that a trivial five-liner, that would be a very instructive piece.


Andrei

Reply via email to