On 6 November 2013 14:29, John Arbash Meinel <j...@arbash-meinel.com> wrote:
>>> I would be perfectly happy with PUT if we were already a RESTful
>>> API, but it seems a bit strange to just tack that on, and will be
>>> a one-more-special case that we run into when trying to debug,
>>> etc. (logs will likely be different, working in the code will
>>> have to think about multiple paths, etc.)
>>
>> The reason is that if you've got a large charm (and we'll probably
>> end up uploading tools through this mechanism at some point) PUT
>> streams the bytes nicely, but we *really* don't want a single RPC
>> containing the entire charm as an arbitrarily large blob, so we'd
>> have to add quite a bit more mechanism to "stream" the data with
>> RPC, and even then you have to work out how big your data packets
>> are, and you incur round trip latency for as many packets as you
>> send - this would make charm upload quite a bit slower.
>>
>> I suspect that the amount of work outlined above is actually quite
>> a bit less than would need to be done to implement charm streaming
>> uploads over the RPC interface.
>>
>
> The chunked implementation in golang just uses io.Copy which reads and
> writes everything in 32kB chunks. We could just as easily do the same
> thing, or just make them 1MB chunks or whatever. We can just as easily
> pipeline the RPC requests which is what is being done with
> transfer-encoding: chunked.

I'm not sure I understand this.

How about I explain what would be necessary to stream charms over
the RPC interface?

The sequence of RPC operations might look a little like this:

-> UploadCharm {RequestId: 1, URL: "cs:precise/wordpress:28", SHA256:
"abcb4464b3b3d3f3de"}
<- {RequestId: 1, StreamId: 1234}
-> WriteData {RequestId: 2, StreamId: 1234, Data: base64encodeddata}
<- {RequestId: 2}
-> WriteData {RequestId: 3, StreamId: 1234, Data: base64encodeddata}
<- {RequestId: 3}
... repeat for as many data blocks as are in the charm.
-> CloseStream {RequestId: 99, StreamId: 1234}
<- {RequestId: 99}

To do this, we'd need a new "stream" entity in the API,
and we'd need to implement the above operations on it.

If we wanted to pipeline, we'd have to go more sophisticated.
We could include the offset of the data block in the RPC requests.

A pipelined streaming operation might look like this:

-> UploadCharm {RequestId: 1, URL: "cs:precise/wordpress:28", SHA256:
"abcb4464b3b3d3f3de"}
<- {RequestId: 1, StreamId: 1234}
-> WriteData {RequestId: 2, StreamId: 1234, Offset: 0, Data: base64encodeddata}
-> WriteData {RequestId: 3, StreamId: 1234, Offset: 65536, Data:
base64encodeddata}
<- {RequestId: 2}
-> WriteData{ RequestId: 4, StreamId: 1234, Offset: 131072, Data:
base64encodeddata}
<- {RequestId: 3}
... repeat for as many data blocks as are in the charm
-> CloseStream {RequestId: 99, StreamId: 1234}
<- {RequestId: 99}

This is eminently doable (and I've implemented this kind of thing in the past),
but it is considerably more complex than just using TCP streaming
as nature intended.

And it's still not great - there are parameters that need tuning
but the best values depend on the actual connection in use.
For example: how big do you chunk the data? (that's the "packet size"
I mentioned above); how many outstanding concurrent requests do
you allow in flight?

TCP already does sliding windows - it's a much better fit for streaming data.
That's what it was designed for. The chunk size that io.Copy uses isn't
that important, as each packet doesn't entail a round-trip.

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev

Reply via email to