Hi Chris, (inline)
On 17/08/11 21:12, David Lutterkort wrote:
On Wed, 2011-08-17 at 09:57 -0400, Chris Lalancette wrote:
Hey Marios,
I know this is several months out of date, but I was just doing some
testing on the blob creation stuff and noticing that my libdeltacloud tests
were failing. I traced it down to the fact that the blob_id parameter changed
from param[:blob_id] to param[:blob] when you added the streaming stuff to
blobs.
Yes thats right. Initially we had one operation for creating blobs:
POST /api/buckets/:bucket
and this accepted (amongst others) the 'blob_id' parameter to define the
name of the blob
Then, in order to implement streaming PUT through deltacloud I added:
PUT /api/buckets/:bucket/:blob
The name change for the parameter was, I can only guess, some attempt to
maintain consistency (i.e. 'blob' over 'blob_id') though in hindsight
was not really necessary. Your suggested patch:
post "#{Sinatra::UrlForHelper::DEFAULT_URI_PREFIX}/buckets/:bucket" do
bucket_id = params[:bucket]
- blob_id = params['blob']
+ blob_id = params['blob'] || params['blob_id']
seems fine to me in that it won't break anything. If it maintains
compatibility with your stuff then I personally have no objection to
making this addition. More on PUT vs POST below
I think it's another case where the code does somehing special for the
HTML UI - the official API for creating a new blob is
PUT /api/buckets/:bucket/:blob; looking at this now, it seems strange
that we have two different ways to create blobs, and I am wondering if
we shouldn't drop the PUT, and only use POST for everything.
Yes, we have two methods for creating blobs: POST
(http://incubator.apache.org/deltacloud/api#h4_3_8) and PUT
(http://incubator.apache.org/deltacloud/api#h4_3_7).
The POST method is non-streaming:
client ---TEMP_FILE---> deltacloud ---STREAM---> provider
i.e., the client sends the blob to deltacloud, which receives the entire
request and creates a temp_file for the blob data, and then streams this
to the provider.
The PUT operation is streaming:
client ---STREAM---> deltacloud ---STREAM---> provider
i.e., the client sends the blob to deltacloud, which does not wait to
receive the entire request and instead starts streaming the blob data to
the provider as this is received.
Now, in order to create a blob on a given cloud provider service, you
invariably must specify the content_length of the blob. For a PUT
operation, the content_length is exactly as defined by the sending
client in the PUT to deltacloud. Thus, we can take that content_length
and start sending the data to the provider as we are receiving it.
However, for a POST operation, the content_length of the blob is not
what is sent for the client POST operation to deltacloud, due to the
presence of the multipart/form-data boundary, which will vary depending
on the sending client. It became very messy/difficult to try and parse
the boundary and 'guess' the content length of the blob in order to
start streaming, which is why we decided to go with PUT. In fact, the
cloud providers themselves (EC2, rackspace, Azure) use PUT operations to
create blobs (with POST supported as an alternative).
Thus, we have both POST (non streaming, only to support HTML forms and
the web browser interface) and PUT (streaming). If we want to remove one
of those methods then I would definitely vote to remove POST since imho
the streaming functionality for creating blobs is absolutely necessary
for 'real world' use. Forcing deltacloud to buffer all blob objects
before sending them on to the provider is obviously not very useful.
marios
David