Re: [openstack-dev] [glance] proposed priorities for Mitaka

Flavio Percoco Tue, 15 Sep 2015 03:09:08 -0700

On 15/09/15 02:46 +0200, Monty Taylor wrote:

On 09/15/2015 02:06 AM, Clint Byrum wrote:

Excerpts from Doug Hellmann's message of 2015-09-14 13:46:16 -0700:

Excerpts from Clint Byrum's message of 2015-09-14 13:25:43 -0700:

Excerpts from Doug Hellmann's message of 2015-09-14 12:51:24 -0700:

Excerpts from Flavio Percoco's message of 2015-09-14 14:41:00 +0200:

On 14/09/15 08:10 -0400, Doug Hellmann wrote:


After having some conversations with folks at the Ops Midcycle a
few weeks ago, and observing some of the more recent email threads
related to glance, glance-store, the client, and the API, I spent
last week contacting a few of you individually to learn more about
some of the issues confronting the Glance team. I had some very
frank, but I think constructive, conversations with all of you about
the issues as you see them. As promised, this is the public email
thread to discuss what I found, and to see if we can agree on what
the Glance team should be focusing on going into the Mitaka summit
and development cycle and how the rest of the community can support
you in those efforts.

I apologize for the length of this email, but there's a lot to go
over. I've identified 2 high priority items that I think are critical
for the team to be focusing on starting right away in order to use
the upcoming summit time effectively. I will also describe several
other issues that need to be addressed but that are less immediately
critical. First the high priority items:

1. Resolve the situation preventing the DefCore committee from
  including image upload capabilities in the tests used for trademark
  and interoperability validation.

2. Follow through on the original commitment of the project to
  provide an image API by completing the integration work with
  nova and cinder to ensure V2 API adoption.


Hi Doug,

First and foremost, I'd like to thank you for taking the time to dig
into these issues, and for reaching out to the community seeking for
information and a better understanding of what the real issues are. I
can imagine how much time you had to dedicate on this and I'm glad you
did.

Now, to your email, I very much agree with the priorities you
mentioned above and I'd like for, whomever will win Glance's PTL
election, to bring focus back on that.

Please, find some comments in-line for each point:


I. DefCore

The primary issue that attracted my attention was the fact that
DefCore cannot currently include an image upload API in its
interoperability test suite, and therefore we do not have a way to
ensure interoperability between clouds for users or for trademark
use. The DefCore process has been long, and at times confusing,
even to those of us following it sort of closely. It's not entirely
surprising that some projects haven't been following the whole time,
or aren't aware of exactly what the whole thing means. I have
proposed a cross-project summit session for the Mitaka summit to
address this need for communication more broadly, but I'll try to
summarize a bit here.


+1

I think it's quite sad that some projects, especially those considered
to be part of the `starter-kit:compute`[0], don't follow closely
what's going on in DefCore. I personally consider this a task PTLs
should incorporate in their role duties. I'm glad you proposed such
session, I hope it'll help raising awareness of this effort and it'll
help moving things forward on that front.


Until fairly recently a lot of the discussion was around process
and priorities for the DefCore committee. Now that those things are
settled, and we have some approved policies, it's time to engage
more fully.  I'll be working during Mitaka to improve the two-way
communication.


DefCore is using automated tests, combined with business policies,
to build a set of criteria for allowing trademark use. One of the
goals of that process is to ensure that all OpenStack deployments
are interoperable, so that users who write programs that talk to
one cloud can use the same program with another cloud easily. This
is a *REST API* level of compatibility. We cannot insert cloud-specific
behavior into our client libraries, because not all cloud consumers
will use those libraries to talk to the services. Similarly, we
can't put the logic in the test suite, because that defeats the
entire purpose of making the APIs interoperable. For this level of
compatibility to work, we need well-defined APIs, with a long support
period, that work the same no matter how the cloud is deployed. We
need the entire community to support this effort. From what I can
tell, that is going to require some changes to the current Glance
API to meet the requirements. I'll list those requirements, and I
hope we can discuss them to a degree that ensures everyone understands
them. I don't want this email thread to get bogged down in
implementation details or API designs, though, so let's try to keep
the discussion at a somewhat high level, and leave the details for
specs and summit discussions. I do hope you will correct any
misunderstandings or misconceptions, because unwinding this as an
outside observer has been quite a challenge and it's likely I have
some details wrong.

As I understand it, there are basically two ways to upload an image
to glance using the V2 API today. The "POST" API pushes the image's
bits through the Glance API server, and the "task" API instructs
Glance to download the image separately in the background. At one
point apparently there was a bug that caused the results of the two
different paths to be incompatible, but I believe that is now fixed.
However, the two separate APIs each have different issues that make
them unsuitable for DefCore.

The DefCore process relies on several factors when designating APIs
for compliance. One factor is the technical direction, as communicated
by the contributor community -- that's where we tell them things
like "we plan to deprecate the Glance V1 API". In addition to the
technical direction, DefCore looks at the deployment history of an
API. They do not want to require deploying an API if it is not seen
as widely usable, and they look for some level of existing adoption
by cloud providers and distributors as an indication of that the
API is desired and can be successfully used. Because we have multiple
upload APIs, the message we're sending on technical direction is
weak right now, and so they have focused on deployment considerations
to resolve the question.


The task upload process you're referring to is the one that uses the
`import` task, which allows you to download an image from an external
source, asynchronously, and import it in Glance. This is the old
`copy-from` behavior that was moved into a task.

The "fun" thing about this - and I'm sure other folks in the Glance
community will disagree - is that I don't consider tasks to be a
public API. That is to say, I would expect tasks to be an internal API
used by cloud admins to perform some actions (bsaed on its current
implementation). Eventually, some of these tasks could be triggered
from the external API but as background operations that are triggered
by the well-known public ones and not through the task API.


Does that mean it's more of an "admin" API?


I think it is basically just a half-way done implementation that is
exposed directly to users of Rackspace Cloud and, AFAIK, nobody else.
When last I tried to make integration tests in shade that exercised the
upstream glance task import code, I was met with an implementation that
simply did not work, because the pieces behind it had never been fully
implemented upstream. That may have been resolved, but in the process
of trying to write tests and make this work, I discovered a system that
made very little sense from a user standpoint. I want to upload an
image, why do I want a task?!


Ultimately, I believe end-users of the cloud simply shouldn't care
about what tasks are or aren't and more importantly, as you mentioned
later in the email, tasks make clouds not interoperable. I'd be pissed
if my public image service would ask me to learn about tasks to be
able to use the service.


It would be OK if a public API set up to do a specific task returned a
task ID that could be used with a generic task API to check status, etc.
So the idea of tasks isn't completely bad, it's just too vague as it's
exposed right now.


I think it is a concern, because it is assuming users will want to do
generic things with a specific API. This turns into a black-box game where
the user shoves a task in and then waits to see what comes out the other
side. Not something I want to encourage users to do or burden them with.

We have an API whose sole purpose is to accept image uploads. That
Rackspace identified a scaling pain point there is _good_. But why not
*solve* it for the user, instead of introduce more complexity?


That's fair. I don't actually care which API we have, as long as it
meets the other requirements.


What I'd like to see is the upload image API given the ability to
respond with a URL that can be uploaded to using the object storage API
we already have in OpenStack. Exposing users to all of these operator
choices is just wasting their time. Just simply say "Oh, you want to
upload an image? Thats fine, please upload it as an object over there
and POST here again when it is ready to be imported." This will make
perfect sense to a user reading docs, and doesn't require them to grasp
an abstract concept like "tasks" when all they want to do is upload
their image.


And what would it do if the backing store for the image service
isn't Swift or another object storage system that supports direct
uploads? Return a URL that pointed back to itself, maybe?


For those operators who don't have concerns about scaling the glance
API service to their users' demands, glance's image upload API works
perfectly well today.  The indirect approach is only meant to dealt with
the situation where the operator expects a lot of really large images to
be uploaded simultaneously, and would like to take advantage of the Swift
API's rather rich set of features for making that a positive experience.
There is also a user benefit to using the Swift API, which is that a
segmented upload can more easily be resumed.


Yes, BUT ...

If there are going to be two legitimate ways to upload an image, that needs to be discoverable so that scripts (or things like ansible or razor or juju or terraform or *insert system tool here*) can accomplish "please upload this here image file into this here cloud"


I don't think there's a problem with having 2 legitimate ways to
*upload* the image data but rather in those 2 ways not being always
deployed/enabled.

In Glance's v2, there are currently 2 ways to create images. Each of
these has a specific way to import - upload or downloading from - the
data. One these create workflows has 2 ways to "attach" data to an
image - uploading or using locations. As far as Glance's public API
goes, I'd really like this to be shrunk down to just 1 way to create
images and then allow for those 3, useful, ways to attach data to an
image to be used through that single call, which is basically what we
have in V1.

It's really not about the REST API itself. Literally zero percent of the people are doing that. People use tools. Tools write to APIs. And nobody who is running an OpenStack cloud should have to write their own branded tools - that's a cost that's completely silly to bear. An operator running an openstack cloud should be able to say to their users "go use the ansible openstack modules" or "go use the juju openstack provider"
Which brings us back to your excellnet point - both of these are totally legitimate ways to upload to the cloud, except small clouds often don't run swift, and large clouds may want to handle the situation you mention and leverage swift. So how about:
glance image-create my-great-image
returns: 200 OK {
 upload-url: 'https://example.com/some/url/location',
 is_swift: False
}

OR

glance image-create my-great-image
returns: 200 OK {
 upload-url: 'https://example.com/some/url/location',
 is_swift: False
}
and if is_swift is true, then the user (or script) knows it can used the threaded swiftuploader, If it's false, the user (or script) just uploads content to the URL. The process is completely sane, is pretty much the same for both types of cloud, and has one known and understandable either-or deployer difference that each fork of is open source and each fork of has a defined semantic.
Details, of course - and I know there are at least 5 more to work out - but hopefully that makes sense and doesn't disenfrancize anyone?



IMHO, the above is already complicated enough. It's introspectable,
sure, but it already says too much about the cloud that the user
shouldn't care of. For the sake of expanding that thought, though, I'd
say the user should just get an URL and the rest should be handled
transparently.

We once talked about a possibility of allowing users to use
glance_store directly to upload the image to the cloud store when
scenarios like the one above (or Rackspace's specifically) exist.
Again, just throwing it out there for the sake of discussing scenarios
and use-cases rather than implementation details that I personally
don't care about righ now.

Now, IMO HTTP has facilities for that too, it's just that glanceclient
(and lo, many HTTP clients) aren't well versed in those deeper, optional
pieces of HTTP. That is why Swift works the way it does, and I like
the idea of glance simply piggy backing on the experience of many years
of production refinement that are available and codified in Swift and
any other OpenStack Object Storage API implementations (like the CEPH
RADOS gateway).


++

Flavio

--
@flaper87
Flavio Percoco

pgpjSfjjo4Njg.pgp
Description: PGP signature

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [glance] proposed priorities for Mitaka

Reply via email to