On 15/09/15 02:46 +0200, Monty Taylor wrote:
On 09/15/2015 02:06 AM, Clint Byrum wrote:Excerpts from Doug Hellmann's message of 2015-09-14 13:46:16 -0700:Excerpts from Clint Byrum's message of 2015-09-14 13:25:43 -0700:Excerpts from Doug Hellmann's message of 2015-09-14 12:51:24 -0700:Excerpts from Flavio Percoco's message of 2015-09-14 14:41:00 +0200:On 14/09/15 08:10 -0400, Doug Hellmann wrote:After having some conversations with folks at the Ops Midcycle a few weeks ago, and observing some of the more recent email threads related to glance, glance-store, the client, and the API, I spent last week contacting a few of you individually to learn more about some of the issues confronting the Glance team. I had some very frank, but I think constructive, conversations with all of you about the issues as you see them. As promised, this is the public email thread to discuss what I found, and to see if we can agree on what the Glance team should be focusing on going into the Mitaka summit and development cycle and how the rest of the community can support you in those efforts. I apologize for the length of this email, but there's a lot to go over. I've identified 2 high priority items that I think are critical for the team to be focusing on starting right away in order to use the upcoming summit time effectively. I will also describe several other issues that need to be addressed but that are less immediately critical. First the high priority items: 1. Resolve the situation preventing the DefCore committee from including image upload capabilities in the tests used for trademark and interoperability validation. 2. Follow through on the original commitment of the project to provide an image API by completing the integration work with nova and cinder to ensure V2 API adoption.Hi Doug, First and foremost, I'd like to thank you for taking the time to dig into these issues, and for reaching out to the community seeking for information and a better understanding of what the real issues are. I can imagine how much time you had to dedicate on this and I'm glad you did. Now, to your email, I very much agree with the priorities you mentioned above and I'd like for, whomever will win Glance's PTL election, to bring focus back on that. Please, find some comments in-line for each point:I. DefCore The primary issue that attracted my attention was the fact that DefCore cannot currently include an image upload API in its interoperability test suite, and therefore we do not have a way to ensure interoperability between clouds for users or for trademark use. The DefCore process has been long, and at times confusing, even to those of us following it sort of closely. It's not entirely surprising that some projects haven't been following the whole time, or aren't aware of exactly what the whole thing means. I have proposed a cross-project summit session for the Mitaka summit to address this need for communication more broadly, but I'll try to summarize a bit here.+1 I think it's quite sad that some projects, especially those considered to be part of the `starter-kit:compute`[0], don't follow closely what's going on in DefCore. I personally consider this a task PTLs should incorporate in their role duties. I'm glad you proposed such session, I hope it'll help raising awareness of this effort and it'll help moving things forward on that front.Until fairly recently a lot of the discussion was around process and priorities for the DefCore committee. Now that those things are settled, and we have some approved policies, it's time to engage more fully. I'll be working during Mitaka to improve the two-way communication.DefCore is using automated tests, combined with business policies, to build a set of criteria for allowing trademark use. One of the goals of that process is to ensure that all OpenStack deployments are interoperable, so that users who write programs that talk to one cloud can use the same program with another cloud easily. This is a *REST API* level of compatibility. We cannot insert cloud-specific behavior into our client libraries, because not all cloud consumers will use those libraries to talk to the services. Similarly, we can't put the logic in the test suite, because that defeats the entire purpose of making the APIs interoperable. For this level of compatibility to work, we need well-defined APIs, with a long support period, that work the same no matter how the cloud is deployed. We need the entire community to support this effort. From what I can tell, that is going to require some changes to the current Glance API to meet the requirements. I'll list those requirements, and I hope we can discuss them to a degree that ensures everyone understands them. I don't want this email thread to get bogged down in implementation details or API designs, though, so let's try to keep the discussion at a somewhat high level, and leave the details for specs and summit discussions. I do hope you will correct any misunderstandings or misconceptions, because unwinding this as an outside observer has been quite a challenge and it's likely I have some details wrong. As I understand it, there are basically two ways to upload an image to glance using the V2 API today. The "POST" API pushes the image's bits through the Glance API server, and the "task" API instructs Glance to download the image separately in the background. At one point apparently there was a bug that caused the results of the two different paths to be incompatible, but I believe that is now fixed. However, the two separate APIs each have different issues that make them unsuitable for DefCore. The DefCore process relies on several factors when designating APIs for compliance. One factor is the technical direction, as communicated by the contributor community -- that's where we tell them things like "we plan to deprecate the Glance V1 API". In addition to the technical direction, DefCore looks at the deployment history of an API. They do not want to require deploying an API if it is not seen as widely usable, and they look for some level of existing adoption by cloud providers and distributors as an indication of that the API is desired and can be successfully used. Because we have multiple upload APIs, the message we're sending on technical direction is weak right now, and so they have focused on deployment considerations to resolve the question.The task upload process you're referring to is the one that uses the `import` task, which allows you to download an image from an external source, asynchronously, and import it in Glance. This is the old `copy-from` behavior that was moved into a task. The "fun" thing about this - and I'm sure other folks in the Glance community will disagree - is that I don't consider tasks to be a public API. That is to say, I would expect tasks to be an internal API used by cloud admins to perform some actions (bsaed on its current implementation). Eventually, some of these tasks could be triggered from the external API but as background operations that are triggered by the well-known public ones and not through the task API.Does that mean it's more of an "admin" API?I think it is basically just a half-way done implementation that is exposed directly to users of Rackspace Cloud and, AFAIK, nobody else. When last I tried to make integration tests in shade that exercised the upstream glance task import code, I was met with an implementation that simply did not work, because the pieces behind it had never been fully implemented upstream. That may have been resolved, but in the process of trying to write tests and make this work, I discovered a system that made very little sense from a user standpoint. I want to upload an image, why do I want a task?!Ultimately, I believe end-users of the cloud simply shouldn't care about what tasks are or aren't and more importantly, as you mentioned later in the email, tasks make clouds not interoperable. I'd be pissed if my public image service would ask me to learn about tasks to be able to use the service.It would be OK if a public API set up to do a specific task returned a task ID that could be used with a generic task API to check status, etc. So the idea of tasks isn't completely bad, it's just too vague as it's exposed right now.I think it is a concern, because it is assuming users will want to do generic things with a specific API. This turns into a black-box game where the user shoves a task in and then waits to see what comes out the other side. Not something I want to encourage users to do or burden them with. We have an API whose sole purpose is to accept image uploads. That Rackspace identified a scaling pain point there is _good_. But why not *solve* it for the user, instead of introduce more complexity?That's fair. I don't actually care which API we have, as long as it meets the other requirements.What I'd like to see is the upload image API given the ability to respond with a URL that can be uploaded to using the object storage API we already have in OpenStack. Exposing users to all of these operator choices is just wasting their time. Just simply say "Oh, you want to upload an image? Thats fine, please upload it as an object over there and POST here again when it is ready to be imported." This will make perfect sense to a user reading docs, and doesn't require them to grasp an abstract concept like "tasks" when all they want to do is upload their image.And what would it do if the backing store for the image service isn't Swift or another object storage system that supports direct uploads? Return a URL that pointed back to itself, maybe?For those operators who don't have concerns about scaling the glance API service to their users' demands, glance's image upload API works perfectly well today. The indirect approach is only meant to dealt with the situation where the operator expects a lot of really large images to be uploaded simultaneously, and would like to take advantage of the Swift API's rather rich set of features for making that a positive experience. There is also a user benefit to using the Swift API, which is that a segmented upload can more easily be resumed.Yes, BUT ...If there are going to be two legitimate ways to upload an image, that needs to be discoverable so that scripts (or things like ansible or razor or juju or terraform or *insert system tool here*) can accomplish "please upload this here image file into this here cloud"
I don't think there's a problem with having 2 legitimate ways to *upload* the image data but rather in those 2 ways not being always deployed/enabled. In Glance's v2, there are currently 2 ways to create images. Each of these has a specific way to import - upload or downloading from - the data. One these create workflows has 2 ways to "attach" data to an image - uploading or using locations. As far as Glance's public API goes, I'd really like this to be shrunk down to just 1 way to create images and then allow for those 3, useful, ways to attach data to an image to be used through that single call, which is basically what we have in V1.
It's really not about the REST API itself. Literally zero percent of the people are doing that. People use tools. Tools write to APIs. And nobody who is running an OpenStack cloud should have to write their own branded tools - that's a cost that's completely silly to bear. An operator running an openstack cloud should be able to say to their users "go use the ansible openstack modules" or "go use the juju openstack provider"Which brings us back to your excellnet point - both of these are totally legitimate ways to upload to the cloud, except small clouds often don't run swift, and large clouds may want to handle the situation you mention and leverage swift. So how about:glance image-create my-great-image returns: 200 OK { upload-url: 'https://example.com/some/url/location', is_swift: False } OR glance image-create my-great-image returns: 200 OK { upload-url: 'https://example.com/some/url/location', is_swift: False }and if is_swift is true, then the user (or script) knows it can used the threaded swiftuploader, If it's false, the user (or script) just uploads content to the URL. The process is completely sane, is pretty much the same for both types of cloud, and has one known and understandable either-or deployer difference that each fork of is open source and each fork of has a defined semantic.Details, of course - and I know there are at least 5 more to work out - but hopefully that makes sense and doesn't disenfrancize anyone?
IMHO, the above is already complicated enough. It's introspectable, sure, but it already says too much about the cloud that the user shouldn't care of. For the sake of expanding that thought, though, I'd say the user should just get an URL and the rest should be handled transparently. We once talked about a possibility of allowing users to use glance_store directly to upload the image to the cloud store when scenarios like the one above (or Rackspace's specifically) exist. Again, just throwing it out there for the sake of discussing scenarios and use-cases rather than implementation details that I personally don't care about righ now.
Now, IMO HTTP has facilities for that too, it's just that glanceclient (and lo, many HTTP clients) aren't well versed in those deeper, optional pieces of HTTP. That is why Swift works the way it does, and I like the idea of glance simply piggy backing on the experience of many years of production refinement that are available and codified in Swift and any other OpenStack Object Storage API implementations (like the CEPH RADOS gateway).
++ Flavio -- @flaper87 Flavio Percoco
pgpjSfjjo4Njg.pgp
Description: PGP signature
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev