
Thanks for raising this. I was interested in the project for some time, but I never got a chance to wrap my head around. I also have a few concerns - please see inline.

On 09/25/2017 01:27 PM, Zhenguo Niu wrote:
Hi folks,

First of all, thanks for the audiences for Mogan project update in the TC room during Denver PTG. Here we would like to get more suggestions before we apply for inclusion.

Speaking only for myself, I find the current direction of one API+scheduler for vm/baremetal/container unfortunate. After containers management moved out to be a separated project Zun, baremetal with Nova and Ironic continues to be a pain point.

#. API
Only part of the Nova APIs and parameters can apply to baremetal instances, meanwhile for interoperable with other virtual drivers, bare metal specific APIs such as deploy time RAID, advanced partitions can not  be included. It's true that we can support various compute drivers, but the reality is that the support of each of hypervisor is not equal, especially for bare metals in a virtualization world. But I understand the problems with that as Nova was designed to provide compute resources(virtual machines) instead of bare metals.

A correction: any compute resources.

Nova works okay with bare metals. It's never going to work perfectly though, because we always have to find a common subset of features between VM and BM. RAID is a good example indeed. We have a solution for the future, but it's not going to satisfy everyone.

Now I have a question: to which extend do you plan to maintain the "cloud" nature of the API? Let's take RAID as an example. Ironic can apply a very generic or a very specific configuration. You can request "just RAID-5" or you can ask for specific disks to be combined in a specific combination. I believe the latter is not something we want to expose to cloud users, as it's not going to be a cloud any more.

#. Scheduler
Bare metal doesn't fit in to the model of 1:1 nova-compute to resource, as nova-compute processes can't be run on the inventory nodes themselves. That is to say host aggregates, availability zones and such things based on compute service(host) can't be applied to bare metal resources. And for grouping like anti-affinity, the granularity is also not same with virtual machines, bare metal users may want their HA instances not on the same failure domain instead of the node itself. Short saying, we can only get a rigid resource class only scheduling for bare metals.

It's not rigid. Okay, it's rigid, but it's not as rigid as what we used to have.

If you're going back to VCPUs-memory-disk triad, you're making it more rigid. Of these three, only memory has ever made practical sense for deployers. VCPUs is a bit subtle, as it depends on hyper-threading enabled/disabled, and I've never seen people using it too often.

But our local_gb thing is an outright lie. Of 20 disks a machine can easily have, which one do you report for local_gb? Well, in the best case people used ironic root device hints with ironic-inspector to figure out. Which is great, but requires ironic-inspector. In the worst case people just put random number there to make scheduling work. This is horrible, please make sure to not get back to it.

What I would love to see of a bare metal scheduling project is a scheduling based on inventory. I was thinking of being able to express things like "give me a node with 2 GPU of at least 256 CUDA cores each". Do you plan on this kind of things? This would truly mean flexible scheduling.

Which brings me to one of my biggest reservations about Mogan: I don't think copying Nova's architecture is a good idea overall. Particularly, I think you have flavors, which do not map at all into bare metal world IMO.

And most of the cloud providers in the market offering virtual machines and bare metals as separated resources, but unfortunately, it's hard to achieve this with one compute service.

Do you have proofs for the first statement? And do you imply public clouds? Our customers deploy hybrid environments, to my best knowledge. Nobody I know uses one compute service in the whole cloud anyway.

I heard people are deploying seperated Nova for virtual machines and bare metals with many downstream hacks to the bare metal single-driver Nova but as the changes to Nova would be massive and may invasive to virtual machines, it seems not practical to be upstream.

I think you're overestimated the problem. In TripleO we deploy separate virtual nova compute nodes. If ironic is enabled, its nova computes go to controllers. Then you can use host aggregates to split flavors between VM and BM. With resources classes it's even more trivial: you get this split naturally.

So we created Mogan [1] about one year ago, which aims to offer bare metals as first class resources to users with a set of bare metal specific API and a baremetal-centric scheduler(with Placement service). It was like an experimental project at the beginning, but the outcome makes us believe it's the right way. Mogan will fully embrace Ironic for bare metal provisioning and with RSD server [2] introduced to OpenStack, it will be a new world for bare metals, as with that we can compose hardware resources on the fly.

Good that you touched this topic, because I have a question here :)

With ironic you *request* a node. With RSD and similar you *create* a node, which is closer to VMs than to traditional BMs. This gives a similar problem to what we have with nova now. Namely, exact vs non-exact filters. How do you solve it? Assuming you plan on using flavors on (which I think is a bad idea), do you use exact or non-exact filters? How do you handle the difference between approaches?

Also, I would like to clarify the overlaps between Mogan and Nova, I bet there must be some users who wants to use one API for the compute resources management as they don't care about whether it's a virtual machine or a bare metal server. Baremetal driver with Nova is still the right choice for such users to get raw performance compute resources. On the contrary, Mogan is for real bare metal users and cloud providers who wants to offer bare metals as a separated resources.

Thank you for your time!

[1] https://wiki.openstack.org/wiki/Mogan
[2] https://www.intel.com/content/www/us/en/architecture-and-technology/rack-scale-design-overview.html

Best Regards,
Zhenguo Niu

OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

Reply via email to