On 07/25/2015 03:27 PM, Bhandaru, Malini K wrote:
Thanks Jay for covering Host Aggregate Resource Pool tracking at the mid-cycle 
meetup.
I could see the implementation being very similar to the extensible resource 
tracker defined today but would like to understand better the value it provides
1) Is it to quickly respond able to honor a scheduling request?

The idea behind resource pools actually doesn't have much to do with the speed of a particular scheduling request. It has to do with refactoring the currently awkward way that resources are modeled in the Nova database.

There are some providers of resources that don't fit well. Namely, Ironic and shared storage providers. For Ironic, we have butchered the compute_nodes table in the database with a "hypervisor_hostname" column that doesn't mean anything to any other driver than Ironic. Add to that, because of this butchering and design of nova-compute, there can only ever be a single nova-compute running the ironic virt driver. And the resource tracker in this nova-compute totally fudges the resource usage records because for Ironic, resources aren't elastic. They are static. So, while Ironic can report it has 100 PB of storage available, that doesn't mean anything at all, because it actually has chunks of storage available that map to the underlying hardware of the Ironic nodes.

Similarly, if shared storage is used in Nova, the usage and total capacity numbers reported by the resource trackers in nova-computes that utilize that shared storage are completely incorrect.

The resource pools idea is meant to correct this reporting for providers of resources that don't meet the original Nova resource tracking ideals.

2) Is it to capture some statistics - usage trends, mean/median during the 
day/week etc for capacity planning?

No, this doesn't come into play with the resource pool concepts at all.

To determine the actual host in such a pool, weighting conditions still apply ..

No, you are misunderstanding what the resource pool is. A resource pool isn't a collection of compute hosts. It is the inventory records for a particular provider one or more types of resource. So, a traditional compute node, with no shared storage, would be a single resource pool for CPU, RAM, LocalDisk, PCI Devices, and NUMA nodes. A compute node that provided no local disk but instead relied on shared storage would be a resource pool for CPU, RAM, PCI Devices, and NUMA nodes. A separate resource pool that represented the shared disk storage would exist and be associated with that compute node via the host aggregates table.

The scheduler will look at the resource pools table to identify a set of provider of resources that match what was in the launch request spec. Once the scheduler determines a list of resource pools, it would be able to weigh those providers in much the same way that it does today. The difference with this approach is the resource usage amounts will be accurate for all types of resources instead of being totally inaccurate for shared resources and providers of "fixed static resources" like Ironic.

Best,
-jay

If more free resources on a host are weighted higher, we shall spread the 
workload in the pool, and if the opposite, we shall consolidate workloads on 
active hosts.

So this deeper dive into the elements of the pool is inescapable.

I could see the use of heuristics instead of checking each of the hosts in the 
resource pool.

Regards
Malini



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to