On Mon, Apr 11, 2016 at 1:46 PM, Jay Pipes <jaypi...@gmail.com> wrote: > Hi Miguel Angel, comments/answers inline :) > > On 04/08/2016 09:17 AM, Miguel Angel Ajo Pelayo wrote: >> >> Hi!, >> >> In the context of [1] (generic resource pools / scheduling in nova) >> and [2] (minimum bandwidth guarantees -egress- in neutron), I had a talk >> a few weeks ago with Jay Pipes, >> >> The idea was leveraging the generic resource pools and scheduling >> mechanisms defined in [1] to find the right hosts and track the total >> available bandwidth per host (and per host "physical network"), >> something in neutron (still to be defined where) would notify the new >> API about the total amount of "NIC_BW_KB" available on every host/physnet. > > > Yes, what we discussed was making it initially per host, meaning the host > would advertise a total aggregate bandwidth amount for all NICs that it uses > for the data plane as a single amount. > > The other way to track this resource class (NIC_BW_KB) would be to make the > NICs themselves be resource providers and then the scheduler could pick a > specific NIC to bind the port to based on available NIC_BW_KB on a > particular NIC. > > The former method makes things conceptually easier at the expense of > introducing greater potential for retrying placement decisions (since the > specific NIC to bind a port to wouldn't be known until the claim is made on > the compute host). The latter method adds complexity to the filtering and > scheduler in order to make more accurate placement decisions that would > result in fewer retries. > >> That part is quite clear to me, >> >> From [1] I'm not sure which blueprint introduces the ability to >> schedule based on the resource allocation/availability itself, >> ("resource-providers-scheduler" seems more like an optimization to the >> schedule/DB interaction, right?) > > > Yes, you are correct about the above blueprint; it's only for moving the > Python-side filters to be a DB query. > > The resource-providers-allocations blueprint: > > https://review.openstack.org/300177 > > Is the one where we convert the various consumed resource amount fields to > live in the single allocations table that may be queried for usage > information. > > We aim to use the ComputeNode object as a facade that hides the migration of > these data fields as much as possible so that the scheduler actually does > not need to know that the schema has changed underneath it. Of course, this > only works for *existing* resource classes, like vCPU, RAM, etc. It won't > work for *new* resource classes like the discussed NET_BW_KB because, > clearly, we don't have an existing field in the instance_extra or other > tables that contain that usage amount and therefore can't use ComputeNode > object as a facade over a non-existing piece of data. > > Eventually, the intent is to change the ComputeNode object to return a new > AllocationList object that would contain all of the compute node's resources > in a tabular format (mimicking the underlying allocations table): > > https://review.openstack.org/#/c/282442/20/nova/objects/resource_provider.py > > Once this is done, the scheduler can be fitted to query this AllocationList > object to make resource usage and placement decisions in the Python-side > filters. > > We are still debating on the resource-providers-scheduler-db-filters > blueprint: > > https://review.openstack.org/#/c/300178/ > > Whether to change the existing FilterScheduler or create a brand new > scheduler driver. I could go either way, frankly. If we made a brand new > scheduler driver, it would do a query against the compute_nodes table in the > DB directly. The legacy FilterScheduler would manipulate the AllocationList > object returned by the ComputeNode.allocations attribute. Either way we get > to where we want to go: representing all quantitative resources in a > standardized and consistent fashion. > >> And, that brings me to another point: at the moment of filtering >> hosts, nova I guess, will have the neutron port information, it has to >> somehow identify if the port is tied to a minimum bandwidth QoS policy. > > > Yes, Nova's conductor gathers information about the requested networks > *before* asking the scheduler where to place hosts: > > https://github.com/openstack/nova/blob/stable/mitaka/nova/conductor/manager.py#L362 > >> That would require identifying that the port has a "qos_policy_id" >> attached to it, and then, asking neutron for the specific QoS policy >> [3], then look out for a minimum bandwidth rule (still to be defined), >> and extract the required bandwidth from it. > > > Yep, exactly correct. > >> That moves, again some of the responsibility to examine and >> understand external resources to nova. > > > Yep, it does. The alternative is more retries for placement decisions > because accurate decisions cannot be made until the compute node is already > selected and the claim happens on the compute node. > >> Could it make sense to make that part pluggable via stevedore?, so >> we would provide something that takes the "resource id" (for a port in >> this case) and returns the requirements translated to resource classes >> (NIC_BW_KB in this case). > > > Not sure Stevedore makes sense in this context. Really, we want *less* > extensibility and *more* consistency. So, I would envision rather a system > where Nova would call to Neutron before scheduling when it has received a > port or network ID in the boot request and ask Neutron whether the port or > network has any resource constraints on it. Neutron would return a > standardized response containing each resource class and the amount > requested in a dictionary (or better yet, an os_vif.objects.* object, > serialized). Something like: > > { > 'resources': { > '<UUID of port or network>': { > 'NIC_BW_KB': 2048, > 'IPV4_ADDRESS': 1 > } > } > } >
Oh, true, that's a great idea, having some API that translates a neutron resource, to scheduling constraints. The external call will be still required, but the coupling issue is removed. > In the case of the NIC_BW_KB resource class, Nova's scheduler would look for > compute nodes that had a NIC with that amount of bandwidth still available. > In the case of the IPV4_ADDRESS resource class, Nova's scheduler would use > the generic-resource-pools interface to find a resource pool of IPV4_ADDRESS > resources (i.e. a Neutron routed network or subnet allocation pool) that has > available IP space for the request. > Not sure about the IPV4_ADDRESS part because I still didn't look on how they resolve routed networks with this new framework, but for other constraints makes perfect sense to me. > Best, > -jay > > >> Best regards, >> Miguel Ángel Ajo >> >> >> [1] >> >> http://lists.openstack.org/pipermail/openstack-dev/2016-February/086371.html >> [2] https://bugs.launchpad.net/neutron/+bug/1560963 >> [3] >> http://developer.openstack.org/api-ref-networking-v2-ext.html#showPolicy __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev