Re: [openstack-dev] [nova][scheduling] Can VM placement consider the VM network traffic need?

2017-09-05 Thread Mooney, Sean K
Interesting timeing
Would love to talk about this at the ptg.
Comments inline.
Regards
sean

> -Original Message-
> From: Balazs Gibizer [mailto:balazs.gibi...@ericsson.com]
> Sent: Tuesday, September 5, 2017 8:23 AM
> To: OpenStack Development Mailing List (not for usage questions)
> 
> Cc: Mooney, Sean K ; mosh...@mellanox.com
> Subject: Re: [openstack-dev] [nova][scheduling] Can VM placement
> consider the VM network traffic need?
> 
> On Mon, Sep 4, 2017 at 9:11 PM, Jay Pipes  wrote:
> > On 09/01/2017 04:42 AM, Rua, Philippe (Nokia - FI/Espoo) wrote:
> > > Will it be possible to include network bandwidth as a resource in
> > Nova scheduling, for VM placement decision?
> >
> > Yes.
> >
> > See here for a related Neutron spec that mentions Placement:
> > https://review.openstack.org/#/c/396297/7/specs/pike/strict-minimum-
> ba
> > ndwidth-support.rst
> >
> > > Context: in telecommunication applications, the network traffic is
> > an important dimension of resource usage. For example, it is often
> > important to distribute "bandwidth-greedy" VMs to different compute
> > nodes. There were some earlier discussions on this topic, but I could
> > not find a concrete outcome. [1][2][3]
> > >
> > > After some reading, I wonder whether the Custom resource classes
> > can provide a generic mechanism? [4][5][6]
> >
> > No :) Custom resource classes are antithetical to generic/standard
> > mechanisms.
> >
> > We want to add two *standard* resource classes, one called
> > NET_INGRESS_BYTES_SEC and another called NET_EGRESS_BYTES_SEC which
> > would represent the total bandwidth in bytes per second the for
> > corresponding traffic directions.
> 
> While I agree that the end goal is to have standard resource classes
> for bandwidth I think custom resource classes are generic enough to
> model bandwidth resource. If you want to play with the bandwidth based
> scheduling idea based on Pike then custom resource classes are
> available as a tool for a proof of concept.
[Mooney, Sean K] 
Form a queens perspective Rodolfo is currently working creating a spec
To introduce a standard bandwidth resource class and resource provider.
He has opened the blueprint to track this here:
https://blueprints.launchpad.net/nova/+spec/bandwidth-resource-provider
currently the scope we are proposing our work to cover is end to end
minimum bandwidth guarantee for sriov interfaces.in this case the bandwidth
resource provider will be a child of the PF. This could be extended
to vSwitches also but in the linux bridge and ovs case neither can support
multi-tenant minimum bandwidth gurrentess at present so from a nova perspective
while we can make sure we do not over subscribe on bandwidth for ovs, neutron
cannot enforce the minimum bandwidth allocation on the vswitch. Hardware 
offloaded
ovs may be able to provide a minimum bandwidth guarantee in the future as might 
vpp
> 
> >
> >
> > What would be the resource provider, though? There are at least two
> > potential answers here:
> >
> > 1) A network interface controller on the compute host
> >
> > In this case, the NIC on the host would be a child provider of the
> > compute host resource provider. It would have an inventory record of
> > resource class NET_INGRESS_BYTES_SEC with a total value representing
> > the entire bandwidth of the host NIC. Instances would consume some
> > amount of NET_INGRESS_BYTES_SEC corresponding to *either* the Nova
> > flavor (if the resources:NET_INGRESS_BYTES_SEC extra-spec is set)
> *or*
> > to the sum of consumed bandwidth amounts from the port profile of any
> > ports specified when launching the instance (and thus would be part
> of
> > the pci device request collection attached to the build request).
> >
> > 2) A "network slice" of a network interface controller on the compute
> > host
> >
> > In this case, assume that the NIC on the compute host has had its
> > total bandwidth constrained via traffic control so that 50% of its
> > available ingress bandwidth is allocated to network A and 50% is
> > allocated to network B.
> >
> > There would be multiple resources providers, each with an inventory
> > record of resource class NET_INGRESS_BYTES_SEC with a total value of
> > 1/2
> > the total NIC bandwidth. Both of these resource providers would be
> > child providers of the compute host resource provider. One of these
> > child resource providers will be decorated with the trait
> > "CUSTOM_NETWORK_A"
> > and the other with trait "CUSTOM_NETWORK_B".
> >
> > The scheduler wo

Re: [openstack-dev] [nova][scheduling] Can VM placement consider the VM network traffic need?

2017-09-05 Thread Balazs Gibizer

On Mon, Sep 4, 2017 at 9:11 PM, Jay Pipes  wrote:

On 09/01/2017 04:42 AM, Rua, Philippe (Nokia - FI/Espoo) wrote:
> Will it be possible to include network bandwidth as a resource in 
Nova scheduling, for VM placement decision?


Yes.

See here for a related Neutron spec that mentions Placement:
https://review.openstack.org/#/c/396297/7/specs/pike/strict-minimum-bandwidth-support.rst

> Context: in telecommunication applications, the network traffic is 
an important dimension of resource usage. For example, it is often 
important to distribute "bandwidth-greedy" VMs to different compute 
nodes. There were some earlier discussions on this topic, but I could 
not find a concrete outcome. [1][2][3]

>
> After some reading, I wonder whether the Custom resource classes 
can provide a generic mechanism? [4][5][6]


No :) Custom resource classes are antithetical to generic/standard
mechanisms.

We want to add two *standard* resource classes, one called
NET_INGRESS_BYTES_SEC and another called NET_EGRESS_BYTES_SEC which
would represent the total bandwidth in bytes per second the for
corresponding traffic directions.


While I agree that the end goal is to have standard resource classes 
for bandwidth I think custom resource classes are generic enough to 
model bandwidth resource. If you want to play with the bandwidth based 
scheduling idea based on Pike then custom resource classes are 
available as a tool for a proof of concept.





What would be the resource provider, though? There are at least two
potential answers here:

1) A network interface controller on the compute host

In this case, the NIC on the host would be a child provider of the
compute host resource provider. It would have an inventory record of
resource class NET_INGRESS_BYTES_SEC with a total value representing 
the
entire bandwidth of the host NIC. Instances would consume some amount 
of
NET_INGRESS_BYTES_SEC corresponding to *either* the Nova flavor (if 
the

resources:NET_INGRESS_BYTES_SEC extra-spec is set) *or* to the sum of
consumed bandwidth amounts from the port profile of any ports 
specified

when launching the instance (and thus would be part of the pci device
request collection attached to the build request).

2) A "network slice" of a network interface controller on the compute 
host


In this case, assume that the NIC on the compute host has had its 
total

bandwidth constrained via traffic control so that 50% of its available
ingress bandwidth is allocated to network A and 50% is allocated to
network B.

There would be multiple resources providers, each with an inventory
record of resource class NET_INGRESS_BYTES_SEC with a total value of 
1/2
the total NIC bandwidth. Both of these resource providers would be 
child

providers of the compute host resource provider. One of these child
resource providers will be decorated with the trait "CUSTOM_NETWORK_A"
and the other with trait "CUSTOM_NETWORK_B".

The scheduler would be able to determine which resource provider to
consume the NET_INGRESS_BYTES_SEC resources from by looking for a
resource provider that has both the required amount of
NET_INGRESS_BYTES_SEC as well as the trait required by the port 
profile.
If, say, the port profile specifies that the port is to go on a NIC 
with
access to network "A", then the build request would contain a request 
to

the scheduler for CUSTOM_NETWORK_A trait...


The above setup can be simulated with custom resource classes and 
individual resource providers per compute node connected to the given 
compute node's resource provider via an aggregate. You most probably 
need to simulate the above network traits with individual custom 
resource classes in Pike.


I definitely don't think it is something I would do in production based 
on Pike due to two reasons:
1) we have bugs in Pike GA that prevents nova to handle some edge cases 
(especially in VM moving scenarios)
2) I agree with Jay that nested providers and neutron support will 
allows us to do something much more cleaner in the future.


However I think Pike is a good base to build a PoC and gather feedback. 
For example I already foresee a need to model OVS packet processing 
limits and in the long run even include the capacity of the TOR 
switches into the picture.





If you're coming to Denver, I encourage you to get with me, Sean 
Mooney,
Moshe Levi and others who are interested in seeing this work move 
forward.


@Jay: sign me up for this list.

Cheers,
gibi



Best,
-jay

> Here is what I have in mind:
> - The VM need is specified in the flavor extra-specs, e.g. 
resources:CUSTOM_BANDWIDTH=123.
> - The compute node total capacity is specified in host aggregate 
metadata, e.g. CUSTOM_BANDWIDTH=999.
> - Nova then takes care of the rest: scheduling where the free 
capacity is sufficient, and performing simple resource usage 
accounting (updating the compute node free network bandwidth capacity 
as required).

>
> Is the outline above according to current plans?
> If not, what would b

Re: [openstack-dev] [nova][scheduling] Can VM placement consider the VM network traffic need?

2017-09-04 Thread Jay Pipes

On 09/01/2017 04:42 AM, Rua, Philippe (Nokia - FI/Espoo) wrote:

Will it be possible to include network bandwidth as a resource in Nova 
scheduling, for VM placement decision?


Yes.

See here for a related Neutron spec that mentions Placement: 
https://review.openstack.org/#/c/396297/7/specs/pike/strict-minimum-bandwidth-support.rst



Context: in telecommunication applications, the network traffic is an important dimension 
of resource usage. For example, it is often important to distribute 
"bandwidth-greedy" VMs to different compute nodes. There were some earlier 
discussions on this topic, but I could not find a concrete outcome. [1][2][3]

After some reading, I wonder whether the Custom resource classes can provide a 
generic mechanism? [4][5][6]


No :) Custom resource classes are antithetical to generic/standard 
mechanisms.


We want to add two *standard* resource classes, one called 
NET_INGRESS_BYTES_SEC and another called NET_EGRESS_BYTES_SEC which 
would represent the total bandwidth in bytes per second the for 
corresponding traffic directions.


What would be the resource provider, though? There are at least two 
potential answers here:


1) A network interface controller on the compute host

In this case, the NIC on the host would be a child provider of the 
compute host resource provider. It would have an inventory record of 
resource class NET_INGRESS_BYTES_SEC with a total value representing the 
entire bandwidth of the host NIC. Instances would consume some amount of 
NET_INGRESS_BYTES_SEC corresponding to *either* the Nova flavor (if the 
resources:NET_INGRESS_BYTES_SEC extra-spec is set) *or* to the sum of 
consumed bandwidth amounts from the port profile of any ports specified 
when launching the instance (and thus would be part of the pci device 
request collection attached to the build request).


2) A "network slice" of a network interface controller on the compute host

In this case, assume that the NIC on the compute host has had its total 
bandwidth constrained via traffic control so that 50% of its available 
ingress bandwidth is allocated to network A and 50% is allocated to 
network B.


There would be multiple resources providers, each with an inventory 
record of resource class NET_INGRESS_BYTES_SEC with a total value of 1/2 
the total NIC bandwidth. Both of these resource providers would be child 
providers of the compute host resource provider. One of these child 
resource providers will be decorated with the trait "CUSTOM_NETWORK_A" 
and the other with trait "CUSTOM_NETWORK_B".


The scheduler would be able to determine which resource provider to 
consume the NET_INGRESS_BYTES_SEC resources from by looking for a 
resource provider that has both the required amount of 
NET_INGRESS_BYTES_SEC as well as the trait required by the port profile. 
If, say, the port profile specifies that the port is to go on a NIC with 
access to network "A", then the build request would contain a request to 
the scheduler for CUSTOM_NETWORK_A trait...


If you're coming to Denver, I encourage you to get with me, Sean Mooney, 
Moshe Levi and others who are interested in seeing this work move forward.


Best,
-jay


Here is what I have in mind:
- The VM need is specified in the flavor extra-specs, e.g. 
resources:CUSTOM_BANDWIDTH=123.
- The compute node total capacity is specified in host aggregate metadata, e.g. 
CUSTOM_BANDWIDTH=999.
- Nova then takes care of the rest: scheduling where the free capacity is 
sufficient, and performing simple resource usage accounting (updating the 
compute node free network bandwidth capacity as required).

Is the outline above according to current plans?
If not, what would be possible/needed in order to achieve the same result, i.e. 
consider the VM network traffic need during VM placement?

BR,
Philippe

[1] https://blueprints.launchpad.net/nova/+spec/bandwidth-as-scheduler-metric
[2] https://wiki.openstack.org/wiki/NetworkBandwidthEntitlement
[3] 
https://openstack.nimeyo.com/80515/openstack-scheduling-bandwidth-resources-nic_bw_kb-resource
[4] https://docs.openstack.org/nova/latest/user/placement.html
[5] 
http://specs.openstack.org/openstack/nova-specs/priorities/pike-priorities.html#placement
[6] https://review.openstack.org/#/c/473627/

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduling] Can VM placement consider the VM network traffic need?

2017-09-01 Thread Balazs Gibizer


On Fri, Sep 1, 2017 at 10:42 AM, Rua, Philippe (Nokia - FI/Espoo) 
 wrote:
Will it be possible to include network bandwidth as a resource in 
Nova scheduling, for VM placement decision?


I think it will.




Context: in telecommunication applications, the network traffic is an 
important dimension of resource usage. For example, it is often 
important to distribute "bandwidth-greedy" VMs to different compute 
nodes. There were some earlier discussions on this topic, but I could 
not find a concrete outcome. [1][2][3]


After some reading, I wonder whether the Custom resource classes can 
provide a generic mechanism? [4][5][6]

Here is what I have in mind:
- The VM need is specified in the flavor extra-specs, e.g. 
resources:CUSTOM_BANDWIDTH=123.
- The compute node total capacity is specified in host aggregate 
metadata, e.g. CUSTOM_BANDWIDTH=999.


I'm not aware of any feature that considers aggregate metadata key as 
resource inventory. As far as I know you have to define new resource 
providers for your CUSTOM_BANDWIDTH resource via the placement API and 
you have to report the 999 as inventory on those resource providers 
also via placement API. Also don't forget to connect your resource 
provider to the existing compute resource providers via an aggregate 
(this is an aggregate in placement which is different from the host 
aggregate concept in nova). This review contains some test cases that 
can help you how to set things up 
https://review.openstack.org/#/c/497399


- Nova then takes care of the rest: scheduling where the free 
capacity is sufficient, and performing simple resource usage 
accounting (updating the compute node free network bandwidth capacity 
as required).


With the above flavor extra spec as request and the above resource 
provider setup nova will do the rest of the resource accounting for the 
your custom resource. Except in case you hit one of the bugs we 
discovered in this area 
https://bugs.launchpad.net/nova/+bugs?field.tag=placement





Is the outline above according to current plans?
If not, what would be possible/needed in order to achieve the same 
result, i.e. consider the VM network traffic need during VM placement?


You might want to keep an eye on the nested-resource-provider work 
planned for Queens as it will give you better options to model your 
resources: 
https://blueprints.launchpad.net/nova/+spec/nested-resource-providers


Cheers,
gibi




BR,
Philippe

[1] 
https://blueprints.launchpad.net/nova/+spec/bandwidth-as-scheduler-metric

[2] https://wiki.openstack.org/wiki/NetworkBandwidthEntitlement
[3] 
https://openstack.nimeyo.com/80515/openstack-scheduling-bandwidth-resources-nic_bw_kb-resource

[4] https://docs.openstack.org/nova/latest/user/placement.html
[5] 
http://specs.openstack.org/openstack/nova-specs/priorities/pike-priorities.html#placement

[6] https://review.openstack.org/#/c/473627/

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][scheduling] Can VM placement consider the VM network traffic need?

2017-09-01 Thread Rua, Philippe (Nokia - FI/Espoo)
Will it be possible to include network bandwidth as a resource in Nova 
scheduling, for VM placement decision?

Context: in telecommunication applications, the network traffic is an important 
dimension of resource usage. For example, it is often important to distribute 
"bandwidth-greedy" VMs to different compute nodes. There were some earlier 
discussions on this topic, but I could not find a concrete outcome. [1][2][3]

After some reading, I wonder whether the Custom resource classes can provide a 
generic mechanism? [4][5][6]
Here is what I have in mind:
- The VM need is specified in the flavor extra-specs, e.g. 
resources:CUSTOM_BANDWIDTH=123.
- The compute node total capacity is specified in host aggregate metadata, e.g. 
CUSTOM_BANDWIDTH=999.
- Nova then takes care of the rest: scheduling where the free capacity is 
sufficient, and performing simple resource usage accounting (updating the 
compute node free network bandwidth capacity as required).

Is the outline above according to current plans?
If not, what would be possible/needed in order to achieve the same result, i.e. 
consider the VM network traffic need during VM placement?

BR,
Philippe

[1] https://blueprints.launchpad.net/nova/+spec/bandwidth-as-scheduler-metric
[2] https://wiki.openstack.org/wiki/NetworkBandwidthEntitlement
[3] 
https://openstack.nimeyo.com/80515/openstack-scheduling-bandwidth-resources-nic_bw_kb-resource
[4] https://docs.openstack.org/nova/latest/user/placement.html
[5] 
http://specs.openstack.org/openstack/nova-specs/priorities/pike-priorities.html#placement
[6] https://review.openstack.org/#/c/473627/

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev