-----Original Message-----
From: Jay Pipes Sent Monday, March 02, 2015 16:24

On 02/25/2015 06:41 AM, Daniel P. Berrange wrote:
> On Wed, Feb 25, 2015 at 02:08:32PM +0000, Gary Kotton wrote:
>> I understand that this is a high or critical bug but I think that
>> we need to discuss more on it and try have a more robust model.
>
> What I'm not seeing from the bug description is just what part of
> the scheduler needs the ability to have total summed disk across
> every host in the cloud.

The scheduler does not need to know this information at all. One might 
say that a cloud administrator would want to know the total free disk 
space available in their cloud -- or at least get notified once the 
total free space falls below some threshold. IMO, there are better ways 
of accomplishing such a capacity-management task: use an NPRE/monitoring 
check that simply does a `df` or similar command every so often against 
the actual filesystem backend.

IMHO, this isn't something that needs to be fronted by a 
management/admin-only REST API that needs to iterate over a potentially 
huge number of compute nodes just to enable some pretty graphical 
front-end that shows some pie chart of available disk space.
 
[Rockyg] ++  Scheduler doesn't need to know anything about the individual 
compute nodes attached to *the same* shared storage to do placement.  Scheduler 
can't increase or decrease the physical amount of storage available to the set 
of nodes. The hardware monitor for the shared storage provides the total amount 
of disk on the system, the amount already used and the amount still unused.  
Anywhere the scheduler starts a new vm in this node set will have the same 
amount of disk available or not.

> What is the actual bad functional behaviour that results from this
> bug that means it is a high priority issue to fix ?

The higher priority thing would be to remove the wonky os-hypervisors 
REST API extension and its related cruft. This API extension is fatally 
flawed in a number of ways, including assumptions about things such as 
underlying providers of disk/volume resources and misleading 
relationships between the servicegroup API and the compute nodes table.

[Rockyg] IMO the most important piece of information from OpenStack sw for an 
operator with a set of nodes sharing a storage backend is: what is the current 
total commitment (over commitment more likely) of the storage capacity on the 
set of nodes attached.  And that results in a simple go/no-go for starting 
another vm on the set, or sending a warning/error that the storage is 
over-committed and get more.

--Rocky



Best,
-jay

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to