On 08/01/2018 11:32 AM, melanie witt wrote:
I think it's definitely a significant issue that troubleshooting "No allocation
candidates returned" from placement is so difficult. However, it's not
straightforward to log detail in placement when the request for allocation
candidates is essentially "SELECT * FROM nodes WHERE cpu usage < needed and disk
usage < needed and memory usage < needed" and the result is returned from the
API.
I think the only way to get useful info on a failure would be to break down the
huge SQL statement into subclauses and store the results of the intermediate
queries. So then if it failed placement could log something like:
hosts with enough CPU: <list1>
hosts that also have enough disk: <list2>
hosts that also have enough memory: <list3>
hosts that also meet extra spec host aggregate keys: <list 4>
hosts that also meet image properties host aggregate keys: <list 5>
hosts that also have requested PCI devices: <list 6>
And maybe we could optimize the above by only emitting logs where the list has a
length less than X (to avoid flooding the logs with hostnames in large clusters).
This would let you zero in on the things that finally caused the list to be
whittled down to nothing.
Chris
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev