On Tue, Feb 26, 2013 at 3:15 PM, Sam Morrison <[email protected]> wrote:
>
> On 26/02/2013, at 2:15 PM, Chris Behrens <[email protected]> wrote:
>
>>
>> On Feb 25, 2013, at 6:39 PM, Joe Gordon <[email protected]> wrote:
>>
>>>
>>> It looks like the scheduler issues are related to the rabbitmq issues.   
>>> "host 'qh2-rcc77' ... is disabled or has not been heard from in a while"
>>>
>>> What does 'nova host-list' say?   the clocks must all be synced up?
>>
>> Good things to check.  It feels like something is spinning way too much 
>> within this filter, though.  This can also cause the above message.  The 
>> scheduler pulls all of the records before it starts filtering… and if 
>> there's a huge delay somewhere, it can start seeing a bunch of hosts as 
>> disabled.
>>
>> The filter doesn't look like a problem.. unless there's a large amount of 
>> aggregate metadata… and/or a large amount of key/values for the 
>> instance_type's extra specs.   There *is* a DB call in the filter.  If 
>> that's blocking for an extended period of time, the whole process is 
>> blocked…  But I suspect by the '100% cpu' comment, that this is not the 
>> case…  So the only thing I can think of is that it returns a tremendous 
>> amount of metadata.
>>
>> Adding some extra logging in the filter could be useful.
>>
>> - Chris
>
> Thanks Chris, I have 2 aggregates and 2 keys defined and each of the 80 hosts 
> has either one or the other. At the moment every flavour has either one or 
> the other too so I don't think it's too much data.
>
> I've tracked it down to this call:
>
> metadata = db.aggregate_metadata_get_by_host(context, host_state.host)

More debugging has got it down to a query

In db.api.aggregate_metadata_get_by_host:

    query = model_query(context, models.Aggregate).join(
            "_hosts").filter(models.AggregateHost.host == host).join(
            "_metadata")
   ......
   rows = query.all()

With query debug on this resolves to:

SELECT aggregates.created_at AS aggregates_created_at,
aggregates.updated_at AS aggregates_updated_at, aggregates.deleted_at
AS aggregates_deleted_at, aggregates.deleted AS aggregates_deleted,
aggregates.id AS aggregates_id, aggregates.name AS aggregates_name,
aggregates.availability_zone AS aggregates_availability_zone,
aggregate_hosts_1.created_at AS aggregate_hosts_1_created_at,
aggregate_hosts_1.updated_at AS aggregate_hosts_1_updated_at,
aggregate_hosts_1.deleted_at AS aggregate_hosts_1_deleted_at,
aggregate_hosts_1.deleted AS aggregate_hosts_1_deleted,
aggregate_hosts_1.id AS aggregate_hosts_1_id, aggregate_hosts_1.host
AS aggregate_hosts_1_host, aggregate_hosts_1.aggregate_id AS
aggregate_hosts_1_aggregate_id FROM aggregates INNER JOIN
aggregate_hosts AS aggregate_hosts_2 ON aggregates.id =
aggregate_hosts_2.aggregate_id AND aggregate_hosts_2.deleted = 0 AND
aggregates.deleted = 0 INNER JOIN aggregate_hosts ON
aggregate_hosts.aggregate_id = aggregates.id AND
aggregate_hosts.deleted = 0 AND aggregates.deleted = 0 INNER JOIN
aggregate_metadata AS aggregate_metadata_1 ON aggregates.id =
aggregate_metadata_1.aggregate_id AND aggregate_metadata_1.deleted = 0
AND aggregates.deleted = 0 INNER JOIN aggregate_metadata ON
aggregate_metadata.aggregate_id = aggregates.id AND
aggregate_metadata.deleted = 0 AND aggregates.deleted = 0 LEFT OUTER
JOIN aggregate_hosts AS aggregate_hosts_3 ON aggregates.id =
aggregate_hosts_3.aggregate_id AND aggregate_hosts_3.deleted = 0 AND
aggregates.deleted = 0 LEFT OUTER JOIN aggregate_hosts AS
aggregate_hosts_1 ON aggregate_hosts_1.aggregate_id = aggregates.id
AND aggregate_hosts_1.deleted = 0 AND aggregates.deleted = 0 WHERE
aggregates.deleted = 0 AND aggregate_hosts.host = 'qh2-rcc34';

Which in our case returns 328509 rows in set (25.97 sec)

Seems a bit off considering there are 80 rows in aggregate_hosts, 2
rows in aggregates and 2 rows in aggregate_metadata

In the code rows is only equal to 1 so it seems to be doing something
inside to code to do this? Don't know too much how sqlalchemy works.

Seems like a bug to me? or maybe our database has something wrong in it?

Cheers,
Sam

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

Reply via email to