Replying from my phone, so I can't look, but I wonder if we have an index missing.
On Feb 25, 2013, at 8:54 PM, Sam Morrison <[email protected]> wrote: > On Tue, Feb 26, 2013 at 3:15 PM, Sam Morrison <[email protected]> wrote: >> >> On 26/02/2013, at 2:15 PM, Chris Behrens <[email protected]> wrote: >> >>> >>> On Feb 25, 2013, at 6:39 PM, Joe Gordon <[email protected]> wrote: >>> >>>> >>>> It looks like the scheduler issues are related to the rabbitmq issues. >>>> "host 'qh2-rcc77' ... is disabled or has not been heard from in a while" >>>> >>>> What does 'nova host-list' say? the clocks must all be synced up? >>> >>> Good things to check. It feels like something is spinning way too much >>> within this filter, though. This can also cause the above message. The >>> scheduler pulls all of the records before it starts filtering… and if >>> there's a huge delay somewhere, it can start seeing a bunch of hosts as >>> disabled. >>> >>> The filter doesn't look like a problem.. unless there's a large amount of >>> aggregate metadata… and/or a large amount of key/values for the >>> instance_type's extra specs. There *is* a DB call in the filter. If >>> that's blocking for an extended period of time, the whole process is >>> blocked… But I suspect by the '100% cpu' comment, that this is not the >>> case… So the only thing I can think of is that it returns a tremendous >>> amount of metadata. >>> >>> Adding some extra logging in the filter could be useful. >>> >>> - Chris >> >> Thanks Chris, I have 2 aggregates and 2 keys defined and each of the 80 >> hosts has either one or the other. At the moment every flavour has either >> one or the other too so I don't think it's too much data. >> >> I've tracked it down to this call: >> >> metadata = db.aggregate_metadata_get_by_host(context, host_state.host) > > More debugging has got it down to a query > > In db.api.aggregate_metadata_get_by_host: > > query = model_query(context, models.Aggregate).join( > "_hosts").filter(models.AggregateHost.host == host).join( > "_metadata") > ...... > rows = query.all() > > With query debug on this resolves to: > > SELECT aggregates.created_at AS aggregates_created_at, > aggregates.updated_at AS aggregates_updated_at, aggregates.deleted_at > AS aggregates_deleted_at, aggregates.deleted AS aggregates_deleted, > aggregates.id AS aggregates_id, aggregates.name AS aggregates_name, > aggregates.availability_zone AS aggregates_availability_zone, > aggregate_hosts_1.created_at AS aggregate_hosts_1_created_at, > aggregate_hosts_1.updated_at AS aggregate_hosts_1_updated_at, > aggregate_hosts_1.deleted_at AS aggregate_hosts_1_deleted_at, > aggregate_hosts_1.deleted AS aggregate_hosts_1_deleted, > aggregate_hosts_1.id AS aggregate_hosts_1_id, aggregate_hosts_1.host > AS aggregate_hosts_1_host, aggregate_hosts_1.aggregate_id AS > aggregate_hosts_1_aggregate_id FROM aggregates INNER JOIN > aggregate_hosts AS aggregate_hosts_2 ON aggregates.id = > aggregate_hosts_2.aggregate_id AND aggregate_hosts_2.deleted = 0 AND > aggregates.deleted = 0 INNER JOIN aggregate_hosts ON > aggregate_hosts.aggregate_id = aggregates.id AND > aggregate_hosts.deleted = 0 AND aggregates.deleted = 0 INNER JOIN > aggregate_metadata AS aggregate_metadata_1 ON aggregates.id = > aggregate_metadata_1.aggregate_id AND aggregate_metadata_1.deleted = 0 > AND aggregates.deleted = 0 INNER JOIN aggregate_metadata ON > aggregate_metadata.aggregate_id = aggregates.id AND > aggregate_metadata.deleted = 0 AND aggregates.deleted = 0 LEFT OUTER > JOIN aggregate_hosts AS aggregate_hosts_3 ON aggregates.id = > aggregate_hosts_3.aggregate_id AND aggregate_hosts_3.deleted = 0 AND > aggregates.deleted = 0 LEFT OUTER JOIN aggregate_hosts AS > aggregate_hosts_1 ON aggregate_hosts_1.aggregate_id = aggregates.id > AND aggregate_hosts_1.deleted = 0 AND aggregates.deleted = 0 WHERE > aggregates.deleted = 0 AND aggregate_hosts.host = 'qh2-rcc34'; > > Which in our case returns 328509 rows in set (25.97 sec) > > Seems a bit off considering there are 80 rows in aggregate_hosts, 2 > rows in aggregates and 2 rows in aggregate_metadata > > In the code rows is only equal to 1 so it seems to be doing something > inside to code to do this? Don't know too much how sqlalchemy works. > > Seems like a bug to me? or maybe our database has something wrong in it? > > Cheers, > Sam _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : [email protected] Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp

