Re: [Openstack] Nova DB Connection Pooling
On Sep 27, 2011, at 6:11 PM, Joshua Harlow wrote: > Is there any info on what it does aggregate and what it doesn’t. Like flavors > aren’t aggregated (I would think)? But instance listing is aggregated. I don't know if there is any formal documentation on this. But I know that host capabilities are aggregated; I'm assuming that volume availability is also either currently aggregated or is in development. Things like 'flavors' are usually defined on a deployment-wide basis, not on individual hosts, so aggregation doesn't really come into play. This is more related to the tangent that Chris Behrens started on DB replication: if you want to offer a new instance_type, you have to somehow propagate that info to all the zones in your deployment. -- Ed Leafe This email may include confidential information. If you received it in error, please delete it. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
Thanks for the info :-) Is there any info on what it does aggregate and what it doesn't. Like flavors aren't aggregated (I would think)? But instance listing is aggregated. On 9/27/11 1:54 PM, "Ed Leafe" wrote: On Sep 27, 2011, at 12:44 PM, Joshua Harlow wrote: > "Each deployment also gets its own Database. The Schema of each database is > the same for all deployments at all levels. The difference between > deployments is, depending on the services running within that Zone, not all > the database tables may be used." Is there a doc that shows what is used and > what is not? It's not a defined difference; it's a matter of usage. IOW, a zone might not have any compute services running at all, but instead may serve as an aggregator of child zones, and thus wouldn't "need" the tables that support hosts. E.g., a datacenter zone might have several 'section' zones, each of which corresponds to a different area in the DC. Each of those sections may be divided into several zones that are each composed of several zones, each of which actually has compute resources. In this example, the outer zones serve only to provide horizontal scalability by aggregating their child zones. The database tables that are involved in working with compute resources would be empty, since they are not used at that level, but they would still need to be present. > As for each zone, the X dashboards I think would be required where X is each > zone (since the zone "stuff" is really only in the scheduler). Correct me if > I am wrong. :-) You're wrong. :) I think you're missing the way that compute resources roll up the zone hierarchy. If you ask for all instances for a client at a particular zone, it will return the instances present in its hosts, if any, and all of its child zones' hosts, and their children, etc. That's what I mean by zones serving as aggregators. > Just from looking at that page, each zone has an API entry-point, and unless > the "parent" zone does aggregation of child zone information then wouldn't > separate dashboards be needed? See, you answered your own question. Parents do indeed aggregate child zone information. > I agree that distributed DB can be more complicated, but sometimes its worth > it (if it makes X turn into 1 then it might be better?) :-) We discussed the pros and cons of a distributed DB; personally, I was in favor of Cassandra rather than a replicated MySQL, but I think that the unfamiliarity of many in those early days with Cassandra ruled out that option. We instead decided on a "shared nothing" approach to zones, where a parent knew of its children, but the children were ignorant of their parents. It's sort of a fractal design: you can drill down the nested zones, and at each level, they function independently and identically. -- Ed Leafe This email may include confidential information. If you received it in error, please delete it. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
On Sep 27, 2011, at 12:44 PM, Joshua Harlow wrote: > “Each deployment also gets its own Database. The Schema of each database is > the same for all deployments at all levels. The difference between > deployments is, depending on the services running within that Zone, not all > the database tables may be used.” Is there a doc that shows what is used and > what is not? It's not a defined difference; it's a matter of usage. IOW, a zone might not have any compute services running at all, but instead may serve as an aggregator of child zones, and thus wouldn't "need" the tables that support hosts. E.g., a datacenter zone might have several 'section' zones, each of which corresponds to a different area in the DC. Each of those sections may be divided into several zones that are each composed of several zones, each of which actually has compute resources. In this example, the outer zones serve only to provide horizontal scalability by aggregating their child zones. The database tables that are involved in working with compute resources would be empty, since they are not used at that level, but they would still need to be present. > As for each zone, the X dashboards I think would be required where X is each > zone (since the zone “stuff” is really only in the scheduler). Correct me if > I am wrong. :-) You're wrong. :) I think you're missing the way that compute resources roll up the zone hierarchy. If you ask for all instances for a client at a particular zone, it will return the instances present in its hosts, if any, and all of its child zones' hosts, and their children, etc. That's what I mean by zones serving as aggregators. > Just from looking at that page, each zone has an API entry-point, and unless > the “parent” zone does aggregation of child zone information then wouldn’t > separate dashboards be needed? See, you answered your own question. Parents do indeed aggregate child zone information. > I agree that distributed DB can be more complicated, but sometimes its worth > it (if it makes X turn into 1 then it might be better?) :-) We discussed the pros and cons of a distributed DB; personally, I was in favor of Cassandra rather than a replicated MySQL, but I think that the unfamiliarity of many in those early days with Cassandra ruled out that option. We instead decided on a "shared nothing" approach to zones, where a parent knew of its children, but the children were ignorant of their parents. It's sort of a fractal design: you can drill down the nested zones, and at each level, they function independently and identically. -- Ed Leafe This email may include confidential information. If you received it in error, please delete it. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
I was more of just commenting on the line in the multi-cluster doc that says: "Each deployment also gets its own Database. The Schema of each database is the same for all deployments at all levels. The difference between deployments is, depending on the services running within that Zone, not all the database tables may be used." Is there a doc that shows what is used and what is not? As for each zone, the X dashboards I think would be required where X is each zone (since the zone "stuff" is really only in the scheduler). Correct me if I am wrong. :-) Just from looking at that page, each zone has an API entry-point, and unless the "parent" zone does aggregation of child zone information then wouldn't separate dashboards be needed? Or you would have to switch the dashboard between different zones (this might make more sense but still seems sort of clunky). Similarly this seems to be the same with volume management where now u have X volume entry-points. Not that this is bad, it just seems like it starts to get very complicated :-) I agree that distributed DB can be more complicated, but sometimes its worth it (if it makes X turn into 1 then it might be better?) :-) I just wouldn't want to manage this: 200 Huddle Zones/DC == 200 DB's per datacenter. :-( On 9/27/11 12:21 AM, "Chris Behrens" wrote: Not sure the 'data that are shared' is the right wording below, but I think I get the point. The one thing that jumps out to me as a current 'issue' is the fact that the instance_types table must be kept in sync. I can't think of anything else at the moment, but there might be something. And the comment (not quoted below) about 'X' dashboards *I think* is invalid. I don't know much about it.. does it talk more than to API? If not, just point it at top level zone. I'm really not a fan of distributed DB architectures if they can be avoided. They add a lot of unneeded complexity and a lot more potential for breakage. The instance type issue can be solved differently. That said, it can be an interesting discussion topic. :) - Chris On Sep 26, 2011, at 8:53 PM, Ed Leafe wrote: > On Sep 26, 2011, at 10:14 PM, Joshua Harlow wrote: > >> It seems like it would be good to talk about this during the conference, >> since it seems sort of odd to have pieces of data that are shared across >> zones along with pieces of data that are not shared across zones. > > What pieces of data are shared across zones? > > > -- Ed Leafe > > This email may include confidential information. If you received it in error, > please delete it. > > > ___ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp This email may include confidential information. If you received it in error, please delete it. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
On Sep 27, 2011, at 10:10 AM, Sandy Walsh wrote: > POST .../zone/boot has been removed and integrated into POST .../servers/boot > now, so there isn't really anything that shouldn't be public now. zone/select > is still required for bursting scenarios. I was referring to the database sync across zones task that Chris brought up: that's something that would be useful for OpenStack, but not applicable to a public API. -- Ed Leafe This email may include confidential information. If you received it in error, please delete it. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
POST .../zone/boot has been removed and integrated into POST .../servers/boot now, so there isn't really anything that shouldn't be public now. zone/select is still required for bursting scenarios. From: openstack-bounces+sandy.walsh=rackspace@lists.launchpad.net [openstack-bounces+sandy.walsh=rackspace@lists.launchpad.net] on behalf of Ed Leafe [ed.le...@rackspace.com] Sent: Tuesday, September 27, 2011 10:40 AM To: Chris Behrens Cc: openstack Subject: Re: [Openstack] Nova DB Connection Pooling On Sep 27, 2011, at 2:21 AM, Chris Behrens wrote: > Not sure the 'data that are shared' is the right wording below, but I think I > get the point. The one thing that jumps out to me as a current 'issue' is > the fact that the instance_types table must be kept in sync. I can't think > of anything else at the moment, but there might be something. That's true, but a completely separate discussion from the current one concerning scalability. This is a matter of ongoing maintenance, and one which I discussed a while ago when I proposed separating the internal-only, inter-zone part of novaclient from the public API part. What you've described is one of the use cases I pointed out where we would have a need for inter-zone functionality that should never be exposed to a public API. -- Ed Leafe This email may include confidential information. If you received it in error, please delete it. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp This email may include confidential information. If you received it in error, please delete it. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
On Sep 27, 2011, at 2:21 AM, Chris Behrens wrote: > Not sure the 'data that are shared' is the right wording below, but I think I > get the point. The one thing that jumps out to me as a current 'issue' is > the fact that the instance_types table must be kept in sync. I can't think > of anything else at the moment, but there might be something. That's true, but a completely separate discussion from the current one concerning scalability. This is a matter of ongoing maintenance, and one which I discussed a while ago when I proposed separating the internal-only, inter-zone part of novaclient from the public API part. What you've described is one of the use cases I pointed out where we would have a need for inter-zone functionality that should never be exposed to a public API. -- Ed Leafe This email may include confidential information. If you received it in error, please delete it. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
Certainly nothing is carved in stone as requirements change, but for the purpose of background: The design for zones was based on the core nova design tenets and feedback from other core members: http://wiki.openstack.org/BasicDesignTenets Early spec: http://wiki.openstack.org/MultiClusterZones -S From: Joshua Harlow [harlo...@yahoo-inc.com] Sent: Tuesday, September 27, 2011 12:14 AM To: Sandy Walsh; Devin Carlen; Soren Hansen Cc: openstack Subject: Re: [Openstack] Nova DB Connection Pooling It seems like it would be good to talk about this during the conference, since it seems sort of odd to have pieces of data that are shared across zones along with pieces of data that are not shared across zones. It seems like it might be better to provide a unified view of the zones (from a management and operational standpoint)? I wouldn’t want to manage X DB’s with X dashboards Keystone seems to help with the auth and glance with image management, but then u still have this nova DB usage that doesn't quite fit in the puzzle (in my opinion). I would personally rather have a distributed data-store act as the DB, this can then be the “single DB”, thus making everything fit better (or at least a db service so that this could be a possibility for users with a large number of distributed compute nodes in different data centers). Imposing a single DB deployment per zone seems to restrictive, instead of say imposing a nova-db service (as an example) that could talk to mysql (for those who want a simple solution) or say could talk to riak [$or other nosql here$] (for those who want a distributed yet “single db-like” solution). On 9/26/11 7:26 PM, "Sandy Walsh" > wrote: Sure ... was there something in particular you wanted to know about? The overview: The assumption with Zones is there is a single DB deployment per Zone. When I say "single DB", that could be clustered/HA as need be. But the intention is no sharing of DB between zones. This, of course, has caused us some problems with respect to Instance/Flavor/User ID's being shared across zones. But these have largely been mitigated with the use of UUID's, Glance & Keystone. Not sure how Networks and Volumes will behave. Data collected from child zones get encrypted blobs of data from the child that may contain ID's or zone-local information, but it's not generally available to the parent zones. They're ephemeral magic cookies. We don't do a lot of disk access in the distributed scheduler. Most stuff is in-memory and transient. -S From: openstack-bounces+sandy.walsh=rackspace@lists.launchpad.net [openstack-bounces+sandy.walsh=rackspace@lists.launchpad.net] on behalf of Devin Carlen [devin.car...@gmail.com] Sent: Monday, September 26, 2011 10:26 PM To: Soren Hansen Cc: openstack@lists.launchpad.net Subject: Re: [Openstack] Nova DB Connection Pooling We really need to hear from Sandy Walsh on this thread so he can elaborate on how the distributed scheduling works (with multiple mysql databases). Devin On Sep 26, 2011, at 6:41 AM, Soren Hansen wrote: > 2011/9/26 Pitucha, Stanislaw Izaak > >: >> The pain starts when your max memory usage crosses what you have available. >> Check http://dev.mysql.com/doc/refman/5.1/en/memory-use.html - especially >> comments which calculate the needed memory for N connections for both innodb >> and isam. (mysqltuner.pl will also calculate that for you) >> >> Hundreds of connections should be ok. Thousands... you should rethink it ;) > > Hm.. It doesn't take many racks full of blade servers to get into 4 > digit numbers of compute nodes. Certainly fewer than I was expecting > to see in a garden variety Nova zone. > > -- > Soren Hansen| http://linux2go.dk/ > Ubuntu Developer| http://www.ubuntu.com/ > OpenStack Developer | http://www.openstack.org/ > > ___ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp This email may include confidential information. If you received it in error, please delete it. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp This email may include confidential information. If you received it in error, please dele
Re: [Openstack] Nova DB Connection Pooling
Not sure the 'data that are shared' is the right wording below, but I think I get the point. The one thing that jumps out to me as a current 'issue' is the fact that the instance_types table must be kept in sync. I can't think of anything else at the moment, but there might be something. And the comment (not quoted below) about 'X' dashboards *I think* is invalid. I don't know much about it.. does it talk more than to API? If not, just point it at top level zone. I'm really not a fan of distributed DB architectures if they can be avoided. They add a lot of unneeded complexity and a lot more potential for breakage. The instance type issue can be solved differently. That said, it can be an interesting discussion topic. :) - Chris On Sep 26, 2011, at 8:53 PM, Ed Leafe wrote: > On Sep 26, 2011, at 10:14 PM, Joshua Harlow wrote: > >> It seems like it would be good to talk about this during the conference, >> since it seems sort of odd to have pieces of data that are shared across >> zones along with pieces of data that are not shared across zones. > > What pieces of data are shared across zones? > > > -- Ed Leafe > > This email may include confidential information. If you received it in error, > please delete it. > > > ___ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp This email may include confidential information. If you received it in error, please delete it. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
On Sep 26, 2011, at 10:14 PM, Joshua Harlow wrote: > It seems like it would be good to talk about this during the conference, > since it seems sort of odd to have pieces of data that are shared across > zones along with pieces of data that are not shared across zones. What pieces of data are shared across zones? -- Ed Leafe This email may include confidential information. If you received it in error, please delete it. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
It seems like it would be good to talk about this during the conference, since it seems sort of odd to have pieces of data that are shared across zones along with pieces of data that are not shared across zones. It seems like it might be better to provide a unified view of the zones (from a management and operational standpoint)? I wouldn't want to manage X DB's with X dashboards Keystone seems to help with the auth and glance with image management, but then u still have this nova DB usage that doesn't quite fit in the puzzle (in my opinion). I would personally rather have a distributed data-store act as the DB, this can then be the "single DB", thus making everything fit better (or at least a db service so that this could be a possibility for users with a large number of distributed compute nodes in different data centers). Imposing a single DB deployment per zone seems to restrictive, instead of say imposing a nova-db service (as an example) that could talk to mysql (for those who want a simple solution) or say could talk to riak [$or other nosql here$] (for those who want a distributed yet "single db-like" solution). On 9/26/11 7:26 PM, "Sandy Walsh" wrote: Sure ... was there something in particular you wanted to know about? The overview: The assumption with Zones is there is a single DB deployment per Zone. When I say "single DB", that could be clustered/HA as need be. But the intention is no sharing of DB between zones. This, of course, has caused us some problems with respect to Instance/Flavor/User ID's being shared across zones. But these have largely been mitigated with the use of UUID's, Glance & Keystone. Not sure how Networks and Volumes will behave. Data collected from child zones get encrypted blobs of data from the child that may contain ID's or zone-local information, but it's not generally available to the parent zones. They're ephemeral magic cookies. We don't do a lot of disk access in the distributed scheduler. Most stuff is in-memory and transient. -S From: openstack-bounces+sandy.walsh=rackspace@lists.launchpad.net [openstack-bounces+sandy.walsh=rackspace@lists.launchpad.net] on behalf of Devin Carlen [devin.car...@gmail.com] Sent: Monday, September 26, 2011 10:26 PM To: Soren Hansen Cc: openstack@lists.launchpad.net Subject: Re: [Openstack] Nova DB Connection Pooling We really need to hear from Sandy Walsh on this thread so he can elaborate on how the distributed scheduling works (with multiple mysql databases). Devin On Sep 26, 2011, at 6:41 AM, Soren Hansen wrote: > 2011/9/26 Pitucha, Stanislaw Izaak : >> The pain starts when your max memory usage crosses what you have available. >> Check http://dev.mysql.com/doc/refman/5.1/en/memory-use.html - especially >> comments which calculate the needed memory for N connections for both innodb >> and isam. (mysqltuner.pl will also calculate that for you) >> >> Hundreds of connections should be ok. Thousands... you should rethink it ;) > > Hm.. It doesn't take many racks full of blade servers to get into 4 > digit numbers of compute nodes. Certainly fewer than I was expecting > to see in a garden variety Nova zone. > > -- > Soren Hansen| http://linux2go.dk/ > Ubuntu Developer| http://www.ubuntu.com/ > OpenStack Developer | http://www.openstack.org/ > > ___ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp This email may include confidential information. If you received it in error, please delete it. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
Sure ... was there something in particular you wanted to know about? The overview: The assumption with Zones is there is a single DB deployment per Zone. When I say "single DB", that could be clustered/HA as need be. But the intention is no sharing of DB between zones. This, of course, has caused us some problems with respect to Instance/Flavor/User ID's being shared across zones. But these have largely been mitigated with the use of UUID's, Glance & Keystone. Not sure how Networks and Volumes will behave. Data collected from child zones get encrypted blobs of data from the child that may contain ID's or zone-local information, but it's not generally available to the parent zones. They're ephemeral magic cookies. We don't do a lot of disk access in the distributed scheduler. Most stuff is in-memory and transient. -S From: openstack-bounces+sandy.walsh=rackspace@lists.launchpad.net [openstack-bounces+sandy.walsh=rackspace@lists.launchpad.net] on behalf of Devin Carlen [devin.car...@gmail.com] Sent: Monday, September 26, 2011 10:26 PM To: Soren Hansen Cc: openstack@lists.launchpad.net Subject: Re: [Openstack] Nova DB Connection Pooling We really need to hear from Sandy Walsh on this thread so he can elaborate on how the distributed scheduling works (with multiple mysql databases). Devin On Sep 26, 2011, at 6:41 AM, Soren Hansen wrote: > 2011/9/26 Pitucha, Stanislaw Izaak : >> The pain starts when your max memory usage crosses what you have available. >> Check http://dev.mysql.com/doc/refman/5.1/en/memory-use.html - especially >> comments which calculate the needed memory for N connections for both innodb >> and isam. (mysqltuner.pl will also calculate that for you) >> >> Hundreds of connections should be ok. Thousands... you should rethink it ;) > > Hm.. It doesn't take many racks full of blade servers to get into 4 > digit numbers of compute nodes. Certainly fewer than I was expecting > to see in a garden variety Nova zone. > > -- > Soren Hansen| http://linux2go.dk/ > Ubuntu Developer| http://www.ubuntu.com/ > OpenStack Developer | http://www.openstack.org/ > > ___ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp This email may include confidential information. If you received it in error, please delete it. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
We really need to hear from Sandy Walsh on this thread so he can elaborate on how the distributed scheduling works (with multiple mysql databases). Devin On Sep 26, 2011, at 6:41 AM, Soren Hansen wrote: > 2011/9/26 Pitucha, Stanislaw Izaak : >> The pain starts when your max memory usage crosses what you have available. >> Check http://dev.mysql.com/doc/refman/5.1/en/memory-use.html - especially >> comments which calculate the needed memory for N connections for both innodb >> and isam. (mysqltuner.pl will also calculate that for you) >> >> Hundreds of connections should be ok. Thousands... you should rethink it ;) > > Hm.. It doesn't take many racks full of blade servers to get into 4 > digit numbers of compute nodes. Certainly fewer than I was expecting > to see in a garden variety Nova zone. > > -- > Soren Hansen| http://linux2go.dk/ > Ubuntu Developer| http://www.ubuntu.com/ > OpenStack Developer | http://www.openstack.org/ > > ___ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
2011/9/26 Pitucha, Stanislaw Izaak : > The pain starts when your max memory usage crosses what you have available. > Check http://dev.mysql.com/doc/refman/5.1/en/memory-use.html - especially > comments which calculate the needed memory for N connections for both innodb > and isam. (mysqltuner.pl will also calculate that for you) > > Hundreds of connections should be ok. Thousands... you should rethink it ;) Hm.. It doesn't take many racks full of blade servers to get into 4 digit numbers of compute nodes. Certainly fewer than I was expecting to see in a garden variety Nova zone. -- Soren Hansen | http://linux2go.dk/ Ubuntu Developer | http://www.ubuntu.com/ OpenStack Developer | http://www.openstack.org/ ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
The pain starts when your max memory usage crosses what you have available. Check http://dev.mysql.com/doc/refman/5.1/en/memory-use.html - especially comments which calculate the needed memory for N connections for both innodb and isam. (mysqltuner.pl will also calculate that for you) Hundreds of connections should be ok. Thousands... you should rethink it ;) If you want to improve connection pooling, look at the "Mysql proxy" project. Regards, Stanisław Pitucha Cloud Services Hewlett Packard smime.p7s Description: S/MIME cryptographic signature ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
2011/9/26 Monty Taylor : >> I'm not a MySQL guru by any means, but can you explain this to me? >> I've never read anywhere that MySQL "doesn't really like having a >> bunch of unused connection sitting around for long lifecycles". It >> seems pretty logical to me to have at least 2 persistent connections >> to the database to avoid being completely blocked on database calls. > So I should probably phrase that a little differently - and some of it > is a question of scale. 2 connections, yes - 15 probably not. I've not run MySQL at this scale before. How well will it handle a couple of thousand persistent connections? When does the pain start to kick in? Anyways, this cloud stuff is all about *horizontal* scalability. It does seem increasingly odd (to me, at least) to have the architecture include this central datastore that everything needs to connect to, regardless of how well this datastore scales *vertically*. Something like Riak was designed to scale extremely well horizontally (in much the same way as Swift). Using it will require us to rethink our datastore access quite considerably, but the benefit is painless horizontal scalability (and unicorns and ponies, of course). -- Soren Hansen | http://linux2go.dk/ Ubuntu Developer | http://www.ubuntu.com/ OpenStack Developer | http://www.openstack.org/ ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
If our real goal is scaling the api, It seems like we can scale much more easily by simply running more copies of nova-api and load balancing across them. Then we don't need any background connection pooling magic. If it really is limited to issues with eventlet then utils.synchronized might solve our problems in the near-term, but we still need the with_lockmode for allocations on multiple machines to work. It would be great if we could figure out how the system is failing, but I have to admit debugging has been pretty challenging. Vish ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
t folks are likely to be running creates an OS-level thread for every connection to the db - which means if you have 500 compute nodes each with 2 long-lived connections to the db, you're going to have 1000 os threads that the MySQL scheduler is going to (poorly) deal with. (we're using extremes here for illustration purposes) On the other hand, the MySQL connection protocol is extremely cheap (there's an extraneous round trip, but we can live with it) - so depending on the exact architecture, you could actually potentially see better performance if each of your nodes connected when they need to talk and disconnected when they were done - keeping the scheduler pressure down on at the db side. (still - it all depends, so really just has to be looked at/tuned - the exact opposite could also be true. :) Additionally - per-thread memory that's allocated in a connection context does not get freed until the threads are reaped (this means falling all of the way out of the thread_cache) - so depending on what's going on, you could have memory pressure issues if you had 1000 threads that stuck around for forever, because if any of the queries on any of those connections happened to max out that connections sort_buffer, for instance - well, you're just going to keep that allocated sort_buffer laying around for a while. In any case - as with all things it's about happy medium - so technically solving the greethreads/MySQL connection issue is important - but also consider figuring out ways to go ahead and disconnect if you're detecting that you're going to be idle for a while. Alternately - make sure that the db_pool gracefully handles server-initiated disconnects ... because that way you could just have a normal pool setup for small to medium sized installations, and for larger ones the dba could set connection_timeout to something low, like 2 seconds, and then idle stuff would get reaped as it was idle. I am honestly curious to see how https://launchpad.net/myconnpy fares. sqlalchemy has support for it - I think you add +myconnpy to the db url - so like mysql+myconnpy:// ... blah. Monty > > -Original Message- From: "Monty Taylor" > Sent: Sunday, September 25, 2011 8:48pm To: > openstack@lists.launchpad.net Subject: Re: [Openstack] Nova DB > Connection Pooling > > What was the intent of the connection pooling? That is, what was it > trying to fix? > > Running the script a single time caused 16 connection threads to be > spawned. This seems a bit excessive. > > When I tried spawning five copies at the same time, I wound up with > 60 connected threads, plus some connect timeouts, plus some of the > tracebacks you're seeing below. > > Increasing the thread_cache_size from the default on ubuntu (which is > 8) helped things, but things still seemed to be going to the bad > place. > > More vexing is that all of these queries were doing select ... for > update (makes sense why) - but that just means that we're stacking up > on locks in the db trying to get these resources... one of those > situations where greater parallelism actually isn't better. > > Without really knowing more, my vote would be not to have app-level > connection pooling by default. MySQL specifically doesn't really > like having a bunch of unused connection sitting around for long > lifecycles (with a few exceptions - the exceptions always prove the > rule, of course) > > Of course- I could be wrong... which is why I'd like to know more > about what the issue was that incited connection pooling. > > Monty > > On 09/25/2011 01:53 PM, Vishvananda Ishaya wrote: >> Hey everyone, >> >> I'm a bit concerned with the connection pooling in the db. It >> seems that things are not getting cleaned up properly. I have a >> repro-case that causes failures that we have seen before. if I >> revert the nova/db/sqlalchemy/session.py to before the eventlet db >> pool was added I get no failures. If you want to see the issue, >> try the attached code. You will need to run from the nova >> directory or do python setup.py develop. You will also need to >> create a mysql database called test and edit the sql_connection >> string if you have a mysql password, etc. Please check this code. >> If we can't come up with a fix, I htink we need to revert back to >> no connection pooling. >> >> Run the attached script at least 3 times The code below runs fine >> the first couple of times, Then it starts to fail with the >> following error: >> >> 2011-09-24 12:51:02,799 INFO >> sqlalchemy.engine.base.Engine.0x...36d0 [-] ROLLBACK Traceback >> (most recent call last): File >> "/Library/Python/2.7/site-packages/eventlet/hu
Re: [Openstack] Nova DB Connection Pooling
Hey Monty/All, The original goal of my eventlet connection pooling patch was to increase overall throughput of the OpenStack API. By itself, SQLAlchemy provides a lot of nifty features such as connection pooling and connection timeout limits, but all of these were being lost on us because of our use of eventlet in the API. To elaborate, when multiples users query the API simultaneously, what we'd expect to happen is for a greenthread to be created for each request. This happens as expected, however since SQLAlchemy uses the MySQLdb module by default all database calls block *all* greenthreads. This is because MySQLdb is written in C and thus can't be monkey patched by eventlet. As a result, the API basically does all SQL queries in serial and we're obviously (?) going to have to support multiple concurrent connections in the API. It was very evident in load testing of the API that something needed to change. The patch introduced db_pool (http://eventlet.net/doc/modules/db_pool.html), which more or less uses threads to overcome the limitation of using MySQLdb in conjunction with eventlet. Long story shot, my patch ended up creating a lot of issues and I absolutely agree that something needs to change ASAP. Monty, I can try to answer your questions/concerns: > Running the script a single time caused 16 connection threads to be > spawned. This seems a bit excessive. There are two flags that define how many connections should be maintained per service. These flags are 'sql_min_pool_size' and 'sql_max_pool_size'. Unfortunately I set both of these to 10 by default. When I ran the test script provided by Vish I saw 15 connections created. What seems to be happening is that initially 10 connections are pooled and 5 additional connections are being created (one per greenlet it's spawning). Long story short here is that something is wrong because at most you should be seeing 10 connections, so that's one outstanding issue. > When I tried spawning five copies at the same time, I wound up with 60 > connected threads, plus some connect timeouts, plus some of the > tracebacks you're seeing below. Absolutely. With the current defaults you'll hit limits quickly. > More vexing is that all of these queries were doing select ... for > update (makes sense why) - but that just means that we're stacking up on > locks in the db trying to get these resources... one of those situations > where greater parallelism actually isn't better. Absolutely, a fix for this could be to place a nova.utils 'synchronized' lock on methods like these to make sure that they're only run in parallel. While this might be "vexing", that's exactly what this script was designed to show...a worst case (SELECT .. FOR UPDATE). > Without really knowing more, my vote would be not to have app-level > connection pooling by default. MySQL specifically doesn't really like > having a bunch of unused connection sitting around for long lifecycles > (with a few exceptions - the exceptions always prove the rule, of course) I'm not a MySQL guru by any means, but can you explain this to me? I've never read anywhere that MySQL "doesn't really like having a bunch of unused connection sitting around for long lifecycles". It seems pretty logical to me to have at least 2 persistent connections to the database to avoid being completely blocked on database calls. Brian -----Original Message- From: "Monty Taylor" Sent: Sunday, September 25, 2011 8:48pm To: openstack@lists.launchpad.net Subject: Re: [Openstack] Nova DB Connection Pooling What was the intent of the connection pooling? That is, what was it trying to fix? Running the script a single time caused 16 connection threads to be spawned. This seems a bit excessive. When I tried spawning five copies at the same time, I wound up with 60 connected threads, plus some connect timeouts, plus some of the tracebacks you're seeing below. Increasing the thread_cache_size from the default on ubuntu (which is 8) helped things, but things still seemed to be going to the bad place. More vexing is that all of these queries were doing select ... for update (makes sense why) - but that just means that we're stacking up on locks in the db trying to get these resources... one of those situations where greater parallelism actually isn't better. Without really knowing more, my vote would be not to have app-level connection pooling by default. MySQL specifically doesn't really like having a bunch of unused connection sitting around for long lifecycles (with a few exceptions - the exceptions always prove the rule, of course) Of course- I could be wrong... which is why I'd like to know more about what the issue was that incited connection pooling. Monty On 09/25/2011 01:53 PM, Vishvananda I
Re: [Openstack] Nova DB Connection Pooling
Because I'm just going to spam the list all afternoon... Out of curiousity, I ramped up the numbers in the script to get a most sustained attack on the db. With the old db code (pre pool) - running 10 concurrent copies of dbrepro.py gave me 10 db connections (as you'd expect) and a consistent sustained throughput of 45 qps for a while. When I pulled forward to trunk, well - the first time I hit out of connections, because I was configured with max_connections of 150. Once I upped that and restarted mysql, I got 185 concurrent connections to the database (why 185? I don't know) with bursts of 10 queries a second once every 5 seconds with a single burst on the front end of 101 queries in a second. (oh, and btw - with trunk and 10 concurrent copies I started getting that traceback again) fwiw On 09/25/2011 05:59 PM, Monty Taylor wrote: > Hrm. It's not piling on with the locking like I originally thought - > from the db's perspective there's not a whole hell of a lot going on, > actually - even two copies of this script run concurrently cause them to > totally get in to some app-level spinning that ends up with at least one > script erroring. I'll keep poking, mainly just because it's interesting > - but it's certainly not happy. > > On 09/25/2011 01:53 PM, Vishvananda Ishaya wrote: >> Hey everyone, >> >> I'm a bit concerned with the connection pooling in the db. It seems that >> things are not getting cleaned up properly. I have a repro-case that causes >> failures that we have seen before. if I revert the >> nova/db/sqlalchemy/session.py to before the eventlet db pool was added I get >> no failures. If you want to see the issue, try the attached code. You will >> need to run from the nova directory or do python setup.py develop. You will >> also need to create a mysql database called test and edit the sql_connection >> string if you have a mysql password, etc. Please check this code. If we >> can't come up with a fix, I htink we need to revert back to no connection >> pooling. >> >> Run the attached script at least 3 times The code below runs fine the first >> couple of times, Then it starts to fail with the following error: >> >> 2011-09-24 12:51:02,799 INFO sqlalchemy.engine.base.Engine.0x...36d0 [-] >> ROLLBACK >> Traceback (most recent call last): >> File "/Library/Python/2.7/site-packages/eventlet/hubs/hub.py", line 336, in >> fire_timers >>timer() >> File "/Library/Python/2.7/site-packages/eventlet/hubs/timer.py", line 56, >> in __call__ >>cb(*args, **kw) >> File "/Library/Python/2.7/site-packages/eventlet/event.py", line 163, in >> _do_send >>waiter.switch(result) >> File "/Library/Python/2.7/site-packages/eventlet/greenthread.py", line 192, >> in main >>result = function(*args, **kwargs) >> File "dbrepro.py", line 44, in associate >>ip = db.fixed_ip_associate_pool(ctxt, 1, instance_id=val) >> File "/Users/vishvananda/os/nova/nova/db/api.py", line 352, in >> fixed_ip_associate_pool >>instance_id, host) >> File "/Users/vishvananda/os/nova/nova/db/sqlalchemy/api.py", line 102, in >> wrapper >>return f(*args, **kwargs) >> File "/Users/vishvananda/os/nova/nova/db/sqlalchemy/api.py", line 725, in >> fixed_ip_associate_pool >>filter_by(host=None).\ >> File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line >> 1496, in first >>ret = list(self[0:1]) >> File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line >> 1405, in __getitem__ >>return list(res) >> File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line >> 1669, in instances >>fetch = cursor.fetchall() >> File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line >> 2383, in fetchall >>l = self.process_rows(self._fetchall_impl()) >> File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line >> 2366, in process_rows >>keymap = metadata._keymap >> AttributeError: 'NoneType' object has no attribute '_keymap' >> >> >> >> >> >> >> >> >> ___ >> Mailing list: https://launchpad.net/~openstack >> Post to : openstack@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~openstack >> More help : https://help.launchpad.net/ListHelp > > ___ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp > ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
Hrm. It's not piling on with the locking like I originally thought - from the db's perspective there's not a whole hell of a lot going on, actually - even two copies of this script run concurrently cause them to totally get in to some app-level spinning that ends up with at least one script erroring. I'll keep poking, mainly just because it's interesting - but it's certainly not happy. On 09/25/2011 01:53 PM, Vishvananda Ishaya wrote: > Hey everyone, > > I'm a bit concerned with the connection pooling in the db. It seems that > things are not getting cleaned up properly. I have a repro-case that causes > failures that we have seen before. if I revert the > nova/db/sqlalchemy/session.py to before the eventlet db pool was added I get > no failures. If you want to see the issue, try the attached code. You will > need to run from the nova directory or do python setup.py develop. You will > also need to create a mysql database called test and edit the sql_connection > string if you have a mysql password, etc. Please check this code. If we > can't come up with a fix, I htink we need to revert back to no connection > pooling. > > Run the attached script at least 3 times The code below runs fine the first > couple of times, Then it starts to fail with the following error: > > 2011-09-24 12:51:02,799 INFO sqlalchemy.engine.base.Engine.0x...36d0 [-] > ROLLBACK > Traceback (most recent call last): > File "/Library/Python/2.7/site-packages/eventlet/hubs/hub.py", line 336, in > fire_timers >timer() > File "/Library/Python/2.7/site-packages/eventlet/hubs/timer.py", line 56, in > __call__ >cb(*args, **kw) > File "/Library/Python/2.7/site-packages/eventlet/event.py", line 163, in > _do_send >waiter.switch(result) > File "/Library/Python/2.7/site-packages/eventlet/greenthread.py", line 192, > in main >result = function(*args, **kwargs) > File "dbrepro.py", line 44, in associate >ip = db.fixed_ip_associate_pool(ctxt, 1, instance_id=val) > File "/Users/vishvananda/os/nova/nova/db/api.py", line 352, in > fixed_ip_associate_pool >instance_id, host) > File "/Users/vishvananda/os/nova/nova/db/sqlalchemy/api.py", line 102, in > wrapper >return f(*args, **kwargs) > File "/Users/vishvananda/os/nova/nova/db/sqlalchemy/api.py", line 725, in > fixed_ip_associate_pool >filter_by(host=None).\ > File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line 1496, > in first >ret = list(self[0:1]) > File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line 1405, > in __getitem__ >return list(res) > File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line 1669, > in instances >fetch = cursor.fetchall() > File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line > 2383, in fetchall >l = self.process_rows(self._fetchall_impl()) > File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line > 2366, in process_rows >keymap = metadata._keymap > AttributeError: 'NoneType' object has no attribute '_keymap' > > > > > > > > > ___ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova DB Connection Pooling
Whoops! Ok- turns out I was running against system installed libs instead of trunk. doh. I can now not reproduce your issues with trunk. I spawned off 10 concurrent copies of that program. Now, some of them weren't able to get ips any more (since it's only configured for a pool of five of them.) - but I can no longer trigger the AttributeError: 'NoneType' object has no attribute '_keymap' traceback. On 09/25/2011 05:48 PM, Monty Taylor wrote: > What was the intent of the connection pooling? That is, what was it > trying to fix? > > Running the script a single time caused 16 connection threads to be > spawned. This seems a bit excessive. > > When I tried spawning five copies at the same time, I wound up with 60 > connected threads, plus some connect timeouts, plus some of the > tracebacks you're seeing below. > > Increasing the thread_cache_size from the default on ubuntu (which is 8) > helped things, but things still seemed to be going to the bad place. > > More vexing is that all of these queries were doing select ... for > update (makes sense why) - but that just means that we're stacking up on > locks in the db trying to get these resources... one of those situations > where greater parallelism actually isn't better. > > Without really knowing more, my vote would be not to have app-level > connection pooling by default. MySQL specifically doesn't really like > having a bunch of unused connection sitting around for long lifecycles > (with a few exceptions - the exceptions always prove the rule, of course) > > Of course- I could be wrong... which is why I'd like to know more about > what the issue was that incited connection pooling. > > Monty > > On 09/25/2011 01:53 PM, Vishvananda Ishaya wrote: >> Hey everyone, >> >> I'm a bit concerned with the connection pooling in the db. It seems that >> things are not getting cleaned up properly. I have a repro-case that causes >> failures that we have seen before. if I revert the >> nova/db/sqlalchemy/session.py to before the eventlet db pool was added I get >> no failures. If you want to see the issue, try the attached code. You will >> need to run from the nova directory or do python setup.py develop. You will >> also need to create a mysql database called test and edit the sql_connection >> string if you have a mysql password, etc. Please check this code. If we >> can't come up with a fix, I htink we need to revert back to no connection >> pooling. >> >> Run the attached script at least 3 times The code below runs fine the first >> couple of times, Then it starts to fail with the following error: >> >> 2011-09-24 12:51:02,799 INFO sqlalchemy.engine.base.Engine.0x...36d0 [-] >> ROLLBACK >> Traceback (most recent call last): >> File "/Library/Python/2.7/site-packages/eventlet/hubs/hub.py", line 336, in >> fire_timers >>timer() >> File "/Library/Python/2.7/site-packages/eventlet/hubs/timer.py", line 56, >> in __call__ >>cb(*args, **kw) >> File "/Library/Python/2.7/site-packages/eventlet/event.py", line 163, in >> _do_send >>waiter.switch(result) >> File "/Library/Python/2.7/site-packages/eventlet/greenthread.py", line 192, >> in main >>result = function(*args, **kwargs) >> File "dbrepro.py", line 44, in associate >>ip = db.fixed_ip_associate_pool(ctxt, 1, instance_id=val) >> File "/Users/vishvananda/os/nova/nova/db/api.py", line 352, in >> fixed_ip_associate_pool >>instance_id, host) >> File "/Users/vishvananda/os/nova/nova/db/sqlalchemy/api.py", line 102, in >> wrapper >>return f(*args, **kwargs) >> File "/Users/vishvananda/os/nova/nova/db/sqlalchemy/api.py", line 725, in >> fixed_ip_associate_pool >>filter_by(host=None).\ >> File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line >> 1496, in first >>ret = list(self[0:1]) >> File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line >> 1405, in __getitem__ >>return list(res) >> File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line >> 1669, in instances >>fetch = cursor.fetchall() >> File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line >> 2383, in fetchall >>l = self.process_rows(self._fetchall_impl()) >> File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line >> 2366, in process_rows >>keymap = metadata._keymap >> AttributeError: 'NoneType' object has no attribute '_keymap' >> >> >> >> >> >> >> >> >> ___ >> Mailing list: https://launchpad.net/~openstack >> Post to : openstack@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~openstack >> More help : https://help.launchpad.net/ListHelp > > ___ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp > ___
Re: [Openstack] Nova DB Connection Pooling
What was the intent of the connection pooling? That is, what was it trying to fix? Running the script a single time caused 16 connection threads to be spawned. This seems a bit excessive. When I tried spawning five copies at the same time, I wound up with 60 connected threads, plus some connect timeouts, plus some of the tracebacks you're seeing below. Increasing the thread_cache_size from the default on ubuntu (which is 8) helped things, but things still seemed to be going to the bad place. More vexing is that all of these queries were doing select ... for update (makes sense why) - but that just means that we're stacking up on locks in the db trying to get these resources... one of those situations where greater parallelism actually isn't better. Without really knowing more, my vote would be not to have app-level connection pooling by default. MySQL specifically doesn't really like having a bunch of unused connection sitting around for long lifecycles (with a few exceptions - the exceptions always prove the rule, of course) Of course- I could be wrong... which is why I'd like to know more about what the issue was that incited connection pooling. Monty On 09/25/2011 01:53 PM, Vishvananda Ishaya wrote: > Hey everyone, > > I'm a bit concerned with the connection pooling in the db. It seems that > things are not getting cleaned up properly. I have a repro-case that causes > failures that we have seen before. if I revert the > nova/db/sqlalchemy/session.py to before the eventlet db pool was added I get > no failures. If you want to see the issue, try the attached code. You will > need to run from the nova directory or do python setup.py develop. You will > also need to create a mysql database called test and edit the sql_connection > string if you have a mysql password, etc. Please check this code. If we > can't come up with a fix, I htink we need to revert back to no connection > pooling. > > Run the attached script at least 3 times The code below runs fine the first > couple of times, Then it starts to fail with the following error: > > 2011-09-24 12:51:02,799 INFO sqlalchemy.engine.base.Engine.0x...36d0 [-] > ROLLBACK > Traceback (most recent call last): > File "/Library/Python/2.7/site-packages/eventlet/hubs/hub.py", line 336, in > fire_timers >timer() > File "/Library/Python/2.7/site-packages/eventlet/hubs/timer.py", line 56, in > __call__ >cb(*args, **kw) > File "/Library/Python/2.7/site-packages/eventlet/event.py", line 163, in > _do_send >waiter.switch(result) > File "/Library/Python/2.7/site-packages/eventlet/greenthread.py", line 192, > in main >result = function(*args, **kwargs) > File "dbrepro.py", line 44, in associate >ip = db.fixed_ip_associate_pool(ctxt, 1, instance_id=val) > File "/Users/vishvananda/os/nova/nova/db/api.py", line 352, in > fixed_ip_associate_pool >instance_id, host) > File "/Users/vishvananda/os/nova/nova/db/sqlalchemy/api.py", line 102, in > wrapper >return f(*args, **kwargs) > File "/Users/vishvananda/os/nova/nova/db/sqlalchemy/api.py", line 725, in > fixed_ip_associate_pool >filter_by(host=None).\ > File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line 1496, > in first >ret = list(self[0:1]) > File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line 1405, > in __getitem__ >return list(res) > File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line 1669, > in instances >fetch = cursor.fetchall() > File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line > 2383, in fetchall >l = self.process_rows(self._fetchall_impl()) > File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line > 2366, in process_rows >keymap = metadata._keymap > AttributeError: 'NoneType' object has no attribute '_keymap' > > > > > > > > > ___ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
[Openstack] Nova DB Connection Pooling
Hey everyone, I'm a bit concerned with the connection pooling in the db. It seems that things are not getting cleaned up properly. I have a repro-case that causes failures that we have seen before. if I revert the nova/db/sqlalchemy/session.py to before the eventlet db pool was added I get no failures. If you want to see the issue, try the attached code. You will need to run from the nova directory or do python setup.py develop. You will also need to create a mysql database called test and edit the sql_connection string if you have a mysql password, etc. Please check this code. If we can't come up with a fix, I htink we need to revert back to no connection pooling. Run the attached script at least 3 times The code below runs fine the first couple of times, Then it starts to fail with the following error: 2011-09-24 12:51:02,799 INFO sqlalchemy.engine.base.Engine.0x...36d0 [-] ROLLBACK Traceback (most recent call last): File "/Library/Python/2.7/site-packages/eventlet/hubs/hub.py", line 336, in fire_timers timer() File "/Library/Python/2.7/site-packages/eventlet/hubs/timer.py", line 56, in __call__ cb(*args, **kw) File "/Library/Python/2.7/site-packages/eventlet/event.py", line 163, in _do_send waiter.switch(result) File "/Library/Python/2.7/site-packages/eventlet/greenthread.py", line 192, in main result = function(*args, **kwargs) File "dbrepro.py", line 44, in associate ip = db.fixed_ip_associate_pool(ctxt, 1, instance_id=val) File "/Users/vishvananda/os/nova/nova/db/api.py", line 352, in fixed_ip_associate_pool instance_id, host) File "/Users/vishvananda/os/nova/nova/db/sqlalchemy/api.py", line 102, in wrapper return f(*args, **kwargs) File "/Users/vishvananda/os/nova/nova/db/sqlalchemy/api.py", line 725, in fixed_ip_associate_pool filter_by(host=None).\ File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line 1496, in first ret = list(self[0:1]) File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line 1405, in __getitem__ return list(res) File "/Library/Python/2.7/site-packages/sqlalchemy/orm/query.py", line 1669, in instances fetch = cursor.fetchall() File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 2383, in fetchall l = self.process_rows(self._fetchall_impl()) File "/Library/Python/2.7/site-packages/sqlalchemy/engine/base.py", line 2366, in process_rows keymap = metadata._keymap AttributeError: 'NoneType' object has no attribute '_keymap' import eventlet eventlet.monkey_patch() from nova import context from nova import db from nova import flags from nova import log as logging from nova.db import migration from nova.network import manager as network_manager from sqlalchemy import exc FLAGS = flags.FLAGS FLAGS.sql_connection = 'mysql://root:@localhost/test' from nova.tests import fake_flags logging.setup() ctxt = context.get_admin_context() def setup(): migration.db_sync() network = network_manager.VlanManager() bridge_interface = FLAGS.flat_interface or FLAGS.vlan_interface network.create_networks(ctxt, label='test', cidr=FLAGS.fixed_range, multi_host=FLAGS.multi_host, num_networks=FLAGS.num_networks, network_size=FLAGS.network_size, cidr_v6=FLAGS.fixed_range_v6, gateway_v6=FLAGS.gateway_v6, bridge=FLAGS.flat_network_bridge, bridge_interface=bridge_interface, vpn_start=FLAGS.vpn_start, vlan_start=FLAGS.vlan_start, dns1=FLAGS.flat_network_dns) for net in db.network_get_all(ctxt): network.set_network_host(ctxt, net) for i in range(5): db.instance_create(ctxt, {}) import time def associate(val): ip = db.fixed_ip_associate_pool(ctxt, 1, instance_id=val) time.sleep(1.0) db.fixed_ip_disassociate(ctxt, ip) return ip try: result = db.fixed_ip_get_all(ctxt) except exc.ProgrammingError: setup() pile = eventlet.GreenPile() jobs = [] for i in range(5): pile.spawn(associate, i) #p = multiprocessing.Process(target=associate, args=(i,)) #jobs.append(p) #p.start() for result in pile: print result ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp