Re: [Openstack] Database replacement?
+1 From: openstack-bounces+dedutta=cisco@lists.launchpad.net [mailto:openstack-bounces+dedutta=cisco@lists.launchpad.net] On Behalf Of Joshua Harlow Sent: Monday, September 26, 2011 11:20 AM To: Monty Taylor; Brian Lamar Cc: openstack Subject: Re: [Openstack] Database replacement? We'll I think part of it is just having a good abstraction and having the possibility of hooking in nosql engines or sql engines (maybe this isn't in the end using SQLAlchemy? Or at least not directly) It would be nice neat if this was possible. At least for a company like yahoo, at its scale I can pretty much guarantee a simple centralized DB model wouldn't work. The concept provided in http://wiki.openstack.org/MultiClusterZones might help, but that isn't there yet (afaik). So this was more of just a thought exercise on how or if its possible to abstract out that part. On 9/25/11 5:15 PM, Monty Taylor mord...@inaugust.com wrote: On 09/24/2011 10:50 AM, Brian Lamar wrote: Hey Josh, Has there been any thought on having a nova-db service that responds to requests for information from the db (or something like a db). No plans that I'm aware of, there is a Database-as-a-Service project called 'Red Dwarf' which might fit this bill however. I honestly haven't looked too much into it. This could be useful for companies that don't necessarily want to have a limiting factor being a database. Since when u scale past a certain number of compute nodes the database connections themselves may become a bottleneck (especially the heartbeat mechanism which updates a table every X seconds). Not sure what you mean by this. Currently the OpenStack architecture was built to allow hundreds and thousands (maybe?) of compute nodes in the same environment. The keys is to group compute nodes into clusters as outlined here: http://wiki.openstack.org/MultiClusterZones Long story short the database isn't being shared between all compute clusters, but instead a hierarchy of clusters is formed (something I, in a pinch, would consider akin to a distributed Map/Reduce model of data sharing). What are the actual scaling concerns? Have you seen scaling problems, or are you just concerned that they might be hit? I'm not seeing any mention of numbers here that would even come close to exceeding the MySQL-scales-that-far-without-breaking-a-sweat range of things... so I'd love to try and help address specific problems rather than re-architect something before we even know what the problem we're trying to solve are. Does something like this help out with your scaling concerns? I do know that personally I'd be interested in a CouchDB/NoSQL alternative to the Nova database layer...but what we have right now seems to conceptual work for scaling out to many hundreds of compute nodes. Again - to what end? What is it that the current db setup isn't providing that CouchDB would do a better job of? It would be interesting if these types of request could go to the message queue instead 110% agree. Hopefully this is something we can talk about at the upcoming conference in Boston. :) I will definitely agree that message queues can be a way of adding scalability (async systems are often able to provide for interesting parallelism) ... but at the end of the day the unit of work still has to get accomplished, and if the request for data to the underlying message store is still slow (sql or nosql, whatever) - under extremely high load if your disk and/or cpu are saturated on the db infrastructure, async or sync is going to make a flips work of difference. So I'm going to be really annoying and again ask: to solve what actual problem? Example queries and/or any logging/capturing of system information during scaling issues would be a great start ... we can take a stab at solving any current problems that are there - and as part of solving those problems we can of course discuss approaches such as async message queues or nosql alternatives. Monty -Original Message- From: Joshua Harlow harlo...@yahoo-inc.com Sent: Friday, September 23, 2011 5:40pm To: openstack openstack@lists.launchpad.net Subject: [Openstack] Database replacement? ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp This email may include confidential information. If you received it in error, please delete it. Howdy all, congrats on the diablo release! Has there been any thought on having a nova-db service that responds to requests for information from the db (or something like a db). This could be useful for companies that don't necessarily want to have a limiting factor being a database. Since when u scale past a certain number of compute nodes the database connections themselves may become a bottleneck (especially the heartbeat mechanism
Re: [Openstack] Database replacement?
On 09/24/2011 10:50 AM, Brian Lamar wrote: Hey Josh, Has there been any thought on having a nova-db service that responds to requests for information from the db (or something like a db). No plans that I'm aware of, there is a Database-as-a-Service project called 'Red Dwarf' which might fit this bill however. I honestly haven't looked too much into it. This could be useful for companies that don't necessarily want to have a limiting factor being a database. Since when u scale past a certain number of compute nodes the database connections themselves may become a bottleneck (especially the heartbeat mechanism which updates a table every X seconds). Not sure what you mean by this. Currently the OpenStack architecture was built to allow hundreds and thousands (maybe?) of compute nodes in the same environment. The keys is to group compute nodes into clusters as outlined here: http://wiki.openstack.org/MultiClusterZones Long story short the database isn't being shared between all compute clusters, but instead a hierarchy of clusters is formed (something I, in a pinch, would consider akin to a distributed Map/Reduce model of data sharing). What are the actual scaling concerns? Have you seen scaling problems, or are you just concerned that they might be hit? I'm not seeing any mention of numbers here that would even come close to exceeding the MySQL-scales-that-far-without-breaking-a-sweat range of things... so I'd love to try and help address specific problems rather than re-architect something before we even know what the problem we're trying to solve are. Does something like this help out with your scaling concerns? I do know that personally I'd be interested in a CouchDB/NoSQL alternative to the Nova database layer...but what we have right now seems to conceptual work for scaling out to many hundreds of compute nodes. Again - to what end? What is it that the current db setup isn't providing that CouchDB would do a better job of? It would be interesting if these types of request could go to the message queue instead 110% agree. Hopefully this is something we can talk about at the upcoming conference in Boston. :) I will definitely agree that message queues can be a way of adding scalability (async systems are often able to provide for interesting parallelism) ... but at the end of the day the unit of work still has to get accomplished, and if the request for data to the underlying message store is still slow (sql or nosql, whatever) - under extremely high load if your disk and/or cpu are saturated on the db infrastructure, async or sync is going to make a flips work of difference. So I'm going to be really annoying and again ask: to solve what actual problem? Example queries and/or any logging/capturing of system information during scaling issues would be a great start ... we can take a stab at solving any current problems that are there - and as part of solving those problems we can of course discuss approaches such as async message queues or nosql alternatives. Monty -Original Message- From: Joshua Harlow harlo...@yahoo-inc.com Sent: Friday, September 23, 2011 5:40pm To: openstack openstack@lists.launchpad.net Subject: [Openstack] Database replacement? ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp This email may include confidential information. If you received it in error, please delete it. Howdy all, congrats on the diablo release! Has there been any thought on having a nova-db service that responds to requests for information from the db (or something like a db). This could be useful for companies that don't necessarily want to have a limiting factor being a database. Since when u scale past a certain number of compute nodes the database connections themselves may become a bottleneck (especially the heartbeat mechanism which updates a table every X seconds). It would be interesting if these types of request could go to the message queue instead and then the db backing could be swapped out with something more scalable (or still use mysql/sqlite...). Any thoughts? -Josh ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Database replacement?
Hey Monty, The only thing I might add is a thought around the horizontal scalability of mysql in a single zone. I know a lot of work has gone into various mysql clustering technologies but as we continue down the road of durable and distributed services mysql ends up being the red headed stepchild of the deployment. Fo me the question isn't can mysql handle the load but rather, in a standard and repeatable deployment, when is it detrimental to add another mysql instance to a running cluster due to the replication requirements? Sorry for not responding inline, on my phone! On Sun, Sep 25, 2011 at 5:15 PM, Monty Taylor mord...@inaugust.com wrote: On 09/24/2011 10:50 AM, Brian Lamar wrote: Hey Josh, Has there been any thought on having a nova-db service that responds to requests for information from the db (or something like a db). No plans that I'm aware of, there is a Database-as-a-Service project called 'Red Dwarf' which might fit this bill however. I honestly haven't looked too much into it. This could be useful for companies that don't necessarily want to have a limiting factor being a database. Since when u scale past a certain number of compute nodes the database connections themselves may become a bottleneck (especially the heartbeat mechanism which updates a table every X seconds). Not sure what you mean by this. Currently the OpenStack architecture was built to allow hundreds and thousands (maybe?) of compute nodes in the same environment. The keys is to group compute nodes into clusters as outlined here: http://wiki.openstack.org/MultiClusterZones Long story short the database isn't being shared between all compute clusters, but instead a hierarchy of clusters is formed (something I, in a pinch, would consider akin to a distributed Map/Reduce model of data sharing). What are the actual scaling concerns? Have you seen scaling problems, or are you just concerned that they might be hit? I'm not seeing any mention of numbers here that would even come close to exceeding the MySQL-scales-that-far-without-breaking-a-sweat range of things... so I'd love to try and help address specific problems rather than re-architect something before we even know what the problem we're trying to solve are. Does something like this help out with your scaling concerns? I do know that personally I'd be interested in a CouchDB/NoSQL alternative to the Nova database layer...but what we have right now seems to conceptual work for scaling out to many hundreds of compute nodes. Again - to what end? What is it that the current db setup isn't providing that CouchDB would do a better job of? It would be interesting if these types of request could go to the message queue instead 110% agree. Hopefully this is something we can talk about at the upcoming conference in Boston. :) I will definitely agree that message queues can be a way of adding scalability (async systems are often able to provide for interesting parallelism) ... but at the end of the day the unit of work still has to get accomplished, and if the request for data to the underlying message store is still slow (sql or nosql, whatever) - under extremely high load if your disk and/or cpu are saturated on the db infrastructure, async or sync is going to make a flips work of difference. So I'm going to be really annoying and again ask: to solve what actual problem? Example queries and/or any logging/capturing of system information during scaling issues would be a great start ... we can take a stab at solving any current problems that are there - and as part of solving those problems we can of course discuss approaches such as async message queues or nosql alternatives. Monty -Original Message- From: Joshua Harlow harlo...@yahoo-inc.com Sent: Friday, September 23, 2011 5:40pm To: openstack openstack@lists.launchpad.net Subject: [Openstack] Database replacement? ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp This email may include confidential information. If you received it in error, please delete it. Howdy all, congrats on the diablo release! Has there been any thought on having a nova-db service that responds to requests for information from the db (or something like a db). This could be useful for companies that don't necessarily want to have a limiting factor being a database. Since when u scale past a certain number of compute nodes the database connections themselves may become a bottleneck (especially the heartbeat mechanism which updates a table every X seconds). It would be interesting if these types of request could go to the message queue instead and then the db backing could be swapped out with something more
Re: [Openstack] Database replacement?
Hey Josh, Has there been any thought on having a nova-db service that responds to requests for information from the db (or something like a db). No plans that I'm aware of, there is a Database-as-a-Service project called 'Red Dwarf' which might fit this bill however. I honestly haven't looked too much into it. This could be useful for companies that don't necessarily want to have a limiting factor being a database. Since when u scale past a certain number of compute nodes the database connections themselves may become a bottleneck (especially the heartbeat mechanism which updates a table every X seconds). Not sure what you mean by this. Currently the OpenStack architecture was built to allow hundreds and thousands (maybe?) of compute nodes in the same environment. The keys is to group compute nodes into clusters as outlined here: http://wiki.openstack.org/MultiClusterZones Long story short the database isn't being shared between all compute clusters, but instead a hierarchy of clusters is formed (something I, in a pinch, would consider akin to a distributed Map/Reduce model of data sharing). Does something like this help out with your scaling concerns? I do know that personally I'd be interested in a CouchDB/NoSQL alternative to the Nova database layer...but what we have right now seems to conceptual work for scaling out to many hundreds of compute nodes. It would be interesting if these types of request could go to the message queue instead 110% agree. Hopefully this is something we can talk about at the upcoming conference in Boston. :) -Brian -Original Message- From: Joshua Harlow harlo...@yahoo-inc.com Sent: Friday, September 23, 2011 5:40pm To: openstack openstack@lists.launchpad.net Subject: [Openstack] Database replacement? ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp This email may include confidential information. If you received it in error, please delete it. Howdy all, congrats on the diablo release! Has there been any thought on having a nova-db service that responds to requests for information from the db (or something like a db). This could be useful for companies that don't necessarily want to have a limiting factor being a database. Since when u scale past a certain number of compute nodes the database connections themselves may become a bottleneck (especially the heartbeat mechanism which updates a table every X seconds). It would be interesting if these types of request could go to the message queue instead and then the db backing could be swapped out with something more scalable (or still use mysql/sqlite...). Any thoughts? -Josh ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Database replacement?
On Sep 24, 2011, at 12:57 PM, Brian Lamar brian.la...@rackspace.com wrote: No plans that I'm aware of, there is a Database-as-a-Service project called 'Red Dwarf' which might fit this bill however. I honestly haven't looked too much into it. Hi, im a developer on the database as a service project. The reddwarf project is aimed at deploying MySQL (and more databases to come) in a virtual environment. It uses nova to deploy the virtual MySQL instances. It is not related to the database replacement issue. This email may include confidential information. If you received it in error, please delete it. ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
[Openstack] Database replacement?
Howdy all, congrats on the diablo release! Has there been any thought on having a nova-db service that responds to requests for information from the db (or something like a db). This could be useful for companies that don't necessarily want to have a limiting factor being a database. Since when u scale past a certain number of compute nodes the database connections themselves may become a bottleneck (especially the heartbeat mechanism which updates a table every X seconds). It would be interesting if these types of request could go to the message queue instead and then the db backing could be swapped out with something more scalable (or still use mysql/sqlite...). Any thoughts? -Josh ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Database replacement?
This is a good idea. Actually it might be a very good idea to think of scalable/distributed nosql engines to interface with nova and other Openstack projects. Regards debo From: openstack-bounces+dedutta=cisco@lists.launchpad.net [mailto:openstack-bounces+dedutta=cisco@lists.launchpad.net] On Behalf Of Joshua Harlow Sent: Friday, September 23, 2011 2:40 PM To: openstack Subject: [Openstack] Database replacement? Howdy all, congrats on the diablo release! Has there been any thought on having a nova-db service that responds to requests for information from the db (or something like a db). This could be useful for companies that don't necessarily want to have a limiting factor being a database. Since when u scale past a certain number of compute nodes the database connections themselves may become a bottleneck (especially the heartbeat mechanism which updates a table every X seconds). It would be interesting if these types of request could go to the message queue instead and then the db backing could be swapped out with something more scalable (or still use mysql/sqlite...). Any thoughts? -Josh ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Database replacement?
Actually I see people have done django and Cassandra already so it might be doable without a huge churn http://stackoverflow.com/questions/2369793/how-to-use-cassandra-in-djang o-framework debo From: Joshua Harlow [mailto:harlo...@yahoo-inc.com] Sent: Friday, September 23, 2011 3:32 PM To: Debo Dutta (dedutta); openstack Subject: Re: [Openstack] Database replacement? Ya, that would be the ideal, make it more modular so that nosql engines could be hooked in (if applicable). It might also provide someone an opportunity to re-factor https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py which seems hairy (4000 lines??) On 9/23/11 3:10 PM, Debo Dutta (dedutta) dedu...@cisco.com wrote: This is a good idea. Actually it might be a very good idea to think of scalable/distributed nosql engines to interface with nova and other Openstack projects. Regards debo From: openstack-bounces+dedutta=cisco@lists.launchpad.net [mailto:openstack-bounces+dedutta=cisco@lists.launchpad.net] On Behalf Of Joshua Harlow Sent: Friday, September 23, 2011 2:40 PM To: openstack Subject: [Openstack] Database replacement? Howdy all, congrats on the diablo release! Has there been any thought on having a nova-db service that responds to requests for information from the db (or something like a db). This could be useful for companies that don't necessarily want to have a limiting factor being a database. Since when u scale past a certain number of compute nodes the database connections themselves may become a bottleneck (especially the heartbeat mechanism which updates a table every X seconds). It would be interesting if these types of request could go to the message queue instead and then the db backing could be swapped out with something more scalable (or still use mysql/sqlite...). Any thoughts? -Josh ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Database replacement?
I know from talking to datastax and 10gen both that there is interest in doing this. On Sep 23, 2011 3:40 PM, Debo Dutta (dedutta) dedu...@cisco.com wrote: This is a good idea. Actually it might be a very good idea to think of scalable/distributed nosql engines to interface with nova and other Openstack projects. Regards debo From: openstack-bounces+dedutta=cisco@lists.launchpad.net [mailto:openstack-bounces+dedutta=cisco@lists.launchpad.net] On Behalf Of Joshua Harlow Sent: Friday, September 23, 2011 2:40 PM To: openstack Subject: [Openstack] Database replacement? Howdy all, congrats on the diablo release! Has there been any thought on having a nova-db service that responds to requests for information from the db (or something like a db). This could be useful for companies that don't necessarily want to have a limiting factor being a database. Since when u scale past a certain number of compute nodes the database connections themselves may become a bottleneck (especially the heartbeat mechanism which updates a table every X seconds). It would be interesting if these types of request could go to the message queue instead and then the db backing could be swapped out with something more scalable (or still use mysql/sqlite...). Any thoughts? -Josh ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp