Re: [Openstack] Database replacement?

2011-09-26 Thread Debo Dutta (dedutta)
+1

 

From: openstack-bounces+dedutta=cisco@lists.launchpad.net
[mailto:openstack-bounces+dedutta=cisco@lists.launchpad.net] On
Behalf Of Joshua Harlow
Sent: Monday, September 26, 2011 11:20 AM
To: Monty Taylor; Brian Lamar
Cc: openstack
Subject: Re: [Openstack] Database replacement?

 

We'll I think part of it is just having a good abstraction and having
the possibility of hooking in nosql engines or sql engines (maybe this
isn't in the end using SQLAlchemy? Or at least not directly)
It would be nice  neat if this was possible. At least for a company
like yahoo, at its scale I can pretty much guarantee a simple
centralized DB model wouldn't work.
The concept provided in http://wiki.openstack.org/MultiClusterZones
might help, but that isn't there yet (afaik).
So this was more of just a thought exercise on how or if its possible
to abstract out that part.

On 9/25/11 5:15 PM, Monty Taylor mord...@inaugust.com wrote:




On 09/24/2011 10:50 AM, Brian Lamar wrote:
 Hey Josh,

 Has there been any thought on having a nova-db service that
 responds to requests for information from the db (or something
 like a db).

 No plans that I'm aware of, there is a Database-as-a-Service project
 called 'Red Dwarf' which might fit this bill however. I honestly
 haven't looked too much into it.

 This could be useful for companies that don't necessarily want to
 have a limiting factor being a database. Since when u scale past
 a certain number of compute nodes the database connections
 themselves may become a bottleneck (especially the heartbeat
 mechanism which updates a table every X seconds).

 Not sure what you mean by this. Currently the OpenStack architecture
 was built to allow hundreds and thousands (maybe?) of compute nodes
 in the same environment. The keys is to group compute nodes into
 clusters as outlined here:

 http://wiki.openstack.org/MultiClusterZones

 Long story short the database isn't being shared between all compute
 clusters, but instead a hierarchy of clusters is formed (something I,
 in a pinch, would consider akin to a distributed Map/Reduce model of
 data sharing).

What are the actual scaling concerns? Have you seen scaling problems, or
are you just concerned that they might be hit? I'm not seeing any
mention of numbers here that would even come close to exceeding the
MySQL-scales-that-far-without-breaking-a-sweat range of things... so I'd
love to try and help address specific problems rather than re-architect
something before we even know what the problem we're trying to solve
are.

 Does something like this help out with your scaling concerns? I do
 know that personally I'd be interested in a CouchDB/NoSQL alternative
 to the Nova database layer...but what we have right now seems to
 conceptual work for scaling out to many hundreds of compute nodes.

Again - to what end? What is it that the current db setup isn't
providing that CouchDB would do a better job of?

 It would be interesting if these types of request could go to the
 message queue instead

 110% agree. Hopefully this is something we can talk about at the
 upcoming conference in Boston. :)

I will definitely agree that message queues can be a way of adding
scalability (async systems are often able to provide for interesting
parallelism) ... but at the end of the day the unit of work still has to
get accomplished, and if the request for data to the underlying message
store is still slow (sql or nosql, whatever) - under extremely high load
if your disk and/or cpu are saturated on the db infrastructure, async or
sync is going to make a flips work of difference. So I'm going to be
really annoying and again ask: to solve what actual problem? Example
queries and/or any logging/capturing of system information during
scaling issues would be a great start ... we can take a stab at solving
any current problems that are there - and as part of solving those
problems we can of course discuss approaches such as async message
queues or nosql alternatives.

Monty


 -Original Message- From: Joshua Harlow
 harlo...@yahoo-inc.com Sent: Friday, September 23, 2011 5:40pm To:
 openstack openstack@lists.launchpad.net Subject: [Openstack]
 Database replacement?

 ___ Mailing list:
 https://launchpad.net/~openstack Post to :
 openstack@lists.launchpad.net Unsubscribe :
 https://launchpad.net/~openstack More help   :
 https://help.launchpad.net/ListHelp This email may include
 confidential information. If you received it in error, please delete
 it. Howdy all, congrats on the diablo release!

 Has there been any thought on having a nova-db service that responds
 to requests for information from the db (or something like a db).

 This could be useful for companies that don't necessarily want to
 have a limiting factor being a database. Since when u scale past a
 certain number of compute nodes the database connections themselves
 may become a bottleneck (especially the heartbeat mechanism

Re: [Openstack] Database replacement?

2011-09-25 Thread Monty Taylor


On 09/24/2011 10:50 AM, Brian Lamar wrote:
 Hey Josh,
 
 Has there been any thought on having a nova-db service that
 responds to requests for information from the db (or something
 like a db).
 
 No plans that I'm aware of, there is a Database-as-a-Service project
 called 'Red Dwarf' which might fit this bill however. I honestly
 haven't looked too much into it.
 
 This could be useful for companies that don't necessarily want to
 have a limiting factor being a database. Since when u scale past
 a certain number of compute nodes the database connections
 themselves may become a bottleneck (especially the heartbeat 
 mechanism which updates a table every X seconds).
 
 Not sure what you mean by this. Currently the OpenStack architecture
 was built to allow hundreds and thousands (maybe?) of compute nodes
 in the same environment. The keys is to group compute nodes into
 clusters as outlined here:
 
 http://wiki.openstack.org/MultiClusterZones
 
 Long story short the database isn't being shared between all compute
 clusters, but instead a hierarchy of clusters is formed (something I,
 in a pinch, would consider akin to a distributed Map/Reduce model of
 data sharing).

What are the actual scaling concerns? Have you seen scaling problems, or
are you just concerned that they might be hit? I'm not seeing any
mention of numbers here that would even come close to exceeding the
MySQL-scales-that-far-without-breaking-a-sweat range of things... so I'd
love to try and help address specific problems rather than re-architect
something before we even know what the problem we're trying to solve are.

 Does something like this help out with your scaling concerns? I do
 know that personally I'd be interested in a CouchDB/NoSQL alternative
 to the Nova database layer...but what we have right now seems to
 conceptual work for scaling out to many hundreds of compute nodes.

Again - to what end? What is it that the current db setup isn't
providing that CouchDB would do a better job of?

 It would be interesting if these types of request could go to the
 message queue instead
 
 110% agree. Hopefully this is something we can talk about at the
 upcoming conference in Boston. :)

I will definitely agree that message queues can be a way of adding
scalability (async systems are often able to provide for interesting
parallelism) ... but at the end of the day the unit of work still has to
get accomplished, and if the request for data to the underlying message
store is still slow (sql or nosql, whatever) - under extremely high load
if your disk and/or cpu are saturated on the db infrastructure, async or
sync is going to make a flips work of difference. So I'm going to be
really annoying and again ask: to solve what actual problem? Example
queries and/or any logging/capturing of system information during
scaling issues would be a great start ... we can take a stab at solving
any current problems that are there - and as part of solving those
problems we can of course discuss approaches such as async message
queues or nosql alternatives.

Monty

 
 -Original Message- From: Joshua Harlow
 harlo...@yahoo-inc.com Sent: Friday, September 23, 2011 5:40pm To:
 openstack openstack@lists.launchpad.net Subject: [Openstack]
 Database replacement?
 
 ___ Mailing list:
 https://launchpad.net/~openstack Post to :
 openstack@lists.launchpad.net Unsubscribe :
 https://launchpad.net/~openstack More help   :
 https://help.launchpad.net/ListHelp This email may include
 confidential information. If you received it in error, please delete
 it. Howdy all, congrats on the diablo release!
 
 Has there been any thought on having a nova-db service that responds
 to requests for information from the db (or something like a db).
 
 This could be useful for companies that don't necessarily want to
 have a limiting factor being a database. Since when u scale past a
 certain number of compute nodes the database connections themselves
 may become a bottleneck (especially the heartbeat mechanism which
 updates a table every X seconds). It would be interesting if these
 types of request could go to the message queue instead and then the
 db backing could be swapped out with something more scalable (or
 still use mysql/sqlite...).
 
 Any thoughts?
 
 -Josh
 
 
 
 ___ Mailing list:
 https://launchpad.net/~openstack Post to :
 openstack@lists.launchpad.net Unsubscribe :
 https://launchpad.net/~openstack More help   :
 https://help.launchpad.net/ListHelp
 

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Database replacement?

2011-09-25 Thread Cole
Hey Monty,

The only thing I might add is a thought around the horizontal scalability of
mysql in a single zone. I know a lot of work has gone into various mysql
clustering technologies but as we continue down the road of durable and
distributed services mysql ends up being the red headed stepchild of the
deployment.

Fo me the question isn't can mysql handle the load but rather, in a
standard and repeatable deployment, when is it detrimental to add another
mysql instance to a running cluster due to the replication requirements?

Sorry for not responding inline, on my phone!




On Sun, Sep 25, 2011 at 5:15 PM, Monty Taylor mord...@inaugust.com wrote:



 On 09/24/2011 10:50 AM, Brian Lamar wrote:
  Hey Josh,
 
  Has there been any thought on having a nova-db service that
  responds to requests for information from the db (or something
  like a db).
 
  No plans that I'm aware of, there is a Database-as-a-Service project
  called 'Red Dwarf' which might fit this bill however. I honestly
  haven't looked too much into it.
 
  This could be useful for companies that don't necessarily want to
  have a limiting factor being a database. Since when u scale past
  a certain number of compute nodes the database connections
  themselves may become a bottleneck (especially the heartbeat
  mechanism which updates a table every X seconds).
 
  Not sure what you mean by this. Currently the OpenStack architecture
  was built to allow hundreds and thousands (maybe?) of compute nodes
  in the same environment. The keys is to group compute nodes into
  clusters as outlined here:
 
  http://wiki.openstack.org/MultiClusterZones
 
  Long story short the database isn't being shared between all compute
  clusters, but instead a hierarchy of clusters is formed (something I,
  in a pinch, would consider akin to a distributed Map/Reduce model of
  data sharing).

 What are the actual scaling concerns? Have you seen scaling problems, or
 are you just concerned that they might be hit? I'm not seeing any
 mention of numbers here that would even come close to exceeding the
 MySQL-scales-that-far-without-breaking-a-sweat range of things... so I'd
 love to try and help address specific problems rather than re-architect
 something before we even know what the problem we're trying to solve are.

  Does something like this help out with your scaling concerns? I do
  know that personally I'd be interested in a CouchDB/NoSQL alternative
  to the Nova database layer...but what we have right now seems to
  conceptual work for scaling out to many hundreds of compute nodes.

 Again - to what end? What is it that the current db setup isn't
 providing that CouchDB would do a better job of?

  It would be interesting if these types of request could go to the
  message queue instead
 
  110% agree. Hopefully this is something we can talk about at the
  upcoming conference in Boston. :)

 I will definitely agree that message queues can be a way of adding
 scalability (async systems are often able to provide for interesting
 parallelism) ... but at the end of the day the unit of work still has to
 get accomplished, and if the request for data to the underlying message
 store is still slow (sql or nosql, whatever) - under extremely high load
 if your disk and/or cpu are saturated on the db infrastructure, async or
 sync is going to make a flips work of difference. So I'm going to be
 really annoying and again ask: to solve what actual problem? Example
 queries and/or any logging/capturing of system information during
 scaling issues would be a great start ... we can take a stab at solving
 any current problems that are there - and as part of solving those
 problems we can of course discuss approaches such as async message
 queues or nosql alternatives.

 Monty

 
  -Original Message- From: Joshua Harlow
  harlo...@yahoo-inc.com Sent: Friday, September 23, 2011 5:40pm To:
  openstack openstack@lists.launchpad.net Subject: [Openstack]
  Database replacement?
 
  ___ Mailing list:
  https://launchpad.net/~openstack Post to :
  openstack@lists.launchpad.net Unsubscribe :
  https://launchpad.net/~openstack More help   :
  https://help.launchpad.net/ListHelp This email may include
  confidential information. If you received it in error, please delete
  it. Howdy all, congrats on the diablo release!
 
  Has there been any thought on having a nova-db service that responds
  to requests for information from the db (or something like a db).
 
  This could be useful for companies that don't necessarily want to
  have a limiting factor being a database. Since when u scale past a
  certain number of compute nodes the database connections themselves
  may become a bottleneck (especially the heartbeat mechanism which
  updates a table every X seconds). It would be interesting if these
  types of request could go to the message queue instead and then the
  db backing could be swapped out with something more

Re: [Openstack] Database replacement?

2011-09-24 Thread Brian Lamar
Hey Josh,

 Has there been any thought on having a nova-db service that responds to 
 requests for 
 information from the db (or something like a db).

No plans that I'm aware of, there is a Database-as-a-Service project called 
'Red Dwarf' which might fit this bill however. I honestly haven't looked too 
much into it.

 This could be useful for companies that don't necessarily want to have a 
 limiting 
 factor being a database. Since when u scale past a certain number of compute 
 nodes the 
 database connections themselves may become a bottleneck (especially the 
 heartbeat 
 mechanism which updates a table every X seconds).

Not sure what you mean by this. Currently the OpenStack architecture was built 
to allow hundreds and thousands (maybe?) of compute nodes in the same 
environment. The keys is to group compute nodes into clusters as outlined here:

http://wiki.openstack.org/MultiClusterZones

Long story short the database isn't being shared between all compute clusters, 
but instead a hierarchy of clusters is formed (something I, in a pinch, would 
consider akin to a distributed Map/Reduce model of data sharing).

Does something like this help out with your scaling concerns? I do know that 
personally I'd be interested in a CouchDB/NoSQL alternative to the Nova 
database layer...but what we have right now seems to conceptual work for 
scaling out to many hundreds of compute nodes.

 It would be interesting if these types of request could go to the message 
 queue 
 instead

110% agree. Hopefully this is something we can talk about at the upcoming 
conference in Boston. :)


-Brian


-Original Message-
From: Joshua Harlow harlo...@yahoo-inc.com
Sent: Friday, September 23, 2011 5:40pm
To: openstack openstack@lists.launchpad.net
Subject: [Openstack] Database replacement?

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp
This email may include confidential information. If you received it in error, 
please delete it.
Howdy all, congrats on the diablo release!

Has there been any thought on having a nova-db service that responds to 
requests for information from the db (or something like a db).

This could be useful for companies that don't necessarily want to have a 
limiting factor being a database. Since when u scale past a certain number of 
compute nodes the database connections themselves may become a bottleneck 
(especially the heartbeat mechanism which updates a table every X seconds). It 
would be interesting if these types of request could go to the message queue 
instead and then the db backing could be swapped out with something more 
scalable (or still use mysql/sqlite...).

Any thoughts?

-Josh



___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Database replacement?

2011-09-24 Thread Michael Basnight
On Sep 24, 2011, at 12:57 PM, Brian Lamar brian.la...@rackspace.com wrote:

 No plans that I'm aware of, there is a Database-as-a-Service project called 
 'Red Dwarf' which might fit this bill however. I honestly haven't looked too 
 much into it.

Hi, im a developer on the database as a service project. The reddwarf project 
is aimed at deploying MySQL (and more databases to come) in a virtual 
environment. It uses nova to deploy the virtual MySQL instances. It is not 
related to the database replacement issue.
This email may include confidential information. If you received it in error, 
please delete it.


___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] Database replacement?

2011-09-23 Thread Joshua Harlow
Howdy all, congrats on the diablo release!

Has there been any thought on having a nova-db service that responds to 
requests for information from the db (or something like a db).

This could be useful for companies that don't necessarily want to have a 
limiting factor being a database. Since when u scale past a certain number of 
compute nodes the database connections themselves may become a bottleneck 
(especially the heartbeat mechanism which updates a table every X seconds). It 
would be interesting if these types of request could go to the message queue 
instead and then the db backing could be swapped out with something more 
scalable (or still use mysql/sqlite...).

Any thoughts?

-Josh
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Database replacement?

2011-09-23 Thread Debo Dutta (dedutta)
This is a good idea. Actually it might be a very good idea to think of
scalable/distributed nosql engines to interface with nova and other
Openstack projects. 

 

Regards

debo

 

From: openstack-bounces+dedutta=cisco@lists.launchpad.net
[mailto:openstack-bounces+dedutta=cisco@lists.launchpad.net] On
Behalf Of Joshua Harlow
Sent: Friday, September 23, 2011 2:40 PM
To: openstack
Subject: [Openstack] Database replacement?

 

Howdy all, congrats on the diablo release!

Has there been any thought on having a nova-db service that responds to
requests for information from the db (or something like a db).

This could be useful for companies that don't necessarily want to have a
limiting factor being a database. Since when u scale past a certain
number of compute nodes the database connections themselves may become a
bottleneck (especially the heartbeat mechanism which updates a table
every X seconds). It would be interesting if these types of request
could go to the message queue instead and then the db backing could be
swapped out with something more scalable (or still use mysql/sqlite...).

Any thoughts?

-Josh 

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Database replacement?

2011-09-23 Thread Debo Dutta (dedutta)
Actually I see people have done django and Cassandra already so it might
be doable without a huge churn

http://stackoverflow.com/questions/2369793/how-to-use-cassandra-in-djang
o-framework

 

debo

 

From: Joshua Harlow [mailto:harlo...@yahoo-inc.com] 
Sent: Friday, September 23, 2011 3:32 PM
To: Debo Dutta (dedutta); openstack
Subject: Re: [Openstack] Database replacement?

 

Ya, that would be the ideal, make it more modular so that nosql engines
could be hooked in (if applicable).

It might also provide someone an opportunity to re-factor
https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py
which seems hairy (4000 lines??)

On 9/23/11 3:10 PM, Debo Dutta (dedutta) dedu...@cisco.com wrote:

This is a good idea. Actually it might be a very good idea to think of
scalable/distributed nosql engines to interface with nova and other
Openstack projects. 
 
Regards
debo
 

From: openstack-bounces+dedutta=cisco@lists.launchpad.net
[mailto:openstack-bounces+dedutta=cisco@lists.launchpad.net] On
Behalf Of Joshua Harlow
Sent: Friday, September 23, 2011 2:40 PM
To: openstack
Subject: [Openstack] Database replacement?

Howdy all, congrats on the diablo release!

Has there been any thought on having a nova-db service that responds to
requests for information from the db (or something like a db).

This could be useful for companies that don't necessarily want to have a
limiting factor being a database. Since when u scale past a certain
number of compute nodes the database connections themselves may become a
bottleneck (especially the heartbeat mechanism which updates a table
every X seconds). It would be interesting if these types of request
could go to the message queue instead and then the db backing could be
swapped out with something more scalable (or still use mysql/sqlite...).

Any thoughts?

-Josh 

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Database replacement?

2011-09-23 Thread Cole
I know from talking to datastax and 10gen both that there is interest in
doing this.
On Sep 23, 2011 3:40 PM, Debo Dutta (dedutta) dedu...@cisco.com wrote:
 This is a good idea. Actually it might be a very good idea to think of
 scalable/distributed nosql engines to interface with nova and other
 Openstack projects.



 Regards

 debo



 From: openstack-bounces+dedutta=cisco@lists.launchpad.net
 [mailto:openstack-bounces+dedutta=cisco@lists.launchpad.net] On
 Behalf Of Joshua Harlow
 Sent: Friday, September 23, 2011 2:40 PM
 To: openstack
 Subject: [Openstack] Database replacement?



 Howdy all, congrats on the diablo release!

 Has there been any thought on having a nova-db service that responds to
 requests for information from the db (or something like a db).

 This could be useful for companies that don't necessarily want to have a
 limiting factor being a database. Since when u scale past a certain
 number of compute nodes the database connections themselves may become a
 bottleneck (especially the heartbeat mechanism which updates a table
 every X seconds). It would be interesting if these types of request
 could go to the message queue instead and then the db backing could be
 swapped out with something more scalable (or still use mysql/sqlite...).

 Any thoughts?

 -Josh

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp