strange couch error under load

2009-03-27 Thread Shaun Lindsay
Hey all,

We've been seeing this strange error in the couch logs and we're not sure what 
to make of it.  It seems like once this error starts happening, it continues to 
occur and corresponds to a general unresponsiveness of that particular couch 
node.  My first thought was that it was running out of file descriptors, but 
that doesn't seem to be the case (it still happens with a very high ulimit).

In this error the database it's trying to open is 'profiles22', but the error 
doesn't seem tied to a particular database (we see it happening on all of our 
profiles shards).

We built this couch instance from trunk r713244, if that helps.

Anyone have any insight on what's going on?
[Fri, 27 Mar 2009 18:21:42 GMT] [error] [0.14547.0] {error_report,0.21.0,
{0.14547.0,crash_report,
 [[{pid,0.14547.0},
   {registered_name,[]},
   {error_info,
   {exit,
   {timeout,
   {gen_server,call,
   [couch_server,{open,profiles22,[{creds,{[]}}]}]}},
   [{gen_server,call,2},
{couch_httpd_db,do_db_req,2},
{couch_httpd,handle_request,3},
{mochiweb_http,headers,4},
{proc_lib,init_p,5}]}},
   {initial_call,
   {mochiweb_socket_server,acceptor_loop,
   [{0.59.0,#Port0.149,#Funmochiweb_http.1.62821608}]}},
   {ancestors,[couch_httpd,0.53.0,couch_server_sup,0.1.0]},
   {messages,[]},
   {links,[0.59.0,#Port0.14390]},
   {dictionary,[]},
   {trap_exit,false},
   {status,running},
   {heap_size,2584},
   {stack_size,23},
   {reductions,3300}],
  []]}}

Thanks,
Shaun



Lounge clustering framework

2009-02-24 Thread Shaun Lindsay
Hey all,
We've been discussing the best way to handle releasing the Lounge code and
we have some questions that you, the couch devs, might be able to help out
with:

1. What license is preferred?  Since Couch is an Apache project, the Apache
license is probably appropriate, however, since the Lounge is more or less a
separate entity, we can probably release under any license.  Any
preferences?

2. Project hosting?  Again, since this is separate from Couch, it probably
doesn't make sense to have it in the Couch repo.  We were thinking google
code (since we use svn),  but I'm open to whatever.  Thoughts?

Once we settle on a license, we'll need to run it by our lawyer to make sure
we're solid and, assuming that goes well, we should be good to give out the
code.

Thanks,
Shaun Lindsay
Meebo.com


Re: Couch clustering/partitioning Re: CouchSpray - Thoughts?

2009-02-20 Thread Shaun Lindsay
Due to one of the key people being sick, we pushed our meeting to discuss
releasing the code to Monday.  I'll send out an update then.

On Fri, Feb 20, 2009 at 2:17 AM, Jan Lehnardt j...@apache.org wrote:


 On 20 Feb 2009, at 02:34, Shaun Lindsay wrote:

  Hi all,
 So, a couple months ago we implemented almost exactly the couch
 clustering/partitioning solution described below.


 Shaun, this sounds fantastic! :) I hope you can release the code for
 this.

 Cheers
 Jan
 --




  The couch cluster (which
 we called 'The Lounge') sits behind nginx running a custom module that
 farms
 out the GETs and PUTs to the appropriate node/shard and the views to a
 python proxy daemon which handles reducing the view results from the
 individual shards and returning the full view.  We have replication
 working
 between the cluster nodes so the shards exist multiple places and, in the
 case of one of the nodes going down, the various proxies fail over to the
 backup shards.

 This clustering setup has been running in full production for several
 months
 now with minimal problems.

 We're looking to release all the code back to the community, but we need
 to
 clear it with our legal team first to make sure we're not compromising any
 of our more business-specific, proprietary code.

 In total, we have:
 a nginx module specifically set up for sharding databases
 a 'smartproxy', written in Python/Twisted, for sharding views
 and a few other ancillary pieces (replication notification, view updating,
 etc)

 Mainly, I just wanted to keep people from duplicating the work we've done
 --
 hopefully we can release something back to the community in the next
 several
 weeks.

 We're having a meeting tomorrow morning to figure out what we can release
 right now (probably the nginx module, at the least).  I'll let everyone
 know
 what out timeline looks like.

 --Shaun Lindsay
 Meebo.com

 On Thu, Feb 19, 2009 at 4:48 PM, Chris Anderson jch...@apache.org
 wrote:

  On Thu, Feb 19, 2009 at 4:35 PM, Ben Browning ben...@gmail.com wrote:

 So, I started thinking about partitioning with CouchDB and realized
 that since views are just map/reduce, we can do some magic that's
 harder if not impossible with other database systems. The idea in a
 nutshell is to create a proxy that sits in front of multiple servers
 and sprays the view queries to all servers, merging the results -
 hence CouchSpray. This would give us storage and processing
 scalability and could, with some extra logic, provide data redundancy
 and failover.


 There are plans in CouchDB's future to take care of data partitioning,
 as well as querying views from a cluster. Theoretically, it should be
 pretty simple. There are a few small projects that have started down
 the road of writing code in this area.

 https://code.launchpad.net/~dreid/sectional/trunkhttps://code.launchpad.net/%7Edreid/sectional/trunk

 Sectional is an Erlang http proxy that implements consistent hashing
 for docs. I'm not sure how it handles view queries.

 There's also a project to provide partitioning around the basic
 key/value PUT and GET store using Nginx:

 http://github.com/dysinger/nginx/tree/nginx_upstream_hash

 If you're interested in digging into this stuff, please join d...@. We
 plan to include clustering in CouchDB, so if you're interested in
 implementing it, we could use your help.

 Chris

 --
 Chris Anderson
 http://jchris.mfdz.com





Couch clustering/partitioning Re: CouchSpray - Thoughts?

2009-02-19 Thread Shaun Lindsay
Hi all,
So, a couple months ago we implemented almost exactly the couch
clustering/partitioning solution described below.  The couch cluster (which
we called 'The Lounge') sits behind nginx running a custom module that farms
out the GETs and PUTs to the appropriate node/shard and the views to a
python proxy daemon which handles reducing the view results from the
individual shards and returning the full view.  We have replication working
between the cluster nodes so the shards exist multiple places and, in the
case of one of the nodes going down, the various proxies fail over to the
backup shards.

This clustering setup has been running in full production for several months
now with minimal problems.

We're looking to release all the code back to the community, but we need to
clear it with our legal team first to make sure we're not compromising any
of our more business-specific, proprietary code.

In total, we have:
a nginx module specifically set up for sharding databases
a 'smartproxy', written in Python/Twisted, for sharding views
and a few other ancillary pieces (replication notification, view updating,
etc)

Mainly, I just wanted to keep people from duplicating the work we've done --
hopefully we can release something back to the community in the next several
weeks.

We're having a meeting tomorrow morning to figure out what we can release
right now (probably the nginx module, at the least).  I'll let everyone know
what out timeline looks like.

--Shaun Lindsay
Meebo.com

On Thu, Feb 19, 2009 at 4:48 PM, Chris Anderson jch...@apache.org wrote:

 On Thu, Feb 19, 2009 at 4:35 PM, Ben Browning ben...@gmail.com wrote:
  So, I started thinking about partitioning with CouchDB and realized
  that since views are just map/reduce, we can do some magic that's
  harder if not impossible with other database systems. The idea in a
  nutshell is to create a proxy that sits in front of multiple servers
  and sprays the view queries to all servers, merging the results -
  hence CouchSpray. This would give us storage and processing
  scalability and could, with some extra logic, provide data redundancy
  and failover.

 There are plans in CouchDB's future to take care of data partitioning,
 as well as querying views from a cluster. Theoretically, it should be
 pretty simple. There are a few small projects that have started down
 the road of writing code in this area.

 https://code.launchpad.net/~dreid/sectional/trunk

 Sectional is an Erlang http proxy that implements consistent hashing
 for docs. I'm not sure how it handles view queries.

 There's also a project to provide partitioning around the basic
 key/value PUT and GET store using Nginx:

 http://github.com/dysinger/nginx/tree/nginx_upstream_hash

 If you're interested in digging into this stuff, please join d...@. We
 plan to include clustering in CouchDB, so if you're interested in
 implementing it, we could use your help.

 Chris

 --
 Chris Anderson
 http://jchris.mfdz.com



Re: Couch clustering/partitioning Re: CouchSpray - Thoughts?

2009-02-19 Thread Shaun Lindsay
Our current cluster is running on 4 nodes, on some slower, leftover
hardware.  As of right now, it's handling about 300 queries/sec, with about
1/3 of that being view requests.  As for documents, we're looking at ~12M
docs taking up around 90Gb of space on disk.
We have the cluster split in to 48 shards, with 12 shards per node -- this
lets us add more nodes later, up to 48, before we need to mess with
resharding the data (or, more likely, treeing out underneath the first level
of nodes).

We've also implemented some caching for the most often hit views in the
smartproxy -- map/reducing over the 48 shards isn't trivial and, at 100 view
queries a second, caching is necessary for the cluster to actually work.

--Shaun Lindsay
Meebo.com

On Thu, Feb 19, 2009 at 5:40 PM, Chris Anderson jch...@apache.org wrote:

 On Thu, Feb 19, 2009 at 5:34 PM, Shaun Lindsay sh...@meebo.com wrote:
  Hi all,
  So, a couple months ago we implemented almost exactly the couch
  clustering/partitioning solution described below.  The couch cluster
 (which
  we called 'The Lounge') sits behind nginx running a custom module that
 farms
  out the GETs and PUTs to the appropriate node/shard and the views to a
  python proxy daemon which handles reducing the view results from the
  individual shards and returning the full view.  We have replication
 working
  between the cluster nodes so the shards exist multiple places and, in the
  case of one of the nodes going down, the various proxies fail over to the
  backup shards.
 
  This clustering setup has been running in full production for several
 months
  now with minimal problems.
 
  We're looking to release all the code back to the community, but we need
 to
  clear it with our legal team first to make sure we're not compromising
 any
  of our more business-specific, proprietary code.
 
  In total, we have:
  a nginx module specifically set up for sharding databases
  a 'smartproxy', written in Python/Twisted, for sharding views
  and a few other ancillary pieces (replication notification, view
 updating,
  etc)
 
  Mainly, I just wanted to keep people from duplicating the work we've done
 --
  hopefully we can release something back to the community in the next
 several
  weeks.
 
  We're having a meeting tomorrow morning to figure out what we can release
  right now (probably the nginx module, at the least).  I'll let everyone
 know
  what out timeline looks like.
 

 This is really cool. It sounds like a tool people could get started
 with right away to build big CouchDB clusters. Is there anything you
 can tell us about the size of clusters you've used?

 The smartproxy code will probably be a really good illustration of
 what we'll probably want to implement in Erlang for CouchDB,
 eventually. Again, +1 for software people can use today.

 Hopefully it turns out to be easy to release!

 Cheers,
 Chris

 --
 Chris Anderson
 http://jchris.mfdz.com