strange couch error under load
Hey all, We've been seeing this strange error in the couch logs and we're not sure what to make of it. It seems like once this error starts happening, it continues to occur and corresponds to a general unresponsiveness of that particular couch node. My first thought was that it was running out of file descriptors, but that doesn't seem to be the case (it still happens with a very high ulimit). In this error the database it's trying to open is 'profiles22', but the error doesn't seem tied to a particular database (we see it happening on all of our profiles shards). We built this couch instance from trunk r713244, if that helps. Anyone have any insight on what's going on? [Fri, 27 Mar 2009 18:21:42 GMT] [error] [0.14547.0] {error_report,0.21.0, {0.14547.0,crash_report, [[{pid,0.14547.0}, {registered_name,[]}, {error_info, {exit, {timeout, {gen_server,call, [couch_server,{open,profiles22,[{creds,{[]}}]}]}}, [{gen_server,call,2}, {couch_httpd_db,do_db_req,2}, {couch_httpd,handle_request,3}, {mochiweb_http,headers,4}, {proc_lib,init_p,5}]}}, {initial_call, {mochiweb_socket_server,acceptor_loop, [{0.59.0,#Port0.149,#Funmochiweb_http.1.62821608}]}}, {ancestors,[couch_httpd,0.53.0,couch_server_sup,0.1.0]}, {messages,[]}, {links,[0.59.0,#Port0.14390]}, {dictionary,[]}, {trap_exit,false}, {status,running}, {heap_size,2584}, {stack_size,23}, {reductions,3300}], []]}} Thanks, Shaun
Lounge clustering framework
Hey all, We've been discussing the best way to handle releasing the Lounge code and we have some questions that you, the couch devs, might be able to help out with: 1. What license is preferred? Since Couch is an Apache project, the Apache license is probably appropriate, however, since the Lounge is more or less a separate entity, we can probably release under any license. Any preferences? 2. Project hosting? Again, since this is separate from Couch, it probably doesn't make sense to have it in the Couch repo. We were thinking google code (since we use svn), but I'm open to whatever. Thoughts? Once we settle on a license, we'll need to run it by our lawyer to make sure we're solid and, assuming that goes well, we should be good to give out the code. Thanks, Shaun Lindsay Meebo.com
Re: Couch clustering/partitioning Re: CouchSpray - Thoughts?
Due to one of the key people being sick, we pushed our meeting to discuss releasing the code to Monday. I'll send out an update then. On Fri, Feb 20, 2009 at 2:17 AM, Jan Lehnardt j...@apache.org wrote: On 20 Feb 2009, at 02:34, Shaun Lindsay wrote: Hi all, So, a couple months ago we implemented almost exactly the couch clustering/partitioning solution described below. Shaun, this sounds fantastic! :) I hope you can release the code for this. Cheers Jan -- The couch cluster (which we called 'The Lounge') sits behind nginx running a custom module that farms out the GETs and PUTs to the appropriate node/shard and the views to a python proxy daemon which handles reducing the view results from the individual shards and returning the full view. We have replication working between the cluster nodes so the shards exist multiple places and, in the case of one of the nodes going down, the various proxies fail over to the backup shards. This clustering setup has been running in full production for several months now with minimal problems. We're looking to release all the code back to the community, but we need to clear it with our legal team first to make sure we're not compromising any of our more business-specific, proprietary code. In total, we have: a nginx module specifically set up for sharding databases a 'smartproxy', written in Python/Twisted, for sharding views and a few other ancillary pieces (replication notification, view updating, etc) Mainly, I just wanted to keep people from duplicating the work we've done -- hopefully we can release something back to the community in the next several weeks. We're having a meeting tomorrow morning to figure out what we can release right now (probably the nginx module, at the least). I'll let everyone know what out timeline looks like. --Shaun Lindsay Meebo.com On Thu, Feb 19, 2009 at 4:48 PM, Chris Anderson jch...@apache.org wrote: On Thu, Feb 19, 2009 at 4:35 PM, Ben Browning ben...@gmail.com wrote: So, I started thinking about partitioning with CouchDB and realized that since views are just map/reduce, we can do some magic that's harder if not impossible with other database systems. The idea in a nutshell is to create a proxy that sits in front of multiple servers and sprays the view queries to all servers, merging the results - hence CouchSpray. This would give us storage and processing scalability and could, with some extra logic, provide data redundancy and failover. There are plans in CouchDB's future to take care of data partitioning, as well as querying views from a cluster. Theoretically, it should be pretty simple. There are a few small projects that have started down the road of writing code in this area. https://code.launchpad.net/~dreid/sectional/trunkhttps://code.launchpad.net/%7Edreid/sectional/trunk Sectional is an Erlang http proxy that implements consistent hashing for docs. I'm not sure how it handles view queries. There's also a project to provide partitioning around the basic key/value PUT and GET store using Nginx: http://github.com/dysinger/nginx/tree/nginx_upstream_hash If you're interested in digging into this stuff, please join d...@. We plan to include clustering in CouchDB, so if you're interested in implementing it, we could use your help. Chris -- Chris Anderson http://jchris.mfdz.com
Couch clustering/partitioning Re: CouchSpray - Thoughts?
Hi all, So, a couple months ago we implemented almost exactly the couch clustering/partitioning solution described below. The couch cluster (which we called 'The Lounge') sits behind nginx running a custom module that farms out the GETs and PUTs to the appropriate node/shard and the views to a python proxy daemon which handles reducing the view results from the individual shards and returning the full view. We have replication working between the cluster nodes so the shards exist multiple places and, in the case of one of the nodes going down, the various proxies fail over to the backup shards. This clustering setup has been running in full production for several months now with minimal problems. We're looking to release all the code back to the community, but we need to clear it with our legal team first to make sure we're not compromising any of our more business-specific, proprietary code. In total, we have: a nginx module specifically set up for sharding databases a 'smartproxy', written in Python/Twisted, for sharding views and a few other ancillary pieces (replication notification, view updating, etc) Mainly, I just wanted to keep people from duplicating the work we've done -- hopefully we can release something back to the community in the next several weeks. We're having a meeting tomorrow morning to figure out what we can release right now (probably the nginx module, at the least). I'll let everyone know what out timeline looks like. --Shaun Lindsay Meebo.com On Thu, Feb 19, 2009 at 4:48 PM, Chris Anderson jch...@apache.org wrote: On Thu, Feb 19, 2009 at 4:35 PM, Ben Browning ben...@gmail.com wrote: So, I started thinking about partitioning with CouchDB and realized that since views are just map/reduce, we can do some magic that's harder if not impossible with other database systems. The idea in a nutshell is to create a proxy that sits in front of multiple servers and sprays the view queries to all servers, merging the results - hence CouchSpray. This would give us storage and processing scalability and could, with some extra logic, provide data redundancy and failover. There are plans in CouchDB's future to take care of data partitioning, as well as querying views from a cluster. Theoretically, it should be pretty simple. There are a few small projects that have started down the road of writing code in this area. https://code.launchpad.net/~dreid/sectional/trunk Sectional is an Erlang http proxy that implements consistent hashing for docs. I'm not sure how it handles view queries. There's also a project to provide partitioning around the basic key/value PUT and GET store using Nginx: http://github.com/dysinger/nginx/tree/nginx_upstream_hash If you're interested in digging into this stuff, please join d...@. We plan to include clustering in CouchDB, so if you're interested in implementing it, we could use your help. Chris -- Chris Anderson http://jchris.mfdz.com
Re: Couch clustering/partitioning Re: CouchSpray - Thoughts?
Our current cluster is running on 4 nodes, on some slower, leftover hardware. As of right now, it's handling about 300 queries/sec, with about 1/3 of that being view requests. As for documents, we're looking at ~12M docs taking up around 90Gb of space on disk. We have the cluster split in to 48 shards, with 12 shards per node -- this lets us add more nodes later, up to 48, before we need to mess with resharding the data (or, more likely, treeing out underneath the first level of nodes). We've also implemented some caching for the most often hit views in the smartproxy -- map/reducing over the 48 shards isn't trivial and, at 100 view queries a second, caching is necessary for the cluster to actually work. --Shaun Lindsay Meebo.com On Thu, Feb 19, 2009 at 5:40 PM, Chris Anderson jch...@apache.org wrote: On Thu, Feb 19, 2009 at 5:34 PM, Shaun Lindsay sh...@meebo.com wrote: Hi all, So, a couple months ago we implemented almost exactly the couch clustering/partitioning solution described below. The couch cluster (which we called 'The Lounge') sits behind nginx running a custom module that farms out the GETs and PUTs to the appropriate node/shard and the views to a python proxy daemon which handles reducing the view results from the individual shards and returning the full view. We have replication working between the cluster nodes so the shards exist multiple places and, in the case of one of the nodes going down, the various proxies fail over to the backup shards. This clustering setup has been running in full production for several months now with minimal problems. We're looking to release all the code back to the community, but we need to clear it with our legal team first to make sure we're not compromising any of our more business-specific, proprietary code. In total, we have: a nginx module specifically set up for sharding databases a 'smartproxy', written in Python/Twisted, for sharding views and a few other ancillary pieces (replication notification, view updating, etc) Mainly, I just wanted to keep people from duplicating the work we've done -- hopefully we can release something back to the community in the next several weeks. We're having a meeting tomorrow morning to figure out what we can release right now (probably the nginx module, at the least). I'll let everyone know what out timeline looks like. This is really cool. It sounds like a tool people could get started with right away to build big CouchDB clusters. Is there anything you can tell us about the size of clusters you've used? The smartproxy code will probably be a really good illustration of what we'll probably want to implement in Erlang for CouchDB, eventually. Again, +1 for software people can use today. Hopefully it turns out to be easy to release! Cheers, Chris -- Chris Anderson http://jchris.mfdz.com