On Mon, Aug 16, 2010 at 1:54 PM, Miles Fidelman <mfidel...@meetinghouse.net> wrote: > Hi Folks, > > I wonder if someone might share some insight into why Erlang was chosen for > CouchDB. > > Don't get me wrong, I think Erlang is a really cool language/environment; > I'm a big fan of designs that spawn lots of independent processes, and > communicating via messages. But... it doesn't seem like CouchDB takes > advantage of all that much of Erlang's unique capabilities. > > Hence, I'm sort of wondering why Erlang for CouchDB, and if there are any > visions of taking more advantage of Erlang down the road. > > Thanks, > > Miles Fidelman > > -- > In theory, there is no difference between theory and practice. > In<fnord> practice, there is. .... Yogi Berra > > >
Miles, Firstly I'd like to reemphasize that CouchDB does use Erlang in very Erlangy ways. There's quite a bit more to the language than just message passing. Though in the end this thread has seemed to focus on the use of message passing (or rather, lack thereof) in regards to the replication protocol. I can't speak for Damien on why exactly he decided to use HTTP for the replicator, but I can say that if I were going to design it from scratch I would probably make very similar choices. Somewhat for points others have made in that its ubiquitous and does very well with firewall traversal, but those aren't the main points by a long shot. The biggest thing that an HTTP replicator has going for it is its simplicity. The entire protocol can be summed up in as little as "open an HTTP connection, stream documents edited after the last replication." Even with that simple idea there's a very large amount of engineering that has gone into it. We have to take into account Erlang's memory model, exponential back off when links go wonky, resumption when they come back, tracking replication histories, filtered replication, continuous replication, authentication, etc. And those are just the points I know from listening to the discussion. I bet Adam Kocoloski and Filipe Manana could go on for hours on the details I just glossed over. Switching the replicator to a more advanced protocol I think isn't really in the cards for the problem that the current replication scheme is meant to solve. I think that implementing a solution that uses P2P/UUCP/multicast discovery would be an awesome feature, but not something I would see going into the 'core' CouchDB distribution until someone steps up with a long term commitment to supporting it. Also of interest is that once you get to the 100's or 1000's of nodes scale you're probably not going to want to use Erlang's native message passing. Either you're going to be in a datacenter which means you'll want to fine granted control over network utilization, or you're going to be distributed in which case epmd/messages will have the usual firewall/nat issues. Some other interesting points are mentioned in a recent thread [1] on erlang-questions. Whether the replicator breaks HTTP is rather more of a philosophical debate best left for when I've had a few beers. I don't discount your points that SOAP/XML-RPC suck hard, but I don't think they have any bearing on the replication protocol given how its implemented. HTH, Paul Davis [1] http://www.erlang.org/cgi-bin/ezmlm-cgi?4:msp:52886:ecobpklllbhjdniiklhn