Adam,

Adam Kocoloski wrote:
On Apr 23, 2010, at 8:52 AM, Miles Fidelman wrote:
- notes on the replication process (step-by-step, what happens when replication 
is invoked - what code modules are involved and so forth), and/or,
couch_rep_* modules handle replication.  How familiar are you with Erlang/OTP?  couch_rep_sup is a 
supervisor for all replications, each of which has a couch_rep gen_server and changes_feed, 
missing_revs, reader, and writer processes.  Each of those processes handles one part of the 
"conversation" on the slide I pointed out to you two days ago.  Data flows from 
changes_feed ->  missing_revs ->  reader ->  writer.

Pretty familiar with Erlang at a conceptual/system level; starting to take the time to get fluent in programming. Haven't done functional languages in a long time.

- an overview of the code for someone new to the project - what lives in what 
modules, how they string together - anything that might shortcut having to read 
through every module and make sense of things from scratch

Anything - handwritten notes, slides from a code walkthrough, that kind of 
thing.
Hi Miles, not to sound critical, but I don't think such a broad request will 
get you very far.  If you have specific questions I'll be happy to answer them.
With all do respect... lots of projects maintain documentation of internals, particularly efforts focused on platform technologies intended for long-term and broad-based application. Certainly in the world of commercial software development it's the rare project that doesn't have documentation providing a high level view of a large software system -- it's pretty hard to either bring new team members on board, or to perform long-term maintenance of code. Granted that it's a bit harder to maintain this level of documentation on open-source projects without steady funding, but I will point at some examples: - Linux Kernel Internals: somewhat old (2.4), but http://tldp.org/LDP/lki/index.html (I know there are updates)
- Apache HTTPD: http://httpd.apache.org/docs/2.2/developer/
- MongoDB, documentation of replication internals: http://www.mongodb.org/display/DOCS/Replication+Internals - or even http://wiki.github.com/erlang/otp/routemap-source-tree - providing a basic overview of Erlang's internals

Please, take a shot at reading the code for the part you're interested in.  If 
you come across something you don't understand, send an email or join #couchdb 
on IRC.  Many of the devs hang out there regularly and can walk you through the 
code.  Best,

It doesn't seem that unreasonable to at least ask whether Couch has some similar documentation floating around - if only at the level of notes put together by an individual developer, or for discussion among developers.

Couch is certainly aiming at long-term viability as a platform for broad-based use, and seems to be aiming at being a broad-based open-source effort. To succeed over the long term, it will NEED to have a good set of developer-level documentation. "Read the code" is not a a long-term solution.

Re. replication, in specific, the the couch_rep_* modules do not contain much in the way of comments.

Personally, I've been involved in a LOT of network protocol-related work (BBN, back to the ARPANET days). I've yet to see any kind of protocol work where someone hasn't jotted down at least a sequence diagram and some kind of dataflow diagram showing how all the pieces fit together. More common is a full-blown ASN.1 description, and eventually an RFC in full gory detail.

It does not seem unreasonable to ask if someone has jotted down notes about the full set of steps executed, and code modules involved, when Couch receives a "POST /_replicate" transaction.

At the very least, it sure would be helpful to have something like:
http://httpd.apache.org/docs/2.2/developer/request.html, or
http://www.apachetutor.org/dev/request
to detail the sequence of events and code involved in request processing.

If, in fact, that kind of information has never been put on "paper," and lives only in the source code and a few people's heads, that scares me a lot vis-a-vis committing to Couch as a platform for any kind of serious project.

Miles Fidelman

--
In theory, there is no difference between theory and practice.
In<fnord>  practice, there is.   .... Yogi Berra


Reply via email to