Adam,
Adam Kocoloski wrote:
On Apr 23, 2010, at 8:52 AM, Miles Fidelman wrote:
- notes on the replication process (step-by-step, what happens when replication
is invoked - what code modules are involved and so forth), and/or,
couch_rep_* modules handle replication. How familiar are you with Erlang/OTP? couch_rep_sup is a
supervisor for all replications, each of which has a couch_rep gen_server and changes_feed,
missing_revs, reader, and writer processes. Each of those processes handles one part of the
"conversation" on the slide I pointed out to you two days ago. Data flows from
changes_feed -> missing_revs -> reader -> writer.
Pretty familiar with Erlang at a conceptual/system level; starting to
take the time to get fluent in programming. Haven't done functional
languages in a long time.
- an overview of the code for someone new to the project - what lives in what
modules, how they string together - anything that might shortcut having to read
through every module and make sense of things from scratch
Anything - handwritten notes, slides from a code walkthrough, that kind of
thing.
Hi Miles, not to sound critical, but I don't think such a broad request will
get you very far. If you have specific questions I'll be happy to answer them.
With all do respect... lots of projects maintain documentation of
internals, particularly efforts focused on platform technologies
intended for long-term and broad-based application. Certainly in the
world of commercial software development it's the rare project that
doesn't have documentation providing a high level view of a large
software system -- it's pretty hard to either bring new team members on
board, or to perform long-term maintenance of code. Granted that it's a
bit harder to maintain this level of documentation on open-source
projects without steady funding, but I will point at some examples:
- Linux Kernel Internals: somewhat old (2.4), but
http://tldp.org/LDP/lki/index.html (I know there are updates)
- Apache HTTPD: http://httpd.apache.org/docs/2.2/developer/
- MongoDB, documentation of replication internals:
http://www.mongodb.org/display/DOCS/Replication+Internals
- or even http://wiki.github.com/erlang/otp/routemap-source-tree -
providing a basic overview of Erlang's internals
Please, take a shot at reading the code for the part you're interested in. If
you come across something you don't understand, send an email or join #couchdb
on IRC. Many of the devs hang out there regularly and can walk you through the
code. Best,
It doesn't seem that unreasonable to at least ask whether Couch has some
similar documentation floating around - if only at the level of notes
put together by an individual developer, or for discussion among developers.
Couch is certainly aiming at long-term viability as a platform for
broad-based use, and seems to be aiming at being a broad-based
open-source effort. To succeed over the long term, it will NEED to have
a good set of developer-level documentation. "Read the code" is not a a
long-term solution.
Re. replication, in specific, the the couch_rep_* modules do not contain
much in the way of comments.
Personally, I've been involved in a LOT of network protocol-related work
(BBN, back to the ARPANET days). I've yet to see any kind of protocol
work where someone hasn't jotted down at least a sequence diagram and
some kind of dataflow diagram showing how all the pieces fit together.
More common is a full-blown ASN.1 description, and eventually an RFC in
full gory detail.
It does not seem unreasonable to ask if someone has jotted down notes
about the full set of steps executed, and code modules involved, when
Couch receives a "POST /_replicate" transaction.
At the very least, it sure would be helpful to have something like:
http://httpd.apache.org/docs/2.2/developer/request.html, or
http://www.apachetutor.org/dev/request
to detail the sequence of events and code involved in request processing.
If, in fact, that kind of information has never been put on "paper," and
lives only in the source code and a few people's heads, that scares me a
lot vis-a-vis committing to Couch as a platform for any kind of serious
project.
Miles Fidelman
--
In theory, there is no difference between theory and practice.
In<fnord> practice, there is. .... Yogi Berra