Re: Understanding the CouchDB file format

2011-12-25 Thread Paul Davis
On Thu, Dec 22, 2011 at 8:53 PM, Riyad Kalla wrote: > Randall, > > Spot on; we are on the same page now. I'll go through the post again > tomorrow morning and reword it a bit so the thoughts aren't so fragmented. > > Best, > Riyad > > P.S.> Are there discussions anywhere on alternative file format

Re: Understanding the CouchDB file format

2011-12-22 Thread Riyad Kalla
Randall, Spot on; we are on the same page now. I'll go through the post again tomorrow morning and reword it a bit so the thoughts aren't so fragmented. Best, Riyad P.S.> Are there discussions anywhere on alternative file formats for the indices that the Couch community has considered in the pas

Re: Understanding the CouchDB file format

2011-12-22 Thread Randall Leeds
On Thu, Dec 22, 2011 at 12:11, Riyad Kalla wrote: > On Thu, Dec 22, 2011 at 12:38 PM, Robert Newson wrote: > >> There are a >> few parts of the article that are inaccurate (like the assertion we >> have good locality for the id and seq trees. If this were true we >> wouldn't have seen such a huge

Re: Understanding the CouchDB file format

2011-12-22 Thread Riyad Kalla
On Thu, Dec 22, 2011 at 12:38 PM, Robert Newson wrote: > It reads well as an article but needs some polish before it could be a > great wiki page. I suggest that if it does go up, it is clearly marked > as a draft, and we all chip in to sculpt it into shape. > Great idea. Agreed that it is no wh

Re: Understanding the CouchDB file format

2011-12-22 Thread Robert Newson
It reads well as an article but needs some polish before it could be a great wiki page. I suggest that if it does go up, it is clearly marked as a draft, and we all chip in to sculpt it into shape. Particularly, the author is very enthusiastic but this mars the article (the all-caps, the ad-hoc me

Re: Understanding the CouchDB file format

2011-12-22 Thread Riyad Kalla
Jan, Thank you and yes, I'd be happy to contribute this to the wiki. I made some edits early this morning after some feedback. If a few folks in-the-know could give it a quick read-through to make sure I didn't get anything wrong then I'm happy to work it up on the wiki as well (or send it along

Re: Understanding the CouchDB file format

2011-12-22 Thread Jan Lehnardt
Good writeup! Would you consider contributing it to the CouchDB Wiki? http://wiki.apache.org/couchdb/ Cheers Jan -- On Dec 21, 2011, at 21:28 , Riyad Kalla wrote: > Bob, > > Really appreciate the link; Rick has a handful of articles that helped a > lot. > > Along side all the CouchDB readi

Re: Understanding the CouchDB file format

2011-12-21 Thread Riyad Kalla
Thank you Robert, fixed. On Wed, Dec 21, 2011 at 1:42 PM, Robert Dionne wrote: > Riyad, > > Your welcome. At a quick glance your post has one error, internal nodes do > contain values (from the reductions). The appendix in the couchdb book also > makes this error[1] which I've opened a ticket fo

Re: Understanding the CouchDB file format

2011-12-21 Thread Robert Dionne
Riyad, Your welcome. At a quick glance your post has one error, internal nodes do contain values (from the reductions). The appendix in the couchdb book also makes this error[1] which I've opened a ticket for. Cheers, Bob [1] https://github.com/oreilly/couchdb-guide/issues/450 On Dec 21,

Re: Understanding the CouchDB file format

2011-12-21 Thread Riyad Kalla
Bob, Really appreciate the link; Rick has a handful of articles that helped a lot. Along side all the CouchDB reading I've been looking at SSD-optimized data storage mechanisms and tried to coalesce all of this information into this post on Couch's file storage format: https://plus.google.com/u/0

Re: Understanding the CouchDB file format

2011-12-21 Thread Robert Dionne
I think this is largely correct Riyad, I dug out an old article[1] by Rick Ho that you may also find helpful though it might be slightly dated. Generally the best performance will be had if the ids are sequential and updates are done in bulk. Write heavy applications will eat up a lot of space a

Re: Understanding the CouchDB file format

2011-12-21 Thread Riyad Kalla
Adding to this conversation, I found this set of slides by Chris explaining the append-only index update format: http://www.slideshare.net/jchrisa/btree-nosql-oak?from=embed Specifically slides 16, 17 and 18. Using this example tree, rewriting the updated path (in reverse order) appended to the e

Re: Understanding the CouchDB file format

2011-12-20 Thread Riyad Kalla
@Filipe - I was just not clear on how CouchDB operated; you and Robert cleared that up for me. Thank you. @Robert - The writeup is excellent so far (I am not familiar with erlang, so there is a bit of stickiness there), thank you for taking the time to put this together! At this point I am curiou

Re: Understanding the CouchDB file format

2011-12-20 Thread Filipe David Manana
On Tue, Dec 20, 2011 at 8:27 PM, Riyad Kalla wrote: > Filipe, > > Thank you for the reply. > > Maybe I am misunderstanding exactly what couch is writing out; the docs > I've read say that it "rewrites the root node" -- I can't tell if the docs > mean the parent node of the child doc that was chang

Re: Understanding the CouchDB file format

2011-12-20 Thread Robert Dionne
Robert Dionne Computer Programmer dio...@dionne-associates.com 203.231.9961 On Dec 20, 2011, at 3:27 PM, Riyad Kalla wrote: > Filipe, > > Thank you for the reply. > > Maybe I am misunderstanding exactly what couch is writing out; the docs > I've read say that it "rewrites the root node" --

Re: Understanding the CouchDB file format

2011-12-20 Thread Riyad Kalla
Filipe, Thank you for the reply. Maybe I am misunderstanding exactly what couch is writing out; the docs I've read say that it "rewrites the root node" -- I can't tell if the docs mean the parent node of the child doc that was changed (as one of the b+ leaves) or if it means the direct path, from

Re: Understanding the CouchDB file format

2011-12-20 Thread Filipe David Manana
On Tue, Dec 20, 2011 at 6:24 PM, Riyad Kalla wrote: > I've been reading everything I can find on the CouchDB file format[1] and > am getting bits and pieces here and there, but not a great, concrete, > step-by-step explanation of the process. > > I'm clear on the use of B+ trees and after reading

Understanding the CouchDB file format

2011-12-20 Thread Riyad Kalla
I've been reading everything I can find on the CouchDB file format[1] and am getting bits and pieces here and there, but not a great, concrete, step-by-step explanation of the process. I'm clear on the use of B+ trees and after reading a few papers on the benefits of log-structured file formats, I