[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Jan Lehnardt (Commented) (JIRA) Wed, 16 Nov 2011 10:27:15 -0800

    [ 
https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151401#comment-13151401
 ]


Jan Lehnardt commented on COUCHDB-1342:
---------------------------------------

Good review Paul, thanks! :)

Let me quote and reply inline to each of your points:

> This patch looks like its a mishmash of a couple patches that are applied 
> slightly differently. Not sure if that's a branch thing or what. But for 
> instance, there's a lot of id_btree(Db) -> Db#db.couch type changes.

These are deliberate. Currently we are maintaining an Fd/couch_file pair for 
reads and writes in a bit of a clumsy way in the #btree.fd record and the 
couch_file module and do a switcheroo for writes vs. reads. Now all of that is 
moved into couch_file and couch_db doesn't have to concern itself with the 
details of efficient file access.

> There's also a switch between couch_file:sync(Filepath) and 
> couch_file:sync(Fd). I'm not sure if that's on purpose or not because we 
> switched to Filepath on purpose. I see it as being possible but I don't see a 
> reference to it.

We used to pass in the filepath to allow the fsync() to be async so it wouldn't 
block any readers or writers. That behaviour is now obsolete as couch_file will 
now pass the fsync() request to it's writer child process.


> I'm really not a fan of exposing the inner workings of couch_file to every 
> client as can be seen by the massive number of flush calls that this creates. 
> I find it quite likely that we'll eventually end up missing one of these 
> flush calls and causing a nasty bug because the error would look like data 
> hadn't been written. I'd want to see things rewritten so that the usage of 
> couch_file doesn't change. Easiest solution I see would be to automatically 
> flush buffers if a read request comes in for something that's not on disk.

Good point, API-wise this is a bit leaky and we should definitely look into 
making this better, but I don't think this blocks the bulk of the improvements 
that this patch introduces. I'm happy to open a new ticket about this. Good 
starting points are calling flush() inside couch_file 1) after writing a header 
and 2) if a read fails with {error, eof} (fwiw, Damien tried the latter before 
but removed it again, the details elude me).

> I'm pretty sure we talked about this at one point, but can someone describe 
> the differences between this and Erlang's delayed_write mode for files?

The difference is that delayed_write does buffering whereas we just want to 
keep writing to the file all the time. delayed_write would not write until the 
buffer is full or the timeout is hit. In addition, we wouldn't be able to tell 
for delayed_commits=false writes when data hit the file reliably.

> couch_file becomes much more complicated with this patch. I'd prefer to see 
> it simplified if at all possible. I'm not entirely certain how it'd look but 
> I might start by making a couch_file_reader and couch_file_writer gen_servers 
> instead of having bare processes in couch_file.

The whole point of this patch is to make couch_file smarter and move the 
reader/writer abstraction further down the stack and reap the associated 
performance benefits. Unless we have an alternate patch, we can't really 
compare this.

> I have to say that this patch scares me a bit. If we make the switch to 
> something like this then we're opening up a whole new type of failure 
> scenario where file "writes" appear to succeed but then end up failing after 
> the caller has moved on. While there are certainly cases where I can see this 
> being a fine tradeoff (view updaters, compactors, etc) it worries me a bit 
> for the main database file. The fact is that I can't (yet) reasonably 
> consider all of the possible new failure modes and convince myself that 
> things are correct. I've already seen some super weird reactions to things 
> like running out of disk space, it seems like not knowing about such an error 
> until you've "written" a megabyte of data seems awkward.

This doesn't change the error scenarios. We already rely on monitoring to tell 
the request process a couch_file write didn't work: 

  https://github.com/fdmanana/couchdb/blob/master/src/couchdb/couch_db.erl#L819 
ff. (825 in particular)
  
   
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: 
> https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Reply via email to