[jira] Commented: (COUCHDB-449) Turn off delayed commits by default

2009-08-16 Thread Jan Lehnardt (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12743856#action_12743856
 ] 

Jan Lehnardt commented on COUCHDB-449:
--

For completeness: I turned on delayed commits for the test suite in r804727.

 Turn off delayed commits by default
 ---

 Key: COUCHDB-449
 URL: https://issues.apache.org/jira/browse/COUCHDB-449
 Project: CouchDB
  Issue Type: Bug
  Components: Database Core
Affects Versions: 0.9, 0.9.1
Reporter: Jan Lehnardt
Assignee: Adam Kocoloski
Priority: Blocker
 Fix For: 0.10

 Attachments: delayed_commits_v1.patch


 Delayed commits make CouchDB significantly faster. They also open a one 
 second window for data loss. In 0.9 and trunk, delayed commits are enabled by 
 default and can be overridden with HTTP headers and an explicit API call to 
 flush the write buffer. I suggest to turn off delayed commits by default and 
 use the same overrides to enable it per request. A per-database option is 
 possible, too.
 One concern is developer workflow speed. The setting affects the test suite 
 performance significantly. I'd opt to change couch.js to set the appropriate 
 header to enable delayed commits for tests.
 CouchDB should guarantee data safety first and speed second, with sensible 
 overrides.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (COUCHDB-449) Turn off delayed commits by default

2009-08-06 Thread Brian Candler (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739942#action_12739942
 ] 

Brian Candler commented on COUCHDB-449:
---


Just to be clear: _changes is supposed only to update after a commit has
taken place, not after a write?

If so, I cannot demonstrate it. If I write a document and then immediately
read _changes, it always appears. See below at (*).

Furthermore, the same is true if I run

$ curl http://127.0.0.1:5984/test/_changes?feed=continuous

in another window. As soon as I add a document in the first window, it
appears in the _changes feed.

My very rough scan of the source suggests that a delayed commit should take
place after 1 second:

Delay and (Db#db.waiting_delayed_commit == nil) -
Db#db{waiting_delayed_commit=
erlang:send_after(1000, self(), delayed_commit)};

So if that's right, and what you say is true, then I would expect not to see
the document in _changes for this long.

OTOH, with batch=ok the commit is delayed indefinitely. I have raised this
as a separate ticket COUCHDB-454)

All tested with HEAD (git commit aebdb31001126dab6b579b8cc2e605ef7ec499c6)
and 12b5 under Jaunty.

Regards,

Brian.

(*)
$ curl -X DELETE http://127.0.0.1:5984/test
{ok:true}
$ curl -X PUT http://127.0.0.1:5984/test
{ok:true}
$ curl http://127.0.0.1:5984/test/_changes
{results:[

],
last_seq:0}

$ curl -X POST -d'{}' http://127.0.0.1:5984/test; curl 
http://127.0.0.1:5984/test/_changes
{ok:true,id:70708dcbc2977b759365f9731f27,rev:1-967a00dff5e02add41819138abb3284d}
{results:[
{seq:1,id:70708dcbc2977b759365f9731f27,changes:[{rev:1-967a00dff5e02add41819138abb3284d}]}
],
last_seq:1}

$ curl -X POST -d'{}' http://127.0.0.1:5984/test; curl 
http://127.0.0.1:5984/test/_changes
{ok:true,id:1d4596c1cb715c0da9f99980fea0a3a2,rev:1-967a00dff5e02add41819138abb3284d}
{results:[
{seq:1,id:70708dcbc2977b759365f9731f27,changes:[{rev:1-967a00dff5e02add41819138abb3284d}]},
{seq:2,id:1d4596c1cb715c0da9f99980fea0a3a2,changes:[{rev:1-967a00dff5e02add41819138abb3284d}]}
],
last_seq:2}

$ curl -X POST -d'{}' http://127.0.0.1:5984/test; curl 
http://127.0.0.1:5984/test/_changes
{ok:true,id:a2feeaaca391446bb7a0f24c359ff79e,rev:1-967a00dff5e02add41819138abb3284d}
{results:[
{seq:1,id:70708dcbc2977b759365f9731f27,changes:[{rev:1-967a00dff5e02add41819138abb3284d}]},
{seq:2,id:1d4596c1cb715c0da9f99980fea0a3a2,changes:[{rev:1-967a00dff5e02add41819138abb3284d}]},
{seq:3,id:a2feeaaca391446bb7a0f24c359ff79e,changes:[{rev:1-967a00dff5e02add41819138abb3284d}]}
],
last_seq:3}

$ curl -X POST -d'{}' http://127.0.0.1:5984/test; curl -X POST -d'{}' 
http://127.0.0.1:5984/test; curl -X POST -d'{}' http://127.0.0.1:5984/test; 
curl http://127.0.0.1:5984/test/_changes
{ok:true,id:a2262a5904690aec5c64bb61f44903ed,rev:1-967a00dff5e02add41819138abb3284d}
{ok:true,id:26fdac7e139531e0f4352a089d4db7f4,rev:1-967a00dff5e02add41819138abb3284d}
{ok:true,id:f6bb36540484788becd54391dbc6189b,rev:1-967a00dff5e02add41819138abb3284d}
{results:[
{seq:1,id:70708dcbc2977b759365f9731f27,changes:[{rev:1-967a00dff5e02add41819138abb3284d}]},
{seq:2,id:1d4596c1cb715c0da9f99980fea0a3a2,changes:[{rev:1-967a00dff5e02add41819138abb3284d}]},
{seq:3,id:a2feeaaca391446bb7a0f24c359ff79e,changes:[{rev:1-967a00dff5e02add41819138abb3284d}]},
{seq:4,id:a2262a5904690aec5c64bb61f44903ed,changes:[{rev:1-967a00dff5e02add41819138abb3284d}]},
{seq:5,id:26fdac7e139531e0f4352a089d4db7f4,changes:[{rev:1-967a00dff5e02add41819138abb3284d}]},
{seq:6,id:f6bb36540484788becd54391dbc6189b,changes:[{rev:1-967a00dff5e02add41819138abb3284d}]}
],
last_seq:6}



 Turn off delayed commits by default
 ---

 Key: COUCHDB-449
 URL: https://issues.apache.org/jira/browse/COUCHDB-449
 Project: CouchDB
  Issue Type: Bug
  Components: Database Core
Affects Versions: 0.9, 0.9.1
Reporter: Jan Lehnardt
Priority: Blocker
 Fix For: 0.10


 Delayed commits make CouchDB significantly faster. They also open a one 
 second window for data loss. In 0.9 and trunk, delayed commits are enabled by 
 default and can be overridden with HTTP headers and an explicit API call to 
 flush the write buffer. I suggest to turn off delayed commits by default and 
 use the same overrides to enable it per request. A per-database option is 
 possible, too.
 One concern is developer workflow speed. The setting affects the test suite 
 performance significantly. I'd opt to change couch.js to set the appropriate 
 header to enable delayed commits for tests.
 CouchDB should guarantee data safety first and speed second, with sensible 
 overrides.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (COUCHDB-449) Turn off delayed commits by default

2009-08-06 Thread Adam Kocoloski (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740093#action_12740093
 ] 

Adam Kocoloski commented on COUCHDB-449:


+1 on turning off delayed commits by default
+1 for enabling them on  a per-DB basis
+0 for making the threshold configurable

We should add a DB-level configuration facility at some point.  It'd be nice to 
be able to edit this setting (and others, like continuous replication) without 
server-level admin privileges.

 Turn off delayed commits by default
 ---

 Key: COUCHDB-449
 URL: https://issues.apache.org/jira/browse/COUCHDB-449
 Project: CouchDB
  Issue Type: Bug
  Components: Database Core
Affects Versions: 0.9, 0.9.1
Reporter: Jan Lehnardt
Priority: Blocker
 Fix For: 0.10


 Delayed commits make CouchDB significantly faster. They also open a one 
 second window for data loss. In 0.9 and trunk, delayed commits are enabled by 
 default and can be overridden with HTTP headers and an explicit API call to 
 flush the write buffer. I suggest to turn off delayed commits by default and 
 use the same overrides to enable it per request. A per-database option is 
 possible, too.
 One concern is developer workflow speed. The setting affects the test suite 
 performance significantly. I'd opt to change couch.js to set the appropriate 
 header to enable delayed commits for tests.
 CouchDB should guarantee data safety first and speed second, with sensible 
 overrides.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (COUCHDB-449) Turn off delayed commits by default

2009-08-05 Thread Jason Davies (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739403#action_12739403
 ] 

Jason Davies commented on COUCHDB-449:
--

+1. Can we make this a config setting? delayed_commits = false by default but 
can be turned on for a node for speed junkies.

 Turn off delayed commits by default
 ---

 Key: COUCHDB-449
 URL: https://issues.apache.org/jira/browse/COUCHDB-449
 Project: CouchDB
  Issue Type: Bug
  Components: Database Core
Affects Versions: 0.9, 0.9.1
Reporter: Jan Lehnardt
Priority: Blocker
 Fix For: 0.10


 Delayed commits make CouchDB significantly faster. They also open a one 
 second window for data loss. In 0.9 and trunk, delayed commits are enabled by 
 default and can be overridden with HTTP headers and an explicit API call to 
 flush the write buffer. I suggest to turn off delayed commits by default and 
 use the same overrides to enable it per request. A per-database option is 
 possible, too.
 One concern is developer workflow speed. The setting affects the test suite 
 performance significantly. I'd opt to change couch.js to set the appropriate 
 header to enable delayed commits for tests.
 CouchDB should guarantee data safety first and speed second, with sensible 
 overrides.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (COUCHDB-449) Turn off delayed commits by default

2009-08-05 Thread Jan Lehnardt (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739406#action_12739406
 ] 

Jan Lehnardt commented on COUCHDB-449:
--

good idea Jason. or a new section:

[delayed_commits]
dbname = true
dbname2  = false
...
...

so you can have a safe db for your app and a fast db for, say, logging.

 Turn off delayed commits by default
 ---

 Key: COUCHDB-449
 URL: https://issues.apache.org/jira/browse/COUCHDB-449
 Project: CouchDB
  Issue Type: Bug
  Components: Database Core
Affects Versions: 0.9, 0.9.1
Reporter: Jan Lehnardt
Priority: Blocker
 Fix For: 0.10


 Delayed commits make CouchDB significantly faster. They also open a one 
 second window for data loss. In 0.9 and trunk, delayed commits are enabled by 
 default and can be overridden with HTTP headers and an explicit API call to 
 flush the write buffer. I suggest to turn off delayed commits by default and 
 use the same overrides to enable it per request. A per-database option is 
 possible, too.
 One concern is developer workflow speed. The setting affects the test suite 
 performance significantly. I'd opt to change couch.js to set the appropriate 
 header to enable delayed commits for tests.
 CouchDB should guarantee data safety first and speed second, with sensible 
 overrides.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (COUCHDB-449) Turn off delayed commits by default

2009-08-05 Thread Brian Candler (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739430#action_12739430
 ] 

Brian Candler commented on COUCHDB-449:
---


Or perhaps you could set a different periodic flush interval for each
database, with 0 equivalent to no delayed commit.

For me, the question is specifically, what guarantees does CouchDB give to
clients about your data safety, and when - for example, at the point where
you get a HTTP response?

There are at least three different scenarios that I'm aware of at the
moment.
1. client supplies 'batch=ok' URL parameter
2. client supplies no special parameters
3. client supplies 'X-Couch-Full-Commit: true' header

From the client's perspective, I can see no difference between (1) and (2).
After receiving a HTTP response, the data is likely to make it to disk at
some time in the future, but it could be lost if the plug is pulled in the
next few seconds.

In case (3), the document is guaranteed to be on disk after the HTTP
response is returned [as long as drive internal write cache is disabled].
This is equivalent to QOS level 1 in the MQTT protocol:
http://publib.boulder.ibm.com/infocenter/wmbhelp/v6r0m0/index.jsp?topic=/com.ibm.etools.mft.doc/ac10850_.htm

However, it also forces writes of everything received up to this point, so
it's very inefficient if you are doing lots of writes with this header on.

Sometimes, you don't require data to be written to disk immediately, but you
do want to be notified *when* it has been written to disk in order to take
some subsequent action (such as acknowledging the successful save to a
downstream consumer).

I would like to propose an alternative approach similar to TCP sequence
numbers. We already have a sequence number which counts documents added to
the database (update_seq). I suggest we keep a separate watermark which is
the sequence number when the database was last flushed to disk (say
flush_seq).

Now:

- when you PUT a document, send the update_seq as part of the response
  (let's call it doc_seq)

- update_seq may continue to increment as more documents are updated

- at some point in the future, when data is flushed to disk, set
  flush_seq := update_seq

- if the client is interested to know when its document has been flushed
  to disk, it can poll mydb to check for flush_seq = doc_seq

- it could be an option in the HTTP request to delay the response until
  flush_seq = doc_seq

That means you would get the benefit of knowing that the document had been
committed to disk, without the cost of having to commit it. Rather, you wait
until someone else wants to force a full commit, or the periodic full commit
takes place.

Then the only per-database tunable you need is the periodic commit interval.
Set it to 5 seconds for logging databases; 0.2 for RADIUS accounting (where
you want to generate a response within 200ms); and 0 if you want every
single document to be committed as soon as it arrives.

Thoughts?

Something like this is doable at present, but requires a buffering proxy.
For example, you can receive RADIUS accounting updates into a buffer, then
every 200ms do a POST to _bulk_docs with X-Couch-Full-Commit: true and
return success to all the clients.

Since CouchDB has to buffer these documents in the VFS cache anyway, it
would be convenient (and more efficient) to let it handle the periodic
flushing too.

Regards,

Brian.


 Turn off delayed commits by default
 ---

 Key: COUCHDB-449
 URL: https://issues.apache.org/jira/browse/COUCHDB-449
 Project: CouchDB
  Issue Type: Bug
  Components: Database Core
Affects Versions: 0.9, 0.9.1
Reporter: Jan Lehnardt
Priority: Blocker
 Fix For: 0.10


 Delayed commits make CouchDB significantly faster. They also open a one 
 second window for data loss. In 0.9 and trunk, delayed commits are enabled by 
 default and can be overridden with HTTP headers and an explicit API call to 
 flush the write buffer. I suggest to turn off delayed commits by default and 
 use the same overrides to enable it per request. A per-database option is 
 possible, too.
 One concern is developer workflow speed. The setting affects the test suite 
 performance significantly. I'd opt to change couch.js to set the appropriate 
 header to enable delayed commits for tests.
 CouchDB should guarantee data safety first and speed second, with sensible 
 overrides.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (COUCHDB-449) Turn off delayed commits by default

2009-08-05 Thread Jan Lehnardt (JIRA)

[ 
https://issues.apache.org/jira/browse/COUCHDB-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739462#action_12739462
 ] 

Jan Lehnardt commented on COUCHDB-449:
--

Brian, thanks for your thoughts :)

The just write and let me know when things have been committed can be done 
with the _changes feed already. No need for a separate sequence id.

 Turn off delayed commits by default
 ---

 Key: COUCHDB-449
 URL: https://issues.apache.org/jira/browse/COUCHDB-449
 Project: CouchDB
  Issue Type: Bug
  Components: Database Core
Affects Versions: 0.9, 0.9.1
Reporter: Jan Lehnardt
Priority: Blocker
 Fix For: 0.10


 Delayed commits make CouchDB significantly faster. They also open a one 
 second window for data loss. In 0.9 and trunk, delayed commits are enabled by 
 default and can be overridden with HTTP headers and an explicit API call to 
 flush the write buffer. I suggest to turn off delayed commits by default and 
 use the same overrides to enable it per request. A per-database option is 
 possible, too.
 One concern is developer workflow speed. The setting affects the test suite 
 performance significantly. I'd opt to change couch.js to set the appropriate 
 header to enable delayed commits for tests.
 CouchDB should guarantee data safety first and speed second, with sensible 
 overrides.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.