Re: delayed_commits false
On Wed, Jul 7, 2010 at 2:29 PM, Jan Lehnardt wrote: >> Anyway rereading the doc we provide, I think delayed_commit don't >> break the promise since db is always in consistent state. > > "consistent state" means CouchDB can start up and use that db > without andy fixup phase (c.f myisamchk, or InnoDB log-replay). > yes, that why I'm ok with the promise :) - benoit
Re: delayed_commits false
On Jul 6, 2010, at 6:56 PM, Robert Newson wrote: > I had started a page to capture the nuances of these settings at > http://wiki.apache.org/couchdb/Durability_Matrix but never finished > it. It's possible some of the prose could be reshaped into a concise > summary of the difficult balancing act we're attempting here. This looks good to me. (Not perfect, but it's on topic, and can be brought up to the quality we care about) The page doesn't actually mention the delayed_commits config setting. Adding a discussion of that, and doing a basic page cleanup, should be simple. Committed the docs patch and I'm ready for 1.0. Chris
Re: delayed_commits false
On 07.07.2010 14:20, Benoit Chesneau wrote: On Wed, Jul 7, 2010 at 1:40 PM, Jan Lehnardt wrote: On 7 Jul 2010, at 08:31, Benoit Chesneau wrote: I dislike to have too much options though. @damien I don't understand this "keep it for 1.0" mantra. Since it's more a "philosophical" change than a technical one, I would prefer that change on 1.0 whatever this number means. How do people use CouchDB in production ? Is delayed_commit turned off most of the time ? I am with Damien, all other releases ship with the current default setting. Changing it *a day* before 1.0 does not sound like the way to go. right, it may be too late to do that now. Anyway rereading the doc we provide, I think delayed_commit don't break the promise since db is always in consistent state. But we may need like you said, to clarify some docs imo and maybe give to users some tricks to help them to flush to disk before any machine shutdown or such ? About the use on laptop and co, laptops are likely less stable than server machines, and we tend to shutdown them more often too. With delayed_commit=True, when someone shutdown his laptop and forget to apply delayed commit (and most of the time, if we don't automatize that, I bet he will), data in memory will be lost. I remember two instances of data loss with CouchDB 0.7 on Mac OS X when Erlang wasn't using the fully reliable FULLFSYNC option to flush data to disk (which is btw more reliable than on any other system). After we patched Erlang to do the right thing on Mac OS X, I haven't heard of a single instance of data loss on a laptop or anywhere else. There was a recent discussion about couchdb shutdown that was raised by this default settings. In my case I never lost data but just because I set couchdb as the last to shutdown. As a user of openbsd, one of the reasons I use this system (except its simplicity) is that it is secured by default on the contrary most linuxes/bsds aren't. Most of the openbsd users know that security will impact performances. I think I would prefer to have a completly safe couchdb even if performances decreased. This brings up an interesting point: we ship the CouchDB source. Most users use CouchDB through some distribution (apt-get, rpm, CouchDBX etc.). The distributors should decide for their users which setting is best. That's indeed a way to handle that, this is again a doc problem I guess. Again, I'd like to keep the default we had forever and commence with the release of 1.0. for both reasons here, you have my +1 :) - benoit Good points everyone. Here's my +1 for keeping the current default. I'd say, I've opened the case, now I close the case. Cheers, Volker
Re: delayed_commits false
On 7 July 2010 13:20, Benoit Chesneau wrote: > On Wed, Jul 7, 2010 at 1:40 PM, Jan Lehnardt wrote: >> >> On 7 Jul 2010, at 08:31, Benoit Chesneau wrote: >> >> I dislike to have too much options though. >>> >>> @damien >>> I don't understand this "keep it for 1.0" mantra. Since it's more a >>> "philosophical" change than a technical one, I would prefer that >>> change on 1.0 whatever this number means. How do people use CouchDB >>> in production ? Is delayed_commit turned off most of the time ? >> >> I am with Damien, all other releases ship with the current default setting. >> Changing it *a day* before 1.0 does not sound like the way to go. >> >> > > right, it may be too late to do that now. > > Anyway rereading the doc we provide, I think delayed_commit don't > break the promise since db is always in consistent state. > > But we may need like you said, to clarify some docs imo and maybe give > to users some tricks to help them to flush to disk before any machine > shutdown or such ? > >>> About the use on laptop and co, laptops are likely less stable than >>> server machines, and we tend to shutdown them more often too. With >>> delayed_commit=True, when someone shutdown his laptop and forget to >>> apply delayed commit (and most of the time, if we don't automatize >>> that, I bet he will), data in memory will be lost. >> >> I remember two instances of data loss with CouchDB 0.7 on Mac OS X >> when Erlang wasn't using the fully reliable FULLFSYNC option to flush >> data to disk (which is btw more reliable than on any other system). After >> we patched Erlang to do the right thing on Mac OS X, I haven't heard >> of a single instance of data loss on a laptop or anywhere else. >> >> > > There was a recent discussion about couchdb shutdown that was raised > by this default settings. In my case I never lost data but just > because I set couchdb as the last to shutdown. For reference, https://issues.apache.org/jira/browse/COUCHDB-791. - Matt > >>> As a user of openbsd, one of the reasons I use this system (except >>> its simplicity) is that it is secured by default on the contrary most >>> linuxes/bsds aren't. Most of the openbsd users know that security will >>> impact performances. I think I would prefer to have a completly safe >>> couchdb even if performances decreased. >> >> This brings up an interesting point: we ship the CouchDB source. >> Most users use CouchDB through some distribution (apt-get, rpm, >> CouchDBX etc.). The distributors should decide for their users which >> setting is best. >> > > That's indeed a way to handle that, this is again a doc problem I guess. > >> Again, I'd like to keep the default we had forever and commence >> with the release of 1.0. > > for both reasons here, you have my +1 :) > > - benoit >
Re: delayed_commits false
On 7 Jul 2010, at 14:20, Benoit Chesneau wrote: > On Wed, Jul 7, 2010 at 1:40 PM, Jan Lehnardt wrote: >> >> On 7 Jul 2010, at 08:31, Benoit Chesneau wrote: >> >> I dislike to have too much options though. >>> >>> @damien >>> I don't understand this "keep it for 1.0" mantra. Since it's more a >>> "philosophical" change than a technical one, I would prefer that >>> change on 1.0 whatever this number means. How do people use CouchDB >>> in production ? Is delayed_commit turned off most of the time ? >> >> I am with Damien, all other releases ship with the current default setting. >> Changing it *a day* before 1.0 does not sound like the way to go. >> >> > > right, it may be too late to do that now. > > Anyway rereading the doc we provide, I think delayed_commit don't > break the promise since db is always in consistent state. "consistent state" means CouchDB can start up and use that db without andy fixup phase (c.f myisamchk, or InnoDB log-replay). > But we may need like you said, to clarify some docs imo and maybe give > to users some tricks to help them to flush to disk before any machine > shutdown or such ? > >>> About the use on laptop and co, laptops are likely less stable than >>> server machines, and we tend to shutdown them more often too. With >>> delayed_commit=True, when someone shutdown his laptop and forget to >>> apply delayed commit (and most of the time, if we don't automatize >>> that, I bet he will), data in memory will be lost. >> >> I remember two instances of data loss with CouchDB 0.7 on Mac OS X >> when Erlang wasn't using the fully reliable FULLFSYNC option to flush >> data to disk (which is btw more reliable than on any other system). After >> we patched Erlang to do the right thing on Mac OS X, I haven't heard >> of a single instance of data loss on a laptop or anywhere else. >> >> > > There was a recent discussion about couchdb shutdown that was raised > by this default settings. In my case I never lost data but just > because I set couchdb as the last to shutdown. Are you referring to the init script (ours?) that should send a POST /_ensure_full_commit before killing CouchDB? If it is our script, we need to fix it (not a blocker, I'd say). If it is somebody else's script, we should get in touch and help getting it fixed. >>> As a user of openbsd, one of the reasons I use this system (except >>> its simplicity) is that it is secured by default on the contrary most >>> linuxes/bsds aren't. Most of the openbsd users know that security will >>> impact performances. I think I would prefer to have a completly safe >>> couchdb even if performances decreased. >> >> This brings up an interesting point: we ship the CouchDB source. >> Most users use CouchDB through some distribution (apt-get, rpm, >> CouchDBX etc.). The distributors should decide for their users which >> setting is best. >> > > That's indeed a way to handle that, this is again a doc problem I guess. > >> Again, I'd like to keep the default we had forever and commence >> with the release of 1.0. > > for both reasons here, you have my +1 :) Hooray :) Cheers Jan --
Re: delayed_commits false
On Wed, Jul 7, 2010 at 1:40 PM, Jan Lehnardt wrote: > > On 7 Jul 2010, at 08:31, Benoit Chesneau wrote: > > I dislike to have too much options though. >> >> @damien >> I don't understand this "keep it for 1.0" mantra. Since it's more a >> "philosophical" change than a technical one, I would prefer that >> change on 1.0 whatever this number means. How do people use CouchDB >> in production ? Is delayed_commit turned off most of the time ? > > I am with Damien, all other releases ship with the current default setting. > Changing it *a day* before 1.0 does not sound like the way to go. > > right, it may be too late to do that now. Anyway rereading the doc we provide, I think delayed_commit don't break the promise since db is always in consistent state. But we may need like you said, to clarify some docs imo and maybe give to users some tricks to help them to flush to disk before any machine shutdown or such ? >> About the use on laptop and co, laptops are likely less stable than >> server machines, and we tend to shutdown them more often too. With >> delayed_commit=True, when someone shutdown his laptop and forget to >> apply delayed commit (and most of the time, if we don't automatize >> that, I bet he will), data in memory will be lost. > > I remember two instances of data loss with CouchDB 0.7 on Mac OS X > when Erlang wasn't using the fully reliable FULLFSYNC option to flush > data to disk (which is btw more reliable than on any other system). After > we patched Erlang to do the right thing on Mac OS X, I haven't heard > of a single instance of data loss on a laptop or anywhere else. > > There was a recent discussion about couchdb shutdown that was raised by this default settings. In my case I never lost data but just because I set couchdb as the last to shutdown. >> As a user of openbsd, one of the reasons I use this system (except >> its simplicity) is that it is secured by default on the contrary most >> linuxes/bsds aren't. Most of the openbsd users know that security will >> impact performances. I think I would prefer to have a completly safe >> couchdb even if performances decreased. > > This brings up an interesting point: we ship the CouchDB source. > Most users use CouchDB through some distribution (apt-get, rpm, > CouchDBX etc.). The distributors should decide for their users which > setting is best. > That's indeed a way to handle that, this is again a doc problem I guess. > Again, I'd like to keep the default we had forever and commence > with the release of 1.0. for both reasons here, you have my +1 :) - benoit
Re: delayed_commits false
On 7 Jul 2010, at 08:31, Benoit Chesneau wrote: I dislike to have too much options though. > > @damien > I don't understand this "keep it for 1.0" mantra. Since it's more a > "philosophical" change than a technical one, I would prefer that > change on 1.0 whatever this number means. How do people use CouchDB > in production ? Is delayed_commit turned off most of the time ? I am with Damien, all other releases ship with the current default setting. Changing it *a day* before 1.0 does not sound like the way to go. > About the use on laptop and co, laptops are likely less stable than > server machines, and we tend to shutdown them more often too. With > delayed_commit=True, when someone shutdown his laptop and forget to > apply delayed commit (and most of the time, if we don't automatize > that, I bet he will), data in memory will be lost. I remember two instances of data loss with CouchDB 0.7 on Mac OS X when Erlang wasn't using the fully reliable FULLFSYNC option to flush data to disk (which is btw more reliable than on any other system). After we patched Erlang to do the right thing on Mac OS X, I haven't heard of a single instance of data loss on a laptop or anywhere else. > As a user of openbsd, one of the reasons I use this system (except > its simplicity) is that it is secured by default on the contrary most > linuxes/bsds aren't. Most of the openbsd users know that security will > impact performances. I think I would prefer to have a completly safe > couchdb even if performances decreased. This brings up an interesting point: we ship the CouchDB source. Most users use CouchDB through some distribution (apt-get, rpm, CouchDBX etc.). The distributors should decide for their users which setting is best. Again, I'd like to keep the default we had forever and commence with the release of 1.0. Cheers Jan --
Re: delayed_commits false
On 7 Jul 2010, at 00:46, Volker Mische wrote: > On 07.07.2010 00:06, Damien Katz wrote: >> >> On Jul 5, 2010, at 8:49 AM, Volker Mische wrote: >> >>> Hi All, >>> >>> delayed_commits were enabled to have better performance especially for >>> single writers. The price you pay for is that you potentially lose up to >>> one second of writes in case of a crash. >>> >>> Such a setting makes sense, though in my opinion it shouldn't be enabled by >>> default. I expect* that people running into performance issues at least >>> take a look at the README or a FAQ section somewhere. There the >>> delayed_commit setting could be pointed out. >>> >>> I'd like to be able to say that on a vanilla CouchDB it's hard to lose >>> data, but I can't atm. I'm also well aware that there will be plenty of >>> performance tests when 1.0 is released and people will complain (if >>> delayed_commits would be set to false by default) that it is horrible slow. >>> Though safety of the data is more important for me. >>> >>> If the only reason why delayed_commits is true by default are the >>> performance tests of some noobs, I really don't think it's a price worth >>> paying. >>> >>> *I know that in reality people don't >>> >>> I would like to see delayed_commits=false for 1.0 >> >> Last year we turned off delayed commits by default. We got lots of >> complaints, the performance impact was too great. So we switched it back. We >> aren't the first storage engine to go around on this. For example, Apple's >> core data switched to using full fsyncs for each write in 10.4, but then >> switched it back for 10.5: >> >> http://developer.apple.com/mac/library/documentation/Cocoa/Conceptual/CoreData/Articles/cdPersistentStores.html >> >> "Important: The default behaviors in Mac OS X v10.4 an 10.5 are different. >> In Mac OS X v10.4, SQLite uses FULL_FSYNC by default; in Mac OS X v10.5 it >> does not." >> >> Anyway, we can improve the documentation warning's, etc, but we should stay >> with the current defaults. >> >> -Damien >> > > As 1.0 is approaching fast, I think this discussion is pretty important. > Especially this thread showed that there are people that prefer setting > delayed_commits to false. Although sometimes someone has to make the last > call, and there is probably no one better than the creator of the project, I > think it this case the decision should be made by more people. > > For *me personally* the authority of Apache CouchDB are the committers. I > would love to see them vote on this topic (being it public or private doesn't > matter). (just clarifying procedure) By Apache policy, every voice on dev@ needs to be considered. The final call for a release (the release vote) is up to the Project’s Management Committee (PMC) which is Damien, Noah, J Chris, Christopher and myself. Cheers Jan --
Re: delayed_commits false
I would prefer to leave it like it's now: set to true by default. a+ On Wed, Jul 7, 2010 at 8:50 AM, Damien Katz wrote: > > On Jul 6, 2010, at 11:31 PM, Benoit Chesneau wrote: > > > On Wed, Jul 7, 2010 at 3:56 AM, Robert Newson > wrote: > >> I had started a page to capture the nuances of these settings at > >> http://wiki.apache.org/couchdb/Durability_Matrix but never finished > >> it. It's possible some of the prose could be reshaped into a concise > >> summary of the difficult balancing act we're attempting here. > >> > >> For what it's worth, I'd prefer to keep this setting as-is for 1.0. > >> Having several 'durability profiles' to choose from would be very > >> neat, though, and displaying the current profile prominently in Futon > >> should convey the message far better than docs or wiki. Consider how > >> often the 'admin party' text gets people thinking about locking down > >> their server... > >> > >> B. > >> > > I dislike to have too much options though. > > > > @damien > > I don't understand this "keep it for 1.0" mantra. Since it's more a > > "philosophical" change than a technical one, I would prefer that > > change on 1.0 whatever this number means. How do people use CouchDB > > in production ? Is delayed_commit turned off most of the time ? > > I don't know the answer to this, but we've shipped version 0.8, 0.9, 0.10 > and 0.11 with the current default. > > > > > About the use on laptop and co, laptops are likely less stable than > > server machines, and we tend to shutdown them more often too. With > > delayed_commit=True, when someone shutdown his laptop and forget to > > apply delayed commit (and most of the time, if we don't automatize > > that, I bet he will), data in memory will be lost. > > I don't recall any real world complaints caused by the 1 sec delay where > people were losing data. The one time we turned it off in trunk, there were > complaints about the slowness and how unusable it was. I personally had to > always turn it on for the servers to be usable. > > > > > As a user of openbsd, one of the reasons I use this system (except > > its simplicity) is that it is secured by default on the contrary most > > linuxes/bsds aren't. Most of the openbsd users know that security will > > impact performances. I think I would prefer to have a completly safe > > couchdb even if performances decreased. > > > > You have that option already. > > -Damien > > > > > - benoit. > > -- Filipe David Manana, fdman...@apache.org "Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men."
Re: delayed_commits false
On Jul 6, 2010, at 11:31 PM, Benoit Chesneau wrote: > On Wed, Jul 7, 2010 at 3:56 AM, Robert Newson wrote: >> I had started a page to capture the nuances of these settings at >> http://wiki.apache.org/couchdb/Durability_Matrix but never finished >> it. It's possible some of the prose could be reshaped into a concise >> summary of the difficult balancing act we're attempting here. >> >> For what it's worth, I'd prefer to keep this setting as-is for 1.0. >> Having several 'durability profiles' to choose from would be very >> neat, though, and displaying the current profile prominently in Futon >> should convey the message far better than docs or wiki. Consider how >> often the 'admin party' text gets people thinking about locking down >> their server... >> >> B. >> > I dislike to have too much options though. > > @damien > I don't understand this "keep it for 1.0" mantra. Since it's more a > "philosophical" change than a technical one, I would prefer that > change on 1.0 whatever this number means. How do people use CouchDB > in production ? Is delayed_commit turned off most of the time ? I don't know the answer to this, but we've shipped version 0.8, 0.9, 0.10 and 0.11 with the current default. > > About the use on laptop and co, laptops are likely less stable than > server machines, and we tend to shutdown them more often too. With > delayed_commit=True, when someone shutdown his laptop and forget to > apply delayed commit (and most of the time, if we don't automatize > that, I bet he will), data in memory will be lost. I don't recall any real world complaints caused by the 1 sec delay where people were losing data. The one time we turned it off in trunk, there were complaints about the slowness and how unusable it was. I personally had to always turn it on for the servers to be usable. > > As a user of openbsd, one of the reasons I use this system (except > its simplicity) is that it is secured by default on the contrary most > linuxes/bsds aren't. Most of the openbsd users know that security will > impact performances. I think I would prefer to have a completly safe > couchdb even if performances decreased. > You have that option already. -Damien > > - benoit.
Re: delayed_commits false
On Wed, Jul 7, 2010 at 3:56 AM, Robert Newson wrote: > I had started a page to capture the nuances of these settings at > http://wiki.apache.org/couchdb/Durability_Matrix but never finished > it. It's possible some of the prose could be reshaped into a concise > summary of the difficult balancing act we're attempting here. > > For what it's worth, I'd prefer to keep this setting as-is for 1.0. > Having several 'durability profiles' to choose from would be very > neat, though, and displaying the current profile prominently in Futon > should convey the message far better than docs or wiki. Consider how > often the 'admin party' text gets people thinking about locking down > their server... > > B. > I dislike to have too much options though. @damien I don't understand this "keep it for 1.0" mantra. Since it's more a "philosophical" change than a technical one, I would prefer that change on 1.0 whatever this number means. How do people use CouchDB in production ? Is delayed_commit turned off most of the time ? About the use on laptop and co, laptops are likely less stable than server machines, and we tend to shutdown them more often too. With delayed_commit=True, when someone shutdown his laptop and forget to apply delayed commit (and most of the time, if we don't automatize that, I bet he will), data in memory will be lost. As a user of openbsd, one of the reasons I use this system (except its simplicity) is that it is secured by default on the contrary most linuxes/bsds aren't. Most of the openbsd users know that security will impact performances. I think I would prefer to have a completly safe couchdb even if performances decreased. - benoit.
Re: delayed_commits false
I had started a page to capture the nuances of these settings at http://wiki.apache.org/couchdb/Durability_Matrix but never finished it. It's possible some of the prose could be reshaped into a concise summary of the difficult balancing act we're attempting here. For what it's worth, I'd prefer to keep this setting as-is for 1.0. Having several 'durability profiles' to choose from would be very neat, though, and displaying the current profile prominently in Futon should convey the message far better than docs or wiki. Consider how often the 'admin party' text gets people thinking about locking down their server... B. On Tue, Jul 6, 2010 at 9:18 PM, J Chris Anderson wrote: >> >>> >>> Maybe the thing to do is to put a note about this config item somewhere >>> prominent in Futon. The Configuration page is obvious, but not prominent >>> enough. >>> > > I've got a nice little paragraph sitting on the config page in Futon (good > enough, I think) > > For the strongest consistency guarantees, delayed_commits should be > set to false. The default value of true is designed for > single-user performance. For more details see the CouchDB wiki on Delayed > Commits. > > The only issue is that there is no such wiki page (or at least I can't find > one.) > > Anyone care to summarize the full-commit tradeoff in a user-friendly way? If > you are up for doing that, but aren't up to editing the wiki, even just > replying to this thread with the language would be helpful, and then I can > put the page up and I'll add this documentation to trunk for 1.0 > > Chris
Re: delayed_commits false
> >> >> Maybe the thing to do is to put a note about this config item somewhere >> prominent in Futon. The Configuration page is obvious, but not prominent >> enough. >> I've got a nice little paragraph sitting on the config page in Futon (good enough, I think) For the strongest consistency guarantees, delayed_commits should be set to false. The default value of true is designed for single-user performance. For more details see the CouchDB wiki on Delayed Commits. The only issue is that there is no such wiki page (or at least I can't find one.) Anyone care to summarize the full-commit tradeoff in a user-friendly way? If you are up for doing that, but aren't up to editing the wiki, even just replying to this thread with the language would be helpful, and then I can put the page up and I'll add this documentation to trunk for 1.0 Chris
Re: delayed_commits false
> > My general theory (extrapolating from my own behavior) is that no one reads > documentation. > because there isn't any, zing! j/k i actually agree with this for the most part but it doesn't mean that there shouldn't be any for the people that use google to figure out things :) > > Maybe the thing to do is to put a note about this config item somewhere > prominent in Futon. The Configuration page is obvious, but not prominent > enough. > Or even better, Futon could have a drop down of different configurations for "development", "single user", "multi user production" that set all the config options to something pre-determined for that use case. > > I agree we shouldn't change the default right before 1.0, but it wouldn't > be hard to slip a note or warning into Futon, with perhaps a link to a wiki > page. > > Anyone have any brilliant ideas about the best place to put this note? > > Chris
Re: delayed_commits false
On Jul 6, 2010, at 4:13 PM, Robert Dionne wrote: > Perhaps it's a matter of documentation. Most users aren't going to think > about the finer points of fsync and so forth, but will care about perceived > out of the box performance. > > However there will be scenarios where ACID will matter very much and good > documentation will help these users make the tradeoff. I think a key point is > that users who do care are not harmed by the defaults. > My general theory (extrapolating from my own behavior) is that no one reads documentation. Maybe the thing to do is to put a note about this config item somewhere prominent in Futon. The Configuration page is obvious, but not prominent enough. I agree we shouldn't change the default right before 1.0, but it wouldn't be hard to slip a note or warning into Futon, with perhaps a link to a wiki page. Anyone have any brilliant ideas about the best place to put this note? Chris
Re: delayed_commits false
Perhaps it's a matter of documentation. Most users aren't going to think about the finer points of fsync and so forth, but will care about perceived out of the box performance. However there will be scenarios where ACID will matter very much and good documentation will help these users make the tradeoff. I think a key point is that users who do care are not harmed by the defaults. On Jul 6, 2010, at 6:58 PM, Volker Mische wrote: > I have to admit that the point, that the main audience of a tarball are > developers is a good one. Perhaps people that do binary distributions of > CouchDB (like all the linux distros) could be encouraged to turn it to false > (though I have no idea what their general policy about changing defaults is). > > Cheers, > Volker > > On 07.07.2010 00:52, Mikeal Rogers wrote: >> I think there is a balance that we can find here between user experience and >> durability. >> >> I think the biggest question for me is, who is the primary target of the >> tarball download? >> >> If it's developers, I think we should leave it on. >> >> If it's people who are going to put it up, vanilla, in to production, we >> should turn them off. >> >> I know that I would certainly advocate keeping them off in the CouchDBX >> build. >> >> -Mikeal >> >> On Tue, Jul 6, 2010 at 3:46 PM, Volker Mischewrote: >> >>> On 07.07.2010 00:06, Damien Katz wrote: >>> >>>> >>>> On Jul 5, 2010, at 8:49 AM, Volker Mische wrote: >>>> >>>> Hi All, >>>>> >>>>> delayed_commits were enabled to have better performance especially for >>>>> single writers. The price you pay for is that you potentially lose up to >>>>> one >>>>> second of writes in case of a crash. >>>>> >>>>> Such a setting makes sense, though in my opinion it shouldn't be enabled >>>>> by default. I expect* that people running into performance issues at least >>>>> take a look at the README or a FAQ section somewhere. There the >>>>> delayed_commit setting could be pointed out. >>>>> >>>>> I'd like to be able to say that on a vanilla CouchDB it's hard to lose >>>>> data, but I can't atm. I'm also well aware that there will be plenty of >>>>> performance tests when 1.0 is released and people will complain (if >>>>> delayed_commits would be set to false by default) that it is horrible >>>>> slow. >>>>> Though safety of the data is more important for me. >>>>> >>>>> If the only reason why delayed_commits is true by default are the >>>>> performance tests of some noobs, I really don't think it's a price worth >>>>> paying. >>>>> >>>>> *I know that in reality people don't >>>>> >>>>> I would like to see delayed_commits=false for 1.0 >>>>> >>>> >>>> Last year we turned off delayed commits by default. We got lots of >>>> complaints, the performance impact was too great. So we switched it back. >>>> We >>>> aren't the first storage engine to go around on this. For example, Apple's >>>> core data switched to using full fsyncs for each write in 10.4, but then >>>> switched it back for 10.5: >>>> >>>> >>>> http://developer.apple.com/mac/library/documentation/Cocoa/Conceptual/CoreData/Articles/cdPersistentStores.html >>>> >>>> "Important: The default behaviors in Mac OS X v10.4 an 10.5 are different. >>>> In Mac OS X v10.4, SQLite uses FULL_FSYNC by default; in Mac OS X v10.5 it >>>> does not." >>>> >>>> Anyway, we can improve the documentation warning's, etc, but we should >>>> stay with the current defaults. >>>> >>>> -Damien >>>> >>>> >>> As 1.0 is approaching fast, I think this discussion is pretty important. >>> Especially this thread showed that there are people that prefer setting >>> delayed_commits to false. Although sometimes someone has to make the last >>> call, and there is probably no one better than the creator of the project, I >>> think it this case the decision should be made by more people. >>> >>> For *me personally* the authority of Apache CouchDB are the committers. I >>> would love to see them vote on this topic (being it public or private >>> doesn't matter). >>> >>> Cheers, >>> Volker >>> >> >
Re: delayed_commits false
+1 on delaying a decision on this until after 1.0, it's a big change and if we do make it we should let it sit in trunk and steep for a while. But, JIRA is a terrible place to have a discussion so I'd rather we continue to use the mailing list. -Mikeal On Tue, Jul 6, 2010 at 4:03 PM, Damien Katz wrote: > This issue has been discussed already. A change this big right before a 1.0 > release is a very bad idea. If we decided to change it, we'd need to wait a > good amount of time to understand how it affects downstream projects that > take the defaults. > > Here is a bug report that talks about it. There is more discussion in the > mailing list as well. > > https://issues.apache.org/jira/browse/COUCHDB-449 > > -Damien > > > On Jul 6, 2010, at 3:58 PM, Volker Mische wrote: > > > I have to admit that the point, that the main audience of a tarball are > developers is a good one. Perhaps people that do binary distributions of > CouchDB (like all the linux distros) could be encouraged to turn it to false > (though I have no idea what their general policy about changing defaults > is). > > > > Cheers, > > Volker > > > > On 07.07.2010 00:52, Mikeal Rogers wrote: > >> I think there is a balance that we can find here between user experience > and > >> durability. > >> > >> I think the biggest question for me is, who is the primary target of the > >> tarball download? > >> > >> If it's developers, I think we should leave it on. > >> > >> If it's people who are going to put it up, vanilla, in to production, we > >> should turn them off. > >> > >> I know that I would certainly advocate keeping them off in the CouchDBX > >> build. > >> > >> -Mikeal > >> > >> On Tue, Jul 6, 2010 at 3:46 PM, Volker Mische >wrote: > >> > >>> On 07.07.2010 00:06, Damien Katz wrote: > >>> > >>>> > >>>> On Jul 5, 2010, at 8:49 AM, Volker Mische wrote: > >>>> > >>>> Hi All, > >>>>> > >>>>> delayed_commits were enabled to have better performance especially > for > >>>>> single writers. The price you pay for is that you potentially lose up > to one > >>>>> second of writes in case of a crash. > >>>>> > >>>>> Such a setting makes sense, though in my opinion it shouldn't be > enabled > >>>>> by default. I expect* that people running into performance issues at > least > >>>>> take a look at the README or a FAQ section somewhere. There the > >>>>> delayed_commit setting could be pointed out. > >>>>> > >>>>> I'd like to be able to say that on a vanilla CouchDB it's hard to > lose > >>>>> data, but I can't atm. I'm also well aware that there will be plenty > of > >>>>> performance tests when 1.0 is released and people will complain (if > >>>>> delayed_commits would be set to false by default) that it is horrible > slow. > >>>>> Though safety of the data is more important for me. > >>>>> > >>>>> If the only reason why delayed_commits is true by default are the > >>>>> performance tests of some noobs, I really don't think it's a price > worth > >>>>> paying. > >>>>> > >>>>> *I know that in reality people don't > >>>>> > >>>>> I would like to see delayed_commits=false for 1.0 > >>>>> > >>>> > >>>> Last year we turned off delayed commits by default. We got lots of > >>>> complaints, the performance impact was too great. So we switched it > back. We > >>>> aren't the first storage engine to go around on this. For example, > Apple's > >>>> core data switched to using full fsyncs for each write in 10.4, but > then > >>>> switched it back for 10.5: > >>>> > >>>> > >>>> > http://developer.apple.com/mac/library/documentation/Cocoa/Conceptual/CoreData/Articles/cdPersistentStores.html > >>>> > >>>> "Important: The default behaviors in Mac OS X v10.4 an 10.5 are > different. > >>>> In Mac OS X v10.4, SQLite uses FULL_FSYNC by default; in Mac OS X > v10.5 it > >>>> does not." > >>>> > >>>> Anyway, we can improve the documentation warning's, etc, but we should > >>>> stay with the current defaults. > >>>> > >>>> -Damien > >>>> > >>>> > >>> As 1.0 is approaching fast, I think this discussion is pretty > important. > >>> Especially this thread showed that there are people that prefer setting > >>> delayed_commits to false. Although sometimes someone has to make the > last > >>> call, and there is probably no one better than the creator of the > project, I > >>> think it this case the decision should be made by more people. > >>> > >>> For *me personally* the authority of Apache CouchDB are the committers. > I > >>> would love to see them vote on this topic (being it public or private > >>> doesn't matter). > >>> > >>> Cheers, > >>> Volker > >>> > >> > > > >
Re: delayed_commits false
The difference when you do a couchapp push with delayed-commits on and off drastically increases when you have a lot of binary attachments. In some of my apps it's the difference between sub-second and 20 seconds. -Mikeal On Tue, Jul 6, 2010 at 4:01 PM, Volker Mische wrote: > (memo to myself, don't send mails late at night) > > On the other hand, do developers care about performance? And if, they would > read the documentation. > > Cheers, > Volker > > > On 07.07.2010 00:58, Volker Mische wrote: > >> I have to admit that the point, that the main audience of a tarball are >> developers is a good one. Perhaps people that do binary distributions of >> CouchDB (like all the linux distros) could be encouraged to turn it to >> false (though I have no idea what their general policy about changing >> defaults is). >> >> Cheers, >> Volker >> >> On 07.07.2010 00:52, Mikeal Rogers wrote: >> >>> I think there is a balance that we can find here between user >>> experience and >>> durability. >>> >>> I think the biggest question for me is, who is the primary target of the >>> tarball download? >>> >>> If it's developers, I think we should leave it on. >>> >>> If it's people who are going to put it up, vanilla, in to production, we >>> should turn them off. >>> >>> I know that I would certainly advocate keeping them off in the CouchDBX >>> build. >>> >>> -Mikeal >>> >>> On Tue, Jul 6, 2010 at 3:46 PM, Volker >>> Mischewrote: >>> >>> On 07.07.2010 00:06, Damien Katz wrote: >>>> >>>> >>>>> On Jul 5, 2010, at 8:49 AM, Volker Mische wrote: >>>>> >>>>> Hi All, >>>>> >>>>>> >>>>>> delayed_commits were enabled to have better performance especially for >>>>>> single writers. The price you pay for is that you potentially lose >>>>>> up to one >>>>>> second of writes in case of a crash. >>>>>> >>>>>> Such a setting makes sense, though in my opinion it shouldn't be >>>>>> enabled >>>>>> by default. I expect* that people running into performance issues >>>>>> at least >>>>>> take a look at the README or a FAQ section somewhere. There the >>>>>> delayed_commit setting could be pointed out. >>>>>> >>>>>> I'd like to be able to say that on a vanilla CouchDB it's hard to lose >>>>>> data, but I can't atm. I'm also well aware that there will be >>>>>> plenty of >>>>>> performance tests when 1.0 is released and people will complain (if >>>>>> delayed_commits would be set to false by default) that it is >>>>>> horrible slow. >>>>>> Though safety of the data is more important for me. >>>>>> >>>>>> If the only reason why delayed_commits is true by default are the >>>>>> performance tests of some noobs, I really don't think it's a price >>>>>> worth >>>>>> paying. >>>>>> >>>>>> *I know that in reality people don't >>>>>> >>>>>> I would like to see delayed_commits=false for 1.0 >>>>>> >>>>>> >>>>> Last year we turned off delayed commits by default. We got lots of >>>>> complaints, the performance impact was too great. So we switched it >>>>> back. We >>>>> aren't the first storage engine to go around on this. For example, >>>>> Apple's >>>>> core data switched to using full fsyncs for each write in 10.4, but >>>>> then >>>>> switched it back for 10.5: >>>>> >>>>> >>>>> >>>>> http://developer.apple.com/mac/library/documentation/Cocoa/Conceptual/CoreData/Articles/cdPersistentStores.html >>>>> >>>>> >>>>> "Important: The default behaviors in Mac OS X v10.4 an 10.5 are >>>>> different. >>>>> In Mac OS X v10.4, SQLite uses FULL_FSYNC by default; in Mac OS X >>>>> v10.5 it >>>>> does not." >>>>> >>>>> Anyway, we can improve the documentation warning's, etc, but we should >>>>> stay with the current defaults. >>>>> >>>>> -Damien >>>>> >>>>> >>>>> As 1.0 is approaching fast, I think this discussion is pretty >>>> important. >>>> Especially this thread showed that there are people that prefer setting >>>> delayed_commits to false. Although sometimes someone has to make the >>>> last >>>> call, and there is probably no one better than the creator of the >>>> project, I >>>> think it this case the decision should be made by more people. >>>> >>>> For *me personally* the authority of Apache CouchDB are the >>>> committers. I >>>> would love to see them vote on this topic (being it public or private >>>> doesn't matter). >>>> >>>> Cheers, >>>> Volker >>>> >>>> >>> >> >
Re: delayed_commits false
This issue has been discussed already. A change this big right before a 1.0 release is a very bad idea. If we decided to change it, we'd need to wait a good amount of time to understand how it affects downstream projects that take the defaults. Here is a bug report that talks about it. There is more discussion in the mailing list as well. https://issues.apache.org/jira/browse/COUCHDB-449 -Damien On Jul 6, 2010, at 3:58 PM, Volker Mische wrote: > I have to admit that the point, that the main audience of a tarball are > developers is a good one. Perhaps people that do binary distributions of > CouchDB (like all the linux distros) could be encouraged to turn it to false > (though I have no idea what their general policy about changing defaults is). > > Cheers, > Volker > > On 07.07.2010 00:52, Mikeal Rogers wrote: >> I think there is a balance that we can find here between user experience and >> durability. >> >> I think the biggest question for me is, who is the primary target of the >> tarball download? >> >> If it's developers, I think we should leave it on. >> >> If it's people who are going to put it up, vanilla, in to production, we >> should turn them off. >> >> I know that I would certainly advocate keeping them off in the CouchDBX >> build. >> >> -Mikeal >> >> On Tue, Jul 6, 2010 at 3:46 PM, Volker Mischewrote: >> >>> On 07.07.2010 00:06, Damien Katz wrote: >>> >>>> >>>> On Jul 5, 2010, at 8:49 AM, Volker Mische wrote: >>>> >>>> Hi All, >>>>> >>>>> delayed_commits were enabled to have better performance especially for >>>>> single writers. The price you pay for is that you potentially lose up to >>>>> one >>>>> second of writes in case of a crash. >>>>> >>>>> Such a setting makes sense, though in my opinion it shouldn't be enabled >>>>> by default. I expect* that people running into performance issues at least >>>>> take a look at the README or a FAQ section somewhere. There the >>>>> delayed_commit setting could be pointed out. >>>>> >>>>> I'd like to be able to say that on a vanilla CouchDB it's hard to lose >>>>> data, but I can't atm. I'm also well aware that there will be plenty of >>>>> performance tests when 1.0 is released and people will complain (if >>>>> delayed_commits would be set to false by default) that it is horrible >>>>> slow. >>>>> Though safety of the data is more important for me. >>>>> >>>>> If the only reason why delayed_commits is true by default are the >>>>> performance tests of some noobs, I really don't think it's a price worth >>>>> paying. >>>>> >>>>> *I know that in reality people don't >>>>> >>>>> I would like to see delayed_commits=false for 1.0 >>>>> >>>> >>>> Last year we turned off delayed commits by default. We got lots of >>>> complaints, the performance impact was too great. So we switched it back. >>>> We >>>> aren't the first storage engine to go around on this. For example, Apple's >>>> core data switched to using full fsyncs for each write in 10.4, but then >>>> switched it back for 10.5: >>>> >>>> >>>> http://developer.apple.com/mac/library/documentation/Cocoa/Conceptual/CoreData/Articles/cdPersistentStores.html >>>> >>>> "Important: The default behaviors in Mac OS X v10.4 an 10.5 are different. >>>> In Mac OS X v10.4, SQLite uses FULL_FSYNC by default; in Mac OS X v10.5 it >>>> does not." >>>> >>>> Anyway, we can improve the documentation warning's, etc, but we should >>>> stay with the current defaults. >>>> >>>> -Damien >>>> >>>> >>> As 1.0 is approaching fast, I think this discussion is pretty important. >>> Especially this thread showed that there are people that prefer setting >>> delayed_commits to false. Although sometimes someone has to make the last >>> call, and there is probably no one better than the creator of the project, I >>> think it this case the decision should be made by more people. >>> >>> For *me personally* the authority of Apache CouchDB are the committers. I >>> would love to see them vote on this topic (being it public or private >>> doesn't matter). >>> >>> Cheers, >>> Volker >>> >> >
Re: delayed_commits false
(memo to myself, don't send mails late at night) On the other hand, do developers care about performance? And if, they would read the documentation. Cheers, Volker On 07.07.2010 00:58, Volker Mische wrote: I have to admit that the point, that the main audience of a tarball are developers is a good one. Perhaps people that do binary distributions of CouchDB (like all the linux distros) could be encouraged to turn it to false (though I have no idea what their general policy about changing defaults is). Cheers, Volker On 07.07.2010 00:52, Mikeal Rogers wrote: I think there is a balance that we can find here between user experience and durability. I think the biggest question for me is, who is the primary target of the tarball download? If it's developers, I think we should leave it on. If it's people who are going to put it up, vanilla, in to production, we should turn them off. I know that I would certainly advocate keeping them off in the CouchDBX build. -Mikeal On Tue, Jul 6, 2010 at 3:46 PM, Volker Mischewrote: On 07.07.2010 00:06, Damien Katz wrote: On Jul 5, 2010, at 8:49 AM, Volker Mische wrote: Hi All, delayed_commits were enabled to have better performance especially for single writers. The price you pay for is that you potentially lose up to one second of writes in case of a crash. Such a setting makes sense, though in my opinion it shouldn't be enabled by default. I expect* that people running into performance issues at least take a look at the README or a FAQ section somewhere. There the delayed_commit setting could be pointed out. I'd like to be able to say that on a vanilla CouchDB it's hard to lose data, but I can't atm. I'm also well aware that there will be plenty of performance tests when 1.0 is released and people will complain (if delayed_commits would be set to false by default) that it is horrible slow. Though safety of the data is more important for me. If the only reason why delayed_commits is true by default are the performance tests of some noobs, I really don't think it's a price worth paying. *I know that in reality people don't I would like to see delayed_commits=false for 1.0 Last year we turned off delayed commits by default. We got lots of complaints, the performance impact was too great. So we switched it back. We aren't the first storage engine to go around on this. For example, Apple's core data switched to using full fsyncs for each write in 10.4, but then switched it back for 10.5: http://developer.apple.com/mac/library/documentation/Cocoa/Conceptual/CoreData/Articles/cdPersistentStores.html "Important: The default behaviors in Mac OS X v10.4 an 10.5 are different. In Mac OS X v10.4, SQLite uses FULL_FSYNC by default; in Mac OS X v10.5 it does not." Anyway, we can improve the documentation warning's, etc, but we should stay with the current defaults. -Damien As 1.0 is approaching fast, I think this discussion is pretty important. Especially this thread showed that there are people that prefer setting delayed_commits to false. Although sometimes someone has to make the last call, and there is probably no one better than the creator of the project, I think it this case the decision should be made by more people. For *me personally* the authority of Apache CouchDB are the committers. I would love to see them vote on this topic (being it public or private doesn't matter). Cheers, Volker
Re: delayed_commits false
I have to admit that the point, that the main audience of a tarball are developers is a good one. Perhaps people that do binary distributions of CouchDB (like all the linux distros) could be encouraged to turn it to false (though I have no idea what their general policy about changing defaults is). Cheers, Volker On 07.07.2010 00:52, Mikeal Rogers wrote: I think there is a balance that we can find here between user experience and durability. I think the biggest question for me is, who is the primary target of the tarball download? If it's developers, I think we should leave it on. If it's people who are going to put it up, vanilla, in to production, we should turn them off. I know that I would certainly advocate keeping them off in the CouchDBX build. -Mikeal On Tue, Jul 6, 2010 at 3:46 PM, Volker Mischewrote: On 07.07.2010 00:06, Damien Katz wrote: On Jul 5, 2010, at 8:49 AM, Volker Mische wrote: Hi All, delayed_commits were enabled to have better performance especially for single writers. The price you pay for is that you potentially lose up to one second of writes in case of a crash. Such a setting makes sense, though in my opinion it shouldn't be enabled by default. I expect* that people running into performance issues at least take a look at the README or a FAQ section somewhere. There the delayed_commit setting could be pointed out. I'd like to be able to say that on a vanilla CouchDB it's hard to lose data, but I can't atm. I'm also well aware that there will be plenty of performance tests when 1.0 is released and people will complain (if delayed_commits would be set to false by default) that it is horrible slow. Though safety of the data is more important for me. If the only reason why delayed_commits is true by default are the performance tests of some noobs, I really don't think it's a price worth paying. *I know that in reality people don't I would like to see delayed_commits=false for 1.0 Last year we turned off delayed commits by default. We got lots of complaints, the performance impact was too great. So we switched it back. We aren't the first storage engine to go around on this. For example, Apple's core data switched to using full fsyncs for each write in 10.4, but then switched it back for 10.5: http://developer.apple.com/mac/library/documentation/Cocoa/Conceptual/CoreData/Articles/cdPersistentStores.html "Important: The default behaviors in Mac OS X v10.4 an 10.5 are different. In Mac OS X v10.4, SQLite uses FULL_FSYNC by default; in Mac OS X v10.5 it does not." Anyway, we can improve the documentation warning's, etc, but we should stay with the current defaults. -Damien As 1.0 is approaching fast, I think this discussion is pretty important. Especially this thread showed that there are people that prefer setting delayed_commits to false. Although sometimes someone has to make the last call, and there is probably no one better than the creator of the project, I think it this case the decision should be made by more people. For *me personally* the authority of Apache CouchDB are the committers. I would love to see them vote on this topic (being it public or private doesn't matter). Cheers, Volker
Re: delayed_commits false
I think there is a balance that we can find here between user experience and durability. I think the biggest question for me is, who is the primary target of the tarball download? If it's developers, I think we should leave it on. If it's people who are going to put it up, vanilla, in to production, we should turn them off. I know that I would certainly advocate keeping them off in the CouchDBX build. -Mikeal On Tue, Jul 6, 2010 at 3:46 PM, Volker Mische wrote: > On 07.07.2010 00:06, Damien Katz wrote: > >> >> On Jul 5, 2010, at 8:49 AM, Volker Mische wrote: >> >> Hi All, >>> >>> delayed_commits were enabled to have better performance especially for >>> single writers. The price you pay for is that you potentially lose up to one >>> second of writes in case of a crash. >>> >>> Such a setting makes sense, though in my opinion it shouldn't be enabled >>> by default. I expect* that people running into performance issues at least >>> take a look at the README or a FAQ section somewhere. There the >>> delayed_commit setting could be pointed out. >>> >>> I'd like to be able to say that on a vanilla CouchDB it's hard to lose >>> data, but I can't atm. I'm also well aware that there will be plenty of >>> performance tests when 1.0 is released and people will complain (if >>> delayed_commits would be set to false by default) that it is horrible slow. >>> Though safety of the data is more important for me. >>> >>> If the only reason why delayed_commits is true by default are the >>> performance tests of some noobs, I really don't think it's a price worth >>> paying. >>> >>> *I know that in reality people don't >>> >>> I would like to see delayed_commits=false for 1.0 >>> >> >> Last year we turned off delayed commits by default. We got lots of >> complaints, the performance impact was too great. So we switched it back. We >> aren't the first storage engine to go around on this. For example, Apple's >> core data switched to using full fsyncs for each write in 10.4, but then >> switched it back for 10.5: >> >> >> http://developer.apple.com/mac/library/documentation/Cocoa/Conceptual/CoreData/Articles/cdPersistentStores.html >> >> "Important: The default behaviors in Mac OS X v10.4 an 10.5 are different. >> In Mac OS X v10.4, SQLite uses FULL_FSYNC by default; in Mac OS X v10.5 it >> does not." >> >> Anyway, we can improve the documentation warning's, etc, but we should >> stay with the current defaults. >> >> -Damien >> >> > As 1.0 is approaching fast, I think this discussion is pretty important. > Especially this thread showed that there are people that prefer setting > delayed_commits to false. Although sometimes someone has to make the last > call, and there is probably no one better than the creator of the project, I > think it this case the decision should be made by more people. > > For *me personally* the authority of Apache CouchDB are the committers. I > would love to see them vote on this topic (being it public or private > doesn't matter). > > Cheers, > Volker >
Re: delayed_commits false
On 07.07.2010 00:06, Damien Katz wrote: On Jul 5, 2010, at 8:49 AM, Volker Mische wrote: Hi All, delayed_commits were enabled to have better performance especially for single writers. The price you pay for is that you potentially lose up to one second of writes in case of a crash. Such a setting makes sense, though in my opinion it shouldn't be enabled by default. I expect* that people running into performance issues at least take a look at the README or a FAQ section somewhere. There the delayed_commit setting could be pointed out. I'd like to be able to say that on a vanilla CouchDB it's hard to lose data, but I can't atm. I'm also well aware that there will be plenty of performance tests when 1.0 is released and people will complain (if delayed_commits would be set to false by default) that it is horrible slow. Though safety of the data is more important for me. If the only reason why delayed_commits is true by default are the performance tests of some noobs, I really don't think it's a price worth paying. *I know that in reality people don't I would like to see delayed_commits=false for 1.0 Last year we turned off delayed commits by default. We got lots of complaints, the performance impact was too great. So we switched it back. We aren't the first storage engine to go around on this. For example, Apple's core data switched to using full fsyncs for each write in 10.4, but then switched it back for 10.5: http://developer.apple.com/mac/library/documentation/Cocoa/Conceptual/CoreData/Articles/cdPersistentStores.html "Important: The default behaviors in Mac OS X v10.4 an 10.5 are different. In Mac OS X v10.4, SQLite uses FULL_FSYNC by default; in Mac OS X v10.5 it does not." Anyway, we can improve the documentation warning's, etc, but we should stay with the current defaults. -Damien As 1.0 is approaching fast, I think this discussion is pretty important. Especially this thread showed that there are people that prefer setting delayed_commits to false. Although sometimes someone has to make the last call, and there is probably no one better than the creator of the project, I think it this case the decision should be made by more people. For *me personally* the authority of Apache CouchDB are the committers. I would love to see them vote on this topic (being it public or private doesn't matter). Cheers, Volker
Re: delayed_commits false
On Jul 5, 2010, at 8:49 AM, Volker Mische wrote: > Hi All, > > delayed_commits were enabled to have better performance especially for single > writers. The price you pay for is that you potentially lose up to one second > of writes in case of a crash. > > Such a setting makes sense, though in my opinion it shouldn't be enabled by > default. I expect* that people running into performance issues at least take > a look at the README or a FAQ section somewhere. There the delayed_commit > setting could be pointed out. > > I'd like to be able to say that on a vanilla CouchDB it's hard to lose data, > but I can't atm. I'm also well aware that there will be plenty of performance > tests when 1.0 is released and people will complain (if delayed_commits would > be set to false by default) that it is horrible slow. Though safety of the > data is more important for me. > > If the only reason why delayed_commits is true by default are the performance > tests of some noobs, I really don't think it's a price worth paying. > > *I know that in reality people don't > > I would like to see delayed_commits=false for 1.0 Last year we turned off delayed commits by default. We got lots of complaints, the performance impact was too great. So we switched it back. We aren't the first storage engine to go around on this. For example, Apple's core data switched to using full fsyncs for each write in 10.4, but then switched it back for 10.5: http://developer.apple.com/mac/library/documentation/Cocoa/Conceptual/CoreData/Articles/cdPersistentStores.html "Important: The default behaviors in Mac OS X v10.4 an 10.5 are different. In Mac OS X v10.4, SQLite uses FULL_FSYNC by default; in Mac OS X v10.5 it does not." Anyway, we can improve the documentation warning's, etc, but we should stay with the current defaults. -Damien > > Cheers, > Volker
Re: delayed_commits false
---[Sorry for the noise and not properly quoting previously]--- Mikeal, I suspect you mixed up local data consistency and durability... In fact, I think it's only the durability one of the ACID properties that's affected. Basically, the only difference that delayed_commits=true makes is that if there is e.g. a power outage, the result of write operations that have committed in the last three or so seconds might be lost. I know, it's no question that local data consistency and the other consistency property you refer to as runtime consistency are not affected by the delayed_commits setting. Cheers, Klaus On Tue, 2010-07-06 at 13:28 -0700, Mikeal Rogers wrote: > Just for reference, most SQL databases ship with the fsync to their > log on a > 1s or longer cycle, it's pretty standard. > > delayed-commits on doesn't reduce durability because the writes to log > are > still append-only and can survive invalid writes and crashes and all > that. > Also, they don't reduce runtime consistency because the response isn't > returned until the document is available. All they do is give the > client a > *different* consistency guarantee (available rather than persisted to > disc).
Re: delayed_commits false
Mikeal, I suspect you mixed up local data consistency and durability... In fact, I think it's only the durability one of the ACID properties that's affected. Basically, the only difference that delayed_commits=true makes is that if there is e.g. a power outage, the result of write operations that have committed in the last three or so seconds might be lost. I know, it's no question that local data consistency and the other consistency property you refer to as runtime consistency are not affected by the delayed_commits setting. Cheers, Klaus > still append-only and can survive invalid writes and crashes and all > that. > Also, they don't reduce runtime consistency because the response isn't > returned until the document is available. All they do is give the > client a > *different* consistency guarantee (available rather than persisted to > disc).
Re: delayed_commits false
The difference between delayed-commits on and off is not the biggest difference in consistency and durability between mongo and couch. MongoDB doesn't return a response by default on a write. You just write to the socket and hope that it's available. MongoDB lets the kernel decide to flush to disc whenever. Newer version force an fsync every minute. MongoDB writes to their file format "in place" and don't keep *anything* append-only around which is why they suffer from these kinds of long term corruption bugs where data can't be recovered. These things don't just set MongoDB apart from CouchDB in terms of consistency and durability they set them apart from all other modern databases. Even Redis has more consistency and durability than this. -Mikeal On Tue, Jul 6, 2010 at 1:22 PM, Zachary Zolton wrote: > To Klaus's point, we have to choose our FUD: > "CouchDB is sooo slow" or "CouchDB will lose your data" > > Would the latter cause more harm than the former? I don't know, but > Google already includes the phrase "mongodb losing data" in its search > suggestions. > > I'd hate for CouchDB to end up in the same boat. > > On Tue, Jul 6, 2010 at 3:07 PM, Klaus Trainer > wrote: > > Just to put my two cents in... > > > > It's a matter of to be or not to be (ACID (by default)). > > > > With delayed_commits=true, data operations aren't durable anymore. > > Consequently, if it's the default setting, some people might say that > > CouchDB does not meet ACID requirements. People mostly tend to simplify. > > We will hardly be able to eliminate (sometimes vicious) superficiality, > > but we rather should face the fact that such partially true assertions > > may be harmful. > > > > Take MySQL as an example: there are (still) people stating that MySQL is > > not ACID compliant and doesn't support transactions. They are partially > > right with that: it's true for the default storage engine (MyISAM), but > > definitely not for InnoDB. > > > > With choosing stronger guarantees by default--as long as it goes in line > > with basic design decisions--we just eliminate any room for such > > confusion (or maybe even FUD that is spread by competitors). Basically, > > when comparing with other databases, that's way more important than a > > higher rank in an inadequate performance benchmark (pissing contest). > > > > - Klaus > > > > > > On Tue, 2010-07-06 at 16:43 +0200, Benoit Chesneau wrote: > >> I would prefer to put delayed_commits=false too, to keep the promise > >> we give to our users. We can't say on one side we are better than > >> mongo for this while a simple power failure may result in lost of data > >> by default (even if we are better since dbs won't be corrupted). > >> > >> The default should be the strongest imo. Like every os should be > >> secure by default, we should let the user know, we do the best *by > >> default* to make sure data are safe on the disk. While they still have > >> the possibility to bypass this "security" . But in this case, this is > >> a choice. > >> > >> For those who worry about the marketing, this is also a good point of > >> differentiation compared to others dbs. (/me remove his marketing hat) > >> . > >> > >> - benoit > > > > > > >
Re: delayed_commits false
Just for reference, most SQL databases ship with the fsync to their log on a 1s or longer cycle, it's pretty standard. delayed-commits on doesn't reduce durability because the writes to log are still append-only and can survive invalid writes and crashes and all that. Also, they don't reduce runtime consistency because the response isn't returned until the document is available. All they do is give the client a *different* consistency guarantee (available rather than persisted to disc). Personally, i think it's better to default the guarantee to persisted but I can see why it's advantageous for some configurations to opt for the other. The real question here is what guarantee we want to ship with by default, one that optimizes single writer performance or one that offers a better guarantee and is still performant under concurrent load. -Mikeal On Tue, Jul 6, 2010 at 1:07 PM, Klaus Trainer wrote: > Just to put my two cents in... > > It's a matter of to be or not to be (ACID (by default)). > > With delayed_commits=true, data operations aren't durable anymore. > Consequently, if it's the default setting, some people might say that > CouchDB does not meet ACID requirements. People mostly tend to simplify. > We will hardly be able to eliminate (sometimes vicious) superficiality, > but we rather should face the fact that such partially true assertions > may be harmful. > > Take MySQL as an example: there are (still) people stating that MySQL is > not ACID compliant and doesn't support transactions. They are partially > right with that: it's true for the default storage engine (MyISAM), but > definitely not for InnoDB. > > With choosing stronger guarantees by default--as long as it goes in line > with basic design decisions--we just eliminate any room for such > confusion (or maybe even FUD that is spread by competitors). Basically, > when comparing with other databases, that's way more important than a > higher rank in an inadequate performance benchmark (pissing contest). > > - Klaus > > > On Tue, 2010-07-06 at 16:43 +0200, Benoit Chesneau wrote: > > I would prefer to put delayed_commits=false too, to keep the promise > > we give to our users. We can't say on one side we are better than > > mongo for this while a simple power failure may result in lost of data > > by default (even if we are better since dbs won't be corrupted). > > > > The default should be the strongest imo. Like every os should be > > secure by default, we should let the user know, we do the best *by > > default* to make sure data are safe on the disk. While they still have > > the possibility to bypass this "security" . But in this case, this is > > a choice. > > > > For those who worry about the marketing, this is also a good point of > > differentiation compared to others dbs. (/me remove his marketing hat) > > . > > > > - benoit > > >
Re: delayed_commits false
To Klaus's point, we have to choose our FUD: "CouchDB is sooo slow" or "CouchDB will lose your data" Would the latter cause more harm than the former? I don't know, but Google already includes the phrase "mongodb losing data" in its search suggestions. I'd hate for CouchDB to end up in the same boat. On Tue, Jul 6, 2010 at 3:07 PM, Klaus Trainer wrote: > Just to put my two cents in... > > It's a matter of to be or not to be (ACID (by default)). > > With delayed_commits=true, data operations aren't durable anymore. > Consequently, if it's the default setting, some people might say that > CouchDB does not meet ACID requirements. People mostly tend to simplify. > We will hardly be able to eliminate (sometimes vicious) superficiality, > but we rather should face the fact that such partially true assertions > may be harmful. > > Take MySQL as an example: there are (still) people stating that MySQL is > not ACID compliant and doesn't support transactions. They are partially > right with that: it's true for the default storage engine (MyISAM), but > definitely not for InnoDB. > > With choosing stronger guarantees by default--as long as it goes in line > with basic design decisions--we just eliminate any room for such > confusion (or maybe even FUD that is spread by competitors). Basically, > when comparing with other databases, that's way more important than a > higher rank in an inadequate performance benchmark (pissing contest). > > - Klaus > > > On Tue, 2010-07-06 at 16:43 +0200, Benoit Chesneau wrote: >> I would prefer to put delayed_commits=false too, to keep the promise >> we give to our users. We can't say on one side we are better than >> mongo for this while a simple power failure may result in lost of data >> by default (even if we are better since dbs won't be corrupted). >> >> The default should be the strongest imo. Like every os should be >> secure by default, we should let the user know, we do the best *by >> default* to make sure data are safe on the disk. While they still have >> the possibility to bypass this "security" . But in this case, this is >> a choice. >> >> For those who worry about the marketing, this is also a good point of >> differentiation compared to others dbs. (/me remove his marketing hat) >> . >> >> - benoit > > >
Re: delayed_commits false
Just to put my two cents in... It's a matter of to be or not to be (ACID (by default)). With delayed_commits=true, data operations aren't durable anymore. Consequently, if it's the default setting, some people might say that CouchDB does not meet ACID requirements. People mostly tend to simplify. We will hardly be able to eliminate (sometimes vicious) superficiality, but we rather should face the fact that such partially true assertions may be harmful. Take MySQL as an example: there are (still) people stating that MySQL is not ACID compliant and doesn't support transactions. They are partially right with that: it's true for the default storage engine (MyISAM), but definitely not for InnoDB. With choosing stronger guarantees by default--as long as it goes in line with basic design decisions--we just eliminate any room for such confusion (or maybe even FUD that is spread by competitors). Basically, when comparing with other databases, that's way more important than a higher rank in an inadequate performance benchmark (pissing contest). - Klaus On Tue, 2010-07-06 at 16:43 +0200, Benoit Chesneau wrote: > I would prefer to put delayed_commits=false too, to keep the promise > we give to our users. We can't say on one side we are better than > mongo for this while a simple power failure may result in lost of data > by default (even if we are better since dbs won't be corrupted). > > The default should be the strongest imo. Like every os should be > secure by default, we should let the user know, we do the best *by > default* to make sure data are safe on the disk. While they still have > the possibility to bypass this "security" . But in this case, this is > a choice. > > For those who worry about the marketing, this is also a good point of > differentiation compared to others dbs. (/me remove his marketing hat) > . > > - benoit
Re: delayed_commits false
On Mon, Jul 5, 2010 at 5:49 PM, Volker Mische wrote: > Hi All, > > delayed_commits were enabled to have better performance especially for > single writers. The price you pay for is that you potentially lose up to one > second of writes in case of a crash. > > Such a setting makes sense, though in my opinion it shouldn't be enabled by > default. I expect* that people running into performance issues at least take > a look at the README or a FAQ section somewhere. There the delayed_commit > setting could be pointed out. > > I'd like to be able to say that on a vanilla CouchDB it's hard to lose data, > but I can't atm. I'm also well aware that there will be plenty of > performance tests when 1.0 is released and people will complain (if > delayed_commits would be set to false by default) that it is horrible slow. > Though safety of the data is more important for me. > > If the only reason why delayed_commits is true by default are the > performance tests of some noobs, I really don't think it's a price worth > paying. > > *I know that in reality people don't > > I would like to see delayed_commits=false for 1.0 > > Cheers, > Volker > I would prefer to put delayed_commits=false too, to keep the promise we give to our users. We can't say on one side we are better than mongo for this while a simple power failure may result in lost of data by default (even if we are better since dbs won't be corrupted). The default should be the strongest imo. Like every os should be secure by default, we should let the user know, we do the best *by default* to make sure data are safe on the disk. While they still have the possibility to bypass this "security" . But in this case, this is a choice. For those who worry about the marketing, this is also a good point of differentiation compared to others dbs. (/me remove his marketing hat) . - benoit
Re: delayed_commits false
The more I think about this the more I think we need to write up some specific use cases and ideal configurations for each. For DesktopCouch and maybe even mobile CouchDB builds that tend to support a couple clients talking to one database each delayed-commits is a better user experience. But, delayed-commits is currently terrible for a production multi-user CouchDB. The more features we add the bigger a problem this will become. I really don't want to end up with a page like the one Postgres has with a bunch of technical points about config options. I think we should target use cases and write up ideal configurations for those users. -Mikeal On Mon, Jul 5, 2010 at 11:17 AM, J Chris Anderson wrote: > For a relatively sane look at the tradeoff's we're talking about, this is a > good resource: > > http://developer.postgresql.org/pgdocs/postgres/runtime-config-wal.html > > I wish it was simple to write a heuristic which would detect single > serialized client workloads, and delay commits, but I don't think it is. > > I lean (slightly) toward leaving delayed_commits = true because the worst > case scenario, even in the case of a crash, isn't data corruption, just lost > data from the most recent activity. > > It is also worth noting that there is an API to ensure_full_commit aside > from the configuration value, so if you have high-value data you are > writing, you can call ensure_full_commit (or use a header value to make the > last PUT or POST operation force full commit) > > I think this is worth discussing. I'm not strongly in favor of the > delayed_commit=true setting, but I do think it is slightly more > user-friendly... > > Chris > > On Jul 5, 2010, at 10:02 AM, Mikeal Rogers wrote: > > > For the concurrent performance tests I wrote in relaximation it's > actually > > better to run with delayed_commits off because it measures the roundtrip > > time of all the concurrent clients. > > > > The reason it's enabled by default is because of apache-bench and other > > single writer performance test tools. From what I've seen, it doesn't > > actually improve write performance under concurrent load and leads to a > kind > > of blocking behavior when you start throwing too many writes at it than > it > > can fsync in a second. The degradation in performance is pretty huge with > > this "blocking" in my concurrent tests. > > > > I don't know of a lot of good concurrent performance test tools which is > why > > I went and wrote one. But, it only tests CouchDB and people love to pick > up > > one of these tools that tests a bunch of other dbs (poorly) and be like > > "CouchDB is slow" because they are using a single writer. > > > > But, IMHO it's better to ship with more guarantees about consistency than > > optimized for crappy perf tools. > > > > -Mikeal > > > > On Mon, Jul 5, 2010 at 8:49 AM, Volker Mische >wrote: > > > >> Hi All, > >> > >> delayed_commits were enabled to have better performance especially for > >> single writers. The price you pay for is that you potentially lose up to > one > >> second of writes in case of a crash. > >> > >> Such a setting makes sense, though in my opinion it shouldn't be enabled > by > >> default. I expect* that people running into performance issues at least > take > >> a look at the README or a FAQ section somewhere. There the > delayed_commit > >> setting could be pointed out. > >> > >> I'd like to be able to say that on a vanilla CouchDB it's hard to lose > >> data, but I can't atm. I'm also well aware that there will be plenty of > >> performance tests when 1.0 is released and people will complain (if > >> delayed_commits would be set to false by default) that it is horrible > slow. > >> Though safety of the data is more important for me. > >> > >> If the only reason why delayed_commits is true by default are the > >> performance tests of some noobs, I really don't think it's a price worth > >> paying. > >> > >> *I know that in reality people don't > >> > >> I would like to see delayed_commits=false for 1.0 > >> > >> Cheers, > >> Volker > >> > >
Re: delayed_commits false
For a relatively sane look at the tradeoff's we're talking about, this is a good resource: http://developer.postgresql.org/pgdocs/postgres/runtime-config-wal.html I wish it was simple to write a heuristic which would detect single serialized client workloads, and delay commits, but I don't think it is. I lean (slightly) toward leaving delayed_commits = true because the worst case scenario, even in the case of a crash, isn't data corruption, just lost data from the most recent activity. It is also worth noting that there is an API to ensure_full_commit aside from the configuration value, so if you have high-value data you are writing, you can call ensure_full_commit (or use a header value to make the last PUT or POST operation force full commit) I think this is worth discussing. I'm not strongly in favor of the delayed_commit=true setting, but I do think it is slightly more user-friendly... Chris On Jul 5, 2010, at 10:02 AM, Mikeal Rogers wrote: > For the concurrent performance tests I wrote in relaximation it's actually > better to run with delayed_commits off because it measures the roundtrip > time of all the concurrent clients. > > The reason it's enabled by default is because of apache-bench and other > single writer performance test tools. From what I've seen, it doesn't > actually improve write performance under concurrent load and leads to a kind > of blocking behavior when you start throwing too many writes at it than it > can fsync in a second. The degradation in performance is pretty huge with > this "blocking" in my concurrent tests. > > I don't know of a lot of good concurrent performance test tools which is why > I went and wrote one. But, it only tests CouchDB and people love to pick up > one of these tools that tests a bunch of other dbs (poorly) and be like > "CouchDB is slow" because they are using a single writer. > > But, IMHO it's better to ship with more guarantees about consistency than > optimized for crappy perf tools. > > -Mikeal > > On Mon, Jul 5, 2010 at 8:49 AM, Volker Mische wrote: > >> Hi All, >> >> delayed_commits were enabled to have better performance especially for >> single writers. The price you pay for is that you potentially lose up to one >> second of writes in case of a crash. >> >> Such a setting makes sense, though in my opinion it shouldn't be enabled by >> default. I expect* that people running into performance issues at least take >> a look at the README or a FAQ section somewhere. There the delayed_commit >> setting could be pointed out. >> >> I'd like to be able to say that on a vanilla CouchDB it's hard to lose >> data, but I can't atm. I'm also well aware that there will be plenty of >> performance tests when 1.0 is released and people will complain (if >> delayed_commits would be set to false by default) that it is horrible slow. >> Though safety of the data is more important for me. >> >> If the only reason why delayed_commits is true by default are the >> performance tests of some noobs, I really don't think it's a price worth >> paying. >> >> *I know that in reality people don't >> >> I would like to see delayed_commits=false for 1.0 >> >> Cheers, >> Volker >>
Re: delayed_commits false
For the concurrent performance tests I wrote in relaximation it's actually better to run with delayed_commits off because it measures the roundtrip time of all the concurrent clients. The reason it's enabled by default is because of apache-bench and other single writer performance test tools. From what I've seen, it doesn't actually improve write performance under concurrent load and leads to a kind of blocking behavior when you start throwing too many writes at it than it can fsync in a second. The degradation in performance is pretty huge with this "blocking" in my concurrent tests. I don't know of a lot of good concurrent performance test tools which is why I went and wrote one. But, it only tests CouchDB and people love to pick up one of these tools that tests a bunch of other dbs (poorly) and be like "CouchDB is slow" because they are using a single writer. But, IMHO it's better to ship with more guarantees about consistency than optimized for crappy perf tools. -Mikeal On Mon, Jul 5, 2010 at 8:49 AM, Volker Mische wrote: > Hi All, > > delayed_commits were enabled to have better performance especially for > single writers. The price you pay for is that you potentially lose up to one > second of writes in case of a crash. > > Such a setting makes sense, though in my opinion it shouldn't be enabled by > default. I expect* that people running into performance issues at least take > a look at the README or a FAQ section somewhere. There the delayed_commit > setting could be pointed out. > > I'd like to be able to say that on a vanilla CouchDB it's hard to lose > data, but I can't atm. I'm also well aware that there will be plenty of > performance tests when 1.0 is released and people will complain (if > delayed_commits would be set to false by default) that it is horrible slow. > Though safety of the data is more important for me. > > If the only reason why delayed_commits is true by default are the > performance tests of some noobs, I really don't think it's a price worth > paying. > > *I know that in reality people don't > > I would like to see delayed_commits=false for 1.0 > > Cheers, > Volker >
delayed_commits false
Hi All, delayed_commits were enabled to have better performance especially for single writers. The price you pay for is that you potentially lose up to one second of writes in case of a crash. Such a setting makes sense, though in my opinion it shouldn't be enabled by default. I expect* that people running into performance issues at least take a look at the README or a FAQ section somewhere. There the delayed_commit setting could be pointed out. I'd like to be able to say that on a vanilla CouchDB it's hard to lose data, but I can't atm. I'm also well aware that there will be plenty of performance tests when 1.0 is released and people will complain (if delayed_commits would be set to false by default) that it is horrible slow. Though safety of the data is more important for me. If the only reason why delayed_commits is true by default are the performance tests of some noobs, I really don't think it's a price worth paying. *I know that in reality people don't I would like to see delayed_commits=false for 1.0 Cheers, Volker