On Thu, Aug 25, 2016 at 6:59 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Thu, Aug 25, 2016 at 11:21 AM, Simon Riggs <si...@2ndquadrant.com> > wrote: > > On 25 August 2016 at 02:31, Robert Haas <robertmh...@gmail.com> wrote: > >> Furthermore, there is an enforced, synchronous fsync at the end of > >> every segment, which actually does hurt performance on write-heavy > >> workloads.[2] Of course, if that were the only reason to consider > >> increasing the segment size, it would probably make more sense to just > >> try to push that extra fsync into the background, but that's not > >> really the case. From what I hear, the gigantic number of files is a > >> bigger pain point. > > > > I think we should fully describe the problem before finding a solution. > > Sure, that's usually a good idea. I attempted to outline all of the > possible issues of which I am aware in my original email, but of > course you may know of considerations which I overlooked. > > > This is too big a change to just tweak a value without discussing the > > actual issue. > > Again, I tried to make sure I was discussing the actual issues in my > original email. In brief: having to run archive_command multiple > times per second imposes very tight latency requirements on it; > directories with hundreds of thousands or millions of files are hard > to manage; enforced synchronous fsyncs at the end of each segment hurt > performance. > > > And if the problem is as described, how can a change of x4 be enough > > to make it worth the pain of change? I think you're already admitting > > it can't be worth it by discussing initdb configuration. > > I guess it depends on how much pain of change you think there will be. > I would expect a change from 16MB -> 64MB to be fairly painless, but > (1) it might break tools that aren't designed to cope with differing > segment sizes and (2) it will increase disk utilization for people who > have such low velocity systems that they never end up with more than 2 > WAL segments, and now those segments are bigger. If you know of other > impacts or have reason to believe those problems will be serious, > please fill in the details. > > Despite the fact that initdb configuration has dominated this thread, > I mentioned it only in the very last sentence of my email and only as > a possibility. I believe that a 4x change will be good enough for the > majority of people for whom this is currently a pain point. However, > yes, I do believe that there are some people for whom it won't be > sufficient. And I believe that as we continue to enhance PostgreSQL > to support higher and higher transaction rates, the number of people > who need an extra-large WAL segment size will increase. As I see it, > there are three options here: > > 1. Do nothing. So far, I don't see anybody arguing for that. > > 2. Change the default to 64MB and call it good. This idea seems to > have considerable support. > > 3. Allow initdb-time configurability but keep the default at 16MB. I > don't see any support for this. There is clearly support for > configurability, but I don't see anyone arguing that the current > default is preferable, unless that is what you are arguing. > > 4. Change the default to 64MB and also allow initdb-time > configurability. This option also appears to enjoy substantial > support, perhaps more than #2. Magnus seemed to be arguing that this > is preferable to #2, because then it's easier for people to change the > setting back if someone discovers a case where the higher default is a > problem; Tom, on the other hand, seems to think this is overkill. > Personally, I believe option #4 is for the best. I believe that the > great majority of users will be better off with 64MB than with 16MB, > but I like the idea of allowing for smaller values (for people with > really low-velocity instances) and larger ones (for people with really > high-velocity instances). > I was not arguing for #4 over #2, at least not strongly. I think #2 is fine, and I think #4 are fine. #4 allows a way out, but it's not *that* important unless we go *beyond* 64Mb. I was mainly arguing that we can't claim "it has a configure switch so it's kinda configurable" as a way out. If we want it configurable *at all*, it should be an initdb switch. If we are confident in our defaults, it doesn't have to be. I agree that #4 is best. I'm not sure it's worth the cost. I'm not worried at all about the risk of master/slave sync thing, per previous statement. But if it does have performance implications, per Andres suggestion, then making it configurable at initdb time probably comes with a cost that's not worth paying. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/