> > Data loss would be an issue when there is a server crash or pg_xlog crash > > etc. That many number of pg_xlog files (1000) would contribute to huge > > data > > loss (data changes not synced to the base are not guaranteed). Of-course, > > this is not related to the current situation. Normally we calculate the > > checkpoint completion time, IO pressure, CPU load and the threat to the > > data loss when we configure checkpoint_segments. > > So you're saying that by using small number of checkpoint segments you > limit the data loss when the WAL gets corrupted/lost? That's a bit like > buying a Maseratti and then not going faster than 10mph because you might > crash at higher speeds ... >
No. I am not saying that checkpoint_segments must be lower. I was just trying to explain the IO over-head on putting high (as high as 1000) checkpoint segments. Lower number of checkpoint segments will lead to more frequent IOs which is not good. Agreed. > > The problem here is that the WAL is usually placed on more reliable drives > (compared to the data files) or a RAID1 array and as it's just writing > data sequentially, so the usage pattern is much less likely to cause > data/drive corruption (compared to data files that need to handle a lot of > random I/O, etc.). > Agreed. > So while it possible the WAL might get corrupted, the probability of data > file corruption is much higher. And the corruption might easily happen > silently during a checkpoint, so there won't be any WAL segments no matter > how many of them you keep ... > Agreed. When corruption occurs, it really does not matter how many WAL segments are kept in pg_xlog. But, at any point of time if PG needs > And by using low number of checkpoint segments it actually gets worse, > because it means more frequent checkpoints -> more I/O on the drives -> > more wearout of the drives etc. > Completely agreed. As mentioned above. I choose checkpoint_segments and checkpoint_timeout once i observe the checkpoint behavior. If you need to protect yourself against this, you need to keep a WAL > archive (prefferably on a separate machine) and/or a hot standby for > failover. > WAL archiving is a different situation where-in you need to backup the pg_xlog files by enabling archiving. I was referring to an exclusive situation, where-in pg_xlogs are not archived and data is not yet been synced to base files (by bgwriter) and the system crashed, then PG would depend on pg_xlog to recover and reach the consistent state, if the pg_xlog is also not available, then there would be a data loss and this depends on how much data is present in pg_xlog files. Thanks, VB