On Fri, Dec 16, 2016 at 7:08 AM, Michael Paquier <michael.paqu...@gmail.com> wrote:
> On Thu, Dec 15, 2016 at 11:25:10AM +0200, Heikki Linnakangas wrote: > > On 12/15/2016 10:44 AM, Magnus Hagander wrote: > > > I wonder if there might be more corner cases like this, but in this > > > particular one it seems easy enough to just say that failing to rename > > > recovery.conf because it didn't exist is safe. > > > > Yeah. It's unexpected though, so I think erroring out is quite > reasonable. > > If the recovery.conf file went missing, who knows what else is wrong. > > i'd rather not mess up with the exiting behavior and just complain to the > pilot to not do that. This enters in the category of not removing the > backup_label file after taking a backup... > I'm happy to say that people shouldn't do that. However, we have a system that is unnecessarily fragile, by making something like that so easy to break. This could equally happen in other ways. For example, someone could accidentally provision a recovery.conf with the wrong permissions. Reducing the fragility of such an important part of the system is a big improvement, IMO. Of course, we have to be extra careful when touching those parts. Bottom line, I'm perfectly fine with failing in such a scenario. I'm not fine with leaving the user with a corrupt cluster if we can avoid it. As for your comparison, we fixed the backup_label part by adding support for taking backups without using the fragile system. So it's not like we didn't recognize the problem. > > > > But in the case of failing to rename recovery.conf for example because > of > > > permissions errors, we don't want to ignore it. But we also really > don't > > > want to end up with this kind of inconsistent data directory IMO. I > don't > > > know that code well enough to suggest how to fix it though -- hoping > for > > > input for someone who knows it closer? > > > > Hmm. Perhaps we should write the timeline history file only after > renaming > > recovery.conf. In general, we should keep the window between writing the > > timeline history file and writing the end-of-recovery record as small as > > possible. I don't think we can eliminate it completely, but it makes > sense > > to minimize it. Something like the attached (completely untested). > I haven't looked at the code either, but the reason is definitely solid - let's keep the time as short as possible. In particular, keeping it recoverable for things that are caused by humans, because they will keep making mistakes. And editing recovery.conf is a normal thing for a DBA to do (sure, not removing it -- but it's one of the few files in the data directory that the DBA is actually supposed to handle manually, so that increases the risk). And also by doing things in an order that will at least make it less likely to end up corrupt if people manage to do it anyway. I did some tests. And after a lookup it looks like a good thing to book > the timeline number at an earlier step and let other nodes know about > it. So +1 on it. > > > Looking at PrescanPreparedTransactions(), I am thinking as well that it > would > be better to get a hard failure when bumping on a corrupted 2PC file. > Future > files are one thing, but corrupted files should be treated more carefully. > Again without looking at it, I agree (so much easier that way :P). Ignoring corruption is generally a bad idea. Failing hard makes the user notice the error, and makes it possible to initiate recovery from a standby or from backups or something, or to *intentionally* remove/clear/ignore it. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/