On Mon, Jul 16, 2018 at 11:21 AM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > Overall, I think it's clear the main risk associated with this patch is the > decode group code - it touches PROC entries, so a bug may cause trouble > pretty easily. So I've focused on this part, for now.
I agree. As a general statement, I think the idea of trying to prevent transactions from aborting is really scary. It's almost an axiom of the system that we're always allowed to abort, and I think there could be a lot of unintended and difficult-to-fix consequences of undermining that guarantee. I think it will be very difficult to create a sound system for delaying transactions, and I doubt very much that the proposed system is sound. In particular: - The do_wait loop contains a CHECK_FOR_INTERRUPTS(). If that makes it interruptible, then it's possible for the abort to complete before the decoding processes have aborted. If that can happen, then this whole mechanism is completely pointless, because it fails to actually achieve the guarantee which is its central goal. On the other hand, if you don't make this abort interruptible, then you are significantly increase the risk that a backend could get stuck in the abort path for an unbounded period of time. If the aborting backend holds any significant resources at this point, such as heavyweight locks, then you risk creating a deadlock that cannot be broken until the decoding process manages to abort, and if that process is involved in the deadlock, then you risk creating an unbreakable deadlock. - BackendXidGetProc() seems to be called in multiple places without any lock held. I don't see how that can be safe, because AFAICS it must inevitably introduce a race condition: the answer can change after that value is returned but before it is used. There's a bunch of recheck logic that looks like it is trying to cope with this problem, but I'm not sure it's very solid. For example, AssignDecodeGroupLeader reads proc->decodeGroupLeader without holding any lock; we have historically avoided assuming that pointer-width reads cannot be torn. (We have assumed this only for 4-byte reads or narrower.) There are no comments about the locking hazards here, and no real explanation of how the recheck algorithm tries to patch things up: + leader = BackendXidGetProc(xid); + if (!leader || leader != proc) + { + LWLockRelease(leader_lwlock); + return NULL; + } Can be non-NULL yet unequal to proc? I don't understand how that can happen: surely once the PGPROC that has that XID aborts, the same XID can't possibly be assigned to a different PGPROC. - The code for releasing PGPROCs in ProcKill looks completely unsafe to me. With locking groups for parallel query, a process always enters a lock group of its own volition. It can safely use (MyProc->lockGroupLeader != NULL) as a race-free test because no other process can modify that value. But in this implementation of decoding groups, one process can put another process into a decoding group, which means this test has a race condition. If there's some reason this is safe, the comments sure don't explain it. I don't want to overplay my hand, but I think this code is a very long way from being committable, and I am concerned that the fundamental approach of blocking transaction aborts may be unsalvageably broken or at least exceedingly dangerous. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company