Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

Andres Freund Tue, 21 Jan 2014 16:36:57 -0800

On 2014-01-21 19:23:57 -0500, Tom Lane wrote:
> Andres Freund <and...@2ndquadrant.com> writes:
> > On 2014-01-21 18:59:13 -0500, Tom Lane wrote:
> >> Another thing to think about is whether we couldn't put a hard limit on
> >> WAL record size somehow.  Multi-megabyte WAL records are an abuse of the
> >> design anyway, when you get right down to it.  So for example maybe we
> >> could split up commit records, with most of the bulky information dumped
> >> into separate records that appear before the "real commit".  This would
> >> complicate replay --- in particular, if we abort the transaction after
> >> writing a few such records, how does the replayer realize that it can
> >> forget about those records?  But that sounds probably surmountable.
> 
> > I think removing the list of subtransactions from commit records would
> > essentially require not truncating pg_subtrans after a restart
> > anymore.
> 
> I'm not suggesting that we stop providing that information!  I'm just
> saying that we perhaps don't need to store it all in one WAL record,
> if instead we put the onus on WAL replay to be able to reconstruct what
> it needs from a series of WAL records.


That'd likely require something similar to the incomplete actions used
in btrees (and until recently in more places). I think that is/was a
disaster I really don't want to extend.

> > We could relatively easily split of logging the dropped files from
> > commit records and log them in groups afterwards, we already have
> > several races allowing to leak files.
> 
> I was thinking the other way around: emit the subsidiary records before the
> atomic commit or abort record, indeed before we've actually committed.
> Part of the point is to reduce the risk that lack of WAL space would
> prevent us from fully committing.
> Replay would then involve either accumulating the subsidiary records in
> memory, or being willing to go back and re-read them when the real commit
> or abort record is seen.

Well, the reason I suggested doing it the other way round is that we
wouldn't need to reassemble anything (outside of cache invalidations
which I don't know how to handle that way) which I think is a
significant increase in robustness and decrease in complexity.

> Also, writing those records afterwards
> increases the risk of a post-commit failure, which is a bad thing.

Well, most of those could be done outside of a critical section,
possibly just FATALing out. Beats PANICing.

Greetings,

Andres Freund

-- 
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

Reply via email to