I wrote: > What is happening of course is that more than 16K subtransaction IDs > won't fit in a commit record (since XLOG records have a 16-bit length > field). We're gonna have to rethink the representation of subxact > commit in XLOG.
After some further thought, I think there are basically two ways to attack this: 1. Allow XLOG records to be larger than 64K. 2. Split transaction commit into multiple XLOG records when there are many subtransactions. #2 looks pretty painful because of the need to ensure that transaction commit is still an atomic action. It's probably doable in principle with something similar to the solution we are using for btree page split logging (ie, record enough info so that the replay logic can complete the commit even if the later records aren't recoverable from the log). But I don't see all the details right off, and it sure seems risky. I'm inclined to go with #1. There are various ways we could do it but the most straightforward would be to just widen the xl_len field to 32 bits. This would cost either 4 or 8 bytes per XLOG record (because of MAXALIGN restrictions) but we could more than buy that back by eliminating the xl_prev and/or xl_xact_prev fields, which have no use in the current system. (They were intended to support UNDO but it seems clear that we will never do that.) Or we could assign an rmgr value to represent an "extension" record that is to be merged with a following "normal" record. This is kinda klugy but would avoid wasting bits on xl_len in the vast majority of records. Also we'd not have to force an initdb since the file format would remain upward-compatible. Thoughts? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org