On Sun, Nov 15, 2020 at 11:24 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Fri, Nov 13, 2020 at 6:02 PM Antonin Houska <a...@cybertec.at> wrote: > > > > Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > > On Thu, Nov 12, 2020 at 2:45 PM Antonin Houska <a...@cybertec.at> wrote: > > > > > > > > > > > > No background undo > > > > ------------------ > > > > > > > > Reduced complexity of the patch seems to be the priority at the moment. > > > > Amit > > > > suggested that cleanup of an orphaned relation file is simple enough to > > > > be > > > > done on foreground and I agree. > > > > > > > > > > Yeah, I think we should try and see if we can make it work but I > > > noticed that there are few places like AbortOutOfAnyTransaction where > > > we have the assumption that undo will be executed in the background. > > > We need to deal with it. > > > > I think this is o.k. if we always check for unapplied undo during startup. > > > > Hmm, how it is ok to leave undo (and rely on startup) unless it is a > PANIC error. IIRC, this path is invoked in non-panic errors as well. > Basically, we won't be able to discard such an undo which doesn't seem > like a good idea. > > > > > "undo worker" is still there, but it only processes undo requests after > > > > server > > > > restart because relation data can only be changed in a transaction - it > > > > seems > > > > cleaner to launch a background worker for this than to hack the startup > > > > process. > > > > > > > > > > But, I see there are still multiple undoworkers that are getting > > > launched and I am not sure if that works correctly because a > > > particular undoworker is connected to a database and then it starts > > > processing all the pending undo. > > > > Each undo worker applies only transactions for its own database, see > > ProcessExistingUndoRequests(): > > > > /* We can only process undo of the database we are connected to. */ > > if (xact_hdr.dboid != MyDatabaseId) > > continue; > > > > Nevertheless, as I've just mentioned in my response to Thomas, I admit that > > we > > should try to live w/o the undo worker altogether. > > > > Okay, but keep in mind that there could be a large amount of undo > (unlike redo which has some limit as we can replay it from the last > checkpoint) which needs to be processed but it might be okay to live > with that for now. Another thing is that it seems we need to connect > to the database to perform it which might appear a bit odd that we > don't allow users to connect to the database but internally we are > connecting it. These are just some points to consider while finalizing > the solution to this. > > > > > Discarding > > > > ---------- > > > > > > > > There's no discard worker for the URS infrastructure yet. I thought > > > > about > > > > discarding the undo log during checkpoint, but checkpoint should > > > > probably do > > > > more straightforward tasks than the calculation of a new discard > > > > pointer for > > > > each undo log, so a background worker is needed. A few notes on that: > > > > > > > > * until the zheap AM gets added, only the transaction that creates > > > > the undo > > > > records needs to access them. This assumption should make the > > > > discarding > > > > algorithm a bit simpler. Note that with zheap, the other > > > > transactions need > > > > to look for old versions of tuples, so the concept of > > > > oldestXidHavingUndo > > > > variable is needed there. > > > > > > > > * it's rather simple to pass pointer the URS pointer to the discard > > > > worker > > > > when transaction either committed or the undo has been executed. > > > > > > > > > > Why can't we have a separate discard worker which keeps on scanning > > > the undorecords and discard accordingly? Giving the onus of foreground > > > process might be tricky because say discard worker is not up to speed > > > and we ran out of space to pass such information for each commit/abort > > > request. > > > > Sure, there should be a discard worker. The question is how to make its work > > efficient. The initial run after restart probably needs to scan everything > > between 'discard' and 'insert' pointers, > > > > Yeah, such an initial scan would be helpful to identify pending aborts > and allow them to be processed. > > > but then it should process only the > > parts created by individual transactions. > > > > Yeah, it needs to process transaction-by-transaction to see which all > we can discard. Also, note that in Single-User mode we need to discard > undo after commit. I think we also need to maintain > oldestXidHavingUndo for CLOG truncation and transaction-wraparound. We > can't allow CLOG truncation for the transaction whose undo is not > discarded as that could be required by some other transaction. For > similar reasons, we can't allow transaction-wraparound and we need to > integrate this into the existing xid-allocation mechanism. I have > found one of the old patch > (Allow-execution-and-discard-of-undo-by-background-wo) attached >
oops, forgot to attach the patch, doing now. -- With Regards, Amit Kapila.
Allow-execution-and-discard-of-undo-by-background-wo.patch
Description: Binary data