Re: [HACKERS] Performance and WAL on big inserts/updates

Marty Scholes Thu, 11 Mar 2004 19:40:26 -0800

> The point is that with redo logging, you can just blindly apply the
> log to the data pages in question, without even really restarting the
> database.

I also am not a recovery expert, but I have watched it happen more than once.

You bring up a good point. My (perhaps false) understanding with recovery/redo was that the last checkpointed state was recreated, then log changes that finished with a commit were applied and the rest discarded.

Actually, this approach would not be ideal, since the last checkpointed state would be unavailable if any data file writes took place between the checkpoint and the crash.

Further, post-checkpoint log entries would surely contain multiple copies of the same data block, and there is no point in applying any but the last log entry for a particular block.

Ok. To recap:

* Logging parsed statements don't apply to light updates and most installations live mostly on light updates * Logging parsed statements would not work if the checkpoint state could not be recreated, which is likely the case.

So, this soluition won't work, and even if it did, would not apply to the vast majority of users.

Hmmm... No wonder it hasn't been implemented yet. ;-)

Thanks again,
Marty

Sailesh Krishnamurthy wrote:

(Just a note: my comments are not pg-specific .. indeed I don't know
much about pg recovery).
"Marty" == Marty Scholes <[EMAIL PROTECTED]> writes:
    Marty> If the DB state cannot be put back to a consistent state
    Marty> prior to a SQL statement in the log, then NO amount of
    Marty> logging will help.  The idea is that the state can be put
    Marty> back to what it was prior to a particular log entry, be it
    Marty> raw datafile blocks or a SQL statement.
The point is that with redo logging, you can just blindly apply the
log to the data pages in question, without even really restarting the
database.
Note that in ARIES, recovery follows: (1) analysis, (2) redo _everything_ since last checkpoint, (3) undo losers.

So logging carefully will indeed help get the system to a consistent state - actually after phase (2) above the system will be in precisely the state during the crash .. and all that's left to do is undo all the live transactions (losers).
BTW, logging raw datafile blocks would be pretty gross (physical
logging) and so ARIES logs the changes to each tuple in "logical"
fashion .. so if only one column changes only that value (before and
after images) are logged. This is what's called "physiological
logging".
    Marty> See above.  If this cannot be resolved prior to
    Marty> re-executing a statement in the log, then the problem is
    Marty> beyond ANY subsequent logging.
Not true ! By applying the log entries carefully you should be able to restore the system to a consistent state.

>> Having said that, page-oriented undo logging can be a pain when >> B-tree pages split. For higher concurrency, ARIES uses logical >> undo logging. In this case, the logs are akin to your "parsed >> statement" idea. >>
    Marty> Yes, my experience exactly.  Maybe we are the only company
    Marty> on the planet that experiences this sort of thing.  Maybe
Well, logical undo is still at a much lower level than parsed statements. Each logical undo log is something like "delete key 5 from index xyz".
    Marty> Maybe this is not a "traditional" RDBMS app, but I am not
    Marty> in the mood to write my own storage infrastructure for it.
I agree that your app has a lot of updates .. it's just that I'm not convinced that logical logging is a clean solution.

I also don't have a solution for your problem :-)


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
     subscribe-nomail command to [EMAIL PROTECTED] so that your
     message can get through to the mailing list cleanly

Re: [HACKERS] Performance and WAL on big inserts/updates

Reply via email to