Re: [jira] Commented: (DERBY-298) rollforward will not work correctly if the system happens to crash immediately after rollforward backup.

Øystein Grøvlen Thu, 26 May 2005 15:33:26 -0700

>>>>> "MM" == Mike Matrigali <[EMAIL PROTECTED]> writes:


    MM> As you suggest I think there needs to be some way for recovery to
    MM> determine a successful log switch has happened and to not log to the
    MM> old file.  I don't think a log record for the switch works as you need
    MM> to read the log to apply the record, but I think this is problematical
    MM> as you are trying to determine validity of the log by looging at record
    MM> in the  log. I  think some  approach as you  and suresh  suggest with
    MM> markers  in  the  header of  the  individual  log  files is  a  better
    MM> approach.

Assuming sync write, the steps to do a log switch is:

1. Check if file with next file number exists.  If so, delete it
2. Open new log file
3. Initialize file (write header) and sync
4. Write end marker to old file 
5. Flush old file to disk
6. Close old file
7. Preallocate space in new log file and sync
8. Close new file
9. Reopen new file in RWS mode
10. Go to current end position (after header).

Basically if a crash occurs before step 3 is finished, the log file
will just be ignored on recovery because the validation of the header
fails.  Should it fail after step 3, redo processing will try to
access the first log rec of the file.  If there is none, the last good
log record pointer will not be updated and logging will occur in the
previous file.  This is OK if the crash occurred before the log switch
was completed since then the operation that initiated the log switch
would not complete either (e.g., a backup would fail).  As we have
agreed on, we need something to tell recovery that the log switch was
completed.

However, there is currently nothing that prevents recovery from
interpreting garbage succeeding the header as a log record and
wrongfully conclude that the log switch was completed.  Even if it is
not very likely that this would happen, I suggest that to be safe we
write an integer 0 after the header as part of step 3. In a new step
11, we could then write a log switch log record which length would
overwrite the 0 written in step 3.  This way, depending on during
which step the crash occurred, the following would happen during the
redo scan using the existing recovery implementation:

Crash during step 1 to 3: New file is not valid, continue to use old log file
Crash during step 4 to 11: New file is empty, continue to use old log file
Crash after step 11: Process log switch log record, next log record
                     will be allocated after this record.

I think this is the behavior we want.

-- 
�ystein

Re: [jira] Commented: (DERBY-298) rollforward will not work correctly if the system happens to crash immediately after rollforward backup.

Reply via email to