[ 
https://issues.apache.org/jira/browse/DERBY-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jørgen Løland updated DERBY-3071:
---------------------------------

    Attachment: derby-3071_continuous-recovery_1c.diff

Thank's for reviewing the patch, Øystein. There is a lot
happening here since two threads are working on the same object
and requires access to the same information but must not interfere 
with one another.

Patch 1c addresses most of your comments.

I see that some of the intricacies arise because I tried to split
the "big patch" into multiple smaller patches to ease the review
process. I may have achieved the opposite, especially regarding
comment 3. I'll do my best to explain:

> 1 a), b), c), e), 4 a), 5

Changed as requested. 

>    d) I do not quite understand why you need to set logFileNumber to
>       bootTimeLogFileNumber at end of recovery since the next thing
>       you will normally do is to set it to something else based on
>       logEnd.

After the redo pass? logFileNumber is used in the                               
"if (logEnd == LogCounter.INVALID_LOG_INSTANT)" block; it would not be 
necessary otherwise. 

>    f) Comments for these fields are (partly?) below the declaration.
>       This is a bit unorthodox, and I feel some empty lines could be
>       added in order to make clear whether comments for declaration of
>       fields apply to field above or below comment.

Fixed for bootTimeLogFileNumber, not the existing comment for
logFileNumber

>    g) I think it would be good with I comment that states why you
>       have to wait for the first log file to be complete in the
>       beginning of recover().  I guess it because
>       getLogFileAtBeginning is not called on this file.  Hence, the
>       normal stop mechanism will not work.  Given that you have this
>       test, is the test for slaveRecoveryBlocked really needed?

You are right. slaveRecoveryBlocked can be removed from
the "slave-block" in the beginning of recover(). Additionaly: no
longer needed in getLogFileAtBeginning. The new patch rather
exploits the fact that recovery does not call
getLogFileAtBeginning until allowedToReadFileNumber has been
updated for the first time. Also added comment as suggested.

> 2. What is preventing someone from executing slave start before boot
>    has completed?  I would think that could create some nasty race
>    conditions.  (E.g., if initializeSlaveReplication() uses
>    logFileNumber before it has been initialized by boot()).  Maybe it
>    is the responsibility of the caller to synchronize this, but then
>    the javadoc for initializeSlaveReplication() should say so.

Good point. The race condition will not occur since the log
factory is booted before the slave controller (and thereby the
slave thread) is started in RawStore#boot(). This "factory
booting" is single-threaded. I added a comment as suggested.

> 3. What happens if failover is initiated before the first log file
>    switch?  Will not that require a notification and that
>    allowedToReadFileNumber is updated in order to get recovery going?

It will require a notify. This notify will be added in a followup
patch containing a "LogToFile#stopRepliationSlaveRole()" method.
The allowedToReadFileNumber variable does not need to be updated
because inReplicationSlaveMode will be false after failover.

> 4 b) Is the MT comment in the javadoc still valid?

Yes


> Replication: Modify logging subsystem for slave replication mode
> ----------------------------------------------------------------
>
>                 Key: DERBY-3071
>                 URL: https://issues.apache.org/jira/browse/DERBY-3071
>             Project: Derby
>          Issue Type: Sub-task
>          Components: Services, Store
>            Reporter: Jørgen Løland
>            Assignee: Jørgen Løland
>         Attachments: derby-3071_continuous-recovery_1a.diff, 
> derby-3071_continuous-recovery_1a.stat, 
> derby-3071_continuous-recovery_1b.diff, derby-3071_continuous-recovery_1c.diff
>
>
> When a database is booted in slave replication mode, it should apply log 
> records received from the master but must not generate log by itself. As 
> described in the functional specification (see DERBY-2872), a database booted 
> in slave mode should enter LogToFile#recover, but not leave this method until 
> the database is no longer in slave mode. 
> The current plan for this issue is to modify LogToFile the following ways:
> * LogToFile is put in slave mode automatically during boot (if property 
> SlaveFactory.SLAVE_MODE is set, see DERBY-3021), but a method is needed to 
> take LogToFile out of recovery mode.
> * SlaveFactory (DERBY-3021) will receive log records from the master and use 
> LogToFile#appendLogRecord to write these to disk. While in slave mode, only 
> SlaveFactory will be allowed to append log records.
> * The thread running LogToFile#recover will recover (redo) one log file at a 
> time (like now), but will not be allowed to open a log file X until that file 
> is no longer being written to. Thus, while appenLogFile writes to logX.dat, 
> recover will be allowed to read all log files up to and including logX-1.dat 
> but will then have to wait until appendLogRecord starts writing to logX+1.dat.
> All the described changes will only apply when in slave mode

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to