Doubt Regarding QJM protocol - example 2.10.6 of Quorum-Journal Design document

Giridhar Addepalli Sun, 28 Sep 2014 02:19:06 -0700

Hi All,

I am going through Quorum Journal Design document.


It is mentioned in Section 2.8 - In Accept Recovery RPC section
"
If the current on-disk log is missing, or a *different length *than the
proposed recovery, the JN downloads the log from the provided URI,
replacing any current copy of the log segment.
"

I can see it that the code follows above design

Source :: Journal.java
             ....

  public synchronized void acceptRecovery(RequestInfo reqInfo,
      SegmentStateProto segment, URL fromUrl)
      throws IOException {

      ....
      if (currentSegment == null ||
        currentSegment.getEndTxId() != segment.getEndTxId()) {
      ....
      } else {
      LOG.info("Skipping download of log " +
          TextFormat.shortDebugString(segment) +
          ": already have up-to-date logs");
      }
      ....
  }
....

My question is what if on-disk log is present and is of *same length *as
the proposed recovery

If JournalNode is skipping download because the logs are of same length,
then we could end up in a situation where finalized log segments contain
different data !

This could happen if we follow example 2.10.6

As per that example we write transactions (151-153 ) on JN1
then when recovery proceeded with only JN2 & JN3 let us assume that we
write again *different transactions* as (151-153) . Then after the crash
when we run recovery , JN1 will skip downloading correct segment from
JN2/JN3 as it thinks it has correct segment( as per the code pasted above).
This will result in a situation where finalized segment ( edits_151-153 )
on JN1 is different from finalized segment edits_151-153 on JN2/JN3.

Please let me know if i have gone wrong some where, and this situation is
taken care of.

Thanks,
Giridhar.

Doubt Regarding QJM protocol - example 2.10.6 of Quorum-Journal Design document

Reply via email to