Hi

A developer should answer that but a quick look to an edit file with od suggests that record are not fixed length. So maybe the likeliness of the situation you suggest is so low that there is no need to check more than file size

Ulul

Le 28/09/2014 11:17, Giridhar Addepalli a écrit :
Hi All,

I am going through Quorum Journal Design document.

It is mentioned in Section 2.8 - In Accept Recovery RPC section
"
If the current on-disk log is missing, or a /different length /than the proposed recovery, the JN downloads the log from the provided URI, replacing any current copy of the log segment.
"

I can see it that the code follows above design

Source :: Journal.java
             ....

      public synchronized void acceptRecovery(RequestInfo reqInfo,
          SegmentStateProto segment, URL fromUrl)
          throws IOException {

          ....
          if (currentSegment == null ||
            currentSegment.getEndTxId() != segment.getEndTxId()) {
          ....
          } else {
          LOG.info("Skipping download of log " +
              TextFormat.shortDebugString(segment) +
              ": already have up-to-date logs");
          }
          ....
      }
    ....

My question is what if on-disk log is present and is of /same length /as the proposed recovery

If JournalNode is skipping download because the logs are of same length, then we could end up in a situation where finalized log segments contain different data !

This could happen if we follow example 2.10.6

As per that example we write transactions (151-153 ) on JN1
then when recovery proceeded with only JN2 & JN3 let us assume that we write again /different transactions/ as (151-153) . Then after the crash when we run recovery , JN1 will skip downloading correct segment from JN2/JN3 as it thinks it has correct segment( as per the code pasted above). This will result in a situation where finalized segment ( edits_151-153 ) on JN1 is different from finalized segment edits_151-153 on JN2/JN3.

Please let me know if i have gone wrong some where, and this situation is taken care of.

Thanks,
Giridhar.

Reply via email to