keith-turner opened a new issue #542: Could add stateful checks to WAL recovery 
code
URL: https://github.com/apache/accumulo/issues/542
 
 
   Data is written to WALs in temporal order.  Mutations are written to a WAL 
with per tablet sequence numbers.   The sequence numbers do not change until a 
minor compaction occurs.  The fact of a minor compaction is recorded in the 
WAL.  
   
   Below is an example of a WAL in the order it was written with the following 
explanation of the contents.
    * Defines tablet `2<<` with id `5`.  Everything else the log will use the 
id `5`
    * A mutation to set row `r1` column `f1:q1` to `v1`.  This mutation has a 
seq of `1` in the WAL.  
    * A mutation setting `r1 f1:q2=v2` with a seq of `1`
    * Compaction start event with seq `2`
    * Compaction finish event with seq `3`
    * A mutation setting `r1 f1:q1=v3` with a seq of `3`
    * Etc
   
   ```
   DEFINE_TABLET 5 1 2<<
   
   MANY_MUTATIONS 5 1
   1 mutations:
     r1
         f1:q1 [system]:1529685833137 [] v1
   
   MANY_MUTATIONS 5 1
   1 mutations:
     r1
         f1:q2 [system]:1529685833149 [] v2
   
   COMPACTION_START 5 2 
hdfs://localhost:8020/accumulo/tables/2/default_tablet/F0000005.rf
   
   COMPACTION_FINISH 5 3
   
   MANY_MUTATIONS 5 3
   1 mutations:
     r1
         f1:q1 [system]:1529685849576 [] v3
   
   COMPACTION_START 5 4 
hdfs://localhost:8020/accumulo/tables/2/default_tablet/F0000006.rf
   
   COMPACTION_FINISH 5 5
   
   MANY_MUTATIONS 5 5
   1 mutations:
     r1
         f1:q1 [system]:1529685856321 [] v4
   
   MANY_MUTATIONS 5 5
   1 mutations:
     r1
         f1:q2 [system]:1529685867727 [] v5
   ```
   
   Given the example above it would be odd to see mutations in a WAL with 
sequence numbers `X`,`X+2`, and `X+4` without seeing corresponding compaction 
events between the mutations.  So we could add two types of sanity checks to 
the recovery code :
   
     * Check that compaction start and finish events increment in a orderly 
way.  Should increment by one.  Seeing a them jump by more than one may 
indicate data is missing.
    * Check that mutation seq numbers in the WAL are in an expected range.  Not 
completely sure, but that range may be : `[min_compaction_finish-2, 
max_compaction_finish]`.  
   
   The retry behavior when writing to WALs and its efect on seq numbers, if 
any, needs to be looked into.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to