Uma Maheswara Rao G created HDDS-12864:
------------------------------------------

             Summary: All commit semantics in replication writes 
                 Key: HDDS-12864
                 URL: https://issues.apache.org/jira/browse/HDDS-12864
             Project: Apache Ozone
          Issue Type: New Feature
          Components: Ozone Client, Ozone Datanode
            Reporter: Uma Maheswara Rao G
            Assignee: Swaminathan Balachandran


In Ozone replication(Raft based) has the semantics of majority commit. While 
this has the advantage of not suffering from a slow node, this will allow the 
system to move forward with majority replication for a short duration of time 
without having to have all 3 replicas consistently committed. Catching a slow 
replica is dependent on a raft.

Since Ozone has a variable length of blocks, it is just fine to close the 
container/block when some of the replicas are not acknowledged in time. So, it 
does need to recopy the content to new nodes, instead, it can just move forward 
with a new pipeline.
With this advantage, we should provide all commit semantics to make sure all 
replicas are consistently committed to a length that the client got 
acknowledgments for.

There are two/three areas where we do the majority of commits today:

1. Client falls back to the majority commits in watchForCommit if all commits 
fail.
2. Leader DN always waits for the majority quorum for transactions
3. The leader only waits for self applyTransaction completion.

Making the above scenarios streamlines and achieving all commits can bring all 
replicas into a consistent state at any point in time, on write acks.

As a side note: Today, in EC, we already do ALL COMMIT like protocol in write 
path. Which avoids QUASI_CLOSED state altogether as replicas always have at the 
minimum length of data, as it is acknowledged to the clients. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org
For additional commands, e-mail: issues-h...@ozone.apache.org

Reply via email to