xichen01 opened a new pull request, #5018:
URL: https://github.com/apache/ozone/pull/5018

   ## What changes were proposed in this pull request?
   
   Currently when we manually transfer SCM leadership, the old SCM maybe 
allocates some IDs duplicate with the new SCM leadership. This will cause some 
serious issue, such as if the SCM allocates the same ID for `Block` for 
CreateKey request and those `Block` is allocated to the same Container, `Block` 
will overwrite each other Chunk file,  the data will be lost.
   
   ### Reproduce
   - A simple way to reproduce this issue
   Generate a consistently faster write load and switch the SCM with the 
command, then you can observe log message on the DN
   ```
   2023-07-03 16:17:31,278 [ChunkWriter-227-0] WARN 
org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils: Duplicate write 
chunk request. Chunk overwrite without explicit request. 
ChunkInfo{chunkName='109611007626090607_chunk_1, offset=0, len=4096}
   ```
   - Generate a constant write load
   ```
   ozone freon ommg --operation CREATE_KEY -n 100000 -t 20 --runtime 3000 
--timebase --size=4096
   ```
   
   ### Root Cause
   The reason for this problem is that the `batch.lastId` is updated before the 
successful execution of `stateManager.allocateBatch`.
   
   
https://github.com/apache/ozone/blob/dd25740d245b911efd28c29c3b314bc584dab0d0/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SequenceIdGenerator.java#L124
   
   which causes the subsequent requests from other threads will get an 
illegitimate ID
   
https://github.com/apache/ozone/blob/dd25740d245b911efd28c29c3b314bc584dab0d0/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SequenceIdGenerator.java#L114-L116
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-8139
   
   Please replace this section with the link to the Apache JIRA)
   
   ## How was this patch tested?
   
   unit test
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to