runzhiwang opened a new pull request #709: HDDS-3244. Improve write efficiency 
by opening RocksDB only once
URL: https://github.com/apache/hadoop-ozone/pull/709
 
 
   ## What changes were proposed in this pull request?
   
   What's the problem ?
   1. This happens when datanode create container. I split the 
`HddsDispatcher.WriteChunk` into `HddsDispatcher.WriteData` and 
`HddsDispatcher.CommitData` as the code shows, to show the cost of them in 
jaeger UI.
   
![image](https://user-images.githubusercontent.com/51938049/77373666-cd51ac00-6da3-11ea-9c77-8d6864f05aac.png)
   
   
   2. when datanode create each container, a new RocksDB instance will be 
   
[created](https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java#L76)
 in `HddsDispatcher.WriteData` , but then the created RocksDB was  
[closed](https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java#L83),
 until `HddsDispatcher.PutBlock` the RocsDB will be 
[opend](https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/utils/ContainerCache.java#L123)
 again, so the RocksDB was open twice in each datanode.  And the RocksDB was 
not used until `HddsDispatcher.PutBlock`.
    
   3. Besides, as the image shows,  when leader datanode open RocksDB in 
`HddsDispatcher.WriteData` , 2 follower datanodes can not open RocksDB until 
the leader finish it. So the whole write cost 3 * cost(RocksDB.open) = 600ms to 
open RocksDB.
   
![image](https://user-images.githubusercontent.com/51938049/77320657-f0507180-6d4b-11ea-8188-acb26785f608.png)
   
   4. When upload a 3KB file five times, the average cost is 912ms. 
   
![image](https://user-images.githubusercontent.com/51938049/77320748-170ea800-6d4c-11ea-8131-81a1180a349a.png)
   
   
   How to fix it ?
   1. Open RocksDB in `HddsDispatcher.CommitData` rather than 
`HddsDispatcher.WriteData`, because leader datanode and 2 follower datanodes 
can open RocksDB in parallel in `HddsDispatcher.CommitData`.
   2. Put the RocksDB handler into cache after open it in 
`HddsDispatcher.CommitData`, to avoid open it again when 
`HddsDispatcher.PutBlock`.
   3. So the whole write cost  1 * cost(RocksDB.open) = 200ms to open RocksDB.
   
![image](https://user-images.githubusercontent.com/51938049/77321385-17f40980-6d4d-11ea-925c-d3960342ed06.png)
   
   4. When upload a 3KB file five times, the average cost is 516ms, improve 
about 44%.
   
![image](https://user-images.githubusercontent.com/51938049/77321360-0dd20b00-6d4d-11ea-84b3-efec6734db74.png)
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/projects/HDDS/issues/HDDS-3244
   
   ## How was this patch tested?
   
   I will change the UTs to pass the CI.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to