runzhiwang opened a new pull request #709: HDDS-3244. Improve write efficiency 
by opening RocksDB only once
URL: https://github.com/apache/hadoop-ozone/pull/709
 
 
   ## What changes were proposed in this pull request?
   
   What's the problem ?
   1. when datanode create each container, a new RocksDB instance will be 
[created|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java#L76],
 but then the created RocksDB was 
[closed|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java#L83],
 until PutBlock the RocsDB will be 
[opend|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/utils/ContainerCache.java#L123]
 again, so the RocksDB was open twice.  
[RocksDB.open|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/RocksDBStore.java#L68]
 cost about 150ms-200ms,  it's  a waste to open it twice.  
   2. Besides, as the image shows,  when leader datanode open RocksDB in 
WriteData , the follower can not open RocksDB until the leader finish it. So 
the whole `ObjectPoint.Put` stage cost 3 * cost(RocksDB.open) = 600ms to open 
RocksDB.
   
![image](https://user-images.githubusercontent.com/51938049/77320657-f0507180-6d4b-11ea-8188-acb26785f608.png)
   
   3. When upload a 3KB file five times, the average cost is 912ms. 
   
![image](https://user-images.githubusercontent.com/51938049/77320707-04946e80-6d4c-11ea-96a4-048f13b65d5b.png)
   
![image](https://user-images.githubusercontent.com/51938049/77320748-170ea800-6d4c-11ea-8131-81a1180a349a.png)
   
   
   How to fix it ?
   1. Open RocksDB in CommitData rather than WriteData, because leader datanode 
and 2 follower datanodes can open RocksDB in parallel in CommitData.
   2. Put the RocksDB handler into cache after open it in CommitData, to avoid 
open it again when PutBlock.
   3. So the whole `ObjectPoint.Put` stage cost  1 * cost(RocksDB.open) = 200ms 
to open RocksDB.
   
![image](https://user-images.githubusercontent.com/51938049/77321385-17f40980-6d4d-11ea-925c-d3960342ed06.png)
   
   4. When upload a 3KB file five times, the average cost is 516ms, improve 
about 44%.
   
![image](https://user-images.githubusercontent.com/51938049/77321360-0dd20b00-6d4d-11ea-84b3-efec6734db74.png)
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/projects/HDDS/issues/HDDS-3244
   
   ## How was this patch tested?
   
   I will change the UTs to pass the CI.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to