[GitHub] [hadoop-ozone] captainzmc opened a new pull request #716: HDDS-3155. Improved ozone client flush implementation to make it faster.

GitBox Sat, 18 Apr 2020 05:07:14 -0700

captainzmc opened a new pull request #716: HDDS-3155. Improved ozone client 
flush implementation to make it faster.
URL: https://github.com/apache/hadoop-ozone/pull/716
 
 
   ## What changes were proposed in this pull request?
   
   When we run MR Job (with 1000 maps)  based on OzoneFileSystem. After the map 
and reduce has finished 100%, the appmaster pauses More than 40 minutes .
   `20/03/05 14:43:33 INFO mapreduce.Job: map 100% reduce 100% `
   `20/03/05 15:29:52 INFO mapreduce.Job: Job job_1583385253878_0002 completed 
successfully`
   It turns out that the appmaster writes all the task events to the log one by 
one, calling flush once for each one. This operation is very time consuming in 
ozone.
   
   HDFS currently has two flush ports, flush () and hflush ().
   flush() : flush the data from client  buffer to the client package 
(dfs.write.packet.size default 64k). If the package is not full, it will not be 
sent to the datanode.
   hflush(): each invocation sends the data in the buffer to the datanode.
   
   Now, ozone's flush is more similar to HDFS's hflush. This PR adds an 
implementation of flush similar to HDFS‘s flush. Using 
ozone.client.stream.buffer.flush.delay to control whether to enable(not enabled 
by default). If we enabled it, when we call the flush() method, we will 
determine whether the data in the current buffer is greater than 
ozone.client.stream.buffer.size. If greater than, we will send it to the 
datanode. Otherwise, we will not send it.
   
   The flush performance has been significantly improved through testing. The 
job is no longer blocked, It will take 1 second to exit after MR finished.
   `20/03/25 11:04:04 INFO mapreduce.Job:  map 100% reduce 100%`
   `20/03/25 11:04:05 INFO mapreduce.Job: Job job_1585104739905_0002 completed 
successfully`
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3155
   
   ## How was this patch tested?
   
   Run yarn on the ozone, perform the testdfsio job below, start a thousand 
maps. And see the exit time after map and reduce 100%.
   `hadoop jar  /path/of/hadoop-mapreduce-client-jobclient-2.8.5-tests.jar 
TestDFSIO -write -nrFiles 1000 -fileSize 1KB  -resFile /tmp/dfsio-write.out`
   
   Add the following configuration in ozone-site.xml and repeat the above 
command to see the execution.
   `<property>`
    `   <name>ozone.client.stream.buffer.flush.delay</name>`
    `   <value>true</value>`
    `</property>`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] captainzmc opened a new pull request #716: HDDS-3155. Improved ozone client flush implementation to make it faster.

Reply via email to