[ 
https://issues.apache.org/jira/browse/HADOOP-18989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17793989#comment-17793989
 ] 

ASF GitHub Bot commented on HADOOP-18989:
-----------------------------------------

hfutatzhanghb commented on code in PR #6294:
URL: https://github.com/apache/hadoop/pull/6294#discussion_r1418253986


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java:
##########
@@ -308,25 +353,49 @@ private void createControlFile(FileSystem fs,
 
     fs.delete(controlDir, true);
 
-    for(int i=0; i < nrFiles; i++) {
+    List<Future<String>> futureList = new ArrayList<>();
+    for (int i = 0; i < nrFiles; i++) {
       String name = getFileName(i);
       Path controlFile = new Path(controlDir, "in_file_" + name);
       SequenceFile.Writer writer = null;
       try {
         writer = SequenceFile.createWriter(fs, config, controlFile,
                                            Text.class, LongWritable.class,
                                            CompressionType.NONE);
-        writer.append(new Text(name), new LongWritable(nrBytes));
+        Runnable controlFileCreateTask = new ControlFileCreateTask(writer, 
name, nrBytes);
+        Future<String> createFuture = 
completionService.submit(controlFileCreateTask, "success");
+        futureList.add(createFuture);

Review Comment:
   omg~ Sir, it's test code, I forget delete it. Have deleted it.





> Use thread pool to improve the speed of creating control files in TestDFSIO
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-18989
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18989
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: benchmarks, common
>    Affects Versions: 3.3.6
>            Reporter: farmmamba
>            Assignee: farmmamba
>            Priority: Major
>              Labels: pull-request-available
>
> When we use TestDFSIO tool to test the throughouts of HDFS clusters, we found 
> it is so slow in the creating controll files stage. 
> After refering to the source code, we found that method createControlFile try 
> to create control files serially. It can be improved by using thread pool.
> After optimizing, the TestDFSIO tool runs quicker than before.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to