[ https://issues.apache.org/jira/browse/HADOOP-18989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17793989#comment-17793989 ]
ASF GitHub Bot commented on HADOOP-18989: ----------------------------------------- hfutatzhanghb commented on code in PR #6294: URL: https://github.com/apache/hadoop/pull/6294#discussion_r1418253986 ########## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java: ########## @@ -308,25 +353,49 @@ private void createControlFile(FileSystem fs, fs.delete(controlDir, true); - for(int i=0; i < nrFiles; i++) { + List<Future<String>> futureList = new ArrayList<>(); + for (int i = 0; i < nrFiles; i++) { String name = getFileName(i); Path controlFile = new Path(controlDir, "in_file_" + name); SequenceFile.Writer writer = null; try { writer = SequenceFile.createWriter(fs, config, controlFile, Text.class, LongWritable.class, CompressionType.NONE); - writer.append(new Text(name), new LongWritable(nrBytes)); + Runnable controlFileCreateTask = new ControlFileCreateTask(writer, name, nrBytes); + Future<String> createFuture = completionService.submit(controlFileCreateTask, "success"); + futureList.add(createFuture); Review Comment: omg~ Sir, it's test code, I forget delete it. Have deleted it. > Use thread pool to improve the speed of creating control files in TestDFSIO > --------------------------------------------------------------------------- > > Key: HADOOP-18989 > URL: https://issues.apache.org/jira/browse/HADOOP-18989 > Project: Hadoop Common > Issue Type: Improvement > Components: benchmarks, common > Affects Versions: 3.3.6 > Reporter: farmmamba > Assignee: farmmamba > Priority: Major > Labels: pull-request-available > > When we use TestDFSIO tool to test the throughouts of HDFS clusters, we found > it is so slow in the creating controll files stage. > After refering to the source code, we found that method createControlFile try > to create control files serially. It can be improved by using thread pool. > After optimizing, the TestDFSIO tool runs quicker than before. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org