[ https://issues.apache.org/jira/browse/HADOOP-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amar Kamat updated HADOOP-910: ------------------------------ Attachment: HADOOP-910-review.patch This patch performs a bit badly as compared to the trunk when ram-filesystem merge and local-filesystem merge co-exist. I guess that is due to the interference between the two threads, which is evident from the logs. Here is the analysis of the logs ||local fs merge time(min)||num interfering ramfs merge threads|| |4.24155|26| |0.0887667|6| |0.201867|8| |0.311233|8| |0.0618333|6| |3.12602|48| |4.00395|48| |0.0716333|6| |0.02535|8| |0.0760667|6| |4.6852|38| |6.95463|58| |1.07183|34| |3.35935|60| |1.46228|6| ---- Here are the results of running the benchmarks with {{fs.inmemory.size.mb=0}} (i.e no ramfs) on 100 nodes || ||total runtime|| avg-shuffle time|| |patched|1h 25m 42s| 26m 7s| |trunk|1h 58m 59s|30m 21s| ---- comments? > Reduces can do merges for the on-disk map output files in parallel with their > copying > ------------------------------------------------------------------------------------- > > Key: HADOOP-910 > URL: https://issues.apache.org/jira/browse/HADOOP-910 > Project: Hadoop > Issue Type: Improvement > Components: mapred > Reporter: Devaraj Das > Assignee: Amar Kamat > Attachments: HADOOP-910-review.patch > > > Proposal to extend the parallel in-memory-merge/copying, that is being done > as part of HADOOP-830, to the on-disk files. > Today, the Reduces dump the map output files to disk and the final merge > happens only after all the map outputs have been collected. It might make > sense to parallelize this part. That is, whenever a Reduce has collected > io.sort.factor number of segments on disk, it initiates a merge of those and > creates one big segment. If the rate of copying is faster than the merge, we > can probably have multiple threads doing parallel merges of independent sets > of io.sort.factor number of segments. If the rate of copying is not as fast > as merge, we stand to gain a lot - at the end of copying of all the map > outputs, we will be left with a small number of segments for the final merge > (which hopefully will feed the reduce directly (via the RawKeyValueIterator) > without having to hit the disk for writing additional output segments). > If the disk bandwidth is higher than the network bandwidth, we have a good > story, I guess, to do such a thing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.