[ https://issues.apache.org/jira/browse/SPARK-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Rosen updated SPARK-5581: ------------------------------ Assignee: Josh Rosen > When writing sorted map output file, avoid open / close between each partition > ------------------------------------------------------------------------------ > > Key: SPARK-5581 > URL: https://issues.apache.org/jira/browse/SPARK-5581 > Project: Spark > Issue Type: Improvement > Components: Shuffle > Affects Versions: 1.3.0 > Reporter: Sandy Ryza > Assignee: Josh Rosen > Fix For: 2.1.0 > > > {code} > // Bypassing merge-sort; get an iterator by partition and just write > everything directly. > for ((id, elements) <- this.partitionedIterator) { > if (elements.hasNext) { > val writer = blockManager.getDiskWriter( > blockId, outputFile, ser, fileBufferSize, > context.taskMetrics.shuffleWriteMetrics.get) > for (elem <- elements) { > writer.write(elem) > } > writer.commitAndClose() > val segment = writer.fileSegment() > lengths(id) = segment.length > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org