[ https://issues.apache.org/jira/browse/SPARK-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207826#comment-14207826 ]
zzc commented on SPARK-2468: ---------------------------- Hi, Aaron Davidson, I am sure that I ran my last test with the patch #3155 applied. configuration : spark.shuffle.consolidateFiles true spark.storage.memoryFraction 0.2 spark.shuffle.memoryFraction 0.2 spark.shuffle.file.buffer.kb 100 spark.reducer.maxMbInFlight 48 spark.shuffle.blockTransferService netty spark.shuffle.io.mode nio spark.shuffle.io.connectionTimeout 120 spark.shuffle.manager SORT spark.shuffle.io.preferDirectBufs true spark.shuffle.io.maxRetries 3 spark.shuffle.io.retryWaitMs 5000 spark.shuffle.io.maxUsableCores 3 command: --num-executors 17 --executor-memory 12g --executor-cores 3 If spark.shuffle.io.preferDirectBufs=false, it's OK. > Netty-based block server / client module > ---------------------------------------- > > Key: SPARK-2468 > URL: https://issues.apache.org/jira/browse/SPARK-2468 > Project: Spark > Issue Type: Improvement > Components: Shuffle, Spark Core > Reporter: Reynold Xin > Assignee: Reynold Xin > Priority: Critical > Fix For: 1.2.0 > > > Right now shuffle send goes through the block manager. This is inefficient > because it requires loading a block from disk into a kernel buffer, then into > a user space buffer, and then back to a kernel send buffer before it reaches > the NIC. It does multiple copies of the data and context switching between > kernel/user. It also creates unnecessary buffer in the JVM that increases GC > Instead, we should use FileChannel.transferTo, which handles this in the > kernel space with zero-copy. See > http://www.ibm.com/developerworks/library/j-zerocopy/ > One potential solution is to use Netty. Spark already has a Netty based > network module implemented (org.apache.spark.network.netty). However, it > lacks some functionality and is turned off by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org