[ https://issues.apache.org/jira/browse/SPARK-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14064965#comment-14064965 ]
DjvuLee commented on SPARK-2156: -------------------------------- I see this fixed in the spark branch-0.9 in the github, but does it updated in the spark v0.9.1 in the http://spark.apache.org/ site? > When the size of serialized results for one partition is slightly smaller > than 10MB (the default akka.frameSize), the execution blocks > -------------------------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-2156 > URL: https://issues.apache.org/jira/browse/SPARK-2156 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 0.9.1, 1.0.0 > Environment: AWS EC2 1 master 2 slaves with the instance type of > r3.2xlarge > Reporter: Chen Jin > Assignee: Xiangrui Meng > Priority: Blocker > Fix For: 0.9.2, 1.0.1, 1.1.0 > > Original Estimate: 504h > Remaining Estimate: 504h > > I have done some experiments when the frameSize is around 10MB . > 1) spark.akka.frameSize = 10 > If one of the partition size is very close to 10MB, say 9.97MB, the execution > blocks without any exception or warning. Worker finished the task to send the > serialized result, and then throw exception saying hadoop IPC client > connection stops (changing the logging to debug level). However, the master > never receives the results and the program just hangs. > But if sizes for all the partitions less than some number btw 9.96MB amd > 9.97MB, the program works fine. > 2) spark.akka.frameSize = 9 > when the partition size is just a little bit smaller than 9MB, it fails as > well. > This bug behavior is not exactly what spark-1112 is about. -- This message was sent by Atlassian JIRA (v6.2#6252)