[ 
https://issues.apache.org/jira/browse/SPARK-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guoqiang Li updated SPARK-2156:
-------------------------------

    Comment: was deleted

(was: [~pwend...@gmail.com])

> When the size of serialized results for one partition is slightly smaller 
> than 10MB (the default akka.frameSize), the execution blocks
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-2156
>                 URL: https://issues.apache.org/jira/browse/SPARK-2156
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 0.9.1, 1.0.0
>         Environment: AWS EC2 1 master 2 slaves with the instance type of 
> r3.2xlarge
>            Reporter: Chen Jin
>            Priority: Critical
>             Fix For: 1.0.1
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
>  I have done some experiments when the frameSize is around 10MB .
> 1) spark.akka.frameSize = 10
> If one of the partition size is very close to 10MB, say 9.97MB, the execution 
> blocks without any exception or warning. Worker finished the task to send the 
> serialized result, and then throw exception saying hadoop IPC client 
> connection stops (changing the logging to debug level). However, the master 
> never receives the results and the program just hangs.
> But if sizes for all the partitions less than some number btw 9.96MB amd 
> 9.97MB, the program works fine.
> 2) spark.akka.frameSize = 9
> when the partition size is just a little bit smaller than 9MB, it fails as 
> well.
> This bug behavior is not exactly what spark-1112 is about.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to