[jira] [Comment Edited] (SPARK-5081) Shuffle write increases

Roi Reshef (JIRA) Mon, 15 Jun 2015 01:42:23 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585641#comment-14585641
 ]


Roi Reshef edited comment on SPARK-5081 at 6/15/15 8:41 AM:
------------------------------------------------------------

Hi Guys,
Was this issue already solved by any chance? I'm using Spark 1.3.1 for training 
algorithm with an iterative fashion. Since implementing a ranking measure (that 
ultimately uses sortBy) i'm experiencing similar problems. It seems that my 
cache explodes after ~100 iterations, and crushes the server with a "There is 
insufficient memory for the Java Runtime Environment to continue" message. Note 
that it isn't supposed to persist the sorted vectors nor to use them in the 
following iterations. So I wonder why memory consumption keeps growing with 
each iteration.


was (Author: roireshef):
Hi Guys,
Was this issue already solved by any chance? I'm using Spark 1.3.1 for training 
in an iterative fashion. Since implementing a ranking measure (that ultimately 
uses sortBy) i'm experiencing similar problems. It seems that my cache explodes 
after ~100 iterations, and crushes the server with a "There is insufficient 
memory for the Java Runtime Environment to continue" message. Note that it 
isn't supposed to persist the sorted vectors nor to use them in the following 
iterations. So I wonder why memory consumption keeps growing with each 
iteration.

> Shuffle write increases
> -----------------------
>
>                 Key: SPARK-5081
>                 URL: https://issues.apache.org/jira/browse/SPARK-5081
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle
>    Affects Versions: 1.2.0
>            Reporter: Kevin Jung
>            Priority: Critical
>         Attachments: Spark_Debug.pdf, diff.txt
>
>
> The size of shuffle write showing in spark web UI is much different when I 
> execute same spark job with same input data in both spark 1.1 and spark 1.2. 
> At sortBy stage, the size of shuffle write is 98.1MB in spark 1.1 but 146.9MB 
> in spark 1.2. 
> I set spark.shuffle.manager option to hash because it's default value is 
> changed but spark 1.2 still writes shuffle output more than spark 1.1.
> It can increase disk I/O overhead exponentially as the input file gets bigger 
> and it causes the jobs take more time to complete. 
> In the case of about 100GB input, for example, the size of shuffle write is 
> 39.7GB in spark 1.1 but 91.0GB in spark 1.2.
> spark 1.1
> ||Stage Id||Description||Input||Shuffle Read||Shuffle Write||
> |9|saveAsTextFile| |1169.4KB| |
> |12|combineByKey| |1265.4KB|1275.0KB|
> |6|sortByKey| |1276.5KB| |
> |8|mapPartitions| |91.0MB|1383.1KB|
> |4|apply| |89.4MB| |
> |5|sortBy|155.6MB| |98.1MB|
> |3|sortBy|155.6MB| | |
> |1|collect| |2.1MB| |
> |2|mapValues|155.6MB| |2.2MB|
> |0|first|184.4KB| | |
> spark 1.2
> ||Stage Id||Description||Input||Shuffle Read||Shuffle Write||
> |12|saveAsTextFile| |1170.2KB| |
> |11|combineByKey| |1264.5KB|1275.0KB|
> |8|sortByKey| |1273.6KB| |
> |7|mapPartitions| |134.5MB|1383.1KB|
> |5|zipWithIndex| |132.5MB| |
> |4|sortBy|155.6MB| |146.9MB|
> |3|sortBy|155.6MB| | |
> |2|collect| |2.0MB| |
> |1|mapValues|155.6MB| |2.2MB|
> |0|first|184.4KB| | |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-5081) Shuffle write increases

Reply via email to