[ 
https://issues.apache.org/jira/browse/SPARK-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia resolved SPARK-1912.
----------------------------------

    Resolution: Fixed

> Compression memory issue during reduce
> --------------------------------------
>
>                 Key: SPARK-1912
>                 URL: https://issues.apache.org/jira/browse/SPARK-1912
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: Wenchen Fan
>            Assignee: Wenchen Fan
>             Fix For: 1.1.0
>
>
> When we need to read a compressed block, we will first create a compress 
> stream instance(LZF or Snappy) and use it to wrap that block.
> Let's say a reducer task need to read 1000 local shuffle blocks, it will 
> first prepare to read that 1000 blocks, which means create 1000 compression 
> stream instance to wrap them. But the initialization of compression instance 
> will allocate some memory and when we have many compression instance at the 
> same time, it is a problem.
> Actually reducer reads the shuffle blocks one by one, so why we create 
> compression instance at the first time? Can we do it lazily that when a block 
> is first read, create compression instance for it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to