[jira] [Commented] (SPARK-19659) Fetch big blocks to disk when shuffle-read

jin xing (JIRA) Tue, 11 Apr 2017 08:21:00 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-19659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964508#comment-15964508
 ]


jin xing commented on SPARK-19659:
----------------------------------

[~irashid]
Tracking memory used by Netty by swapping in our own PooledByteBufAllocator is 
really a good idea. Memory usage will be increased when allocate byte buffer in 
PoolByteBufAllocator and get decreased when ByteBuf's reference count is zero.
Checking source code of Netty, I found that there is  cache inside 
PoolByteBufAllocator. When memory is released, it will be returned to cache or 
chunklist, not destroyed necessarily. In my understanding, we can get the 
memory usage by tracking PooledByteBufAllocator, but the value is not the real 
footprint.(i.e. when memory is released, it maybe in cache other than 
destroyed.)

> Fetch big blocks to disk when shuffle-read
> ------------------------------------------
>
>                 Key: SPARK-19659
>                 URL: https://issues.apache.org/jira/browse/SPARK-19659
>             Project: Spark
>          Issue Type: Improvement
>          Components: Shuffle
>    Affects Versions: 2.1.0
>            Reporter: jin xing
>         Attachments: SPARK-19659-design-v1.pdf, SPARK-19659-design-v2.pdf
>
>
> Currently the whole block is fetched into memory(offheap by default) when 
> shuffle-read. A block is defined by (shuffleId, mapId, reduceId). Thus it can 
> be large when skew situations. If OOM happens during shuffle read, job will 
> be killed and users will be notified to "Consider boosting 
> spark.yarn.executor.memoryOverhead". Adjusting parameter and allocating more 
> memory can resolve the OOM. However the approach is not perfectly suitable 
> for production environment, especially for data warehouse.
> Using Spark SQL as data engine in warehouse, users hope to have a unified 
> parameter(e.g. memory) but less resource wasted(resource is allocated but not 
> used),
> It's not always easy to predict skew situations, when happen, it make sense 
> to fetch remote blocks to disk for shuffle-read, rather than
> kill the job because of OOM. This approach is mentioned during the discussion 
> in SPARK-3019, by [~sandyr] and [~mridulm80]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-19659) Fetch big blocks to disk when shuffle-read

Reply via email to