subject:"\[jira\] \[Assigned\] \(SPARK\-24917\) Sending a partition over netty results in 2x memory usage"

[jira] [Assigned] (SPARK-24917) Sending a partition over netty results in 2x memory usage

2018-07-31 Thread Apache Spark (JIRA)



 [ 
https://issues.apache.org/jira/browse/SPARK-24917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24917:


Assignee: (was: Apache Spark)

> Sending a partition over netty results in 2x memory usage
> -
>
> Key: SPARK-24917
> URL: https://issues.apache.org/jira/browse/SPARK-24917
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.2.2
>Reporter: Vincent
>Priority: Major
>
> Hello
> while investigating some OOM errors in Spark 2.2 [(here's my call 
> stack)|https://image.ibb.co/hHa2R8/sparkOOM.png], I find the following 
> behavior happening, which I think is weird:
>  * a request happens to send a partition over network
>  * this partition is 1.9 GB and is persisted in memory
>  * this partition is apparently stored in a ByteBufferBlockData, that is made 
> of a ChunkedByteBuffer, which is a list of (lots of) ByteBuffer of 4 MB each.
>  * the call to toNetty() is supposed to only wrap all the arrays and not 
> allocate any memory
>  * yet the call stack shows that netty is allocating memory and is trying to 
> consolidate all the chunks into one big 1.9GB array
>  * this means that at this point the memory footprint is 2x the size of the 
> actual partition (which is huge when the partition is 1.9GB)
> Is this transient allocation expected?
> After digging, it turns out that the actual copy is due to [this 
> method|https://github.com/netty/netty/blob/4.0/buffer/src/main/java/io/netty/buffer/Unpooled.java#L260]
>  in netty. If my initial buffer is made of more than DEFAULT_MAX_COMPONENTS 
> (16) components it will trigger a re-allocation of all the buffer. This netty 
> issue was fixed in this recent change : 
> [https://github.com/netty/netty/commit/9b95b8ee628983e3e4434da93fffb893edff4aa2]
>  
> As a result, is it possible to benefit from this change somehow in spark 2.2 
> and above? I don't know how the netty dependencies are handled for spark
>  
> NB: it seems this ticket: [https://jira.apache.org/jira/browse/SPARK-24307] 
> kinda changed the approach for spark 2.4 by bypassing netty buffer 
> altogether. However as it is written in the ticket, this approach *still* 
> needs to have the *entire* block serialized in memory, so this would be a 
> downgrade from fixing the netty issue when your buffer in <  2GB
>  
> Thanks!
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-24917) Sending a partition over netty results in 2x memory usage

2018-07-31 Thread Apache Spark (JIRA)



 [ 
https://issues.apache.org/jira/browse/SPARK-24917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-24917:


Assignee: Apache Spark

> Sending a partition over netty results in 2x memory usage
> -
>
> Key: SPARK-24917
> URL: https://issues.apache.org/jira/browse/SPARK-24917
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.2.2
>Reporter: Vincent
>Assignee: Apache Spark
>Priority: Major
>
> Hello
> while investigating some OOM errors in Spark 2.2 [(here's my call 
> stack)|https://image.ibb.co/hHa2R8/sparkOOM.png], I find the following 
> behavior happening, which I think is weird:
>  * a request happens to send a partition over network
>  * this partition is 1.9 GB and is persisted in memory
>  * this partition is apparently stored in a ByteBufferBlockData, that is made 
> of a ChunkedByteBuffer, which is a list of (lots of) ByteBuffer of 4 MB each.
>  * the call to toNetty() is supposed to only wrap all the arrays and not 
> allocate any memory
>  * yet the call stack shows that netty is allocating memory and is trying to 
> consolidate all the chunks into one big 1.9GB array
>  * this means that at this point the memory footprint is 2x the size of the 
> actual partition (which is huge when the partition is 1.9GB)
> Is this transient allocation expected?
> After digging, it turns out that the actual copy is due to [this 
> method|https://github.com/netty/netty/blob/4.0/buffer/src/main/java/io/netty/buffer/Unpooled.java#L260]
>  in netty. If my initial buffer is made of more than DEFAULT_MAX_COMPONENTS 
> (16) components it will trigger a re-allocation of all the buffer. This netty 
> issue was fixed in this recent change : 
> [https://github.com/netty/netty/commit/9b95b8ee628983e3e4434da93fffb893edff4aa2]
>  
> As a result, is it possible to benefit from this change somehow in spark 2.2 
> and above? I don't know how the netty dependencies are handled for spark
>  
> NB: it seems this ticket: [https://jira.apache.org/jira/browse/SPARK-24307] 
> kinda changed the approach for spark 2.4 by bypassing netty buffer 
> altogether. However as it is written in the ticket, this approach *still* 
> needs to have the *entire* block serialized in memory, so this would be a 
> downgrade from fixing the netty issue when your buffer in <  2GB
>  
> Thanks!
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-24917) Sending a partition over netty results in 2x memory usage

[jira] [Assigned] (SPARK-24917) Sending a partition over netty results in 2x memory usage

2 matches

Site Navigation

Mail list logo

Footer information