[ 
https://issues.apache.org/jira/browse/SPARK-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16715684#comment-16715684
 ] 

ASF GitHub Bot commented on SPARK-24920:
----------------------------------------

vanzin commented on a change in pull request #23278: [SPARK-24920][Core] Allow 
sharing Netty's memory pool allocators
URL: https://github.com/apache/spark/pull/23278#discussion_r240405360
 
 

 ##########
 File path: 
common/network-common/src/main/java/org/apache/spark/network/util/NettyUtils.java
 ##########
 @@ -95,6 +99,21 @@ public static String getRemoteAddress(Channel channel) {
     return "<unknown remote>";
   }
 
+  /**
+   * Returns the lazily created shared pooled ByteBuf allocator for the 
specified allowCache
+   * parameter value.
+   */
+  public static synchronized PooledByteBufAllocator 
getSharedPooledByteBufAllocator(
+      boolean allowDirectBufs,
+      boolean allowCache) {
+    final int index = allowCache ? 0 : 1;
+    if (_sharedPooledByteBufAllocator[index] == null) {
+      _sharedPooledByteBufAllocator[index] =
+        createPooledByteBufAllocator(allowDirectBufs, allowCache, 0 /* 
numCores */);
 
 Review comment:
   Hmm.. it may be good to think about having a better way to define the number 
of cores here. The issue is that by using the default you may be wasting 
resources.
   
   e.g. if your container is only requesting 1 CPU but the host actually has 32 
CPUs, this will create 64 allocation arenas.
   
   (For example, `SparkTransportConf.fromSparkConf` tries to limit thread pool 
sizes and thus the size of the allocators by using the configured number of 
CPUs.)
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Spark should allow sharing netty's memory pools across all uses
> ---------------------------------------------------------------
>
>                 Key: SPARK-24920
>                 URL: https://issues.apache.org/jira/browse/SPARK-24920
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.4.0
>            Reporter: Imran Rashid
>            Priority: Major
>              Labels: memory-analysis
>
> Spark currently creates separate netty memory pools for each of the following 
> "services":
> 1) RPC Client
> 2) RPC Server
> 3) BlockTransfer Client
> 4) BlockTransfer Server
> 5) ExternalShuffle Client
> Depending on configuration and whether its an executor or driver JVM, 
> different of these are active, but its always either 3 or 4.
> Having them independent somewhat defeats the purpose of using pools at all.  
> In my experiments I've found each pool will grow due to a burst of activity 
> in the related service (eg. task start / end msgs), followed another burst in 
> a different service (eg. sending torrent broadcast blocks).  Because of the 
> way these pools work, they allocate memory in large chunks (16 MB by default) 
> for each netty thread, so there is often a surge of 128 MB of allocated 
> memory, even for really tiny messages.  Also a lot of this memory is offheap 
> by default, which makes it even tougher for users to manage.
> I think it would make more sense to combine all of these into a single pool.  
> In some experiments I tried, this noticeably decreased memory usage, both 
> onheap and offheap (no significant performance effect in my small 
> experiments).
> As this is a pretty core change, as I first step I'd propose just exposing 
> this as a conf, to let user experiment more broadly across a wider range of 
> workloads



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to