[ 
https://issues.apache.org/jira/browse/SPARK-7183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-7183:
-----------------------------------

    Assignee:     (was: Apache Spark)

> Memory leak in netty shuffle with spark standalone cluster
> ----------------------------------------------------------
>
>                 Key: SPARK-7183
>                 URL: https://issues.apache.org/jira/browse/SPARK-7183
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle
>    Affects Versions: 1.3.0
>            Reporter: Jack Hu
>              Labels: memory-leak, netty, shuffle
>
> There is slow leak in netty shuffle with spark cluster in 
> {{TransportRequestHandler.streamIds}}
> In spark cluster, there are some reusable netty connections between two block 
> managers to get/send blocks between worker/drivers. These connections are 
> handled by the {{org.apache.spark.network.server.TransportRequestHandler}} in 
> server side. This handler keep tracking all the streamids negotiate by RPC 
> when shuffle data need transform in these two block managers and the streamid 
> is keeping increasing, and never get a chance to be deleted exception this 
> connection is dropped (seems never happen in normal running).
> Here are some detail logs of this  {{TransportRequestHandler}} (Note: we add 
> a log a print the total size of {{TransportRequestHandler.streamIds}}, the 
> log is "Current set size is N of 
> org.apache.spark.network.server.TransportRequestHandler@ADDRESS", this set 
> size is keeping increasing in our test)
> {quote}
> 15/04/22 21:00:16 DEBUG TransportServer: Shuffle server started on port :46288
> 15/04/22 21:00:16 INFO NettyBlockTransferService: Server created on 46288
> 15/04/22 21:00:31 INFO TransportRequestHandler: Created 
> TransportRequestHandler 
> org.apache.spark.network.server.TransportRequestHandler@29a4f3e7
> 15/04/22 21:00:32 TRACE MessageDecoder: Received message RpcRequest: 
> RpcRequest\{requestId=6655045571437304938, message=[B@59778678\}
> 15/04/22 21:00:32 TRACE NettyBlockRpcServer: Received request: 
> OpenBlocks\{appId=app-20150422210016-0000, execId=<driver>, 
> blockIds=[broadcast_1_piece0]}
> 15/04/22 21:00:32 TRACE NettyBlockRpcServer: Registered streamId 
> 1387459488000 with 1 buffers
> 15/04/22 21:00:33 TRACE TransportRequestHandler: Sent result 
> RpcResponse\{requestId=6655045571437304938, response=[B@d2840b\} to client 
> /10.111.7.150:33802
> 15/04/22 21:00:33 TRACE MessageDecoder: Received message ChunkFetchRequest: 
> ChunkFetchRequest\{streamChunkId=StreamChunkId\{streamId=1387459488000, 
> chunkIndex=0}}
> 15/04/22 21:00:33 TRACE TransportRequestHandler: Received req from 
> /10.111.7.150:33802 to fetch block StreamChunkId\{streamId=1387459488000, 
> chunkIndex=0\}
> 15/04/22 21:00:33 INFO TransportRequestHandler: Current set size is 1 of 
> org.apache.spark.network.server.TransportRequestHandler@29a4f3e7
> 15/04/22 21:00:33 TRACE OneForOneStreamManager: Removing stream id 
> 1387459488000
> 15/04/22 21:00:33 TRACE TransportRequestHandler: Sent result 
> ChunkFetchSuccess\{streamChunkId=StreamChunkId\{streamId=1387459488000, 
> chunkIndex=0}, buffer=NioManagedBuffer\{buf=java.nio.HeapByteBuffer[pos=0 
> lim=3839 cap=3839]}} to client /10.111.7.150:33802
> 15/04/22 21:00:34 TRACE MessageDecoder: Received message RpcRequest: 
> RpcRequest\{requestId=6660601528868866371, message=[B@42bed1b8\}
> 15/04/22 21:00:34 TRACE NettyBlockRpcServer: Received request: 
> OpenBlocks\{appId=app-20150422210016-0000, execId=<driver>, 
> blockIds=[broadcast_3_piece0]}
> 15/04/22 21:00:34 TRACE NettyBlockRpcServer: Registered streamId 
> 1387459488001 with 1 buffers
> 15/04/22 21:00:34 TRACE TransportRequestHandler: Sent result 
> RpcResponse\{requestId=6660601528868866371, response=[B@7fa3fb60\} to client 
> /10.111.7.150:33802
> 15/04/22 21:00:34 TRACE MessageDecoder: Received message ChunkFetchRequest: 
> ChunkFetchRequest\{streamChunkId=StreamChunkId\{streamId=1387459488001, 
> chunkIndex=0}}
> 15/04/22 21:00:34 TRACE TransportRequestHandler: Received req from 
> /10.111.7.150:33802 to fetch block StreamChunkId\{streamId=1387459488001, 
> chunkIndex=0\}
> 15/04/22 21:00:34 INFO TransportRequestHandler: Current set size is 2 of 
> org.apache.spark.network.server.TransportRequestHandler@29a4f3e7
> 15/04/22 21:00:34 TRACE OneForOneStreamManager: Removing stream id 
> 1387459488001
> 15/04/22 21:00:34 TRACE TransportRequestHandler: Sent result 
> ChunkFetchSuccess\{streamChunkId=StreamChunkId\{streamId=1387459488001, 
> chunkIndex=0}, buffer=NioManagedBuffer\{buf=java.nio.HeapByteBuffer[pos=0 
> lim=4277 cap=4277]}} to client /10.111.7.150:33802
> 15/04/22 21:00:34 TRACE MessageDecoder: Received message RpcRequest: 
> RpcRequest\{requestId=8454597410163901330, message=[B@19c673d1\}
> 15/04/22 21:00:34 TRACE NettyBlockRpcServer: Received request: 
> OpenBlocks\{appId=app-20150422210016-0000, execId=<driver>, 
> blockIds=[broadcast_2_piece0]}
> 15/04/22 21:00:34 TRACE NettyBlockRpcServer: Registered streamId 
> 1387459488002 with 1 buffers
> 15/04/22 21:00:34 TRACE TransportRequestHandler: Sent result 
> RpcResponse\{requestId=8454597410163901330, response=[B@35dbdac2\} to client 
> /10.111.7.150:33802
> 15/04/22 21:00:34 TRACE MessageDecoder: Received message ChunkFetchRequest: 
> ChunkFetchRequest\{streamChunkId=StreamChunkId\{streamId=1387459488002, 
> chunkIndex=0}}
> 15/04/22 21:00:34 TRACE TransportRequestHandler: Received req from 
> /10.111.7.150:33802 to fetch block StreamChunkId\{streamId=1387459488002, 
> chunkIndex=0\}
> 15/04/22 21:00:34 INFO TransportRequestHandler: Current set size is 3 of 
> org.apache.spark.network.server.TransportRequestHandler@29a4f3e7
> 15/04/22 21:00:34 TRACE OneForOneStreamManager: Removing stream id 
> 1387459488002
> ......
> 15/04/22 23:59:50 TRACE MessageDecoder: Received message RpcRequest: 
> RpcRequest\{requestId=5718124278216696100, message=[B@7ade3ea3\}
> 15/04/22 23:59:50 TRACE NettyBlockRpcServer: Received request: 
> OpenBlocks\{appId=app-20150422210016-0000, execId=<driver>, 
> blockIds=[broadcast_14679_piece0]}
> 15/04/22 23:59:50 TRACE NettyBlockRpcServer: Registered streamId 
> 1387459501252 with 1 buffers
> 15/04/22 23:59:50 TRACE TransportRequestHandler: Sent result 
> RpcResponse\{requestId=5718124278216696100, response=[B@40c07a63\} to client 
> /10.111.7.150:33802
> 15/04/22 23:59:50 TRACE MessageDecoder: Received message ChunkFetchRequest: 
> ChunkFetchRequest\{streamChunkId=StreamChunkId\{streamId=1387459501252, 
> chunkIndex=0}}
> 15/04/22 23:59:50 TRACE TransportRequestHandler: Received req from 
> /10.111.7.150:33802 to fetch block StreamChunkId\{streamId=1387459501252, 
> chunkIndex=0\}
> 15/04/22 23:59:50 INFO TransportRequestHandler: Current set size is 13253 of 
> org.apache.spark.network.server.TransportRequestHandler@29a4f3e7
> 15/04/22 23:59:50 TRACE OneForOneStreamManager: Removing stream id 
> 1387459501252
> 15/04/22 23:59:50 TRACE TransportRequestHandler: Sent result 
> ChunkFetchSuccess\{streamChunkId=StreamChunkId\{streamId=1387459501252, 
> chunkIndex=0}, buffer=NioManagedBuffer\{buf=java.nio.HeapByteBuffer[pos=0 
> lim=31474 cap=31474]}} to client /10.111.7.150:33802
> 15/04/22 23:59:50 TRACE MessageDecoder: Received message RpcRequest: 
> RpcRequest\{requestId=8663805364150028136, message=[B@5974f9b4\}
> 15/04/22 23:59:50 TRACE NettyBlockRpcServer: Received request: 
> OpenBlocks\{appId=app-20150422210016-0000, execId=<driver>, 
> blockIds=[broadcast_14688_piece0]}
> 15/04/22 23:59:50 TRACE NettyBlockRpcServer: Registered streamId 
> 1387459501253 with 1 buffers
> 15/04/22 23:59:50 TRACE TransportRequestHandler: Sent result 
> RpcResponse\{requestId=8663805364150028136, response=[B@122023c6\} to client 
> /10.111.7.150:33802
> 15/04/22 23:59:50 TRACE MessageDecoder: Received message ChunkFetchRequest: 
> ChunkFetchRequest\{streamChunkId=StreamChunkId\{streamId=1387459501253, 
> chunkIndex=0}}
> 15/04/22 23:59:50 TRACE TransportRequestHandler: Received req from 
> /10.111.7.150:33802 to fetch block StreamChunkId\{streamId=1387459501253, 
> chunkIndex=0\}
> 15/04/22 23:59:50 INFO TransportRequestHandler: Current set size is 13254 of 
> org.apache.spark.network.server.TransportRequestHandler@29a4f3e7
> 15/04/22 23:59:50 TRACE OneForOneStreamManager: Removing stream id 
> 1387459501253
> 15/04/22 23:59:50 TRACE TransportRequestHandler: Sent result 
> ChunkFetchSuccess\{streamChunkId=StreamChunkId\{streamId=1387459501253, 
> chunkIndex=0}, buffer=NioManagedBuffer\{buf=java.nio.HeapByteBuffer[pos=0 
> lim=4047 cap=4047]}} to client /10.111.7.150:33802
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to