Huaxiang Sun created HBASE-26092:
------------------------------------

             Summary: JVM core dump in the replication path
                 Key: HBASE-26092
                 URL: https://issues.apache.org/jira/browse/HBASE-26092
             Project: HBase
          Issue Type: Bug
          Components: Replication
    Affects Versions: 2.3.5
            Reporter: Huaxiang Sun


When replication is turned on, we found the following code dump in the region 
server. 

I checked the code dump for replication. I think I got some ideas. For 
replication, when RS receives walEdits from remote cluster, it needs to send 
them out to final RS. In this case, NettyRpcConnection is deployed, calls are 
queued while it refers to ByteBuffer in the context of replicationHandler 
(returned to the pool once it returns). Code dump will happen since the 
byteBuffer has been reused. Needs ref count in this asynchronous processing.

 

Feel free to take it, otherwise, I will try to work on a patch later.

 

 
{code:java}
Stack: [0x00007fb1bf039000,0x00007fb1bf13a000],  sp=0x00007fb1bf138560,  free 
space=1021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
J 28175 C2 
org.apache.hadoop.hbase.ByteBufferKeyValue.write(Ljava/io/OutputStream;Z)I (21 
bytes) @ 0x00007fdbbbb2663c [0x00007fdbbbb263c0+0x27c]
J 14912 C2 
org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.writeRequest(Lorg/apache/hbase/thirdparty/io/netty/channel/ChannelHandlerContext;Lorg/apache/hadoop/hbase/ipc/Call;Lorg/apache/hbase/thirdparty/io/netty/channel/ChannelPromise;)V
 (370 bytes) @ 0x00007fdbbb94b590 [0x00007fdbbb949c00+0x1990]
J 14911 C2 
org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.write(Lorg/apache/hbase/thirdparty/io/netty/channel/ChannelHandlerContext;Ljava/lang/Object;Lorg/apache/hbase/thirdparty/io/netty/channel/ChannelPromise;)V
 (30 bytes) @ 0x00007fdbb972d1d4 [0x00007fdbb972d1a0+0x34]
J 30476 C2 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.write(Ljava/lang/Object;ZLorg/apache/hbase/thirdparty/io/netty/channel/ChannelPromise;)V
 (149 bytes) @ 0x00007fdbbd4e7084 [0x00007fdbbd4e6900+0x784]
J 14914 C2 org.apache.hadoop.hbase.ipc.NettyRpcConnection$6$1.run()V (22 bytes) 
@ 0x00007fdbbb9344ec [0x00007fdbbb934280+0x26c]
J 23528 C2 
org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(J)Z
 (106 bytes) @ 0x00007fdbbcbb0efc [0x00007fdbbcbb0c40+0x2bc]
J 15987% C2 
org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run()V (461 
bytes) @ 0x00007fdbbbaf1580 [0x00007fdbbbaf1360+0x220]
j  
org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run()V+44
j  
org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run()V+11
j  
org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run()V+4
{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to