[jira] [Commented] (HBASE-26460) Close netty channel causes regionserver crash in handleTooBigRequest

Lijin Bin (Jira) Thu, 18 Nov 2021 03:20:05 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-26460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17445854#comment-17445854
 ]


Lijin Bin commented on HBASE-26460:
-----------------------------------

I add two config , disable rpc buffer reuse, and encounter Invalid key 
length/Invalid value length error, so i guess regionserver receive corrupt 
request and the problem is at client code.
{code}
<property>
        <name>hbase.rpc.server.impl</name>
        <value>org.apache.hadoop.hbase.ipc.SimpleRpcServer</value>
</property>
<property>
        <name>hbase.server.allocator.pool.enabled</name>
        <value>false</value>
</property>
{code}

> Close netty channel causes regionserver crash in handleTooBigRequest
> --------------------------------------------------------------------
>
>                 Key: HBASE-26460
>                 URL: https://issues.apache.org/jira/browse/HBASE-26460
>             Project: HBase
>          Issue Type: Bug
>          Components: rpc
>    Affects Versions: 3.0.0-alpha-1, 2.0.0
>            Reporter: Xiaolin Ha
>            Assignee: Xiaolin Ha
>            Priority: Critical
>
> In HBASE-26170, I proposed the coredump problem after calling 
> handleTooBigRequest, but that issue did not resolve the regionserver crash 
> problem, which occurs before the WAL corruption in HBASE-24984.
> After looking through the codes, I think the problem is in CLOSE channel. 
> The direct byte buffer used by RPC call request is allocated by Netty, though 
> we add a reference count to record when to release the direct byte buffer, 
> the byte buffer is managed by Netty actually. It is allocated from Netty 
> PoolArena, and is released there. 
> When the HBase ipc handler is processing a request, the Netty channel handler 
> can process the channel events and message coming back in succession. When 
> there is a too big request by NettyRpcFrameDecoder, the channel will be 
> closed, and all the resources of the channel will be released, though there 
> is HBase ipc handlers using the direct byte buffer to process previous 
> requests.
> Netty provides two methods to request the pooled byte buffer, one is through 
> the PoolThreadCache, each handler thread owns a private one. Another is 
> through PoolArena#allocateNormal. Each ChannelHandler has a local 
> PoolThreadCache.
> When a new Netty channel is created, a new ChannelHandler instance is 
> created. 
> And when a channel is closed, the relevant channel handler will be removed 
> from the pipeline. I found this annotation in the Channel class of Netty,
> {code:java}
> It is important to call close() or close(ChannelPromise) to release all 
> resources once you are done with the Channel. This ensures all resources are 
> released in a proper way, i.e. filehandles. {code}
> And when channel handler is removed in ByteToMessageDecoder#handlerRemoved, 
> it will release the byte buffer,
> {code:java}
> @Override
> public final void handlerRemoved(ChannelHandlerContext ctx) throws Exception {
>     if (decodeState == STATE_CALLING_CHILD_DECODE) {
>         decodeState = STATE_HANDLER_REMOVED_PENDING;
>         return;
>     }
>     ByteBuf buf = cumulation;
>     if (buf != null) {
>         // Directly set this to null so we are sure we not access it in any 
> other method here anymore.
>         cumulation = null;
>         int readable = buf.readableBytes();
>         if (readable > 0) {
>             ByteBuf bytes = buf.readBytes(readable);
>             buf.release();
>             ctx.fireChannelRead(bytes);
>         } else {
>             buf.release();
>         }
> ... {code}
> We should not close the channel when encountering too big request, I think it 
> should just skip the bytes like that in LengthFieldBasedFrameDecoder.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (HBASE-26460) Close netty channel causes regionserver crash in handleTooBigRequest

Reply via email to