[
https://issues.apache.org/jira/browse/HBASE-22539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895272#comment-16895272
]
Wellington Chevreuil edited comment on HBASE-22539 at 7/29/19 1:53 PM:
-----------------------------------------------------------------------
Thanks for jumping in, [~Apache9]!
{quote}Did the crashed region server timeout on some write requests?{quote}
The RS is not crashing at all when we see these corruptions (and the message
mentioned above is never seen on RS logs either). It may eventually crashes
later due other problems, such as GC long pauses, in which case, corrupt wal
would cause any RS that then try to split to crash.
was (Author: wchevreuil):
{quote}Did the crashed region server timeout on some write requests?{quote}
The RS is not crashing at all when we see these corruptions (and the message
mentioned above is never seen on RS logs either). It may eventually crashes
later due other problems, such as GC long pauses, in which case, corrupt wal
would cause any RS that then try to split to crash.
> Potential WAL corruption due to Unsafe.copyMemory usage when DBB are in place
> -----------------------------------------------------------------------------
>
> Key: HBASE-22539
> URL: https://issues.apache.org/jira/browse/HBASE-22539
> Project: HBase
> Issue Type: Bug
> Components: rpc, wal
> Affects Versions: 2.1.1
> Reporter: Wellington Chevreuil
> Assignee: Wellington Chevreuil
> Priority: Blocker
>
> Summary
> We had been chasing a WAL corruption issue reported on one of our customers
> deployments running release 2.1.1 (CDH 6.1.0). After providing a custom
> modified jar with the extra sanity checks implemented by HBASE-21401 applied
> on some code points, plus additional debugging messages, we believe it is
> related to DirectByteBuffer usage, and Unsafe copy from offheap memory to
> on-heap array triggered
> [here|https://github.com/apache/hbase/blob/branch-2.1/hbase-common/src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java#L1157],
> such as when writing into a non ByteBufferWriter type, as done
> [here|https://github.com/apache/hbase/blob/branch-2.1/hbase-common/src/main/java/org/apache/hadoop/hbase/io/ByteBufferWriterOutputStream.java#L84].
> More details on the following comment.
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)