[ https://issues.apache.org/jira/browse/HBASE-18152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590273#comment-16590273 ]
stack commented on HBASE-18152: ------------------------------- Thank you [~allan163] It might be a different corruption type. Hopefully there'll be a clue when we dig in on the WALs. I've not seen it since the fix for the race condition. Cross-fingers. > [AMv2] Corrupt Procedure WAL file; procedure data stored out of order > --------------------------------------------------------------------- > > Key: HBASE-18152 > URL: https://issues.apache.org/jira/browse/HBASE-18152 > Project: HBase > Issue Type: Bug > Components: Region Assignment > Affects Versions: 2.0.0 > Reporter: stack > Assignee: stack > Priority: Critical > Fix For: 3.0.0 > > Attachments: > 0001-TestWALProcedureExecutore-order-checking-test-that-d.patch, > HBASE-18152.master.001.patch, > hbase-hbase-master-ctr-e138-1518143905142-221855-01-000002.hwx.site.log.gz, > pv2-00000000000000000036.log, pv2-00000000000000000047.log, > reading_bad_wal.patch > > > I've seen corruption from time-to-time testing. Its rare enough. Often we > can get over it but sometimes we can't. It took me a while to capture an > instance of corruption. Turns out we are write to the WAL out-of-order which > undoes a basic tenet; that WAL content is ordered in line w/ execution. > Below I'll post a corrupt WAL. > Looking at the write-side, there is a lot going on. I'm not clear on how we > could write out of order. Will try and get more insight. Meantime parking > this issue here to fill data into. -- This message was sent by Atlassian JIRA (v7.6.3#76005)