[ https://issues.apache.org/jira/browse/HBASE-18152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553919#comment-16553919 ]
stack commented on HBASE-18152: ------------------------------- ProcedureStore depends on updates being ordered. For the above case, the two steps -- "Starting" and "Dispatch" -- were run by different Worker threads (PEWorker-8 and then PEWorker-10). The two steps are separated in time by ~70ms and to run, each needs to take the lock on the region. But the updates went into the WAL flipped from how they ran chronologically and how they appeared in the log above. Looking at code, when we write we first fill one of a fixed number of "slots" with the serialized update of Procedure STEP (See WALProcedureStore#insert/#update). We then try to 'push' the data to the WAL (See WPS#pushData). Inside in pushData is a reentrant lock to ensure single-threaded-access updating WAL. During a startup, we can burst to have 10x the Workers we usually have. The "slot" number remains constant. So contention for slots and then to obtain the reentrant lock. Reentrant locks make no guarantees around the order in which threads get scheduled (This is not a 'fair' reentrant lock and even then, still no guarantees) so the code here is suspect. Trying to write a test to prove flawed ordering/corruption. > [AMv2] Corrupt Procedure WAL file; procedure data stored out of order > --------------------------------------------------------------------- > > Key: HBASE-18152 > URL: https://issues.apache.org/jira/browse/HBASE-18152 > Project: HBase > Issue Type: Bug > Components: Region Assignment > Affects Versions: 2.0.0 > Reporter: stack > Assignee: stack > Priority: Critical > Fix For: 3.0.0 > > Attachments: HBASE-18152.master.001.patch, > hbase-hbase-master-ctr-e138-1518143905142-221855-01-000002.hwx.site.log.gz, > pv2-00000000000000000036.log, pv2-00000000000000000047.log, > reading_bad_wal.patch > > > I've seen corruption from time-to-time testing. Its rare enough. Often we > can get over it but sometimes we can't. It took me a while to capture an > instance of corruption. Turns out we are write to the WAL out-of-order which > undoes a basic tenet; that WAL content is ordered in line w/ execution. > Below I'll post a corrupt WAL. > Looking at the write-side, there is a lot going on. I'm not clear on how we > could write out of order. Will try and get more insight. Meantime parking > this issue here to fill data into. -- This message was sent by Atlassian JIRA (v7.6.3#76005)