[ 
https://issues.apache.org/jira/browse/HBASE-18152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554529#comment-16554529
 ] 

Josh Elser commented on HBASE-18152:
------------------------------------

{quote}Notice how they are on different worker threads. The procedures 
"execute" in the correct order but the edits land in the store in another 
order, something quiet possible given the distance between execute, 
post-execution cleanup, and store.
{quote}
Good find, sir!
{quote}Looking at code, when we write we first fill one of a fixed number of 
"slots" with the serialized update of Procedure STEP (See 
WALProcedureStore#insert/#update). We then try to 'push' the data to the WAL 
(See WPS#pushData). Inside in pushData is a reentrant lock to ensure 
single-threaded-access updating WAL. During a startup, we can burst to have 10x 
the Workers we usually have. The "slot" number remains constant. So contention 
for slots and then to obtain the reentrant lock. Reentrant locks make no 
guarantees around the order in which threads get scheduled (This is not a 
'fair' reentrant lock and even then, still no guarantees) so the code here is 
suspect.
{quote}
Totally makes sense. But I'm not sure what a "fix" would look like (not sure if 
you've started thinking about that yet, sorry to derail if you haven't)

Right now, each slot holds the serialized Procedure and then the syncLoop will 
take all slots and write them to the file. What if we make the {{ByteSlot[]}} 
instead a {{TreeMap<Long,ByteSlot>}}, locally sorting each batch of slots 
(really, Procedures), that we're going to group commit to the WAL?

I think that would work since we our Slot never holds more than one Procedure 
at a time (multiple subProcs, but I'm not sure if that matters?).

Tell me to pump the brakes if I'm not being helpful :)

> [AMv2] Corrupt Procedure WAL file; procedure data stored out of order
> ---------------------------------------------------------------------
>
>                 Key: HBASE-18152
>                 URL: https://issues.apache.org/jira/browse/HBASE-18152
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>    Affects Versions: 2.0.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: HBASE-18152.master.001.patch, 
> hbase-hbase-master-ctr-e138-1518143905142-221855-01-000002.hwx.site.log.gz, 
> pv2-00000000000000000036.log, pv2-00000000000000000047.log, 
> reading_bad_wal.patch
>
>
> I've seen corruption from time-to-time testing.  Its rare enough. Often we 
> can get over it but sometimes we can't. It took me a while to capture an 
> instance of corruption. Turns out we are write to the WAL out-of-order which 
> undoes a basic tenet; that WAL content is ordered in line w/ execution.
> Below I'll post a corrupt WAL.
> Looking at the write-side, there is a lot going on. I'm not clear on how we 
> could write out of order. Will try and get more insight. Meantime parking 
> this issue here to fill data into.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to