[ 
https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464280#comment-13464280
 ] 

Devaraj Das commented on HBASE-6679:
------------------------------------

bq. I suppose we could make the reference volatile so all threads catch the 
update.

Yeah, [~stack], I have the same opinion - we close the issue with a fix that 
makes the reference volatile (and that'd justify my hours of debugging <smile>).

bq. But you can't see how the two threads can run concurrently? (not to say it 
not possible)

At least from the regionserver logs it is evident that this didn't happen. From 
the code, the compactions and splits happen in executors, where the split 
happens in an executor with a thread pool of at most one thread. Once a 
compaction completes, the executor fires off a request for split (that may or 
may not happen based on checks done within the request handler). The compaction 
executor doesn't wait for the split to complete, and so technically, it's 
possible that split & compaction could be running in parallel. But at a finer 
granularity, there are locks being taken at different points in split/compact 
(and the important places are protected with HRegion.lock). There are also 
checks for things like HRegion.writeState that are checked/set at places in 
compaction/split.

So IMHO things are wired together okay (but yeah, usual disclaimer - may have 
missed something :-) )

bq. Good on you Deva.

You too :-)
                
> RegionServer aborts due to race between compaction and split
> ------------------------------------------------------------
>
>                 Key: HBASE-6679
>                 URL: https://issues.apache.org/jira/browse/HBASE-6679
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Devaraj Das
>            Assignee: Devaraj Das
>             Fix For: 0.92.3
>
>         Attachments: rs-crash-parallel-compact-split.log
>
>
> In our nightlies, we have seen RS aborts due to compaction and split racing. 
> Original parent file gets deleted after the compaction, and hence, the 
> daughters don't find the parent data file. The RS kills itself when this 
> happens. Will attach a snippet of the relevant RS logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to