[ https://issues.apache.org/jira/browse/HBASE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-9645: ------------------------- Attachment: 9645v2.txt Patch for trunk. > Regionserver halt because of HLog's "Logic Error Snapshot seq id from earlier > flush still present!" > --------------------------------------------------------------------------------------------------- > > Key: HBASE-9645 > URL: https://issues.apache.org/jira/browse/HBASE-9645 > Project: HBase > Issue Type: Bug > Components: regionserver, wal > Affects Versions: 0.94.10 > Environment: Linux 2.6.32-el5.x86_64 > Reporter: Victor Xu > Priority: Critical > Fix For: 0.98.0, 0.94.13, 0.96.1 > > Attachments: 9645v2.txt, HBASE_9645-0.94.10.patch > > > I upgrade my hbase cluster to 0.94.10 three weeks ago, and this case happened > several days after that. I change the bug's priority to 'Critical' because > every time it happens, a regionserver halt down. All of them have the same > log: > {noformat} > ERROR org.apache.hadoop.hbase.regionserver.wal.HLog: Logic Error Snapshot seq > id from earlier flush still present! for region > c0d88db4ce3606842fbec9d34c38f707 overwritten oldseq=80114270537with new > seq=80115066829 > {noformat} > I check the code finding that it locates at HLog.startCacheFlush method. The > 'lastSeqWritten' has been locked. Maybe something wrong happened outside the > HLog that change it by mistake. -- This message was sent by Atlassian JIRA (v6.1#6144)