[ https://issues.apache.org/jira/browse/HBASE-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073391#comment-15073391 ]
Duo Zhang commented on HBASE-14949: ----------------------------------- Talked with [~chenheng] offline. The actual difficulty here is how to deal with rolling upgrade. RS with old split logic may delete the wrong file when name conflict. So I think we should do it with two stages. 1. Change the logic of how to deal with name conflict when splitting. Add a flag(default to false) to tell whether we should use the new WAL logging logic which may result in duplicate WALs. 2. Remove the flag and make our new WAL logging logic default(and the only logic). At stage 1, we should documented that you should not set the flag to true when upgrading from a version without the flag. Then the rolling upgrade is safe since it is safe to use old logic when splitting. Stage 2 should be done in a major version upgrade(maybe 2.0?) and we should documented that you can only upgrade from a version which has applied the patch of stage 1. [~stack] [~chenheng] Any thoughts? Thanks. > Skip duplicate entries when replay WAL. > --------------------------------------- > > Key: HBASE-14949 > URL: https://issues.apache.org/jira/browse/HBASE-14949 > Project: HBase > Issue Type: Sub-task > Reporter: Heng Chen > Attachments: HBASE-14949.patch, HBASE-14949_v1.patch, > HBASE-14949_v2.patch > > > As HBASE-14004 design, there will be duplicate entries in different WAL. It > happens when one hflush failed, we will close old WAL with 'acked hflushed' > length, then open a new WAL and write the unacked hlushed entries into it. > So there maybe some overlap between old WAL and new WAL. > We should skip the duplicate entries when replay. I think it has no harm to > current logic, maybe we do it first. -- This message was sent by Atlassian JIRA (v6.3.4#6332)