[ https://issues.apache.org/jira/browse/HBASE-15425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188511#comment-15188511 ]
Enis Soztutar commented on HBASE-15425: --------------------------------------- Flush and bulk load markers have been added for region replicas so that they can replay these events. Normally, the regular log split / replay ignores these markers. For region replicas, missing a flush file or bulk load files is not a critical condition (since eventually they will be picked up due to compactions), so we were following the safe route there. Now, returning failure will cause the bulk load RPC to be retried, and the regionserver would have already bulk loaded those files, so they will be bulk loaded again. One cluster will see 2 sets of bulk load files, the other cluster which gets replication will see only one set. There is no atomic transaction to make sure that the bulk load and WAL event happens atomically, so it is a best effort in that case. Semantically it should still be correct though. Patch looks fine to me. > Failing to write bulk load event marker in the WAL is ignored > ------------------------------------------------------------- > > Key: HBASE-15425 > URL: https://issues.apache.org/jira/browse/HBASE-15425 > Project: HBase > Issue Type: Bug > Affects Versions: 1.3.0 > Reporter: Ashish Singhi > Assignee: Ashish Singhi > Attachments: HBASE-15425.patch, HBASE-15425.v1.patch > > > During LoadIncrementalHFiles process if we fail to write the bulk load event > marker in the WAL, it is ignored. So this will lead to data mismatch issue in > source and peer cluster in case of bulk loaded data replication scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)