[ https://issues.apache.org/jira/browse/HBASE-8208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13617904#comment-13617904 ]
Lars Hofhansl commented on HBASE-8208: -------------------------------------- Will we get a "sync storm" when we have a lot of column families? > Data could not be replicated to slaves when deferredLogSync is enabled > ---------------------------------------------------------------------- > > Key: HBASE-8208 > URL: https://issues.apache.org/jira/browse/HBASE-8208 > Project: HBase > Issue Type: Bug > Affects Versions: 0.95.0, 0.98.0, 0.94.6 > Reporter: Jeffrey Zhong > Assignee: Jeffrey Zhong > Fix For: 0.95.0, 0.98.0, 0.94.7 > > Attachments: hbase-8208.patch, hbase-8208-v1.patch, > hbase-8208_v2.patch > > > This is a subtle issue. When deferredLogSync is enabled, there are chances we > could flush data before syncing all HLog entries. Assuming we just flush the > internal cache and the server dies with some unsynced hlog entries. > Data is not lost at the source cluster while replication is based on WAL > files and some changes we flushed at the source won't be replicated the slave > clusters. > Although enabling deferredLogSync with tolerances of data loss, it breaks the > replication assumption that whatever persisted in the source should be > replicated to its slave clusters. > In short, the slave cluster could end up with double losses: the data loss in > the source and some data stored in source cluster may not be replicated to > slaves either. > The fix of the issue isn't hard. Basically we can invoke sync during each > flush when replication is enabled for a region server. Since sync returns > immediately when nothing to sync so there should be no performance impact. > Please let me know what you think! > Thanks, > -Jeffrey -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira