[ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114763#comment-13114763 ]
Amitanand Aiyer commented on HBASE-4485: ---------------------------------------- One way to fix this issue, would be to swap the order in which we clear the snapshot, in Store.updateStoreFiles. We might want to notifyReaders() before doing the clearSnapshot(). While it seems like this might fix the issue. Here are some issues that it may raise. (a) If say Scanner 1 got updated; but Scanner 2 is yet to be updated (waiting for the lock). Scanner 1 may now see duplicate data. Some KV's exist both in the new file and in the snapshot. Not sure if our current KVHeap mechanism is able to handle this. esp in terms of the number of updates we have (b) performance concerns about holding on to the snapshot until all the scanners/readers finish scanning the columnFamily to allow StoreScanner.updateReaders() to get the lock. > Eliminate window of missing Data > -------------------------------- > > Key: HBASE-4485 > URL: https://issues.apache.org/jira/browse/HBASE-4485 > Project: HBase > Issue Type: Sub-task > Reporter: Amitanand Aiyer > Assignee: Amitanand Aiyer > Fix For: 0.94.0 > > > After incorporating v11 of the 2856 fix, we discovered that we are still > having some ACID violations. > This time, however, the problem is not about including "newer" updates; but, > about missing older updates > that should be including. > Here is what seems to be happing. > 0 - Scanner starts scanning. > 0 - MemStore.snapshot is called. > Scanner has access to kvHeap and snapshot > 1- Flush takes place. > 1.1 KV's in the snapshot are written to the disk. > 1.2 HFile is ready. > 2 Store.updateStoreFiles() deletes the old snapshot. > > 2.1 updateReaders will not be called until the end of the columnFamily > seek. > 3 For a brief window of time, scanner does not have access to certain > KeyValues. > a) Scanner has no longer access to the snapshot because it is flushed to > the > disk. > b) It does not yet have access to the HFile because the updateReaders was > not called yet. > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira