[ https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981140#action_12981140 ]
ryan rawson commented on HBASE-2856: ------------------------------------ Unfortunately the paranoia re: performance is borne out by direct experience. It will be an issue, and it will be a blocker and we should deal with it right now. Since fixing it might require architectural level changes in how we manage these things internally. including up to and including using a different ID stream for atomic consistency. Backing up a bit, the basic issue is that a handler thread cannot complete and return to the client until the row-transaction it was working on is visible to other clients. To do otherwise risks data loss for ICV and inconsistent read-your-own-write scenarios for clients. But while waiting we are tying up a handler thread, and have to wait on the longest pole HLog append (which can take seconds at their worst!). You end up with a RS level stall which is pretty ugly. I dont want 2 sets of sequence numbers, but I am concerned that we might need it. Perhaps we can find a more elegant mechanism of cheaply keeping track of which seqids are 'committed' and visible and which are not. Right now we use a simple 'read point' which acts like a line in the sand. Previous proposals called for a bitmask of the last N numbers. The problem with this is that deferred flushing combined with non-deferred flushing would cause major problems, as the last N we need to keep track of keeps on expanding. Perhaps a reverse bitmask where we keep track of the PREVIOUS N tx that are NOT committed might make more sense. Implementing it efficiently is another question. > TestAcidGuarantee broken on trunk > ---------------------------------- > > Key: HBASE-2856 > URL: https://issues.apache.org/jira/browse/HBASE-2856 > Project: HBase > Issue Type: Bug > Affects Versions: 0.89.20100621 > Reporter: ryan rawson > Assignee: stack > Priority: Blocker > Fix For: 0.92.0 > > Attachments: 2856-v2.txt, 2856-v3.txt, acid.txt > > > TestAcidGuarantee has a test whereby it attempts to read a number of columns > from a row, and every so often the first column of N is different, when it > should be the same. This is a bug deep inside the scanner whereby the first > peek() of a row is done at time T then the rest of the read is done at T+1 > after a flush, thus the memstoreTS data is lost, and previously 'uncommitted' > data becomes committed and flushed to disk. > One possible solution is to introduce the memstoreTS (or similarly equivalent > value) to the HFile thus allowing us to preserve read consistency past > flushes. Another solution involves fixing the scanners so that peek() is not > destructive (and thus might return different things at different times alas). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.