[ 
https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981140#action_12981140
 ] 

ryan rawson commented on HBASE-2856:
------------------------------------

Unfortunately the paranoia re: performance is borne out by direct
experience.  It will be an issue, and it will be a blocker and we
should deal with it right now. Since fixing it might require
architectural level changes in how we manage these things internally.
including up to and including using a different ID stream for atomic
consistency.

Backing up a bit, the basic issue is that a handler thread cannot
complete and return to the client until the row-transaction it was
working on is visible to other clients. To do otherwise risks data
loss for ICV and inconsistent read-your-own-write scenarios for
clients.  But while waiting we are tying up a handler thread, and have
to wait on the longest pole HLog append (which can take seconds at
their worst!).  You end up with a RS level stall which is pretty ugly.

I dont want 2 sets of sequence numbers, but I am concerned that we
might need it.  Perhaps we can find a more elegant mechanism of
cheaply keeping track of which seqids are 'committed' and visible and
which are not.  Right now we use a simple 'read point' which acts like
a line in the sand.  Previous proposals called for a bitmask of the
last N numbers.  The problem with this is that deferred flushing
combined with non-deferred flushing would cause major problems, as the
last N we need to keep track of keeps on expanding.

Perhaps a reverse bitmask where we keep track of the PREVIOUS N tx
that are NOT committed might make more sense.  Implementing it
efficiently is another question.


> TestAcidGuarantee broken on trunk 
> ----------------------------------
>
>                 Key: HBASE-2856
>                 URL: https://issues.apache.org/jira/browse/HBASE-2856
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.89.20100621
>            Reporter: ryan rawson
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.92.0
>
>         Attachments: 2856-v2.txt, 2856-v3.txt, acid.txt
>
>
> TestAcidGuarantee has a test whereby it attempts to read a number of columns 
> from a row, and every so often the first column of N is different, when it 
> should be the same.  This is a bug deep inside the scanner whereby the first 
> peek() of a row is done at time T then the rest of the read is done at T+1 
> after a flush, thus the memstoreTS data is lost, and previously 'uncommitted' 
> data becomes committed and flushed to disk.
> One possible solution is to introduce the memstoreTS (or similarly equivalent 
> value) to the HFile thus allowing us to preserve read consistency past 
> flushes.  Another solution involves fixing the scanners so that peek() is not 
> destructive (and thus might return different things at different times alas).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to