[ 
https://issues.apache.org/jira/browse/HBASE-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845575#action_12845575
 ] 

ryan rawson commented on HBASE-2294:
------------------------------------

So previously without durability, some of the things in here were just not 
applicable.  Sync = noop really doesnt lend itself to answering these questions.

I have postponed commenting until I fixed 2248, but now that I have here are my 
suggestions on how we should do things:

- Row mutate operations should be atomic. Concurrent gets/scans do not see the 
results of a row mutation until it is "finished".  In the code, this means 
"when rwcc.completeMemstoreInsert() is called".  This has to happen _after_ all 
KVs have been put in memstore.  We have to call HLog.sync() _before_ we start 
modifying the memstore so if there is any HLog issue we don't mutate memstore.  
Thus rows become visibile _very shortly_ after a HLog.sync occurs.  The time it 
takes to modify in-memory structures and call rwcc.completeMemstoreInsert().
- Row mutates across multiple families should be atomic. This was not too hard 
to implement in HBASE-2248 and represents a good level of service I think.
- Reads cannot see rows that have not been sync()ed to HLog.
- Scanners have a weak isolation - they are continuously seeing a updated view 
of the table as it runs across rows.  That means a scanner can see rows 
inserted _after_ it's creation.  Providing stronger isolation doesn't make 
sense since there is no intra-row atomic guarantees. 
- Once a client gets a success after a mutation operation, all other clients, 
including itself will be able to see the new data. 

In my work for HDFS-0.21, it was pretty obvious that hflush was fairly slow.  
For high volume updates, with lower value data (eg: calling ICV on a row many 
thousands of times a seconds) it seemed to make sense to use a time-based 
flush.  That is the durability promise is relaxed slightly to say that the row 
is only durable after X milliseconds (configurable) at the most.  This is a 
per-table setting (see: HBASE-1944).

Right now we have the in-memory atomic reads.  The durability story is being 
improved in HBASE-2283 with restructuring hlog appends/syncs and memstore 
mutations.  The performance and locking of in-memory atomic reads is being 
improved in HBASE-2248.

> Enumerate ACID properties of HBase in a well defined spec
> ---------------------------------------------------------
>
>                 Key: HBASE-2294
>                 URL: https://issues.apache.org/jira/browse/HBASE-2294
>             Project: Hadoop HBase
>          Issue Type: Task
>          Components: documentation
>            Reporter: Todd Lipcon
>            Priority: Blocker
>             Fix For: 0.20.4, 0.21.0
>
>
> It's not written down anywhere what the guarantees are for each operation in 
> HBase with regard to the various ACID properties. I think the developers know 
> the answers to these questions, but we need a clear spec for people building 
> systems on top of HBase. Here are a few sample questions we should endeavor 
> to answer:
> - For a multicell put within a CF, is the update made durable atomically?
> - For a put across CFs, is the update made durable atomically?
> - Can a read see a row that hasn't been sync()ed to the HLog?
> - What isolation do scanners have? Somewhere between snapshot isolation and 
> no isolation?
> - After a client receives a "success" for a write operation, is that 
> operation guaranteed to be visible to all other clients?
> etc
> I see this JIRA as having several points of discussion:
> - Evaluation of what the current state of affairs is
> - Evaluate whether we currently provide any guarantees that aren't useful to 
> users of the system (perhaps we can drop in exchange for performance)
> - Evaluate whether we are missing any guarantees that would be useful to 
> users of the system

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to