[ 
https://issues.apache.org/jira/browse/HBASE-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710162#action_12710162
 ] 

Jonathan Gray commented on HBASE-1304:
--------------------------------------

@ryan, good stuff...

first, remember there are three different types of deletes.  Delete (a single 
version of the exact specified stamp), DeleteColumn (all versions <= specified 
stamp), and DeleteFamily (all versions of all columns <= specified stamp).

So, if the case above is talking about a DeleteColumn of col1 @ timestamp 9, 
then the put @ ts=8 should be deleted, as well as any other col1s in later 
storefiles with ts <= 9.

When we actually do the compaction, we will process it with something like the 
ScanDeleteTracker, that processes the deletes as you go (merging multiple 
storefiles).  So, the DT would actually prevent the put @ ts=8 from being 
output to the compacted file.  However, since in this case it is DeleteColumn, 
we have to retain the delete in the outputted and compacted file.  So we 

Like I mentioned before, the difference between ScanDT and CompactDT is that 
the compaction one will need to make a decision about which deletes to output 
and which to clear.  The two cases you will clear deletes from the compacted 
file is if it is an explicit Delete of a single version and that version exists 
in the merged files, or if you have a delete that overrides another delete 
(like a DeleteColumn of col1 @ ts =9 and @ ts = 20.  The ts = 9 would not be 
included in the compacted file.

Alternatively, we can make Gets like Scans and then we don't need the property 
of deletes only applying to older ones.  What we lose is the ability to 
early-out of a get w/o having to open and read all the storefiles.  You'll 
never be able to just touch memcache, you'll always need to open every 
storefile.

> New client server implementation of how gets and puts are handled. 
> -------------------------------------------------------------------
>
>                 Key: HBASE-1304
>                 URL: https://issues.apache.org/jira/browse/HBASE-1304
>             Project: Hadoop HBase
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Erik Holstad
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: hbase-1304-v1.patch, HBASE-1304-v2.patch, 
> HBASE-1304-v3.patch, HBASE-1304-v4.patch, HBASE-1304-v5.patch, 
> HBASE-1304-v6.patch, HBASE-1304-v7.patch
>
>
> Creating an issue where the implementation of the new client and server will 
> go. Leaving HBASE-1249 as a discussion forum and will put code and patches 
> here.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to