[
https://issues.apache.org/jira/browse/HBASE-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710167#action_12710167
]
ryan rawson commented on HBASE-1304:
------------------------------------
A few thoughts:
- It never makes sense to 'delete into the future' - imagine if you inserted
data that was masked by a delete placed there X days/months/years before. It
would be confusing.
- Given that previous, deletes are scoped to 'now' or before, given the
following facts:
-- Deletes remove puts from the memcache immediately
-- No compactions
- Given these 3 facts, a delete would apply only to previous files. This
fulfills Jon's previous needs for get early out.
Now, under the simple minor compaction case, where we just merge all the keys
without any delete processing, we run into the scenario whereby:
- Deletes now potentially to the current file
Now, a scan must process every file because it needs to get to the next row.
But a get is a specialized scan, whereby once we have fulfilled the needs of
the current query we don't have to next() the rest of the store files/hfiles
(e: jon's early out). For example, let's says we are looking for the columns:
- A,B - 1 version
Once we have fulfilled those requirements we can stop looking in files. Right
now scanners will seek all hfiles at the same time which is something that
isn't strictly necessary in a 'get()'. Early outs prevent us from scanning the
rest of the files, gaining speed.
Now, I'm not sure exactly how by having the 'delete applies only to earlier
files' helps the early outs - since no matter what, deletes always come before
the puts they apply to even in the same file.
In the end, I'm concerned about minor compactions removing deleted key/values -
are we sure we want this behaviour?
> New client server implementation of how gets and puts are handled.
> -------------------------------------------------------------------
>
> Key: HBASE-1304
> URL: https://issues.apache.org/jira/browse/HBASE-1304
> Project: Hadoop HBase
> Issue Type: Improvement
> Affects Versions: 0.20.0
> Reporter: Erik Holstad
> Assignee: Jonathan Gray
> Priority: Blocker
> Fix For: 0.20.0
>
> Attachments: hbase-1304-v1.patch, HBASE-1304-v2.patch,
> HBASE-1304-v3.patch, HBASE-1304-v4.patch, HBASE-1304-v5.patch,
> HBASE-1304-v6.patch, HBASE-1304-v7.patch
>
>
> Creating an issue where the implementation of the new client and server will
> go. Leaving HBASE-1249 as a discussion forum and will put code and patches
> here.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.