[
https://issues.apache.org/jira/browse/HBASE-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857548#action_12857548
]
Jonathan Gray commented on HBASE-2453:
--------------------------------------
I think we should remove all delete processing from minor compactions in order
to make minor compactions as fast as possible. For me, that would suffice for
closing this jira.
Major compactions are another consideration. As I described in HBASE-2450, the
fact that delete markers actually get removed during major compactions makes it
so a background process impacts user-facing behavior. This is because old
delete records can impact new puts (if i put a value with an older timestamp
than a row delete, for example). Before the major it would not show up, after
the major this put would be valid.
One possibility is we change it so minors don't do anything, then majors do
what minors do now (doing the actually deleting, but retaining the deletes
themselves). Only downside of that is that once you delete a row at a
timestamp, you can never re-insert values older than that delete. Today, this
is the case _until_ there is a major compaction. The way to fix this is by
taking storefile age into account so that deletes in previous storefiles don't
apply to newer storefiles. If we did that, we would have to process deletes
during regular compactions because you'd need to look at the relative ages of
the storefiles to determine if a particular delete applied or not.
For now, I'd be happy just removing delete tracking in minors and worrying
about the rest of these issues for 0.21.
> Revisit compaction policies after HBASE-2248 commit
> ---------------------------------------------------
>
> Key: HBASE-2453
> URL: https://issues.apache.org/jira/browse/HBASE-2453
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Jonathan Gray
> Priority: Critical
> Fix For: 0.20.4, 0.20.5, 0.21.0
>
>
> HBASE-2248 turned Gets into Scans server-side. It also removed the invariant
> that deletes in a file only apply to other files and not itself (no longer
> processes MemStore deletes when the delete happens). This has implications
> for our minor compaction policy.
> We are currently processing deletes during minor compactions in a way that
> makes it so we do the actual deleting as we compact, but we retain the delete
> records themselves. This makes it so we retain the invariant of deletes only
> applying to other files.
> Since this is now gone post HBASE-2248, we should revisit our compaction
> policies.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira