[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

Enis Soztutar (JIRA) Mon, 06 Apr 2015 16:14:20 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482163#comment-14482163
 ]


Enis Soztutar commented on HBASE-13389:
---------------------------------------

bq. when we replay data due to recovery we want it to fall into the right place 
w.r.t to existing data. Why do we need more than the maximum time to roll a log 
(1h)?
I think the min time to keep is max time an edit can live in the memstore 
without being flushed. This is not related to log roll (since we still replay 
edits from a previous log roll) but how much further an edit can be replayed 
through dist log replay I think. 
Case 3 as Jeff puts it is an issue with the comparison order. We compare 
entries with {{row > column -> ts -> type -> seqId}} order, however, we should 
compare entries in {{row > column -> ts -> seqId -> type}} order so that Put, 
Delete, Put with the same TS works. If we do better resolution for ts's, this 
is not needed though. 

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -------------------------------------------------------------
>
>                 Key: HBASE-13389
>                 URL: https://issues.apache.org/jira/browse/HBASE-13389
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Performance
>            Reporter: stack
>         Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

Reply via email to