[ https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395844#comment-14395844 ]
Lars Hofhansl commented on HBASE-13389: --------------------------------------- It turns out that the optimization in HBASE-8151 and HBASE-9751 still works, but only after 6 days, when compactions allow setting mvcc readpoints to 0. I think we can get the optimization for HBASE-8166 and back and still have HBASE-12600 correctly, if we replace this: {code} - final boolean needMvcc = fd.maxMVCCReadpoint >= smallestReadPoint; + final Compression.Algorithm compression = store.getFamily().getCompactionCompression(); StripeMultiFileWriter.WriterFactory factory = new StripeMultiFileWriter.WriterFactory() { @Override public Writer createWriter() throws IOException { return store.createWriterInTmp( - fd.maxKeyCount, compression, true, needMvcc, fd.maxTagsLength > 0); + fd.maxKeyCount, compression, true, true, fd.maxTagsLength > 0); } {code} With this: {code} - final boolean needMvcc = fd.maxMVCCReadpoint >= smallestReadPoint; + final boolean needMvcc = fd.maxMVCCReadpoint >= 0; final Compression.Algorithm compression = store.getFamily().getCompactionCompression(); StripeMultiFileWriter.WriterFactory factory = new StripeMultiFileWriter.WriterFactory() { @Override public Writer createWriter() throws IOException { return store.createWriterInTmp( fd.maxKeyCount, compression, true, needMvcc, fd.maxTagsLength > 0); } {code} So when all mvccr readpoint are 0, the next compaction can then still do the optimization for HBASE-8166 and not write the mvcc information at all. It just will be later... Before we already do that when we do not have any scanner open with a readpoint older than any of the readpoints in the HFile, now we have to wait until comactions set them all to 0. It's not all that bad. [~stack], if the data is older than 6 days I'd expect this to no longer show in the profiler. Maybe we need to write some unittests for this, although I assume that won't be easy. > [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations > ------------------------------------------------------------- > > Key: HBASE-13389 > URL: https://issues.apache.org/jira/browse/HBASE-13389 > Project: HBase > Issue Type: Sub-task > Components: Performance > Reporter: stack > > HBASE-12600 moved the edit sequenceid from tags to instead exploit the > mvcc/sequenceid slot in a key. Now Cells near-always have an associated > mvcc/sequenceid where previous it was rare or the mvcc was kept up at the > file level. This is sort of how it should be many of us would argue but as a > side-effect of this change, read-time optimizations that helped speed scans > were undone by this change. > In this issue, lets see if we can get the optimizations back -- or just > remove the optimizations altogether. > The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291. > The optimizations undone by this changes are (to quote the optimizer himself, > Mr [~lhofhansl]): > {quote} > Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166. > We're always storing the mvcc readpoints, and we never compare them against > the actual smallestReadpoint, and hence we're always performing all the > checks, tests, and comparisons that these jiras removed in addition to > actually storing the data - which with up to 8 bytes per Cell is not trivial. > {quote} > This is the 'breaking' change: > https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96 -- This message was sent by Atlassian JIRA (v6.3.4#6332)