[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16191023#comment-16191023 ]
Chia-Ping Tsai commented on HBASE-18752: ---------------------------------------- bq. after this change the min and max timeRange both will be same? No, what this patch try to fix is to correct the {{TimeRange}} in the hfile. See {{TestHStore#testTimeRangeIfSomeCellsAreDroppedInFlush}} {code} + @Test + public void testTimeRangeIfSomeCellsAreDroppedInFlush() throws IOException { + init(this.name.getMethodName(), TEST_UTIL.getConfiguration(), + ColumnFamilyDescriptorBuilder.newBuilder(family).setMaxVersions(1).build()); + long currentTs = 100; + final long minTs = currentTs; + // this cell won't be flushed to disk + this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), null); + // this cell won't be flushed to disk + this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), null); + this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), null); + flushStore(store, id++); + + Collection<HStoreFile> files = store.getStorefiles(); + assertEquals(1, files.size()); + HStoreFile f = files.iterator().next(); + f.initReader(); + StoreFileReader reader = f.getReader(); + assertEquals(currentTs - 1, reader.timeRange.getMin()); + assertEquals(currentTs - 1, reader.timeRange.getMax()); + } {code} Before this change, the min of timerange is {{currentTs}} but the cell having the {{currentTs}} don't be stored in the hfiles because it is dropped. That is a bug causing we can't filter the unnecessary file before staring reading the data block. After this patch, we can get the correct min of timerange. > Recalculate the TimeRange in flushing snapshot to store file > ------------------------------------------------------------ > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task > Reporter: Chia-Ping Tsai > Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)