[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197076#comment-16197076 ] Hudson commented on HBASE-18752: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #3855 (See [https://builds.apache.org/job/HBase-Trunk_matrix/3855/]) HBASE-18752 Recalculate the TimeRange in flushing snapshot to store file (chia7712: rev e2cef8aa805478feb7752fab738ee997e2bf374f) * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StripeStoreFlusher.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultStoreFlusher.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileWriter.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/mob/DefaultMobStoreFlusher.java > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196897#comment-16196897 ] Hudson commented on HBASE-18752: FAILURE: Integrated in Jenkins build HBase-2.0 #654 (See [https://builds.apache.org/job/HBase-2.0/654/]) HBASE-18752 Recalculate the TimeRange in flushing snapshot to store file (chia7712: rev 13a53811de2ced9c6d599e2f91a777d2ad1a9589) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultStoreFlusher.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/mob/DefaultMobStoreFlusher.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHStore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StripeStoreFlusher.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileWriter.java > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196592#comment-16196592 ] ramkrishna.s.vasudevan commented on HBASE-18752: +1. > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196589#comment-16196589 ] Anoop Sam John commented on HBASE-18752: +1 > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196368#comment-16196368 ] Ted Yu commented on HBASE-18752: lgtm > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196353#comment-16196353 ] Chia-Ping Tsai commented on HBASE-18752: bq. Still would be nice to run PE random write tests for a bit longer duration of say 10 mns to see the impact of the overhead in flush. Run the test which creates 100 GB hfiles with following config # snappy # no compaction # no wal # 10 times # DefaultMemStore || ||master||patch|| |min(s)|7255|7045| |avg(s)|7491|7552| |max(s)|7950|8030| It seems the impact of the overhead is trivial. > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16195773#comment-16195773 ] Chia-Ping Tsai commented on HBASE-18752: bq. Also in COmpacting MemStore when Policy is EAGER, for each of the ImmutableSegment creation, we will recalculate this TR? There also dropping of dup cells etc happens. Pls double check once. May be a test case also for that would be nice to have. see HBASE-18966 > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194177#comment-16194177 ] Anoop Sam John commented on HBASE-18752: Thanks.. Am generally +1 on this. Still would be nice to run PE random write tests for a bit longer duration of say 10 mns to see the impact of the overhead in flush. Also in COmpacting MemStore when Policy is EAGER, for each of the ImmutableSegment creation, we will recalculate this TR? There also dropping of dup cells etc happens. Pls double check once. May be a test case also for that would be nice to have. > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194148#comment-16194148 ] ramkrishna.s.vasudevan commented on HBASE-18752: Thanks [~chia7712]. The new test case covers the multi version case also. > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194100#comment-16194100 ] Chia-Ping Tsai commented on HBASE-18752: bq. Any chance for a perf test ? sure. Will run the perf test at weekends. > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193433#comment-16193433 ] Anoop Sam John commented on HBASE-18752: Flush being not in hot write path, some extra ops been ok. Any chance for a perf test ? > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193365#comment-16193365 ] Hadoop QA commented on HBASE-18752: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 13s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 44s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 23s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 41m 37s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}131m 43s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}195m 26s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:5d60123 | | JIRA Issue | HBASE-18752 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12890532/HBASE-18752.v1.patch | | Optional Tests | asflicense shadedjars javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 2b190ba6add8 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 98d1637 | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/8953/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/8953/console | | Powered by | Apache Yetus 0.4.0 http://yetus.apache.org | This message was automatically generated. > Recalculate the TimeRange in flushing snapshot to store file >
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192948#comment-16192948 ] Chia-Ping Tsai commented on HBASE-18752: bq. So if we have max versions set to 2, then also we don't have any issue right? Still the time range tracker will be able to mark 101 and 102 in this case correct? Yes, the test will pass if the max versions set to 2. However, it still fails if we put three(> 2) cells having the same row/fam/qual and different ts. The lowest cell will be dropped in flush. I added more tests in v1 patch. bq. Would there be any impact on performance of flushing ? ya, fixing this bug will impact the performance of flushing. # we have to retrieve the ts from the cell (ByteBufferedCell) # we have to recalculate the min/max of TimeRange (The cost is trivial now because we introduce the non-sync TimeRangeTracker - HBASE-18753) bq. So in your case there are lot of duplicate records but with diff ts? Something like a streaming app? Yep. our data, which are dump from the same time window, have many same fields. > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192462#comment-16192462 ] ramkrishna.s.vasudevan commented on HBASE-18752: [~chia7712] One question here bq.That is a bug causing we can't filter the unnecessary file before staring reading the data block So in your case there are lot of duplicate records but with diff ts? Something like a streaming app? > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191508#comment-16191508 ] Ted Yu commented on HBASE-18752: Would there be any impact on performance of flushing ? > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191498#comment-16191498 ] ramkrishna.s.vasudevan commented on HBASE-18752: Nice patch. I got it now. So if we have max versions set to 2, then also we don't have any issue right? Still the time range tracker will be able to mark 101 and 102 in this case correct? > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191029#comment-16191029 ] ramkrishna.s.vasudevan commented on HBASE-18752: thanks for the info. Will check this once again. > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191023#comment-16191023 ] Chia-Ping Tsai commented on HBASE-18752: bq. after this change the min and max timeRange both will be same? No, what this patch try to fix is to correct the {{TimeRange}} in the hfile. See {{TestHStore#testTimeRangeIfSomeCellsAreDroppedInFlush}} {code} + @Test + public void testTimeRangeIfSomeCellsAreDroppedInFlush() throws IOException { +init(this.name.getMethodName(), TEST_UTIL.getConfiguration(), + ColumnFamilyDescriptorBuilder.newBuilder(family).setMaxVersions(1).build()); +long currentTs = 100; +final long minTs = currentTs; +// this cell won't be flushed to disk +this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), null); +// this cell won't be flushed to disk +this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), null); +this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), null); +flushStore(store, id++); + +Collection files = store.getStorefiles(); +assertEquals(1, files.size()); +HStoreFile f = files.iterator().next(); +f.initReader(); +StoreFileReader reader = f.getReader(); +assertEquals(currentTs - 1, reader.timeRange.getMin()); +assertEquals(currentTs - 1, reader.timeRange.getMax()); + } {code} Before this change, the min of timerange is {{currentTs}} but the cell having the {{currentTs}} don't be stored in the hfiles because it is dropped. That is a bug causing we can't filter the unnecessary file before staring reading the data block. After this patch, we can get the correct min of timerange. > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190845#comment-16190845 ] ramkrishna.s.vasudevan commented on HBASE-18752: [~chia7712] Thanks for the patch. So after this change the min and max timeRange both will be same? > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190048#comment-16190048 ] Chia-Ping Tsai commented on HBASE-18752: Ping for reviews~ > Recalculate the TimeRange in flushing snapshot to store file > > > Key: HBASE-18752 > URL: https://issues.apache.org/jira/browse/HBASE-18752 > Project: HBase > Issue Type: Sub-task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-18752.v0.patch > > > We drop superfluous cells in flushing, hence the TimeRange from snapshot is > inaccurate for the storefile. We should recalculate the TimeRange for the > storefile, but the side-effect is the extra cost - we need to extract the > timestamp from cell (ByteBufferCell). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file
[ https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188582#comment-16188582 ] Hadoop QA commented on HBASE-18752: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 56s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 54s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 26s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 0s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 37m 26s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 95m 35s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}151m 49s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:5d60123 | | JIRA Issue | HBASE-18752 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12889977/HBASE-18752.v0.patch | | Optional Tests | asflicense shadedjars javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 914f194d2c4d 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / d35d837 | | Default Java | 1.8.0_144 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/8892/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/8892/console | | Powered by | Apache Yetus 0.4.0 http://yetus.apache.org | This message was automatically generated. > Recalculate the TimeRange in flushing snapshot to store file >