[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197076#comment-16197076
 ] 

Hudson commented on HBASE-18752:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #3855 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/3855/])
HBASE-18752 Recalculate the TimeRange in flushing snapshot to store file 
(chia7712: rev e2cef8aa805478feb7752fab738ee997e2bf374f)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHStore.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StripeStoreFlusher.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultStoreFlusher.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileWriter.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/mob/DefaultMobStoreFlusher.java


> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196897#comment-16196897
 ] 

Hudson commented on HBASE-18752:


FAILURE: Integrated in Jenkins build HBase-2.0 #654 (See 
[https://builds.apache.org/job/HBase-2.0/654/])
HBASE-18752 Recalculate the TimeRange in flushing snapshot to store file 
(chia7712: rev 13a53811de2ced9c6d599e2f91a777d2ad1a9589)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DefaultStoreFlusher.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/mob/DefaultMobStoreFlusher.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHStore.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StripeStoreFlusher.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileWriter.java


> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-09 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196592#comment-16196592
 ] 

ramkrishna.s.vasudevan commented on HBASE-18752:


+1.

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-09 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196589#comment-16196589
 ] 

Anoop Sam John commented on HBASE-18752:


+1

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196368#comment-16196368
 ] 

Ted Yu commented on HBASE-18752:


lgtm

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-08 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196353#comment-16196353
 ] 

Chia-Ping Tsai commented on HBASE-18752:


bq. Still would be nice to run PE random write tests for a bit longer duration 
of say 10 mns to see the impact of the overhead in flush.
Run the test which creates 100 GB hfiles with following config
# snappy
# no compaction
# no wal
# 10 times
# DefaultMemStore

|| ||master||patch||
|min(s)|7255|7045|
|avg(s)|7491|7552|
|max(s)|7950|8030|
It seems the impact of the overhead is trivial. 

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-07 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16195773#comment-16195773
 ] 

Chia-Ping Tsai commented on HBASE-18752:


bq. Also in COmpacting MemStore when Policy is EAGER, for each of the 
ImmutableSegment creation, we will recalculate this TR? There also dropping of 
dup cells etc happens. Pls double check once. May be a test case also for that 
would be nice to have.
see HBASE-18966

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-05 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194177#comment-16194177
 ] 

Anoop Sam John commented on HBASE-18752:


Thanks.. Am  generally +1 on this. Still would be nice to run PE random write 
tests for a bit longer duration of say 10 mns to see the impact of the overhead 
in flush.
Also in COmpacting MemStore when Policy is EAGER, for each of the 
ImmutableSegment creation, we will recalculate this TR?  There also dropping of 
dup cells etc happens. Pls double check once.  May be a test case also for that 
would be nice to have.

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194148#comment-16194148
 ] 

ramkrishna.s.vasudevan commented on HBASE-18752:


Thanks [~chia7712]. The new test case covers the multi version case also. 

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-05 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194100#comment-16194100
 ] 

Chia-Ping Tsai commented on HBASE-18752:


bq. Any chance for a perf test ?
sure. Will run the perf test at weekends.

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-05 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193433#comment-16193433
 ] 

Anoop Sam John commented on HBASE-18752:


Flush being not in hot write path, some extra ops been ok. Any chance for a 
perf test ?

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch, HBASE-18752.v1.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193365#comment-16193365
 ] 

Hadoop QA commented on HBASE-18752:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
13s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
23s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
41m 37s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}131m 
43s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}195m 26s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:5d60123 |
| JIRA Issue | HBASE-18752 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12890532/HBASE-18752.v1.patch |
| Optional Tests |  asflicense  shadedjars  javac  javadoc  unit  findbugs  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 2b190ba6add8 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 98d1637 |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/8953/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/8953/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> Recalculate the TimeRange in flushing snapshot to store file
> 

[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-05 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192948#comment-16192948
 ] 

Chia-Ping Tsai commented on HBASE-18752:


bq. So if we have max versions set to 2, then also we don't have any issue 
right? Still the time range tracker will be able to mark 101 and 102 in this 
case correct?
Yes, the test will pass if the max versions set to 2. However, it still fails 
if we put three(> 2) cells having the same row/fam/qual and different ts. The 
lowest cell will be dropped in flush. I added more tests in v1 patch.

bq. Would there be any impact on performance of flushing ?
ya, fixing this bug will impact the performance of flushing.
# we have to retrieve the ts from the cell (ByteBufferedCell)
# we have to recalculate the min/max of TimeRange (The cost is trivial now 
because we introduce the non-sync TimeRangeTracker - HBASE-18753)

bq. So in your case there are lot of duplicate records but with diff ts? 
Something like a streaming app?
Yep. our data, which are dump from the same time window, have many same fields.

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-04 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192462#comment-16192462
 ] 

ramkrishna.s.vasudevan commented on HBASE-18752:


[~chia7712]
One question here
bq.That is a bug causing we can't filter the unnecessary file before staring 
reading the data block
So in your case there are lot of duplicate records but with diff ts? Something 
like a streaming app?

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191508#comment-16191508
 ] 

Ted Yu commented on HBASE-18752:


Would there be any impact on performance of flushing ?

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-04 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191498#comment-16191498
 ] 

ramkrishna.s.vasudevan commented on HBASE-18752:


Nice patch. I got it now. So if we have max versions set to 2, then also we 
don't have any issue right? Still the time range tracker will be able to mark 
101 and 102 in this case correct?

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-04 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191029#comment-16191029
 ] 

ramkrishna.s.vasudevan commented on HBASE-18752:


thanks for the info. Will check this once again.

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-04 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191023#comment-16191023
 ] 

Chia-Ping Tsai commented on HBASE-18752:


bq.  after this change the min and max timeRange both will be same?
No, what this patch try to fix is to correct the {{TimeRange}} in the hfile. 
See {{TestHStore#testTimeRangeIfSomeCellsAreDroppedInFlush}}
{code}
+  @Test
+  public void testTimeRangeIfSomeCellsAreDroppedInFlush() throws IOException {
+init(this.name.getMethodName(), TEST_UTIL.getConfiguration(),
+
ColumnFamilyDescriptorBuilder.newBuilder(family).setMaxVersions(1).build());
+long currentTs = 100;
+final long minTs = currentTs;
+// this cell won't be flushed to disk
+this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), 
null);
+// this cell won't be flushed to disk
+this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), 
null);
+this.store.add(new KeyValue(row, family, qf1, currentTs++, (byte[])null), 
null);
+flushStore(store, id++);
+
+Collection files = store.getStorefiles();
+assertEquals(1, files.size());
+HStoreFile f = files.iterator().next();
+f.initReader();
+StoreFileReader reader = f.getReader();
+assertEquals(currentTs - 1, reader.timeRange.getMin());
+assertEquals(currentTs - 1, reader.timeRange.getMax());
+  }
{code}
Before this change, the min of timerange is {{currentTs}} but the cell having 
the {{currentTs}} don't be stored in the hfiles because it is dropped. That is 
a bug causing we can't filter the unnecessary file before staring reading the 
data block. After this patch, we can get the correct min of timerange.


> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-03 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190845#comment-16190845
 ] 

ramkrishna.s.vasudevan commented on HBASE-18752:


[~chia7712]
Thanks for the patch. So after this change the min and max timeRange both will 
be same?

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-03 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190048#comment-16190048
 ] 

Chia-Ping Tsai commented on HBASE-18752:


Ping for reviews~

> Recalculate the TimeRange in flushing snapshot to store file
> 
>
> Key: HBASE-18752
> URL: https://issues.apache.org/jira/browse/HBASE-18752
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18752.v0.patch
>
>
> We drop superfluous cells in flushing, hence the TimeRange from snapshot is 
> inaccurate for the storefile. We should recalculate the TimeRange for the 
> storefile, but the side-effect is the extra cost - we need to extract the 
> timestamp from cell (ByteBufferCell).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18752) Recalculate the TimeRange in flushing snapshot to store file

2017-10-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188582#comment-16188582
 ] 

Hadoop QA commented on HBASE-18752:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
54s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 0s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
37m 26s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 95m 
35s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}151m 49s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:5d60123 |
| JIRA Issue | HBASE-18752 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12889977/HBASE-18752.v0.patch |
| Optional Tests |  asflicense  shadedjars  javac  javadoc  unit  findbugs  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 914f194d2c4d 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / d35d837 |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/8892/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/8892/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> Recalculate the TimeRange in flushing snapshot to store file
>