[jira] [Commented] (HBASE-15808) Reduce potential bulk load intermediate space usage and waste

2016-05-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304433#comment-15304433
 ] 

Hudson commented on HBASE-15808:


FAILURE: Integrated in HBase-0.98-matrix #348 (See 
[https://builds.apache.org/job/HBase-0.98-matrix/348/])
HBASE-15808 Reduce potential bulk load intermediate space usage and (apurtell: 
rev a38b633a4bdf171d9a12a600a9b1a22b1f09dec9)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFilesSplitRecovery.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java


> Reduce potential bulk load intermediate space usage and waste
> -
>
> Key: HBASE-15808
> URL: https://issues.apache.org/jira/browse/HBASE-15808
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Jerry He
>Assignee: Jerry He
>Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.2, 0.98.20
>
> Attachments: HBASE-15808-v2.patch, HBASE-15808-v3.patch, 
> HBASE-15808.patch
>
>
> If the bulk load input files do not match the existing region boudaries, the 
> files will be splitted.
> In the unfornate cases where the files need to be splitted multiple times,
> the process can consume unnecessary space and can even cause out of space.
> Here is over-simplified example.
> Orinal size of input files:  
>   consumed space: size --> 300GB
> After a round of splits: 
>   consumed space: size + tmpspace1 --> 300GB + 300GB
> After another round of splits: 
>   consumded space:  size + tmpspace1 + tmpspace2 --> 300GB + 300GB + 300GB
> ..
> Currently we don't do any cleanup in the process. At least all the 
> intermediate tmpspace (not the last one) can be deleted in the process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15808) Reduce potential bulk load intermediate space usage and waste

2016-05-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304383#comment-15304383
 ] 

Hudson commented on HBASE-15808:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1220 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1220/])
HBASE-15808 Reduce potential bulk load intermediate space usage and (apurtell: 
rev a38b633a4bdf171d9a12a600a9b1a22b1f09dec9)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFilesSplitRecovery.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java


> Reduce potential bulk load intermediate space usage and waste
> -
>
> Key: HBASE-15808
> URL: https://issues.apache.org/jira/browse/HBASE-15808
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Jerry He
>Assignee: Jerry He
>Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.2, 0.98.20
>
> Attachments: HBASE-15808-v2.patch, HBASE-15808-v3.patch, 
> HBASE-15808.patch
>
>
> If the bulk load input files do not match the existing region boudaries, the 
> files will be splitted.
> In the unfornate cases where the files need to be splitted multiple times,
> the process can consume unnecessary space and can even cause out of space.
> Here is over-simplified example.
> Orinal size of input files:  
>   consumed space: size --> 300GB
> After a round of splits: 
>   consumed space: size + tmpspace1 --> 300GB + 300GB
> After another round of splits: 
>   consumded space:  size + tmpspace1 + tmpspace2 --> 300GB + 300GB + 300GB
> ..
> Currently we don't do any cleanup in the process. At least all the 
> intermediate tmpspace (not the last one) can be deleted in the process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15808) Reduce potential bulk load intermediate space usage and waste

2016-05-12 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281536#comment-15281536
 ] 

Matteo Bertozzi commented on HBASE-15808:
-

+1

> Reduce potential bulk load intermediate space usage and waste
> -
>
> Key: HBASE-15808
> URL: https://issues.apache.org/jira/browse/HBASE-15808
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Jerry He
>Assignee: Jerry He
>Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.2
>
> Attachments: HBASE-15808-v2.patch, HBASE-15808-v3.patch, 
> HBASE-15808.patch
>
>
> If the bulk load input files do not match the existing region boudaries, the 
> files will be splitted.
> In the unfornate cases where the files need to be splitted multiple times,
> the process can consume unnecessary space and can even cause out of space.
> Here is over-simplified example.
> Orinal size of input files:  
>   consumed space: size --> 300GB
> After a round of splits: 
>   consumed space: size + tmpspace1 --> 300GB + 300GB
> After another round of splits: 
>   consumded space:  size + tmpspace1 + tmpspace2 --> 300GB + 300GB + 300GB
> ..
> Currently we don't do any cleanup in the process. At least all the 
> intermediate tmpspace (not the last one) can be deleted in the process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15808) Reduce potential bulk load intermediate space usage and waste

2016-05-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281262#comment-15281262
 ] 

Hadoop QA commented on HBASE-15808:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
20s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
57s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
54s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 9m 
1s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.1 
2.5.2 2.6.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 103m 47s 
{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 129m 3s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12803572/HBASE-15808-v3.patch |
| JIRA Issue | HBASE-15808 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/test_framework/yetus-0.2.1/lib/precommit/personality/hbase.sh
 |
| git revision | master / 1267f76 |
| Default Java | 1.7.0_79 |
| Multi-JDK versions |  /home/jenkins/tools/java/jdk1.8.0:1.8.0 
/usr/local/jenkins/java/jdk1.7.0_79:1.7.0_79 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/1871/testReport/ |
| modules | C: hbase-server U: hbase-server 

[jira] [Commented] (HBASE-15808) Reduce potential bulk load intermediate space usage and waste

2016-05-10 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15279256#comment-15279256
 ] 

Jerry He commented on HBASE-15808:
--

Sure. I will add a test.

> Reduce potential bulk load intermediate space usage and waste
> -
>
> Key: HBASE-15808
> URL: https://issues.apache.org/jira/browse/HBASE-15808
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Jerry He
>Assignee: Jerry He
>Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.2
>
> Attachments: HBASE-15808-v2.patch, HBASE-15808.patch
>
>
> If the bulk load input files do not match the existing region boudaries, the 
> files will be splitted.
> In the unfornate cases where the files need to be splitted multiple times,
> the process can consume unnecessary space and can even cause out of space.
> Here is over-simplified example.
> Orinal size of input files:  
>   consumed space: size --> 300GB
> After a round of splits: 
>   consumed space: size + tmpspace1 --> 300GB + 300GB
> After another round of splits: 
>   consumded space:  size + tmpspace1 + tmpspace2 --> 300GB + 300GB + 300GB
> ..
> Currently we don't do any cleanup in the process. At least all the 
> intermediate tmpspace (not the last one) can be deleted in the process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15808) Reduce potential bulk load intermediate space usage and waste

2016-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15279179#comment-15279179
 ] 

Hadoop QA commented on HBASE-15808:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
59s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
14s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 6s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
3s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
16m 21s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.1 2.5.2 2.6.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 131m 38s 
{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 172m 39s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12803303/HBASE-15808-v2.patch |
| JIRA Issue | HBASE-15808 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux pomona.apache.org 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT 
Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/test_framework/yetus-0.2.1/lib/precommit/personality/hbase.sh
 |
| git revision | master / a11091c |
| Default Java | 1.7.0_79 |
| Multi-JDK versions |  /home/jenkins/tools/java/jdk1.8.0:1.8.0 
/usr/local/jenkins/java/jdk1.7.0_79:1.7.0_79 |
| findbugs | v3.0.0 |
|  Test Results | 

[jira] [Commented] (HBASE-15808) Reduce potential bulk load intermediate space usage and waste

2016-05-10 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15278799#comment-15278799
 ] 

Matteo Bertozzi commented on HBASE-15808:
-

patch looks ok to me. checkstyle may complain about that first try/catch 
alignment.

any chance that we can have a unit test so if someone is going to remove it we 
will notice?
looks like we have already some tests that do splits in 
TestLoadIncrementalHFiles, but maybe it is not so trivial to find and check the 
files with what we have today.

> Reduce potential bulk load intermediate space usage and waste
> -
>
> Key: HBASE-15808
> URL: https://issues.apache.org/jira/browse/HBASE-15808
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Jerry He
>Assignee: Jerry He
>Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.2
>
> Attachments: HBASE-15808-v2.patch, HBASE-15808.patch
>
>
> If the bulk load input files do not match the existing region boudaries, the 
> files will be splitted.
> In the unfornate cases where the files need to be splitted multiple times,
> the process can consume unnecessary space and can even cause out of space.
> Here is over-simplified example.
> Orinal size of input files:  
>   consumed space: size --> 300GB
> After a round of splits: 
>   consumed space: size + tmpspace1 --> 300GB + 300GB
> After another round of splits: 
>   consumded space:  size + tmpspace1 + tmpspace2 --> 300GB + 300GB + 300GB
> ..
> Currently we don't do any cleanup in the process. At least all the 
> intermediate tmpspace (not the last one) can be deleted in the process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15808) Reduce potential bulk load intermediate space usage and waste

2016-05-10 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15278782#comment-15278782
 ] 

Jerry He commented on HBASE-15808:
--

After HBASE-12375,  _tmp is not safe anymore.
Attached v2.

[~mbertozzi]: Would you mind taking a look? Thanks.

> Reduce potential bulk load intermediate space usage and waste
> -
>
> Key: HBASE-15808
> URL: https://issues.apache.org/jira/browse/HBASE-15808
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Jerry He
>Assignee: Jerry He
>Priority: Minor
> Attachments: HBASE-15808.patch
>
>
> If the bulk load input files do not match the existing region boudaries, the 
> files will be splitted.
> In the unfornate cases where the files need to be splitted multiple times,
> the process can consume unnecessary space and can even cause out of space.
> Here is over-simplified example.
> Orinal size of input files:  
>   consumed space: size --> 300GB
> After a round of splits: 
>   consumed space: size + tmpspace1 --> 300GB + 300GB
> After another round of splits: 
>   consumded space:  size + tmpspace1 + tmpspace2 --> 300GB + 300GB + 300GB
> ..
> Currently we don't do any cleanup in the process. At least all the 
> intermediate tmpspace (not the last one) can be deleted in the process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)