[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-04-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13967365#comment-13967365
 ] 

Hudson commented on HBASE-10323:


FAILURE: Integrated in hbase-0.96-hadoop2 #261 (See 
[https://builds.apache.org/job/hbase-0.96-hadoop2/261/])
HBASE-10921 Port HBASE-10323 'Auto detect data block encoding in 
HFileOutputFormat' to 0.96 (Kashif) (tedyu: rev 1586704)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java


> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-04-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13967045#comment-13967045
 ] 

Hudson commented on HBASE-10323:


FAILURE: Integrated in hbase-0.96 #379 (See 
[https://builds.apache.org/job/hbase-0.96/379/])
HBASE-10921 Port HBASE-10323 'Auto detect data block encoding in 
HFileOutputFormat' to 0.96 (Kashif) (tedyu: rev 1586704)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java


> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-04-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966950#comment-13966950
 ] 

Hudson commented on HBASE-10323:


FAILURE: Integrated in HBase-0.94-JDK7 #105 (See 
[https://builds.apache.org/job/HBase-0.94-JDK7/105/])
HBASE-10921 Port HBASE-10323 'Auto detect data block encoding in 
HFileOutputFormat' to 0.94 (Kashif) (tedyu: rev 1586701)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java


> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-04-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966923#comment-13966923
 ] 

Hudson commented on HBASE-10323:


FAILURE: Integrated in HBase-0.94 #1339 (See 
[https://builds.apache.org/job/HBase-0.94/1339/])
HBASE-10921 Port HBASE-10323 'Auto detect data block encoding in 
HFileOutputFormat' to 0.94 (Kashif) (tedyu: rev 1586701)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java


> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-04-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966814#comment-13966814
 ] 

Hudson commented on HBASE-10323:


FAILURE: Integrated in HBase-0.94-security #458 (See 
[https://builds.apache.org/job/HBase-0.94-security/458/])
HBASE-10921 Port HBASE-10323 'Auto detect data block encoding in 
HFileOutputFormat' to 0.94 (Kashif) (tedyu: rev 1586701)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java


> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-04-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966812#comment-13966812
 ] 

Hudson commented on HBASE-10323:


FAILURE: Integrated in HBase-0.94-on-Hadoop-2 #65 (See 
[https://builds.apache.org/job/HBase-0.94-on-Hadoop-2/65/])
HBASE-10921 Port HBASE-10323 'Auto detect data block encoding in 
HFileOutputFormat' to 0.94 (Kashif) (tedyu: rev 1586701)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java


> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-04-07 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961878#comment-13961878
 ] 

Ted Yu commented on HBASE-10323:


Created HBASE-10921 for the backport.

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-04-07 Thread Kashif J S (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961825#comment-13961825
 ] 

Kashif J S commented on HBASE-10323:


Any reason why this has not been integrated to 0.94.* versions yet ? 

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13877116#comment-13877116
 ] 

Hudson commented on HBASE-10323:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #89 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/89/])
HBASE-10323 Auto detect data block encoding in HFileOutputFormat (Tedyu: rev 
1559845)
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java


> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13877081#comment-13877081
 ] 

Hudson commented on HBASE-10323:


SUCCESS: Integrated in HBase-0.98 #96 (See 
[https://builds.apache.org/job/HBase-0.98/96/])
HBASE-10323 Auto detect data block encoding in HFileOutputFormat (Tedyu: rev 
1559845)
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java


> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13877051#comment-13877051
 ] 

Hudson commented on HBASE-10323:


SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-1.1 #60 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/60/])
HBASE-10323 Auto detect data block encoding in HFileOutputFormat (Tedyu: rev 
1559771)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java


> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-20 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876903#comment-13876903
 ] 

Andrew Purtell commented on HBASE-10323:


+1 for 0.98

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876684#comment-13876684
 ] 

Hudson commented on HBASE-10323:


SUCCESS: Integrated in HBase-TRUNK #4837 (See 
[https://builds.apache.org/job/HBase-TRUNK/4837/])
HBASE-10323 Auto detect data block encoding in HFileOutputFormat (Tedyu: rev 
1559771)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java


> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876564#comment-13876564
 ] 

Ted Yu commented on HBASE-10323:


Integrated to trunk.

Thanks for the patch, Ishan.

Thanks for the review, Nick.

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-19 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876114#comment-13876114
 ] 

Ted Yu commented on HBASE-10323:


[~apurtell]:
Do you want this in 0.98 ?

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876079#comment-13876079
 ] 

Hadoop QA commented on HBASE-10323:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12623875/HBASE_10323-trunk-v4.patch
  against trunk revision .
  ATTACHMENT ID: 12623875

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//console

This message is automatically generated.

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-19 Thread Ishan Chhabra (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876054#comment-13876054
 ] 

Ishan Chhabra commented on HBASE-10323:
---

Added the @VisibleForTesting annotations where needed and fixed the '{' in 
newline. I didn't make the constants package-private since no other class needs 
them at the moment. When some other class in the package or a test needs it, 
they could be made package private then. 

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
> HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-16 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13873681#comment-13873681
 ] 

Nick Dimiduk commented on HBASE-10323:
--

I reviewed trunk-v3, patch looks really good. I have couple nits that can be 
cleaned up on commit:
 -  leave the "hbase.hfileoutputformat.*" constants at the default access 
modifier so that they're available within the package if necessary.
 - consider using the @VisibleForTesting annotation
 - code formatting style with '{' on a newline in some of the method 
definitions.

Really nice cleanup, [~ishanc]. I like the additional docstrings.

+1

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-trunk-v1.patch, 
> HBASE_10323-trunk-v2.patch, HBASE_10323-trunk-v3.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-15 Thread Ishan Chhabra (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13873071#comment-13873071
 ] 

Ishan Chhabra commented on HBASE-10323:
---

Can someone else looks and +1? [~lhofhansl]?

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-trunk-v1.patch, 
> HBASE_10323-trunk-v2.patch, HBASE_10323-trunk-v3.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-15 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13872544#comment-13872544
 ] 

Nick Dimiduk commented on HBASE-10323:
--

I'm in favor of autodetecting where possible, so long as we provide override 
options for when the system gets it wrong.

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-trunk-v1.patch, 
> HBASE_10323-trunk-v2.patch, HBASE_10323-trunk-v3.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871712#comment-13871712
 ] 

Hadoop QA commented on HBASE-10323:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12623052/HBASE_10323-trunk-v3.patch
  against trunk revision .
  ATTACHMENT ID: 12623052

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8432//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8432//console

This message is automatically generated.

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-0.94.15-v4.patch, HBASE_10323-trunk-v1.patch, 
> HBASE_10323-trunk-v2.patch, HBASE_10323-trunk-v3.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-14 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871468#comment-13871468
 ] 

stack commented on HBASE-10323:
---

Looks good on a quick scan.  [~ndimiduk] You like this one?

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Fix For: 0.99.0
>
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-12 Thread Ishan Chhabra (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869325#comment-13869325
 ] 

Ishan Chhabra commented on HBASE-10323:
---

I was able to run the maven site successfully on my box. Can't figure out why 
it is failing based on the console output. Can somebody help?

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869311#comment-13869311
 ] 

Ted Yu commented on HBASE-10323:


+1

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869310#comment-13869310
 ] 

Hadoop QA commented on HBASE-10323:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12622576/HBASE_10323-trunk-v2.patch
  against trunk revision .
  ATTACHMENT ID: 12622576

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8399//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8399//console

This message is automatically generated.

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869275#comment-13869275
 ] 

Hadoop QA commented on HBASE-10323:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12622575/HBASE_10323-trunk-v2.patch
  against trunk revision .
  ATTACHMENT ID: 12622575

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8398//console

This message is automatically generated.

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
> HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869215#comment-13869215
 ] 

Ted Yu commented on HBASE-10323:


{code}
+   * @return a map from column family to the name of the configured compression
+   * algorithm
+   */
+  static Map createFamilyCompressionMap(Configuration
{code}
Looks like description, 'to the name', is out of sync with the type of value in 
Map.
Similar comment holds for return value javadoc of createFamilyBloomTypeMap()
{code}
+   * @return a map from column family to HfileDataBlockEncoder for the
{code}
nit: uppercase f in HfileDataBlockEncoder

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-trunk-v1.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869209#comment-13869209
 ] 

Hadoop QA commented on HBASE-10323:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12622554/HBASE_10323-trunk-v1.patch
  against trunk revision .
  ATTACHMENT ID: 12622554

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8395//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8395//console

This message is automatically generated.

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-trunk-v1.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-12 Thread Ishan Chhabra (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869184#comment-13869184
 ] 

Ishan Chhabra commented on HBASE-10323:
---

Added javadoc for parameters and uploaded patch for trunk.

[~lhofhansl], what else shoud be auto detected? I can add that as a part of 
this or a separate JIRA.

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Attachments: HBASE_10323-0.94.15-v1.patch, 
> HBASE_10323-0.94.15-v2.patch, HBASE_10323-trunk-v1.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-12 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869148#comment-13869148
 ] 

Lars Hofhansl commented on HBASE-10323:
---

These are related: HBASE-8949, HBASE-3474. Should simple autodetect everything 
that is configurable for a table and/or CF and use that.

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Attachments: HBASE_10323-0.94.15-v1.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-12 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869146#comment-13869146
 ] 

Lars Hofhansl commented on HBASE-10323:
---

I faintly remember a discussion around this on another jira... Can't find that 
right now.

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Attachments: HBASE_10323-0.94.15-v1.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869105#comment-13869105
 ] 

Ted Yu commented on HBASE-10323:


{code}
+   * @throws IOException
+   *   on failure to read column family descriptors
+   */
+  static void configureDataBlockEncoding(HTable table,
+  Configuration conf) throws IOException {
{code}
Please add javadoc for the parameters.
Mind attaching patch for trunk ?

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Attachments: HBASE_10323-0.94.15-v1.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13868965#comment-13868965
 ] 

Hadoop QA commented on HBASE-10323:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12622533/HBASE_10323-0.94.15-v1.patch
  against trunk revision .
  ATTACHMENT ID: 12622533

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8391//console

This message is automatically generated.

> Auto detect data block encoding in HFileOutputFormat
> 
>
> Key: HBASE-10323
> URL: https://issues.apache.org/jira/browse/HBASE-10323
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ishan Chhabra
>Assignee: Ishan Chhabra
> Attachments: HBASE_10323-0.94.15-v1.patch
>
>
> Currently, one has to specify the data block encoding of the table explicitly 
> using the config parameter 
> "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload 
> load. This option is easily missed, not documented and also works differently 
> than compression, block size and bloom filter type, which are auto detected. 
> The solution would be to add support to auto detect datablock encoding 
> similar to other parameters. 
> The current patch does the following:
> 1. Automatically detects datablock encoding in HFileOutputFormat.
> 2. Keeps the legacy option of manually specifying the datablock encoding
> around as a method to override auto detections.
> 3. Moves string conf parsing to the start of the program so that it fails
> fast during starting up instead of failing during record writes. It also
> makes the internals of the program type safe.
> 4. Adds missing doc strings and unit tests for code serializing and
> deserializing config paramerters for bloom filer type, block size and
> datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)