[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13967365#comment-13967365 ] Hudson commented on HBASE-10323: FAILURE: Integrated in hbase-0.96-hadoop2 #261 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/261/]) HBASE-10921 Port HBASE-10323 'Auto detect data block encoding in HFileOutputFormat' to 0.96 (Kashif) (tedyu: rev 1586704) * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.98.0, 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13967045#comment-13967045 ] Hudson commented on HBASE-10323: FAILURE: Integrated in hbase-0.96 #379 (See [https://builds.apache.org/job/hbase-0.96/379/]) HBASE-10921 Port HBASE-10323 'Auto detect data block encoding in HFileOutputFormat' to 0.96 (Kashif) (tedyu: rev 1586704) * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.98.0, 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966950#comment-13966950 ] Hudson commented on HBASE-10323: FAILURE: Integrated in HBase-0.94-JDK7 #105 (See [https://builds.apache.org/job/HBase-0.94-JDK7/105/]) HBASE-10921 Port HBASE-10323 'Auto detect data block encoding in HFileOutputFormat' to 0.94 (Kashif) (tedyu: rev 1586701) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.98.0, 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966923#comment-13966923 ] Hudson commented on HBASE-10323: FAILURE: Integrated in HBase-0.94 #1339 (See [https://builds.apache.org/job/HBase-0.94/1339/]) HBASE-10921 Port HBASE-10323 'Auto detect data block encoding in HFileOutputFormat' to 0.94 (Kashif) (tedyu: rev 1586701) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.98.0, 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966814#comment-13966814 ] Hudson commented on HBASE-10323: FAILURE: Integrated in HBase-0.94-security #458 (See [https://builds.apache.org/job/HBase-0.94-security/458/]) HBASE-10921 Port HBASE-10323 'Auto detect data block encoding in HFileOutputFormat' to 0.94 (Kashif) (tedyu: rev 1586701) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.98.0, 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966812#comment-13966812 ] Hudson commented on HBASE-10323: FAILURE: Integrated in HBase-0.94-on-Hadoop-2 #65 (See [https://builds.apache.org/job/HBase-0.94-on-Hadoop-2/65/]) HBASE-10921 Port HBASE-10323 'Auto detect data block encoding in HFileOutputFormat' to 0.94 (Kashif) (tedyu: rev 1586701) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.98.0, 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961878#comment-13961878 ] Ted Yu commented on HBASE-10323: Created HBASE-10921 for the backport. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.98.0, 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961825#comment-13961825 ] Kashif J S commented on HBASE-10323: Any reason why this has not been integrated to 0.94.* versions yet ? > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.98.0, 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13877116#comment-13877116 ] Hudson commented on HBASE-10323: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #89 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/89/]) HBASE-10323 Auto detect data block encoding in HFileOutputFormat (Tedyu: rev 1559845) * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.98.0, 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13877081#comment-13877081 ] Hudson commented on HBASE-10323: SUCCESS: Integrated in HBase-0.98 #96 (See [https://builds.apache.org/job/HBase-0.98/96/]) HBASE-10323 Auto detect data block encoding in HFileOutputFormat (Tedyu: rev 1559845) * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.98.0, 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13877051#comment-13877051 ] Hudson commented on HBASE-10323: SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-1.1 #60 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/60/]) HBASE-10323 Auto detect data block encoding in HFileOutputFormat (Tedyu: rev 1559771) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.98.0, 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876903#comment-13876903 ] Andrew Purtell commented on HBASE-10323: +1 for 0.98 > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876684#comment-13876684 ] Hudson commented on HBASE-10323: SUCCESS: Integrated in HBase-TRUNK #4837 (See [https://builds.apache.org/job/HBase-TRUNK/4837/]) HBASE-10323 Auto detect data block encoding in HFileOutputFormat (Tedyu: rev 1559771) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876564#comment-13876564 ] Ted Yu commented on HBASE-10323: Integrated to trunk. Thanks for the patch, Ishan. Thanks for the review, Nick. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876114#comment-13876114 ] Ted Yu commented on HBASE-10323: [~apurtell]: Do you want this in 0.98 ? > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876079#comment-13876079 ] Hadoop QA commented on HBASE-10323: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12623875/HBASE_10323-trunk-v4.patch against trunk revision . ATTACHMENT ID: 12623875 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//console This message is automatically generated. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876054#comment-13876054 ] Ishan Chhabra commented on HBASE-10323: --- Added the @VisibleForTesting annotations where needed and fixed the '{' in newline. I didn't make the constants package-private since no other class needs them at the moment. When some other class in the package or a test needs it, they could be made package private then. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13873681#comment-13873681 ] Nick Dimiduk commented on HBASE-10323: -- I reviewed trunk-v3, patch looks really good. I have couple nits that can be cleaned up on commit: - leave the "hbase.hfileoutputformat.*" constants at the default access modifier so that they're available within the package if necessary. - consider using the @VisibleForTesting annotation - code formatting style with '{' on a newline in some of the method definitions. Really nice cleanup, [~ishanc]. I like the additional docstrings. +1 > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-trunk-v1.patch, > HBASE_10323-trunk-v2.patch, HBASE_10323-trunk-v3.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13873071#comment-13873071 ] Ishan Chhabra commented on HBASE-10323: --- Can someone else looks and +1? [~lhofhansl]? > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-trunk-v1.patch, > HBASE_10323-trunk-v2.patch, HBASE_10323-trunk-v3.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13872544#comment-13872544 ] Nick Dimiduk commented on HBASE-10323: -- I'm in favor of autodetecting where possible, so long as we provide override options for when the system gets it wrong. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-trunk-v1.patch, > HBASE_10323-trunk-v2.patch, HBASE_10323-trunk-v3.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871712#comment-13871712 ] Hadoop QA commented on HBASE-10323: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12623052/HBASE_10323-trunk-v3.patch against trunk revision . ATTACHMENT ID: 12623052 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8432//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8432//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8432//console This message is automatically generated. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-trunk-v1.patch, > HBASE_10323-trunk-v2.patch, HBASE_10323-trunk-v3.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871468#comment-13871468 ] stack commented on HBASE-10323: --- Looks good on a quick scan. [~ndimiduk] You like this one? > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869325#comment-13869325 ] Ishan Chhabra commented on HBASE-10323: --- I was able to run the maven site successfully on my box. Can't figure out why it is failing based on the console output. Can somebody help? > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869311#comment-13869311 ] Ted Yu commented on HBASE-10323: +1 > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869310#comment-13869310 ] Hadoop QA commented on HBASE-10323: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12622576/HBASE_10323-trunk-v2.patch against trunk revision . ATTACHMENT ID: 12622576 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8399//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8399//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8399//console This message is automatically generated. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869275#comment-13869275 ] Hadoop QA commented on HBASE-10323: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12622575/HBASE_10323-trunk-v2.patch against trunk revision . ATTACHMENT ID: 12622575 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8398//console This message is automatically generated. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869215#comment-13869215 ] Ted Yu commented on HBASE-10323: {code} + * @return a map from column family to the name of the configured compression + * algorithm + */ + static Map createFamilyCompressionMap(Configuration {code} Looks like description, 'to the name', is out of sync with the type of value in Map. Similar comment holds for return value javadoc of createFamilyBloomTypeMap() {code} + * @return a map from column family to HfileDataBlockEncoder for the {code} nit: uppercase f in HfileDataBlockEncoder > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-trunk-v1.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869209#comment-13869209 ] Hadoop QA commented on HBASE-10323: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12622554/HBASE_10323-trunk-v1.patch against trunk revision . ATTACHMENT ID: 12622554 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8395//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8395//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8395//console This message is automatically generated. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-trunk-v1.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869184#comment-13869184 ] Ishan Chhabra commented on HBASE-10323: --- Added javadoc for parameters and uploaded patch for trunk. [~lhofhansl], what else shoud be auto detected? I can add that as a part of this or a separate JIRA. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-trunk-v1.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869148#comment-13869148 ] Lars Hofhansl commented on HBASE-10323: --- These are related: HBASE-8949, HBASE-3474. Should simple autodetect everything that is configurable for a table and/or CF and use that. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869146#comment-13869146 ] Lars Hofhansl commented on HBASE-10323: --- I faintly remember a discussion around this on another jira... Can't find that right now. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869105#comment-13869105 ] Ted Yu commented on HBASE-10323: {code} + * @throws IOException + * on failure to read column family descriptors + */ + static void configureDataBlockEncoding(HTable table, + Configuration conf) throws IOException { {code} Please add javadoc for the parameters. Mind attaching patch for trunk ? > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13868965#comment-13868965 ] Hadoop QA commented on HBASE-10323: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12622533/HBASE_10323-0.94.15-v1.patch against trunk revision . ATTACHMENT ID: 12622533 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8391//console This message is automatically generated. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)