Allow access to Primitive types stored in binary format in HBase
----------------------------------------------------------------
Key: HIVE-1634
URL: https://issues.apache.org/jira/browse/HIVE-1634
Project: Hadoop Hive
Issue Type: Improvement
Components: HBase Handler
Affects Versions: 0.7.0
Reporter: Basab Maulik
Assignee: Basab Maulik
This addresses HIVE-1245 in part, for atomic or primitive types.
The serde property "hbase.columns.storage.types" = "-,b,b,b,b,b,b,b,b" is a
specification of the storage option for the corresponding column in the serde
property "hbase.columns.mapping". Allowed values are '-' for table default, 's'
for standard string storage, and 'b' for binary storage as would be obtained
from o.a.h.hbase.utils.Bytes. Map types for HBase column families use a colon
separated pair such as 's:b' for the key and value part specifiers
respectively. See the test cases and queries for HBase handler for additional
examples.
There is also a table property "hbase.table.default.storage.type" = "string" to
specify a table level default storage type. The other valid specification is
"binary". The table level default is overridden by a column level specification.
This control is available for the boolean, tinyint, smallint, int, bigint,
float, and double primitive types. The attached patch also relaxes the mapping
of map types to HBase column families to allow any primitive type to be the map
key.
Attached is a program for creating a table and populating it in HBase. The
external table in Hive can access the data as shown in the example below.
hive> create external table TestHiveHBaseExternalTable
> (key string, c_bool boolean, c_byte tinyint, c_short smallint,
> c_int int, c_long bigint, c_string string, c_float float, c_double
double)
> stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> with serdeproperties ("hbase.columns.mapping" =
":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double")
> tblproperties ("hbase.table.name" = "TestHiveHBaseExternalTable");
OK
Time taken: 0.691 seconds
hive> select * from TestHiveHBaseExternalTable;
OK
key-1 NULL NULL NULL NULL NULL Test-String NULL NULL
Time taken: 0.346 seconds
hive> drop table TestHiveHBaseExternalTable;
OK
Time taken: 0.139 seconds
hive> create external table TestHiveHBaseExternalTable
> (key string, c_bool boolean, c_byte tinyint, c_short smallint,
> c_int int, c_long bigint, c_string string, c_float float, c_double
double)
> stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> with serdeproperties (
> "hbase.columns.mapping" =
":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double",
> "hbase.columns.storage.types" = "-,b,b,b,b,b,b,b,b" )
> tblproperties (
> "hbase.table.name" = "TestHiveHBaseExternalTable",
> "hbase.table.default.storage.type" = "string");
OK
Time taken: 0.139 seconds
hive> select * from TestHiveHBaseExternalTable;
OK
key-1 true -128 -32768 -2147483648 -9223372036854775808
Test-String -2.1793132E-11 2.01345E291
Time taken: 0.151 seconds
hive> drop table TestHiveHBaseExternalTable;
OK
Time taken: 0.154 seconds
hive> create external table TestHiveHBaseExternalTable
> (key string, c_bool boolean, c_byte tinyint, c_short smallint,
> c_int int, c_long bigint, c_string string, c_float float, c_double
double)
> stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> with serdeproperties (
> "hbase.columns.mapping" =
":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double",
> "hbase.columns.storage.types" = "-,b,b,b,b,b,-,b,b" )
> tblproperties ("hbase.table.name" = "TestHiveHBaseExternalTable");
OK
Time taken: 0.347 seconds
hive> select * from TestHiveHBaseExternalTable;
OK
key-1 true -128 -32768 -2147483648 -9223372036854775808
Test-String -2.1793132E-11 2.01345E291
Time taken: 0.245 seconds
hive>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.