[ https://issues.apache.org/jira/browse/HIVE-11329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642511#comment-14642511 ]
Wojciech Indyk commented on HIVE-11329: --------------------------------------- [~swarnim] Example: I define a map prefix "loc_" for geographical locations of a row in HBase. I create table in Hive CREATE EXTERNAL TABLE xyz(id int, locations map<string,string>) ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, source:loc_.*") Then I want to query: select id where locations['California']!=null instead of select id where locations['loc_California']!=null Moreover if I query: select id, locations where locations['California']!=null I would like to have a result like: 1, {California:state, New York:state} instead of 1, {loc_California:state, loc_New York:state} In general: I don't want to receive the prefix for each element of the map in hive. I know what the prefix for the map is (it is defined in SERDEPROPERTIES). It is hard to use prefixed data with another data sources, e.g. a IP->geolocation libraries. All in all it's easier to integrate data without prefixes. IMO Prefixes are artificial structure (like 'super-column') to optimize queries and be able to store a map in hbase. That's why i want to cut prefixes. I know a solution with column family -> hive map, but HBase doesn't support more than 2 CF well. I need ~10 maps in row. I think the idea with a flag is very well. IMO it could be flag defined on creating table. > Column prefix in key of hbase column prefix map > ----------------------------------------------- > > Key: HIVE-11329 > URL: https://issues.apache.org/jira/browse/HIVE-11329 > Project: Hive > Issue Type: Bug > Components: HBase Handler > Affects Versions: 0.14.0 > Reporter: Wojciech Indyk > Assignee: Wojciech Indyk > Priority: Minor > Attachments: HIVE-11329.1.patch > > > When I create a table with hbase column prefix > https://issues.apache.org/jira/browse/HIVE-3725 I have the prefix in result > map in hive. > E.g. record in HBase > rowkey: 123 > column: tag_one, value: 0.5 > column: tag_two, value 0.5 > representation in Hive via column prefix mapping "tag_.*": > column: tag map<string,string> > key: tag_one, value: 0.5 > key: tag_two, value: 0.5 > should be: > key: one, value: 0.5 > key: two: value: 0.5 -- This message was sent by Atlassian JIRA (v6.3.4#6332)