Thanks Bejoy, So you mean to say in the below scenario we have to have both 
collection and map together? Do I need to define Array and MAP together for the 
same column? As I understand from your mail this column has not only MAP but 
collection of Maps. Is this assumption is right?

Thank You,
Manish.

-----Original Message-----
From: Bejoy KS [mailto:bejoy...@yahoo.com] 
Sent: Friday, September 21, 2012 10:50 AM
To: user@hive.apache.org; user
Subject: Re: Map issue in Hive.



Hi Manish

Couple of things to keep in mind here

if you have a column data like this "key1:value1;key2:value2;key3:value3;" and 
this column has to be handled by a map data type, Then the DDL should like like
FIELDS TERMINATED BY '<any char>' 
COLLECTION ITEMS TERMINATED BY ';'
MAP KEYS TERMINATED BY ',' 

ie when you have a key value pair, the separator for each key value pair is 
specified using 'COLLECTION ITEMS TERMINATED BY' and the separator for key and 
value within each pair is specified using 'MAP KEYS TERMINATED BY' .

In your column if it is just a collection of elements rather than a key value 
pair, you can use an Array data type instead. Here just specify the delimiter 
for each values using 'COLLECTION ITEMS TERMINATED BY'



Regards,
Bejoy KS


________________________________
From: Manish <manishbh...@rocketmail.com>
To: user <u...@hadoop.apache.org> 
Cc: user <user@hive.apache.org> 
Sent: Friday, September 21, 2012 10:04 AM
Subject: Map issue in Hive.


Hivers,

I have a web log which i need to load into single table. But one column has 
complete string of important data. However i want to extract complete 
information from 1 column and do further analysis.

Issue here is that after giving ';' as a delimiter i was expecting Map for all 
occurrence of  ';'. But it is considering only first delimiter(;) and rest of 
the string is coming in value pair.

This is how 1 column data is looks like

ASP.NET_SessionId=bzqgdenuhxxyqmc2vv5tvrdw;+Rviewd=;+UserId=%7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;+UserType=G;+LastLogin=9/11/2012+12:00:01+AM

    It is getting stored as below. 

{"ASP.NET_SessionId":"bzqgdenuhxxyqmc2vv5tvrdw;+Rviewd=;+UserId=%7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;+UserType=G;+LastLogin=9/11/2012+12:00:01+AM"}

Below is the DDL. 

CREATE external TABLE page_view_tmp_2
(
C_0 STRING,
C_1 MAP<STRING,STRING>,
C_2 STRING,
C_3 STRING,
C_41 STRING)
COMMENT 'Page View'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' MAP KEYS TERMINATED BY ';' 
STORED AS TEXTFILE           

Reply via email to