Here is an example, but first – some warnings:

·         You should set textinputformat.record.delimiter not only for the 
populating of the table but also for querying it



·         There seems to be many issues around this area –

o   When I tried to insert multiple values in a single statement (“insert into 
table … values (…),(…),(…)”) only the first set of values was inserted correctly

o   2 new lines were added to the end of each text (‘lyrics’) although there 
should be none.

o   Aggregative queries seems to return null for the last column. Sometimes.

o   The function ‘sentences’ does not work as expected. It treated the whole 
text as a single line.



Dudu





Example


hive> create table songs (id int,name string,lyrics string)
;

hive> set textinputformat.record.delimiter='\0'
;

hive> insert into table songs values
(
1
,'All For Leyna'
,'She stood on the tracks
Waving her arms
Leading me to that third rail shock
Quick as a wink
She changed her mind'
)
;

hive> insert into table songs values
(
2
,'Goodnight Saigon'
,'We met as soul mates
On Parris Island
We left as inmates
From an asylum
And we were sharp
As sharp as knives
And we were so gung ho
To lay down our lives'
)
;

hive> select id,name,length(lyrics) from songs;

1              All For Leyna      114
2              Goodnight Saigon            155


hive> select id,name,hex(lyrics) from songs;

1              All For Leyna                
5368652073746F6F64206F6E2074686520747261636B730A576176696E67206865722061726D730A4C656164696E67206D6520746F2074686174207468697264207261696C2073686F636B0A517569636B20617320612077696E6B0A536865206368616E67656420686572206D696E640A0A
2              Goodnight Saigon                
5765206D657420617320736F756C206D617465730A4F6E205061727269732049736C616E640A5765206C65667420617320696E6D617465730A46726F6D20616E206173796C756D0A416E6420776520776572652073686172700A4173207368617270206173206B6E697665730A416E64207765207765726520736F2067756E6720686F0A546F206C617920646F776E206F7572206C697665730A0A

hive> select id,name,regexp_replace(lyrics,'\n','<<<NEWLINE>>>') from songs;

1              All For Leyna      She stood on the tracks<<<NEWLINE>>>Waving 
her arms<<<NEWLINE>>>Leading me to that third rail shock<<<NEWLINE>>>Quick as a 
wink<<<NEWLINE>>>She changed her mind<<<NEWLINE>>><<<NEWLINE>>>
2              Goodnight Saigon            We met as soul mates<<<NEWLINE>>>On 
Parris Island<<<NEWLINE>>>We left as inmates<<<NEWLINE>>>From an 
asylum<<<NEWLINE>>>And we were sharp<<<NEWLINE>>>As sharp as 
knives<<<NEWLINE>>>And we were so gung ho<<<NEWLINE>>>To lay down our 
lives<<<NEWLINE>>><<<NEWLINE>>>

hive> select id,name,split(lyrics,'\n') from songs;

1              All For Leyna      ["She stood on the tracks","Waving her 
arms","Leading me to that third rail shock","Quick as a wink","She changed her 
mind","",""]
2              Goodnight Saigon            ["We met as soul mates","On Parris 
Island","We left as inmates","From an asylum","And we were sharp","As sharp as 
knives","And we were so gung ho","To lay down our lives","",""]

hive> select id,name,sentences(lyrics) from songs;

1              All For Leyna                
[["She","stood","on","the","tracks","Waving","her","arms","Leading","me","to","that","third","rail","shock","Quick","as","a","wink","She","changed","her","mind"]]
2              Goodnight Saigon                
[["We","met","as","soul","mates","On","Parris","Island","We","left","as","inmates","From","an","asylum","And","we","were","sharp","As","sharp","as","knives","And","we","were","so","gung","ho","To","lay","down","our","lives"]]

hive> select count (*) from songs;

NULL

hive> select count (*),123,456,789 from songs;

2              123         456         NULL

hive> select count (*),'A','B','C' from songs;

2              A             B             C


From: Radha krishna [mailto:grkmc...@gmail.com]
Sent: Thursday, June 02, 2016 12:42 PM
To: user@hive.apache.org
Subject: LINES TERMINATED BY only supports newline '\n' right now

For some of the columns '\n' character is there as part of value, i want to 
create a hive table for this data i tried by creating the hive table with US as 
the line separator but it showing the below message

Ex:
CREATE EXTERNAL TABLE IF NOT EXISTS emp (name String,id int)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '28'
LINES TERMINATED BY '31'
LOCATION 'emp.txt file path';

FAILED: SemanticException 4:20 LINES TERMINATED BY only supports newline '\n' 
right now. Error encountered near token ''31''

how can we create hive table with out removing the \n character as part of 
column value ( requirement data need to maintain as it is)

can any one implemented hive tables with line separated other than \n

Thanks & Regards
   Radha krishna

Reply via email to