Hi Vivek
Thanks A few more experiments
The \\ , \\\ and \\\\ all work correctly if used as following
split(reverse(split(reverse(floc),'/')[0]),'[\\.]')[0] (need those square
brackets)
split(reverse(split(reverse(floc),'/')[0]),'\\\.')[0] (does not need square
brackets)
split(reverse(split(reverse(floc),'/')[0]),'[\\\\.]')[0] (need those square
brackets)
From: Vivek Veeramani <[email protected]>
To: [email protected]; Sanjay Subramanian <[email protected]>
Sent: Monday, August 24, 2015 1:57 PM
Subject: Re: Using SPLIT with DOT(.) delimiter demonstrate funny behavior
within a VIEW
Hi Sanjay,
Try replacing the 4 backslashes with just 2. Usually works with 2 backslashes.
Replace this split(reverse(split(reverse(floc),'/')[0]),'\\\\.')[0] as resid
with split(reverse(split(reverse(floc),'/')[0]),'\\.')[0] as resid
Please have a look and let us know if that helps.
Best ,
Vivek Veeramani
cell : + 1-415 996 7853
On Mon, Aug 24, 2015 at 12:15 PM, Sanjay Subramanian
<[email protected]> wrote:
Hi guys
I am using Hive version = 0.13.1-cdh5.3.3
HIVE TABLE =
qnap_resume_file_location---------------------------------------DROP TABLE IF
EXISTS
qnap_resume_file_location;CREATE EXTERNAL TABLE qnap_resume_file_location
( floc STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION
'/data/myfirm/file_location';
SAMPLE DATA = qnap_resume_file_location---------------------------------"select
* from resume.qnap_resume_file_location limit
10"/mnt/myfirm/200901/14008_1_1004942032_0_1005347483.html
/mnt/myfirm/200901/14008_1_1004944724_0_1005334758.html/mnt/myfirm/200901/14000__900719077_10_901309048.doc/mnt/myfirm/200901/14012_6_1004858088_0_1005227990.doc/mnt/myfirm/200901/14008_1_1004811416_0_1005258541.html/mnt/myfirm/200901/14008_1_501597002_6_1005224039.html/mnt/myfirm/200901/14009_5_1004872908_0_1005244348.doc/mnt/myfirm/200901/14001_7_14165_1_66775.docx/mnt/myfirm/200901/14009_5_1004935267_0_1005322091.txt/mnt/myfirm/200901/14012_6_1004904422_0_1005283729.doc
HIVE VIEW =
qnap_resume_resid_file_loc_map_vw---------------------------------------------DROP
VIEW IF EXISTS
qnap_resume_resid_file_loc_map_vw;CREATE VIEW IF NOT EXISTS
qnap_resume_resid_file_loc_map_vw ( resid , yyyymm , floc )ASSELECT
split(reverse(split(reverse(floc),'/')[0]),'\\\\.')[0] as resid,
reverse(split(reverse(floc),'/')[1]) as yyyymm, flocFROM
resume.qnap_resume_file_location;
QUERY ON VIEW"select * from qnap_resume_resid_file_loc_map_vw limit 10"EXPECTED
RESULTS
14012_6_1006686583_0_1102955123 201205
/mnt/myfirm/201205/14012_6_1006686583_0_1102955123.docx14009_5_1008440384_0_1102887209
201205
/mnt/myfirm/201205/14009_5_1008440384_0_1102887209.doc14013_7_700200576_8_1102752594
201205
/mnt/myfirm/201205/14013_7_700200576_8_1102752594.doc14015_8_1008470815_0_1102954525
201205
/mnt/myfirm/201205/14015_8_1008470815_0_1102954525.doc14009_5_1008373102_0_1102709973
201205
/mnt/myfirm/201205/14009_5_1008373102_0_1102709973.doc14011_1_1007981566_0_1102730546
201205
/mnt/myfirm/201205/14011_1_1007981566_0_1102730546.txt14015_8_1008436709_0_1102867682
201205
/mnt/myfirm/201205/14015_8_1008436709_0_1102867682.docx14001_7_1006347358_0_1007662042
201205
/mnt/myfirm/201205/14001_7_1006347358_0_1007662042.docx14009_5_1008382590_0_1102732450
201205
/mnt/myfirm/201205/14009_5_1008382590_0_1102732450.doc14000_6_1008189721_0_1102253740
201205 /mnt/myfirm/201205/14000_6_1008189721_0_1102253740.docx
ACTUAL RESULTS (see the first column the DOT and extension are still there)
14012_6_1006686583_0_1102955123 .docx 201205
/mnt/myfirm/201205/14012_6_1006686583_0_1102955123.docx14009_5_1008440384_0_1102887209
.doc 201205
/mnt/myfirm/201205/14009_5_1008440384_0_1102887209.doc14013_7_700200576_8_1102752594
.doc 201205
/mnt/myfirm/201205/14013_7_700200576_8_1102752594.doc14015_8_1008470815_0_1102954525.doc
201205
/mnt/myfirm/201205/14015_8_1008470815_0_1102954525.doc14009_5_1008373102_0_1102709973
.doc 201205
/mnt/myfirm/201205/14009_5_1008373102_0_1102709973.doc14011_1_1007981566_0_1102730546
.txt 201205
/mnt/myfirm/201205/14011_1_1007981566_0_1102730546.txt14015_8_1008436709_0_1102867682
.docx 201205
/mnt/myfirm/201205/14015_8_1008436709_0_1102867682.docx14001_7_1006347358_0_1007662042
.docx 201205
/mnt/myfirm/201205/14001_7_1006347358_0_1007662042.docx14009_5_1008382590_0_1102732450
.doc 201205
/mnt/myfirm/201205/14009_5_1008382590_0_1102732450.doc14000_6_1008189721_0_1102253740
.docx 201205 /mnt/myfirm/201205/14000_6_1008189721_0_1102253740.docx
EXECUTING THE SELECT HIVE QUERY INSIDE THE VIEW INDEPENDENTLY (gives u correct
results - but when placed inside a view then it does not seem to work)SELECT
split(reverse(split(reverse(floc),'/')[0]),'\\\\.')[0] as resid,
reverse(split(reverse(floc),'/')[1]) as yyyymm, flocFROM
resume.qnap_resume_file_location
14012_6_1006686583_0_1102955123 201205
/mnt/myfirm/201205/14012_6_1006686583_0_1102955123.docx14009_5_1008440384_0_1102887209
201205
/mnt/myfirm/201205/14009_5_1008440384_0_1102887209.doc14013_7_700200576_8_1102752594
201205
/mnt/myfirm/201205/14013_7_700200576_8_1102752594.doc14015_8_1008470815_0_1102954525
201205
/mnt/myfirm/201205/14015_8_1008470815_0_1102954525.doc14009_5_1008373102_0_1102709973
201205
/mnt/myfirm/201205/14009_5_1008373102_0_1102709973.doc14011_1_1007981566_0_1102730546
201205
/mnt/myfirm/201205/14011_1_1007981566_0_1102730546.txt14015_8_1008436709_0_1102867682
201205
/mnt/myfirm/201205/14015_8_1008436709_0_1102867682.docx14001_7_1006347358_0_1007662042
201205
/mnt/myfirm/201205/14001_7_1006347358_0_1007662042.docx14009_5_1008382590_0_1102732450
201205
/mnt/myfirm/201205/14009_5_1008382590_0_1102732450.doc14000_6_1008189721_0_1102253740
201205 /mnt/myfirm/201205/14000_6_1008189721_0_1102253740.docx
Thoughts ? Ideas ?
thanks
sanjay