Hello,

I am having a problem escaping a ":" and a "." in a regular expression within 
the REGEX_EXTRACT() function shown at 
http://pig.apache.org/docs/r0.8.0/piglatin_ref2.html#REGEX_EXTRACT. Here's a 
simplified example, though the example in the docs gives me the problem as 
well. I've tried it without the "\" in front of the ":", but that doesn't work 
right either (returns the whole line). So, how do I escape the ":", and also I 
need to escape a "." as well in my actual script.

------INPUT FILE------
hi:1    num1    num2    num3
hi:20   num1    blah    boo
ho:30   num1    blah    foo
bar:30  foo     foo     foo
bar:40  foo     far     away
bar:40  far     far     far

------PIG SCRIPT------
a = LOAD 'fromabs-colons' USING PigStorage AS (f1,f2,f3,f4);
b = FILTER a BY REGEX_EXTRACT(f1,'(.*)\:(.*)',1) == 'hi';
DUMP b;

------WHAT I EXPECT---
(hi:1,num1,num2,num3)
(hi:20,num1,blah,foo)

------ERROR I GET-----
2011-04-19 22:55:43,844 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
1000: Error during parsing. Lexical error at line 1, column 40.  Encountered: 
":" (58), after : "\'(.*)\\"

------PIG VERSION-----
Apache Pig version 0.8.0..1103222002 (r1084466)

Reply via email to