Hey Jonathan, You need to escape the backslash as well (it has a meaning in the string literals in Pig):
b = FILTER a BY REGEX_EXTRACT(f1,'(.*)\\:(.*)',1) == 'hi'; If you'd want to escape a single backslash, it'd become '\\\\'. Best, -Sven On Tue, Apr 19, 2011 at 4:00 PM, Jonathan Hoover <[email protected]>wrote: > Hello, > > I am having a problem escaping a ":" and a "." in a regular expression > within the REGEX_EXTRACT() function shown at > http://pig.apache.org/docs/r0.8.0/piglatin_ref2.html#REGEX_EXTRACT. Here's > a simplified example, though the example in the docs gives me the problem as > well. I've tried it without the "\" in front of the ":", but that doesn't > work right either (returns the whole line). So, how do I escape the ":", and > also I need to escape a "." as well in my actual script. > > ------INPUT FILE------ > hi:1 num1 num2 num3 > hi:20 num1 blah boo > ho:30 num1 blah foo > bar:30 foo foo foo > bar:40 foo far away > bar:40 far far far > > ------PIG SCRIPT------ > a = LOAD 'fromabs-colons' USING PigStorage AS (f1,f2,f3,f4); > b = FILTER a BY REGEX_EXTRACT(f1,'(.*)\:(.*)',1) == 'hi'; > DUMP b; > > ------WHAT I EXPECT--- > (hi:1,num1,num2,num3) > (hi:20,num1,blah,foo) > > ------ERROR I GET----- > 2011-04-19 22:55:43,844 [main] ERROR org.apache.pig.tools.grunt.Grunt - > ERROR 1000: Error during parsing. Lexical error at line 1, column 40. > Encountered: ":" (58), after : "\'(.*)\\" > > ------PIG VERSION----- > Apache Pig version 0.8.0..1103222002 (r1084466) >
