[ https://issues.apache.org/jira/browse/PIG-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Gates updated PIG-1581: ---------------------------- Assignee: Xuefu Zhang Fix Version/s: 0.9.0 > Parser fails to recognize semicolons in quoted strings > ------------------------------------------------------ > > Key: PIG-1581 > URL: https://issues.apache.org/jira/browse/PIG-1581 > Project: Pig > Issue Type: Bug > Components: grunt > Affects Versions: 0.7.0 > Environment: CentOS 5.5 > Reporter: Christopher Hackman > Assignee: Xuefu Zhang > Priority: Minor > Fix For: 0.9.0 > > > Within some contexts, the parser fails to treat semicolons correctly, and > sees them as an EOL. > Given an input file: > /test1.txt (in the hdfs) > 1;a > 2;b > 3;c > 4;d > 5;e > And the following Pig script: > REGISTER /tmp/piggybank.jar ; > DEFINE REGEXEXTRACTALL > org.apache.pig.piggybank.evaluation.string.RegexExtractAll(); > lines = LOAD '/test1.txt' AS (line:chararray); > delimited = FOREACH lines GENERATE FLATTEN ( > REGEXEXTRACTALL(line, '^(\\d+);(\\w+)$') > ) AS ( > digit:int, > word:chararray > ); > DUMP delimited; > I receive the following error: > ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. > Lexical error at line 5, column 40. Encountered: <EOF> after : "\'^(\\\\d+);" -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.