[ https://issues.apache.org/jira/browse/PIG-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Gates resolved PIG-3619. ----------------------------- Resolution: Fixed Patch checked in. Thanks Saad. > Provide XPath function > ---------------------- > > Key: PIG-3619 > URL: https://issues.apache.org/jira/browse/PIG-3619 > Project: Pig > Issue Type: Improvement > Components: piggybank > Reporter: Saad Patel > Assignee: Saad Patel > Attachments: xpath.patch > > > Xml is often loaded using XMLLoader with a record boundary tag as one of the > parameters. A common use case is to then extract data from those records. > XPath would allow those extractions to be done very easily. I'm proposing a > patch that adds simple XPath support as a UDF. > Example usage of this the XPath UDF would be: > {code} > extractions = FOREACH xmlrecords GENERATE XPath(record, 'book/author'), > XPath(record, 'book/title'); > {code} > The proposed UDF also caches the last xml document. This is helpful for > improving performance when multiple consecutive xpath extractions on the same > xml document, such as the example above. -- This message was sent by Atlassian JIRA (v6.1.4#6159)