[
https://issues.apache.org/jira/browse/PIG-2147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062870#comment-13062870
]
[email protected] commented on PIG-2147:
----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1064/
-----------------------------------------------------------
Review request for pig.
Summary
-------
Currently xmlloader does not support nested tags with same tag name, ie if i
have the below content
<event>
<relatedEvents>
<event>x<\event>
<event>y<\event>
<event>z<\event>
<\relatedEvents>
<\event>
And I load the above using XMLLoader,
events = load 'input' using org.apache.pig.piggybank.storage.XMLLoader('event')
as (doc:chararray);
The output will be,
<event>
<relatedEvents>
<event>x<\event>
Whereas the desired output is ;
<relatedEvents>
<event>x<\event>
<event>y<\event>
<event>z<\event>
<\relatedEvents>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Modified the behaviour of XMLLoader such that it considers the nested tags
also. This is implemented by simply counting the number of nesting and
decrementing accordingly.
This addresses bug PIG-2147.
https://issues.apache.org/jira/browse/PIG-2147
Diffs
-----
Diff: https://reviews.apache.org/r/1064/diff
Testing
-------
Thanks,
Vivek
> Support nested tags for XMLLoader
> ---------------------------------
>
> Key: PIG-2147
> URL: https://issues.apache.org/jira/browse/PIG-2147
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.1, 0.9.0
> Reporter: Vivek Padmanabhan
> Assignee: Vivek Padmanabhan
> Fix For: 0.8.1, 0.9.0
>
> Attachments: PIG-2147_1.patch
>
>
> Currently xmlloader does not support nested tags with same tag name, ie if i
> have the below content
> {code}
> <event>
> <relatedEvents>
> <event>x<\event>
> <event>y<\event>
> <event>z<\event>
> <\relatedEvents>
> <\event>
> {code}
> And I load the above using XMLLoader,
> events = load 'input' using
> org.apache.pig.piggybank.storage.XMLLoader('event') as (doc:chararray);
> The output will be,
> {code}
> <event>
> <relatedEvents>
> <event>x<\event>
> {code}
> Whereas the desired output is ;
> {code}
> <relatedEvents>
> <event>x<\event>
> <event>y<\event>
> <event>z<\event>
> <\relatedEvents>
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira