[ 
https://issues.apache.org/jira/browse/PIG-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846130#action_12846130
 ] 

Alan Gates commented on PIG-1284:
---------------------------------

I'll take a look at the patch.

> pig UDF is lacking XMLLoader. Plan to add the XMLLoader
> -------------------------------------------------------
>
>                 Key: PIG-1284
>                 URL: https://issues.apache.org/jira/browse/PIG-1284
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.7.0
>            Reporter: Alok Singh
>             Fix For: 0.7.0
>
>         Attachments: pigudf_xmlLoader.patch, pigudf_xmlLoader.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Hi All,
>  We are planning to add the XMLLoader UDF in the piggybank repository.
> Here is the proposal with the user docs :-
>  The load function to load the XML file
>  This will implements the LoadFunc interface which is used to parse records
>  from a dataset.
>  This takes a xmlTag as the arg which it will use to split the inputdataset 
> into
>  multiple records.
>  For example if the input xml (input.xml) is like this
>  <configuration>
>  <property>
>  <name> foobar </name>
>  <value> barfoo </value>
>  </property>
>  <ignoreProperty>
>  <name> foo </name>
>  </ignoreProperty>
>  <property>
>  <name> justname </name>
>  </property>
>  </configuration>
>  And your pig script is like this
>  --load the jar files
>  register loader.jar;
>  -- load the dataset using XMLLoader
>  -- A is the bag containing the tuple which contains one atom i.e doc see 
> output
>  A = load '/user/aloks/pig/input.xml using loader.XMLLoader('property') as 
> (doc:chararray);
>  --dump the result
>  dump A;
>  Then you will get the output
> (<property>
> <name> foobar </name>
> <value> barfoo </value>
> </property>)
> (<property>
> <name> justname </name>
> </property>)
> Where each () indicate one record
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to