[ 
https://issues.apache.org/jira/browse/PIG-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

karan kumar updated PIG-3870:
-----------------------------
    Attachment: helloWorld.patch

I have added the functionality of STRSPLITTOBAG .
Request code review.


> STRSPLITTOBAG UDF
> -----------------
>
>                 Key: PIG-3870
>                 URL: https://issues.apache.org/jira/browse/PIG-3870
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.12.0
>            Reporter: Praveenesh Kumar
>            Assignee: karan kumar
>             Fix For: 0.12.0
>
>         Attachments: helloWorld.patch
>
>
> I had a scenario, which required me to change the STRSPLIT code. The scenario 
> was as follows:
> I have a data like:
> 1       A|1|1   some
> 2       B|2|2   data
> 3       C|3|3   hadoop
> Need output like this :
> 1    A    some
> 1    1    some
> 1    1    some
> 2    B    data
> 2    2     data
> 2    2     data
> 3    C    hadoop
> 3    3    hadoop
> 3    3    hadoop
> I was trying to use STRSPLIT($1,'\\\|') which was returning a tuple, If I do 
> flatten on it, it converts the data into columns.
> If we return a bag of tuples, we can easily use flatten() to convert it into 
> rows, plus can also convert that into Tuple using TOTUPLE() UDF (if someone 
> just want to use it as tuple)
> After the suggestion from [~daijy], I am creating a JIRA ticket to create a 
> new UDF STRSPLITTOBAG, which will return a bag of tuples as suggested above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to