[ 
https://issues.apache.org/jira/browse/DATAFU-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987685#comment-13987685
 ] 

jian wang commented on DATAFU-34:
---------------------------------

Regarding #1, what about not implementing this UDF?

Regarding #2, it is easy to declare the input field to be bytearray when 
testing the UDF through a pig script. However, I am not sure how to validate 
the output of the UDF is of a particular type ? And I do not think there is a 
helper method to convert object of other type to bytearray because of the type 
cast defined in http://pig.apache.org/docs/r0.10.0/basic.html#cast.

Another question regarding BagToMap, how to handle entries with duplicate key? 
In the TOMAP implementation, 
http://grepcode.com/file/repo1.maven.org/maven2/org.apache.pig/pig/0.9.0/org/apache/pig/builtin/TOMAP.java,
 it seems to replace entry with duplicate key coming later with entry that 
comes earlier. I wonder if it is better to handle the control to user, like 
providing a UDF constructor parameter to control if we should keep earlier or 
later entries with duplicate key or throw an exceptoin? 

> Add some UDFS to handle map type
> --------------------------------
>
>                 Key: DATAFU-34
>                 URL: https://issues.apache.org/jira/browse/DATAFU-34
>             Project: DataFu
>          Issue Type: New Feature
>            Reporter: jian wang
>            Assignee: jian wang
>         Attachments: 0001-add-some-UDFs-to-manipulate-map.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to