[ https://issues.apache.org/jira/browse/DATAFU-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987685#comment-13987685 ]
jian wang commented on DATAFU-34: --------------------------------- Regarding #1, what about not implementing this UDF? Regarding #2, it is easy to declare the input field to be bytearray when testing the UDF through a pig script. However, I am not sure how to validate the output of the UDF is of a particular type ? And I do not think there is a helper method to convert object of other type to bytearray because of the type cast defined in http://pig.apache.org/docs/r0.10.0/basic.html#cast. Another question regarding BagToMap, how to handle entries with duplicate key? In the TOMAP implementation, http://grepcode.com/file/repo1.maven.org/maven2/org.apache.pig/pig/0.9.0/org/apache/pig/builtin/TOMAP.java, it seems to replace entry with duplicate key coming later with entry that comes earlier. I wonder if it is better to handle the control to user, like providing a UDF constructor parameter to control if we should keep earlier or later entries with duplicate key or throw an exceptoin? > Add some UDFS to handle map type > -------------------------------- > > Key: DATAFU-34 > URL: https://issues.apache.org/jira/browse/DATAFU-34 > Project: DataFu > Issue Type: New Feature > Reporter: jian wang > Assignee: jian wang > Attachments: 0001-add-some-UDFs-to-manipulate-map.patch > > -- This message was sent by Atlassian JIRA (v6.2#6252)