[
https://issues.apache.org/jira/browse/DATAFU-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987685#comment-13987685
]
jian wang commented on DATAFU-34:
---------------------------------
Regarding #1, what about not implementing this UDF?
Regarding #2, it is easy to declare the input field to be bytearray when
testing the UDF through a pig script. However, I am not sure how to validate
the output of the UDF is of a particular type ? And I do not think there is a
helper method to convert object of other type to bytearray because of the type
cast defined in http://pig.apache.org/docs/r0.10.0/basic.html#cast.
Another question regarding BagToMap, how to handle entries with duplicate key?
In the TOMAP implementation,
http://grepcode.com/file/repo1.maven.org/maven2/org.apache.pig/pig/0.9.0/org/apache/pig/builtin/TOMAP.java,
it seems to replace entry with duplicate key coming later with entry that
comes earlier. I wonder if it is better to handle the control to user, like
providing a UDF constructor parameter to control if we should keep earlier or
later entries with duplicate key or throw an exceptoin?
> Add some UDFS to handle map type
> --------------------------------
>
> Key: DATAFU-34
> URL: https://issues.apache.org/jira/browse/DATAFU-34
> Project: DataFu
> Issue Type: New Feature
> Reporter: jian wang
> Assignee: jian wang
> Attachments: 0001-add-some-UDFs-to-manipulate-map.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)