[ 
https://issues.apache.org/jira/browse/DATAFU-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984970#comment-13984970
 ] 

Matthew Hayes commented on DATAFU-34:
-------------------------------------

Regarding #1, I think it will be hard to use this UDF if there is not a well 
defined schema.  You won't be able to reference the fields by name.  The 
problem is that we don't know what's going to be in the map, so the size of the 
output tuple can vary for each map.  We could fix this by having the user pass 
in some information about what's in the map so we can generate the schema.  
But, this implies that they know something about what's in the map, for example 
what all the expected keys are.  If this is the case then they don't really 
need this UDF because they could construct a UDF on the fly like so:  
('a',my_map#'a', 'b', my_map#'b').  So, I'm not sure how we can make MapToTuple 
really work.

Regarding #2, I'm not sure how to test the bytearray case.  Maybe Pig's 
DataType has a helper method to convert to a bytearray.  Another option is to 
test the UDF through a Pig script and declare the type of the input as 
bytearray when you define the input schema.



> Add some UDFS to handle map type
> --------------------------------
>
>                 Key: DATAFU-34
>                 URL: https://issues.apache.org/jira/browse/DATAFU-34
>             Project: DataFu
>          Issue Type: New Feature
>            Reporter: jian wang
>            Assignee: jian wang
>         Attachments: 0001-add-some-UDFs-to-manipulate-map.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to