[
https://issues.apache.org/jira/browse/PIG-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270990#comment-13270990
]
Jonathan Coveney commented on PIG-2600:
---------------------------------------
Man, I feel so nitpicky, though this one isn't so nitpicky: the javadoc
formatting isn't correct. As is, when I compile the javadocs, I get this:
{code}
This UDF accepts a Map as input with values of any primitive data type. UDF
swaps keys with values and returns the new inverse Map. Note in case original
values are non-unique, the resulting Map would contain String Key -> DataBag of
values. Here the bag of values is composed of the original keys having the same
value. Note: 1. UDF accepts Map with Values of primitive data type 2. UDF
returns Map grunt> cat 1data [open#1,1#2,11#2] [apache#2,3#4,12#24] grunt> a =
load 'data' as (M:[int]); grunt> b = foreach a generate INVERSEMAP($0); grunt>
dump b; ([2#{(1),(11)},apache#{(open)}]) ([hadoop#{(apache),(12)},4#{(3)}])
{code}
Other comments:
- doInverse should return Map<String,DataBag>
- INVERSEMAP should extend extends EvalFunc<Map<String,DataBag>>
- inverseMap should be a Map<String,Databag>
{code}
100 DataBag bag = (DataBag)inverseMap.get(newKey);
101 if(bag == null) {
102 DataBag dataBag = BAG_FACTORY.newDefaultBag();
103 dataBag.add(TUPLE_FACTORY.newTuple(entry.getKey()));
104 inverseMap.put(newKey, dataBag);
105 } else {
106 bag.add(TUPLE_FACTORY.newTuple(entry.getKey()));
107 }
{code}
You can just reuse the bag reference instead of making a new dataBag
That should be the last of the comments...
> Better Map support
> ------------------
>
> Key: PIG-2600
> URL: https://issues.apache.org/jira/browse/PIG-2600
> Project: Pig
> Issue Type: Improvement
> Reporter: Jonathan Coveney
> Assignee: Prashant Kommireddi
> Fix For: 0.11
>
> Attachments: PIG-2600.patch, PIG-2600_2.patch, PIG-2600_3.patch,
> PIG-2600_4.patch, PIG-2600_5.patch, PIG-2600_6.patch, PIG-2600_7.patch
>
>
> It would be nice if Pig played better with Maps. To that end, I'd like to add
> a lot of utility around Maps.
> - TOBAG should take a Map and output {(key, value)}
> - TOMAP should take a Bag in that same form and make a map.
> - KEYSET should return the set of keys.
> - VALUESET should return the set of values.
> - VALUELIST should return the List of values (no deduping).
> - INVERSEMAP would return a Map of values => the set of keys that refer to
> that Key
> This would all be pretty easy. A more substantial piece of work would be to
> make Pig support non-String keys (this is especially an issue since UDFs and
> whatnot probably assume that they are all Integers). Not sure if it is worth
> it.
> I'd love to hear other things that would be useful for people!
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira