[ https://issues.apache.org/jira/browse/PIG-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Adam Kawa updated PIG-2883: --------------------------- Description: HBaseStorage allows a user to load many HBase columns by specifying the prefix. The problem is to access such columns later, if their names are dynamically created and hold some meaningful information, which you want to process as well (it seems to be relatively common). Quick example: {code} User = LOAD 'hbase://user' USING HBaseStorage('friends:*', '-loadKey true') AS (username:bytearray, friendMap:map[]); UserAndFriend = FOREACH User GENERATE username, friendMap#'What_should_I_to_put_here?'; {code} It would be convenient to easily get the full list of key/value pairs (or just keys or values) from a map (something like MapKeysToBag, MapValuesToBag, MapEntriesToBag UDFs). Having such UFDs, we may FLATTEN returned bag and generate a relation that contains unnested keys or values extracted from the map e.g.: {code} UserAndFriend = FOREACH Users GENERATE username, FLATTEN(MapKeysToTuple(friendMap)) AS friendUsername; {code} I have already implemented such UDFs (here is repo: https://github.com/kawaa/Pigitos and here is a fancy example: http://bit.ly/Sf2KCP). I would love to add it to Piggybank (I have not found such functionality there). If you think that it is useful and missing, I can prepare a patch as soon as possible. Please let me know. was: HBaseStorage allows a user to load many HBase columns by specifying the prefix. The problem is to access such columns later, if their names are dynamically created and hold some meaningful information, which you want to process as well (it seems to be relatively common). Quick example: User = LOAD 'hbase://user' USING HBaseStorage('friends:*', '-loadKey true') AS (username:bytearray, friendMap:map[]); UserAndFriend = FOREACH User GENERATE username, friendMap#'What_should_I_to_put_here?'; It would be convenient to easily get the full list of key/value pairs (or just keys or values) from a map (something like MapKeysToBag, MapValuesToBag, MapEntriesToBag UDFs). Having such UFDs, we may FLATTEN returned bag and generate a relation that contains unnested keys or values extracted from the map e.g.: UserAndFriend = FOREACH Users GENERATE username, FLATTEN(MapKeysToTuple(friendMap)) AS friendUsername; I have already implemented such UDFs (here is repo: https://github.com/kawaa/Pigitos and here is a fancy example: http://bit.ly/Sf2KCP). I would love to add it to Piggybank (I have not found such functionality there). If you think that it is useful and missing, I can prepare a patch as soon as possible. Please let me know. > MapKeysToBag and more UDFs to manipulate maps > --------------------------------------------- > > Key: PIG-2883 > URL: https://issues.apache.org/jira/browse/PIG-2883 > Project: Pig > Issue Type: Wish > Components: piggybank > Affects Versions: 0.10.0 > Reporter: Adam Kawa > Priority: Trivial > Original Estimate: 3h > Remaining Estimate: 3h > > HBaseStorage allows a user to load many HBase columns by specifying the > prefix. The problem is to access such columns later, if their names are > dynamically created and hold some meaningful information, which you want to > process as well (it seems to be relatively common). > Quick example: > {code} > User = LOAD 'hbase://user' USING HBaseStorage('friends:*', '-loadKey true') > AS (username:bytearray, friendMap:map[]); > UserAndFriend = FOREACH User GENERATE username, > friendMap#'What_should_I_to_put_here?'; > {code} > It would be convenient to easily get the full list of key/value pairs (or > just keys or values) from a map (something like MapKeysToBag, MapValuesToBag, > MapEntriesToBag UDFs). Having such UFDs, we may FLATTEN returned bag and > generate a relation that contains unnested keys or values extracted from the > map e.g.: > {code} > UserAndFriend = FOREACH Users GENERATE username, > FLATTEN(MapKeysToTuple(friendMap)) AS friendUsername; > {code} > I have already implemented such UDFs (here is repo: > https://github.com/kawaa/Pigitos and here is a fancy example: > http://bit.ly/Sf2KCP). I would love to add it to Piggybank (I have not found > such functionality there). > If you think that it is useful and missing, I can prepare a patch as soon as > possible. Please let me know. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira