[ https://issues.apache.org/jira/browse/PIG-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987817#comment-15987817 ]
David Ongaro commented on PIG-2531: ----------------------------------- Any progress on this? > Filter function for IsTupleInBag and IsTupleInTuple > --------------------------------------------------- > > Key: PIG-2531 > URL: https://issues.apache.org/jira/browse/PIG-2531 > Project: Pig > Issue Type: New Feature > Components: piggybank > Affects Versions: 0.9.1 > Reporter: Florian Leibert (flo) > Priority: Minor > Attachments: PIG-2531.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > It would be nice to have a FilterFunc that allows to filter based on a tuple > in the stream being part of either another tuple of a bag. > Data (e.g. session data joined with e.g. follow-up sessions where) > > BAG: {('/login'), ('/show'), ('/logout?user_id=2000')}, TUPLE: > > ('/logout?user_id=2000') > > BAG: {('/home'), ('/about')}, TUPLE: ('/admin') > > BAG: {('login')}, TUPLE: ('/logout') > It would be great to be able to filter filter based on criteria <B1 CONTAINS > T1> or <T1 CONTAINS T2>. In the above case, the only result of such an > operation would be the first entry '/logout?user_id=2000' - it should be > obvious that this is useful. -- This message was sent by Atlassian JIRA (v6.3.15#6346)