[ 
https://issues.apache.org/jira/browse/PIG-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207581#comment-13207581
 ] 

Prashant Kommireddi commented on PIG-2531:
------------------------------------------

Thanks Florian. A few comments:

1. Looks like there is a typo with the filename "IsTupelInTupel.java"
2. You could possibly use LinkedList to minimize the effect of expensive 
remove() call on ArrayList (array copying on each of remove operation). Arrays 
might be an issue if tuple2 is huge.
3. Also, the outer for loop could be exited as soon as all of "toMatch" 
elements are matched. 
 
{code}
   for (Object o : tuple1.getAll()) {
      if (toMatch.contains(o)) {
        toMatch.remove(toMatch.indexOf(o));
        if(toMatch.size() == 0)
            return true;
      }
    }
    return false;
{code}
                
> Filter function for IsTupleInBag and IsTupleInTuple
> ---------------------------------------------------
>
>                 Key: PIG-2531
>                 URL: https://issues.apache.org/jira/browse/PIG-2531
>             Project: Pig
>          Issue Type: New Feature
>          Components: piggybank
>    Affects Versions: 0.9.1
>            Reporter: Florian Leibert (flo)
>            Priority: Minor
>         Attachments: PIG-2531.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> It would be nice to have a FilterFunc that allows to filter based on a tuple 
> in the stream being part of either another tuple of a bag. 
> Data (e.g. session data joined with e.g. follow-up sessions where)
> > BAG: {('/login'), ('/show'), ('/logout?user_id=2000')}, TUPLE: 
> > ('/logout?user_id=2000')
> > BAG: {('/home'), ('/about')}, TUPLE: ('/admin')
> > BAG: {('login')}, TUPLE: ('/logout')
> It would be great to be able to filter filter based on criteria <B1 CONTAINS  
> T1> or <T1 CONTAINS T2>. In the above case, the only result of such an 
> operation would be the first entry '/logout?user_id=2000' - it should be 
> obvious that this is useful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to