[ 
https://issues.apache.org/jira/browse/PIG-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142738#comment-13142738
 ] 

Alan Gates commented on PIG-2328:
---------------------------------

Uploaded new patch that follows Dmitriy's suggestion of replacing / with _ for 
file name uniqueness and fixes the javadoc issues Daniel brought up.

bq. It should be trivial to convert it into scalar, so that we get out of the 
business to figure out the symbol link name:

While I agree it would be great to have the scalar option, I want to keep the 
storing it to a file option as well.  One use case I envision is people 
building a bloom filter once (say against a small lookup table) and using it 
repeatedly.

I'm going to commit this patch as is.  I'll file a separate JIRA to add using 
the scalar in Bloom.  Daniel, if you want to submit a patch for that, go ahead. 
 It will be a while before I get to it.

Also, if no one objects I'd like to port this to 0.10.  It doesn't touch any 
existing code so it shouldn't break anything that already works.
                
> Add builtin UDFs for building and using bloom filters
> -----------------------------------------------------
>
>                 Key: PIG-2328
>                 URL: https://issues.apache.org/jira/browse/PIG-2328
>             Project: Pig
>          Issue Type: New Feature
>          Components: internal-udfs
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>             Fix For: 0.10
>
>         Attachments: PIG-bloom-2.patch, PIG-bloom-3.patch, PIG-bloom.patch
>
>
> Bloom filters are a common way to do select a limited set of records before 
> moving data for a join or other heavy weight operation.  Pig should add UDFs 
> to support building and using bloom filters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to