[ 
https://issues.apache.org/jira/browse/DRILL-13?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605391#comment-13605391
 ] 

David Alves commented on DRILL-13:
----------------------------------

for instance bloom filters can be used to perform a semi-join phase before a 
distributed join. 
Since bloom filters are usually associated with partitions I think bloom filter 
capabilities would go along nicely with the partition information and as a 
sub-api of SE e.g.:

{code}
interface PartitionInfo {

  public enum PartitioningType {
    NONE,ORDERED,RANDOM;
  }
  
  public enum BloomFitlerType {
    NONE,KEY,COMPOUND;
  }

  public PartitionType partitionType();

  public PartitionFunction function();

  public BloomFilterType bloomFilterType();

}
{code}

That being said, for this particular issue maybe we could try and focus on the 
main interfaces required and leave these details to other issues.
                
> Storage Engine: Define Java Interface
> -------------------------------------
>
>                 Key: DRILL-13
>                 URL: https://issues.apache.org/jira/browse/DRILL-13
>             Project: Apache Drill
>          Issue Type: Task
>            Reporter: Jacques Nadeau
>            Assignee: Jacques Nadeau
>
> We're going to need to define a storage engine API.  At a minimum, we'll need 
> to generate a Java one.  We will probably need to also create a CPP one.  
> This task is for the former.  Things that are likely to be included in a the 
> Java interface are: reader (scanner), writer, capabilities interface, schema 
> interface, statistics interface, data layout and ordering

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to