[
https://issues.apache.org/jira/browse/DRILL-13?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605391#comment-13605391
]
David Alves commented on DRILL-13:
----------------------------------
for instance bloom filters can be used to perform a semi-join phase before a
distributed join.
Since bloom filters are usually associated with partitions I think bloom filter
capabilities would go along nicely with the partition information and as a
sub-api of SE e.g.:
{code}
interface PartitionInfo {
public enum PartitioningType {
NONE,ORDERED,RANDOM;
}
public enum BloomFitlerType {
NONE,KEY,COMPOUND;
}
public PartitionType partitionType();
public PartitionFunction function();
public BloomFilterType bloomFilterType();
}
{code}
That being said, for this particular issue maybe we could try and focus on the
main interfaces required and leave these details to other issues.
> Storage Engine: Define Java Interface
> -------------------------------------
>
> Key: DRILL-13
> URL: https://issues.apache.org/jira/browse/DRILL-13
> Project: Apache Drill
> Issue Type: Task
> Reporter: Jacques Nadeau
> Assignee: Jacques Nadeau
>
> We're going to need to define a storage engine API. At a minimum, we'll need
> to generate a Java one. We will probably need to also create a CPP one.
> This task is for the former. Things that are likely to be included in a the
> Java interface are: reader (scanner), writer, capabilities interface, schema
> interface, statistics interface, data layout and ordering
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira