[
https://issues.apache.org/jira/browse/DRILL-13?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13603298#comment-13603298
]
immars commented on DRILL-13:
-----------------------------
questions and thoughts, considering query across heterogenious SEs:
* about push down
It would be simpler if capability API can be declarative. but is it nessesary
for SE to provide some stats info of current sub-DAG for drill to do further
optimization or further execution?
* about partition & distribution and query dispatch
whats your thoughts about partition information?
It is good if partition info can be declarative and unified too. the master
drill daemon(who recieved query directly from the user) could use partition
information of different SE to better optimize, aggregation over join for
example, before dispatching subquery. but it seems more difficult than
cappability api to make abstraction of data distribution: distributed-managed
list-based (cassandra), centralized-managed list-based(hbase), simple
rule-based(redis), user-handcrafted maybe rule-based (mysql)
> Storage Engine: Define Java Interface
> -------------------------------------
>
> Key: DRILL-13
> URL: https://issues.apache.org/jira/browse/DRILL-13
> Project: Apache Drill
> Issue Type: Task
> Reporter: Jacques Nadeau
> Assignee: Jacques Nadeau
>
> We're going to need to define a storage engine API. At a minimum, we'll need
> to generate a Java one. We will probably need to also create a CPP one.
> This task is for the former. Things that are likely to be included in a the
> Java interface are: reader (scanner), writer, capabilities interface, schema
> interface, statistics interface, data layout and ordering
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira