[jira] [Commented] (DRILL-2077) Provide a clear starting point for new developers about what to start reading to learn about Drill

Chris Westin (JIRA) Thu, 29 Jan 2015 10:51:08 -0800

    [ 
https://issues.apache.org/jira/browse/DRILL-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14297327#comment-14297327
 ]


Chris Westin commented on DRILL-2077:
-------------------------------------

This should go in the new markdown pages in git.

> Provide a clear starting point for new developers about what to start reading 
> to learn about Drill
> --------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-2077
>                 URL: https://issues.apache.org/jira/browse/DRILL-2077
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Jason Altekruse
>            Assignee: Jason Altekruse
>
> As part of my package level javadocs posted in DRILL-1904 I tried to document 
> the root org.apache.drill.exec package. We should have some good information 
> here as well as in the markdown file on the git repo about the best place to 
> start reading the code to understand how drill operates.
> Here is a description I started. I think we want to make sure this is 
> informative but concise. I want to get in the rest of the package docs, so I 
> am leaving this here as a TODO, please feel free to comment, revise or add to 
> this.
> {code}
>  * A good place to start learning about Drill is exploring the query plans. A
>  * Drill physical plan is defined as a connected graph of operators that read
>  * and manipulate data. Operators are configured by implementations of the 
> {@See
>  * PhysicalOperator} interface. These query graphs are translated into a graph
>  * of physical operators that will actually process data at query execution
>  * time. The connections between these nodes are materialized as interfaces
>  * where data is passed between different operators. As Drill is distributed
>  * these connections can take the form of an RPC layer between the nodes in a
>  * Drill cluster.
>  *
>  * While physical plans can be written by hand, the primary interface for 
> Drill
>  * is SQL. Drill is targeted for compliance with the ANSI SQL 2003
>  * specification. Query parsing and optimization is handled by Calcite, an
>  * Apache incubator project, also used for planning in Apache Hive. Drill
>  * defines many planning rules an optimizations that plug into the Calcite
>  * planning engine to generate optimal plans for the Drill engine.
>  *
>  * Unlike most query systems, Drill is designed to query raw files without
>  * a predefined catalog of metadata defining the types of data or columns 
>  * available in the dataset. To maintain performance in a flexible schema
>  * environment, Drill uses runtime code generation to compile custom java
>  * code as operators receive a message of change in schema. 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-2077) Provide a clear starting point for new developers about what to start reading to learn about Drill

Reply via email to