[ https://issues.apache.org/jira/browse/DRILL-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14297327#comment-14297327 ]
Chris Westin commented on DRILL-2077: ------------------------------------- This should go in the new markdown pages in git. > Provide a clear starting point for new developers about what to start reading > to learn about Drill > -------------------------------------------------------------------------------------------------- > > Key: DRILL-2077 > URL: https://issues.apache.org/jira/browse/DRILL-2077 > Project: Apache Drill > Issue Type: Improvement > Reporter: Jason Altekruse > Assignee: Jason Altekruse > > As part of my package level javadocs posted in DRILL-1904 I tried to document > the root org.apache.drill.exec package. We should have some good information > here as well as in the markdown file on the git repo about the best place to > start reading the code to understand how drill operates. > Here is a description I started. I think we want to make sure this is > informative but concise. I want to get in the rest of the package docs, so I > am leaving this here as a TODO, please feel free to comment, revise or add to > this. > {code} > * A good place to start learning about Drill is exploring the query plans. A > * Drill physical plan is defined as a connected graph of operators that read > * and manipulate data. Operators are configured by implementations of the > {@See > * PhysicalOperator} interface. These query graphs are translated into a graph > * of physical operators that will actually process data at query execution > * time. The connections between these nodes are materialized as interfaces > * where data is passed between different operators. As Drill is distributed > * these connections can take the form of an RPC layer between the nodes in a > * Drill cluster. > * > * While physical plans can be written by hand, the primary interface for > Drill > * is SQL. Drill is targeted for compliance with the ANSI SQL 2003 > * specification. Query parsing and optimization is handled by Calcite, an > * Apache incubator project, also used for planning in Apache Hive. Drill > * defines many planning rules an optimizations that plug into the Calcite > * planning engine to generate optimal plans for the Drill engine. > * > * Unlike most query systems, Drill is designed to query raw files without > * a predefined catalog of metadata defining the types of data or columns > * available in the dataset. To maintain performance in a flexible schema > * environment, Drill uses runtime code generation to compile custom java > * code as operators receive a message of change in schema. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)