Modified: drill/site/trunk/content/drill/docs/planning-and-execution-options/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/planning-and-execution-options/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/planning-and-execution-options/index.html (original) +++ drill/site/trunk/content/drill/docs/planning-and-execution-options/index.html Thu Feb 26 01:16:43 2015 @@ -88,17 +88,17 @@ persist across all sessions.</p> <p>The following table contains planning and execution options that you can set at the system or session level:</p> -<div class="table-wrap"><table class="confluenceTable"><tbody><tr><th class="confluenceTh">Option name</th><th class="confluenceTh">Default value</th><th class="confluenceTh">Description</th></tr><tr><td valign="top" colspan="1" class="confluenceTd">exec.errors.verbose</td><td valign="top" colspan="1" class="confluenceTd"><p>false</p></td><td valign="top" colspan="1" class="confluenceTd"><p>This option enables or disables the verbose message that Drill returns when a query fails. When enabled, Drill provides additional information about failed queries.</p></td></tr><tr><td valign="top" colspan="1" class="confluenceTd"><span>exec.max_hash_table_size</span></td><td valign="top" colspan="1" class="confluenceTd">1073741824</td><td valign="top" colspan="1" class="confluenceTd"><span>The default maximum size for hash tables.</span></td></tr><tr><td valign="top" colspan="1" class="confluenceTd">exec.min_hash_table_size</td><td valign="top" colspan="1" class="confluenceTd">65536</td><td val ign="top" colspan="1" class="confluenceTd">The default starting size for hash tables. Increasing this size is useful for very large aggregations or joins when you have large amounts of memory for Drill to use. Drill can spend a lot of time resizing the hash table as it finds new data. If you have large data sets, you can increase this hash table size to increase performance.</td></tr><tr><td valign="top" colspan="1" class="confluenceTd">planner.add_producer_consumer</td><td valign="top" colspan="1" class="confluenceTd"><p>false</p><p> </p></td><td valign="top" colspan="1" class="confluenceTd"><p>This option enables or disables a secondary reading thread that works out of band of the rest of the scanning fragment to prefetch data from disk. <span style="line-height: 1.4285715;background-color: transparent;">If you interact with a certain type of storage medium that is slow or does not prefetch much data, this option tells Drill to add a producer consumer reading thread to the operati on. Drill can then assign one thread that focuses on a single reading fragment. </span></p><p>If Drill is using memory, you can disable this option to get better performance. If Drill is using disk space, you should enable this option and set a reasonable queue size for the planner.producer_consumer_queue_size option.</p></td></tr><tr><td valign="top" colspan="1" class="confluenceTd">planner.broadcast_threshold</td><td valign="top" colspan="1" class="confluenceTd">1000000</td><td valign="top" colspan="1" class="confluenceTd"><span style="color: rgb(34,34,34);">Threshold, in terms of a number of rows, that determines whether a broadcast join is chosen for a query. Regardless of the setting of the broadcast_join option (enabled or disabled), a broadcast join is not chosen unless the right side of the join is estimated to contain fewer rows than this threshold. The intent of this option is to avoid broadcasting too many rows for join purposes. Broadcasting involves sending data across nodes and is a network-intensive operation. (The "right side" of the join, which may itself be a join or simply a table, is determined by cost-based optimizations and heuristics during physical planning.)</span></td></tr><tr><td valign="top" colspan="1" class="confluenceTd"><p>planner.enable_broadcast_join<br />planner.enable_hashagg<br />planner.enable_hashjoin<br />planner.enable_mergejoin<br />planner.enable_multiphase_agg<br />planner.enable_streamagg</p></td><td valign="top" colspan="1" class="confluenceTd">true</td><td valign="top" colspan="1" class="confluenceTd"><p>These options enable or disable specific aggregation and join operators for queries. These operators are all enabled by default and in general should not be disabled.</p><p>Hash aggregation and hash join are hash-based operations. Streaming aggregation and merge join are sort-based operations. Both hash-based and sort-based operations consume memory; however, currently, hash-based operations do not spill to disk as needed, but the sort-based operations do. If large hash operations do not fit in memory on your system, you may need to disable these operations. Queries will continue to run, using alternative plans.</p></td></tr><tr><td valign="top" colspan="1" class="confluenceTd">planner.producer_consumer_queue_size</td><td valign="top" colspan="1" class="confluenceTd">10</td><td valign="top" colspan="1" class="confluenceTd">Determines how much data to prefetch from disk (in record batches) out of band of query execution. The larger the queue size, the greater the amount of memory that the queue and overall query execution consumes.</td></tr><tr><td valign="top" colspan="1" class="confluenceTd">planner.slice_target</td><td valign="top" colspan="1" class="confluenceTd">100000</td><td valign="top" colspan="1" class="confluenceTd">The number of records manipulated within a fragment before Drill parallelizes them.</td></tr><tr><td valign="top" colspan="1" class="confluenceTd"><p>planner. width.max_per_node</p><p> </p></td><td valign="top" colspan="1" class="confluenceTd"><p>The default depends on the number of cores on each node.</p></td><td valign="top" colspan="1" class="confluenceTd"><p>In this context "width" refers to fanout or distribution potential: the ability to run a query in parallel across the cores on a node and the nodes on a cluster.</p><p><span>A physical plan consists of intermediate operations, known as query "fragments," that run concurrently, yielding opportunities for parallelism above and below each exchange operator in the plan. An exchange operator represents a breakpoint in the execution flow where processing can be distributed. For example, a single-process scan of a file may flow into an exchange operator, followed by a multi-process aggregation fragment.</span><span> </span></p><p>The maximum width per node defines the maximum degree of parallelism for any fragment of a query, but the setting applies at the level of a single node in the cluster.</p><p>The <em>default</em> maximum degree of parallelism per node is calculated as follows, with the theoretical maximum automatically scaled back (and rounded down) so that only 70% of the actual available capacity is taken into account:</p><div class="code panel pdl" style="border-width: 1px;"><div class="codeContent panelContent pdl"> +<table ><tbody><tr><th >Option name</th><th >Default value</th><th >Description</th></tr><tr><td valign="top" colspan="1" >exec.errors.verbose</td><td valign="top" colspan="1" ><p>false</p></td><td valign="top" colspan="1" ><p>This option enables or disables the verbose message that Drill returns when a query fails. When enabled, Drill provides additional information about failed queries.</p></td></tr><tr><td valign="top" colspan="1" ><span>exec.max_hash_table_size</span></td><td valign="top" colspan="1" >1073741824</td><td valign="top" colspan="1" ><span>The default maximum size for hash tables.</span></td></tr><tr><td valign="top" colspan="1" >exec.min_hash_table_size</td><td valign="top" colspan="1" >65536</td><td valign="top" colspan="1" >The default starting size for hash tables. Increasing this size is useful for very large aggregations or joins when you have large amounts of memory for Drill to use. Drill can spend a lot of time resizing the hash table as it finds new data. I f you have large data sets, you can increase this hash table size to increase performance.</td></tr><tr><td valign="top" colspan="1" >planner.add_producer_consumer</td><td valign="top" colspan="1" ><p>false</p><p> </p></td><td valign="top" colspan="1" ><p>This option enables or disables a secondary reading thread that works out of band of the rest of the scanning fragment to prefetch data from disk. <span style="line-height: 1.4285715;background-color: transparent;">If you interact with a certain type of storage medium that is slow or does not prefetch much data, this option tells Drill to add a producer consumer reading thread to the operation. Drill can then assign one thread that focuses on a single reading fragment. </span></p><p>If Drill is using memory, you can disable this option to get better performance. If Drill is using disk space, you should enable this option and set a reasonable queue size for the planner.producer_consumer_queue_size option.</p></td></tr><tr><td valign ="top" colspan="1" >planner.broadcast_threshold</td><td valign="top" colspan="1" >1000000</td><td valign="top" colspan="1" ><span style="color: rgb(34,34,34);">Threshold, in terms of a number of rows, that determines whether a broadcast join is chosen for a query. Regardless of the setting of the broadcast_join option (enabled or disabled), a broadcast join is not chosen unless the right side of the join is estimated to contain fewer rows than this threshold. The intent of this option is to avoid broadcasting too many rows for join purposes. Broadcasting involves sending data across nodes and is a network-intensive operation. (The "right side" of the join, which may itself be a join or simply a table, is determined by cost-based optimizations and heuristics during physical planning.)</span></td></tr><tr><td valign="top" colspan="1" ><p>planner.enable_broadcast_join<br />planner.enable_hashagg<br />planner.enable_hashjoin<br />planner.enable_mergejoin<br />planner.enable_mu ltiphase_agg<br />planner.enable_streamagg</p></td><td valign="top" colspan="1" >true</td><td valign="top" colspan="1" ><p>These options enable or disable specific aggregation and join operators for queries. These operators are all enabled by default and in general should not be disabled.</p><p>Hash aggregation and hash join are hash-based operations. Streaming aggregation and merge join are sort-based operations. Both hash-based and sort-based operations consume memory; however, currently, hash-based operations do not spill to disk as needed, but the sort-based operations do. If large hash operations do not fit in memory on your system, you may need to disable these operations. Queries will continue to run, using alternative plans.</p></td></tr><tr><td valign="top" colspan="1" >planner.producer_consumer_queue_size</td><td valign="top" colspan="1" >10</td><td valign="top" colspan="1" >Determines how much data to prefetch from disk (in record batches) out of band of query execution. The larger the queue size, the greater the amount of memory that the queue and overall query execution consumes.</td></tr><tr><td valign="top" colspan="1" >planner.slice_target</td><td valign="top" colspan="1" >100000</td><td valign="top" colspan="1" >The number of records manipulated within a fragment before Drill parallelizes them.</td></tr><tr><td valign="top" colspan="1" ><p>planner.width.max_per_node</p><p> </p></td><td valign="top" colspan="1" ><p>The default depends on the number of cores on each node.</p></td><td valign="top" colspan="1" ><p>In this context "width" refers to fanout or distribution potential: the ability to run a query in parallel across the cores on a node and the nodes on a cluster.</p><p><span>A physical plan consists of intermediate operations, known as query "fragments," that run concurrently, yielding opportunities for parallelism above and below each exchange operator in the plan. An exchange operator represents a breakpoint in the execution flow where processing can be distributed. For example, a single-process scan of a file may flow into an exchange operator, followed by a multi-process aggregation fragment.</span><span> </span></p><p>The maximum width per node defines the maximum degree of parallelism for any fragment of a query, but the setting applies at the level of a single node in the cluster.</p><p>The <em>default</em> maximum degree of parallelism per node is calculated as follows, with the theoretical maximum automatically scaled back (and rounded down) so that only 70% of the actual available capacity is taken into account:</p> <script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[number of active drillbits (typically one per node) * number of cores per node * 0.7]]></script> -</div></div><p>For example, on a single-node test system with 2 cores and hyper-threading enabled:</p><div class="code panel pdl" style="border-width: 1px;"><div class="codeContent panelContent pdl"> +<p>For example, on a single-node test system with 2 cores and hyper-threading enabled:</p> <script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[1 * 4 * 0.7 = 3]]></script> -</div></div><p>When you modify the default setting, you can supply any meaningful number. The system does not automatically scale down your setting.</p></td></tr><tr><td valign="top" colspan="1" class="confluenceTd">planner.width.max_per_query</td><td valign="top" colspan="1" class="confluenceTd">1000</td><td valign="top" colspan="1" class="confluenceTd"><p>The max_per_query value also sets the maximum degree of parallelism for any given stage of a query, but the setting applies to the query as executed by the whole cluster (multiple nodes). In effect, the actual maximum width per query is the <em>minimum of two values</em>:</p><div class="code panel pdl" style="border-width: 1px;"><div class="codeContent panelContent pdl"> +<p>When you modify the default setting, you can supply any meaningful number. The system does not automatically scale down your setting.</p></td></tr><tr><td valign="top" colspan="1" >planner.width.max_per_query</td><td valign="top" colspan="1" >1000</td><td valign="top" colspan="1" ><p>The max_per_query value also sets the maximum degree of parallelism for any given stage of a query, but the setting applies to the query as executed by the whole cluster (multiple nodes). In effect, the actual maximum width per query is the <em>minimum of two values</em>:</p> <script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[min((number of nodes * width.max_per_node), width.max_per_query)]]></script> -</div></div><p>For example, on a 4-node cluster where <span><code>width.max_per_node</code> is set to 6 and </span><span><code>width.max_per_query</code> is set to 30:</span></p><div class="code panel pdl" style="border-width: 1px;"><div class="codeContent panelContent pdl"> +<p>For example, on a 4-node cluster where <span><code>width.max_per_node</code> is set to 6 and </span><span><code>width.max_per_query</code> is set to 30:</span></p> <script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[min((4 * 6), 30) = 24]]></script> -</div></div><p>In this case, the effective maximum width per query is 24, not 30.</p></td></tr><tr><td valign="top" colspan="1" class="confluenceTd">store.format</td><td valign="top" colspan="1" class="confluenceTd"> </td><td valign="top" colspan="1" class="confluenceTd">Output format for data that is written to tables with the CREATE TABLE AS (CTAS) command.</td></tr><tr><td valign="top" colspan="1" class="confluenceTd">store.json.all_text_mode</td><td valign="top" colspan="1" class="confluenceTd"><p>false</p></td><td valign="top" colspan="1" class="confluenceTd"><p>This option enables or disables text mode. When enabled, Drill reads everything in JSON as a text object instead of trying to interpret data types. This allows complicated JSON to be read using CASE and CAST.</p></td></tr><tr><td valign="top" class="confluenceTd">store.parquet.block-size</td><td valign="top" class="confluenceTd"><p>536870912</p></td><td valign="top" class="confluenceTd">T<span style="color: rgb(34,34,34 );">arget size for a parquet row group, which should be equal to or less than the configured HDFS block size. </span></td></tr></tbody></table></div> +<p>In this case, the effective maximum width per query is 24, not 30.</p></td></tr><tr><td valign="top" colspan="1" >store.format</td><td valign="top" colspan="1" > </td><td valign="top" colspan="1" >Output format for data that is written to tables with the CREATE TABLE AS (CTAS) command.</td></tr><tr><td valign="top" colspan="1" >store.json.all_text_mode</td><td valign="top" colspan="1" ><p>false</p></td><td valign="top" colspan="1" ><p>This option enables or disables text mode. When enabled, Drill reads everything in JSON as a text object instead of trying to interpret data types. This allows complicated JSON to be read using CASE and CAST.</p></td></tr><tr><td valign="top" >store.parquet.block-size</td><td valign="top" ><p>536870912</p></td><td valign="top" >T<span style="color: rgb(34,34,34);">arget size for a parquet row group, which should be equal to or less than the configured HDFS block size. </span></td></tr></tbody></table> </div>
Modified: drill/site/trunk/content/drill/docs/ports-used-by-drill/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/ports-used-by-drill/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/ports-used-by-drill/index.html (original) +++ drill/site/trunk/content/drill/docs/ports-used-by-drill/index.html Thu Feb 26 01:16:43 2015 @@ -70,7 +70,7 @@ <div class="int_text" align="left"><p>The following table provides a list of the ports that Drill uses, the port type, and a description of how Drill uses the port:</p> -<div class="table-wrap"><table class="confluenceTable"><tbody><tr><th class="confluenceTh">Port</th><th colspan="1" class="confluenceTh">Type</th><th class="confluenceTh">Description</th></tr><tr><td valign="top" class="confluenceTd">8047</td><td valign="top" colspan="1" class="confluenceTd">TCP</td><td valign="top" class="confluenceTd">Needed for <span style="color: rgb(34,34,34);">the Drill Web UI.</span><span style="color: rgb(34,34,34);"> </span></td></tr><tr><td valign="top" class="confluenceTd">31010</td><td valign="top" colspan="1" class="confluenceTd">TCP</td><td valign="top" class="confluenceTd">User port address. Used between nodes in a Drill cluster. <br />Needed for an external client, such as Tableau, to connect into the<br />cluster nodes. Also needed for the Drill Web UI.</td></tr><tr><td valign="top" class="confluenceTd">31011</td><td valign="top" colspan="1" class="confluenceTd">TCP</td><td valign="top" class="confluenceTd">Control port address. Used between nodes i n a Drill cluster. <br />Needed for multi-node installation of Apache Drill.</td></tr><tr><td valign="top" colspan="1" class="confluenceTd">31012</td><td valign="top" colspan="1" class="confluenceTd">TCP</td><td valign="top" colspan="1" class="confluenceTd">Data port address. Used between nodes in a Drill cluster. <br />Needed for multi-node installation of Apache Drill.</td></tr><tr><td valign="top" colspan="1" class="confluenceTd">46655</td><td valign="top" colspan="1" class="confluenceTd">UDP</td><td valign="top" colspan="1" class="confluenceTd">Used for JGroups and Infinispan. Needed for multi-node installation of Apache Drill.</td></tr></tbody></table></div> +<table ><tbody><tr><th >Port</th><th colspan="1" >Type</th><th >Description</th></tr><tr><td valign="top" >8047</td><td valign="top" colspan="1" >TCP</td><td valign="top" >Needed for <span style="color: rgb(34,34,34);">the Drill Web UI.</span><span style="color: rgb(34,34,34);"> </span></td></tr><tr><td valign="top" >31010</td><td valign="top" colspan="1" >TCP</td><td valign="top" >User port address. Used between nodes in a Drill cluster. <br />Needed for an external client, such as Tableau, to connect into the<br />cluster nodes. Also needed for the Drill Web UI.</td></tr><tr><td valign="top" >31011</td><td valign="top" colspan="1" >TCP</td><td valign="top" >Control port address. Used between nodes in a Drill cluster. <br />Needed for multi-node installation of Apache Drill.</td></tr><tr><td valign="top" colspan="1" >31012</td><td valign="top" colspan="1" >TCP</td><td valign="top" colspan="1" >Data port address. Used between nodes in a Drill cluster. <br />Needed for multi-node ins tallation of Apache Drill.</td></tr><tr><td valign="top" colspan="1" >46655</td><td valign="top" colspan="1" >UDP</td><td valign="top" colspan="1" >Used for JGroups and Infinispan. Needed for multi-node installation of Apache Drill.</td></tr></tbody></table> </div> Modified: drill/site/trunk/content/drill/docs/progress-reports/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/progress-reports/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/progress-reports/index.html (original) +++ drill/site/trunk/content/drill/docs/progress-reports/index.html Thu Feb 26 01:16:43 2015 @@ -71,7 +71,7 @@ progression of the project, summary of mailing list discussions, and events:</p> <ul> -<li><a href="/confluence/display/DRILL/2014+Q1+Drill+Report">2014 Q1 Drill Report</a></li> +<li><a href="/drill/docs/2014-q1-drill-report">2014 Q1 Drill Report</a></li> </ul> </div> Modified: drill/site/trunk/content/drill/docs/project-bylaws/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/project-bylaws/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/project-bylaws/index.html (original) +++ drill/site/trunk/content/drill/docs/project-bylaws/index.html Thu Feb 26 01:16:43 2015 @@ -67,7 +67,7 @@ </div> -<div class="int_text" align="left"><h1 id="introduction">Introduction</h1> +<div class="int_text" align="left"><h2 id="introduction">Introduction</h2> <p>This document defines the bylaws under which the Apache Drill project operates. It defines the roles and responsibilities of the project, who may @@ -85,13 +85,13 @@ development, please refer to the <a href project</a> for more information on how Apache projects operate.</p> -<h1 id="roles-and-responsibilities">Roles and Responsibilities</h1> +<h2 id="roles-and-responsibilities">Roles and Responsibilities</h2> <p>Apache projects define a set of roles with associated rights and responsibilities. These roles govern what tasks an individual may perform within the project. The roles are defined in the following sections.</p> -<h2 id="users">Users</h2> +<h3 id="users">Users</h3> <p>The most important participants in the project are people who use our software. The majority of our contributors start out as users and guide their @@ -102,14 +102,14 @@ in the form of bug reports and feature s in the Apache community by helping other users on mailing lists and user support forums.</p> -<h2 id="contributors">Contributors</h2> +<h3 id="contributors">Contributors</h3> <p>All of the volunteers who are contributing time, code, documentation, or resources to the Drill Project. A contributor that makes sustained, welcome contributions to the project may be invited to become a committer, though the exact timing of such invitations depends on many factors.</p> -<h2 id="committers">Committers</h2> +<h3 id="committers">Committers</h3> <p>The project's committers are responsible for the project's technical management. Committers have access to a specified set of subproject's code @@ -137,7 +137,7 @@ to become a member of the PMC. The form code. It can also include code review, helping out users on the mailing lists, documentation, etc.</p> -<h2 id="project-management-committee">Project Management Committee</h2> +<h3 id="project-management-committee">Project Management Committee</h3> <p>The PMC is responsible to the board and the ASF for the management and oversight of the Apache Drill codebase. The responsibilities of the PMC @@ -172,7 +172,7 @@ the chair resigns before the end of his recommend a new chair using lazy consensus, but the decision must be ratified by the Apache board.</p> -<h1 id="decision-making">Decision Making</h1> +<h2 id="decision-making">Decision Making</h2> <p>Within the Drill project, different types of decisions require different forms of approval. For example, the previous section describes several decisions @@ -180,7 +180,7 @@ which require 'lazy consensus' a performed, the types of approvals, and which types of decision require which type of approval.</p> -<h2 id="voting">Voting</h2> +<h3 id="voting">Voting</h3> <p>Decisions regarding the project are made by votes on the primary project development mailing list @@ -191,7 +191,7 @@ indicated by subject line starting with items for approval and these should be clearly separated. Voting is carried out by replying to the vote mail. Voting may take four flavors.</p> -<p><table class="confluenceTable"><tbody><tr><td valign="top" class="confluenceTd"><p>Vote</p></td><td valign="top" class="confluenceTd"><p> </p></td></tr><tr><td valign="top" class="confluenceTd"><p>+1</p></td><td valign="top" class="confluenceTd"><p>'Yes,' 'Agree,' or 'the action should be performed.' In general, this vote also indicates a willingness on the behalf of the voter in 'making it happen'.</p></td></tr><tr><td valign="top" class="confluenceTd"><p>+0</p></td><td valign="top" class="confluenceTd"><p>This vote indicates a willingness for the action under consideration to go ahead. The voter, however will not be able to help.</p></td></tr><tr><td valign="top" class="confluenceTd"><p>-0</p></td><td valign="top" class="confluenceTd"><p>This vote indicates that the voter does not, in general, agree with the proposed action but is not concerned enough to prevent the action going ahead.</p></td></tr><tr><td valign="top" class="confluenceTd"><p>-1< /p></td><td valign="top" class="confluenceTd"><p>This is a negative vote. On issues where consensus is required, this vote counts as a <strong>veto</strong>. All vetoes must contain an explanation of why the veto is appropriate. Vetoes with no explanation are void. It may also be appropriate for a -1 vote to include an alternative course of action.</p></td></tr></tbody></table></p> +<p><table ><tbody><tr><td valign="top" >Vote</td><td valign="top" > </td></tr><tr><td valign="top" >+1</td><td valign="top" >'Yes,' 'Agree,' or 'the action should be performed.' In general, this vote also indicates a willingness on the behalf of the voter in 'making it happen'.</td></tr><tr><td valign="top" >+0</td><td valign="top" >This vote indicates a willingness for the action under consideration to go ahead. The voter, however will not be able to help.</td></tr><tr><td valign="top" >-0</td><td valign="top" >This vote indicates that the voter does not, in general, agree with the proposed action but is not concerned enough to prevent the action going ahead.</td></tr><tr><td valign="top" >-1</td><td valign="top" >This is a negative vote. On issues where consensus is required, this vote counts as a <strong>veto</strong>. All vetoes must contain an explanation of why the veto is appropriate. Vetoes with no explanation are void. It may also be appropri ate for a -1 vote to include an alternative course of action.</td></tr></tbody></table></p> <p>All participants in the Drill project are encouraged to show their agreement with or against a particular action by voting. For technical decisions, only @@ -206,15 +206,15 @@ sent when the commit is made. Note that efforts should be made to discuss issues when they are still patches before the code is committed.</p> -<h2 id="approvals">Approvals</h2> +<h3 id="approvals">Approvals</h3> <p>These are the types of approvals that can be sought. Different actions require different types of approvals.</p> -<table class="confluenceTable"><tbody><tr><td valign="top" class="confluenceTd"><p>Approval Type</p></td><td valign="top" class="confluenceTd"><p> </p></td></tr><tr><td valign="top" class="confluenceTd"><p>Consensus</p></td><td valign="top" class="confluenceTd"><p>For this to pass, all voters with binding votes must vote and there can be no binding vetoes (-1). Consensus votes are rarely required due to the impracticality of getting all eligible voters to cast a vote.</p></td></tr><tr><td valign="top" class="confluenceTd"><p>Lazy Consensus</p></td><td valign="top" class="confluenceTd"><p>Lazy consensus requires 3 binding +1 votes and no binding vetoes.</p></td></tr><tr><td valign="top" class="confluenceTd"><p>Lazy Majority</p></td><td valign="top" class="confluenceTd"><p>A lazy majority vote requires 3 binding +1 votes and more binding +1 votes that -1 votes.</p></td></tr><tr><td valign="top" class="confluenceTd"><p>Lazy Approval</p></td><td valign="top" class="confluenceTd"><p>An a ction with lazy approval is implicitly allowed unless a -1 vote is received, at which time, depending on the type of action, either lazy majority or lazy consensus approval must be obtained.</p></td></tr></tbody></table> +<table ><tbody><tr><td valign="top" >Approval Type</td><td valign="top" > </td></tr><tr><td valign="top" >Consensus</td><td valign="top" >For this to pass, all voters with binding votes must vote and there can be no binding vetoes (-1). Consensus votes are rarely required due to the impracticality of getting all eligible voters to cast a vote.</td></tr><tr><td valign="top" >Lazy Consensus</td><td valign="top" >Lazy consensus requires 3 binding +1 votes and no binding vetoes.</td></tr><tr><td valign="top" >Lazy Majority</td><td valign="top" >A lazy majority vote requires 3 binding +1 votes and more binding +1 votes that -1 votes.</td></tr><tr><td valign="top" >Lazy Approval</td><td valign="top" >An action with lazy approval is implicitly allowed unless a -1 vote is received, at which time, depending on the type of action, either lazy majority or lazy consensus approval must be obtained.</td></tr></tbody></table> -<h2 id="vetoes">Vetoes</h2> +<h3 id="vetoes">Vetoes</h3> <p>A valid, binding veto cannot be overruled. If a veto is cast, it must be accompanied by a valid reason explaining the reasons for the veto. The @@ -226,7 +226,7 @@ merely that the veto is valid.</p> to withdraw his or her veto. If a veto is not withdrawn, the action that has been vetoed must be reversed in a timely manner.</p> -<h2 id="actions">Actions</h2> +<h3 id="actions">Actions</h3> <p>This section describes the various actions which are undertaken within the project, the corresponding approval required for that action and those who @@ -235,7 +235,7 @@ time that a vote must remain open, measu should not be called at times when it is known that interested members of the project will be unavailable.</p> -<table class="confluenceTable"><tbody><tr><td valign="top" class="confluenceTd"><p>Action</p></td><td valign="top" class="confluenceTd"><p>Description</p></td><td valign="top" class="confluenceTd"><p>Approval</p></td><td valign="top" class="confluenceTd"><p>Binding Votes</p></td><td valign="top" class="confluenceTd"><p>Minimum Length</p></td></tr><tr><td valign="top" class="confluenceTd"><p>Code Change</p></td><td valign="top" class="confluenceTd"><p>A change made to a codebase of the project and committed by a committer. This includes source code, documentation, website content, etc.</p></td><td valign="top" class="confluenceTd"><p>Consensus approval of active committers, with a minimum of one +1. The code can be committed after the first +1</p></td><td valign="top" class="confluenceTd"><p>Active committers</p></td><td valign="top" class="confluenceTd"><p>1</p></td></tr><tr><td valign="top" class="confluenceTd"><p>Release Plan</p></td><td valign="top" class="confluenceTd"><p>Define s the timetable and actions for a release. The plan also nominates a Release Manager.</p></td><td valign="top" class="confluenceTd"><p>Lazy majority</p></td><td valign="top" class="confluenceTd"><p>Active committers</p></td><td valign="top" class="confluenceTd"><p>3</p></td></tr><tr><td valign="top" class="confluenceTd"><p>Product Release</p></td><td valign="top" class="confluenceTd"><p>When a release of one of the project's products is ready, a vote is required to accept the release as an official release of the project.</p></td><td valign="top" class="confluenceTd"><p>Lazy Majority</p></td><td valign="top" class="confluenceTd"><p>Active PMC members</p></td><td valign="top" class="confluenceTd"><p>3</p></td></tr><tr><td valign="top" class="confluenceTd"><p>Adoption of New Codebase</p></td><td valign="top" class="confluenceTd"><p>When the codebase for an existing, released product is to be replaced with an alternative codebase. If such a vote fails to gain approval, the existing cod e base will continue. This also covers the creation of new sub-projects within the project.</p></td><td valign="top" class="confluenceTd"><p>2/3 majority</p></td><td valign="top" class="confluenceTd"><p>Active PMC members</p></td><td valign="top" class="confluenceTd"><p>6</p></td></tr><tr><td valign="top" class="confluenceTd"><p>New Committer</p></td><td valign="top" class="confluenceTd"><p>When a new committer is proposed for the project.</p></td><td valign="top" class="confluenceTd"><p>Lazy consensus</p></td><td valign="top" class="confluenceTd"><p>Active PMC members</p></td><td valign="top" class="confluenceTd"><p>3</p></td></tr><tr><td valign="top" class="confluenceTd"><p>New PMC Member</p></td><td valign="top" class="confluenceTd"><p>When a committer is proposed for the PMC.</p></td><td valign="top" class="confluenceTd"><p>Lazy consensus</p></td><td valign="top" class="confluenceTd"><p>Active PMC members</p></td><td valign="top" class="confluenceTd"><p>3</p></td></tr><tr><td va lign="top" class="confluenceTd"><p>Committer Removal</p></td><td valign="top" class="confluenceTd"><p>When removal of commit privileges is sought. <em>Note: Such actions will also be referred to the ASF board by the PMC chair.</em></p></td><td valign="top" class="confluenceTd"><p>Consensus</p></td><td valign="top" class="confluenceTd"><p>Active PMC members (excluding the committer in question if a member of the PMC).</p></td><td valign="top" class="confluenceTd"><p>6</p></td></tr><tr><td valign="top" class="confluenceTd"><p>PMC Member Removal</p></td><td valign="top" class="confluenceTd"><p>When removal of a PMC member is sought. <em>Note: Such actions will also be referred to the ASF board by the PMC chair.</em></p></td><td valign="top" class="confluenceTd"><p>Consensus</p></td><td valign="top" class="confluenceTd"><p>Active PMC members (excluding the member in question).</p></td><td valign="top" class="confluenceTd"><p>6</p></td></tr><tr><td valign="top" class="confluenceTd"><p>Mo difying Bylaws</p></td><td valign="top" class="confluenceTd"><p>Modifying this document.</p></td><td valign="top" class="confluenceTd"><p>2/3 majority</p></td><td valign="top" class="confluenceTd"><p>Active PMC members</p></td><td valign="top" class="confluenceTd"><p>6</p></td></tr></tbody></table> +<table ><tbody><tr><td valign="top" >Action</td><td valign="top" >Description</td><td valign="top" >Approval</td><td valign="top" >Binding Votes</td><td valign="top" >Minimum Length</td></tr><tr><td valign="top" >Code Change</td><td valign="top" >A change made to a codebase of the project and committed by a committer. This includes source code, documentation, website content, etc.</td><td valign="top" >Consensus approval of active committers, with a minimum of one +1. The code can be committed after the first +1</td><td valign="top" >Active committers</td><td valign="top" >1</td></tr><tr><td valign="top" >Release Plan</td><td valign="top" >Defines the timetable and actions for a release. The plan also nominates a Release Manager.</td><td valign="top" >Lazy majority</td><td valign="top" >Active committers</td><td valign="top" >3</td></tr><tr><td valign="top" >Product Release</td><td valign="top" >When a release of one of the project's products is ready, a vote is required to accept t he release as an official release of the project.</td><td valign="top" >Lazy Majority</td><td valign="top" >Active PMC members</td><td valign="top" >3</td></tr><tr><td valign="top" >Adoption of New Codebase</td><td valign="top" >When the codebase for an existing, released product is to be replaced with an alternative codebase. If such a vote fails to gain approval, the existing code base will continue. This also covers the creation of new sub-projects within the project.</td><td valign="top" >2/3 majority</td><td valign="top" >Active PMC members</td><td valign="top" >6</td></tr><tr><td valign="top" >New Committer</td><td valign="top" >When a new committer is proposed for the project.</td><td valign="top" >Lazy consensus</td><td valign="top" >Active PMC members</td><td valign="top" >3</td></tr><tr><td valign="top" >New PMC Member</td><td valign="top" >When a committer is proposed for the PMC.</td><td valign="top" >Lazy consensus</td><td valign="top" >Active PMC members</td><td valign ="top" >3</td></tr><tr><td valign="top" >Committer Removal</td><td valign="top" >When removal of commit privileges is sought. <em>Note: Such actions will also be referred to the ASF board by the PMC chair.</em></td><td valign="top" >Consensus</td><td valign="top" >Active PMC members (excluding the committer in question if a member of the PMC).</td><td valign="top" >6</td></tr><tr><td valign="top" >PMC Member Removal</td><td valign="top" >When removal of a PMC member is sought. <em>Note: Such actions will also be referred to the ASF board by the PMC chair.</em></td><td valign="top" >Consensus</td><td valign="top" >Active PMC members (excluding the member in question).</td><td valign="top" >6</td></tr><tr><td valign="top" >Modifying Bylaws</td><td valign="top" >Modifying this document.</td><td valign="top" >2/3 majority</td><td valign="top" >Active PMC members</td><td valign="top" >6</td></tr></tbody></table> </div> Modified: drill/site/trunk/content/drill/docs/query-2-using-standard-sql-functions-clauses-and-joins/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/query-2-using-standard-sql-functions-clauses-and-joins/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/query-2-using-standard-sql-functions-clauses-and-joins/index.html (original) +++ drill/site/trunk/content/drill/docs/query-2-using-standard-sql-functions-clauses-and-joins/index.html Thu Feb 26 01:16:43 2015 @@ -74,13 +74,9 @@ where id>0 order by id limit 1; +------------+------------+ - | id | type | - +------------+------------+ - | 0001 | donut | - +------------+------------+ 1 row selected (0.318 seconds) @@ -93,13 +89,9 @@ dfs.`/Users/brumsby/drill/moredonuts.jso on tbl1.id=tbl2.id; +------------+------------+ - | id | type | - +------------+------------+ - | 0001 | donut | - +------------+------------+ 1 row selected (0.395 seconds) @@ -110,13 +102,9 @@ on tbl1.id=tbl2.id; <div class="highlight"><pre><code class="language-text" data-lang="text">0: jdbc:drill:zk=local> select type, avg(ppu) as ppu_sum from dfs.`/Users/brumsby/drill/donuts.json` group by type; +------------+------------+ - | type | ppu_sum | - +------------+------------+ - | donut | 0.55 | - +------------+------------+ 1 row selected (0.216 seconds) @@ -124,13 +112,9 @@ on tbl1.id=tbl2.id; 0: jdbc:drill:zk=local> select type, sum(sales) as sum_by_type from dfs.`/Users/brumsby/drill/moredonuts.json` group by type; +------------+-------------+ - | type | sum_by_type | - +------------+-------------+ - | donut | 1194 | - +------------+-------------+ 1 row selected (0.389 seconds) Modified: drill/site/trunk/content/drill/docs/query-3-selecting-nested-data-for-a-column/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/query-3-selecting-nested-data-for-a-column/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/query-3-selecting-nested-data-for-a-column/index.html (original) +++ drill/site/trunk/content/drill/docs/query-3-selecting-nested-data-for-a-column/index.html Thu Feb 26 01:16:43 2015 @@ -75,15 +75,10 @@ the <em>fourth</em> element in the array <div class="highlight"><pre><code class="language-text" data-lang="text">0: jdbc:drill:zk=local> select topping[3] as top from dfs.`/Users/brumsby/drill/donuts.json`; +------------+ - | top | - +------------+ - | {"id":"5007","type":"Powdered Sugar"} | - -+------------+ - ++------------ 1 row selected (0.137 seconds) </code></pre></div> <p>Note that this query produces <em>one column for all of the data</em> that is nested Modified: drill/site/trunk/content/drill/docs/query-data/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/query-data/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/query-data/index.html (original) +++ drill/site/trunk/content/drill/docs/query-data/index.html Thu Feb 26 01:16:43 2015 @@ -72,21 +72,7 @@ registered with Drill. If you connected you invoked SQLLine, you can issue SQL queries against that schema. If you did not indicate a schema when you invoked SQLLine, you can issue the <code>USE <schema></code> statement to run your queries against a particular schema. After you -issue the <code>USE</code> statement, you can use absolute notation, such as -<code>schema.table.column</code>.</p> - -<p>Click on any of the following links for information about various data source -queries and examples:</p> - -<ul> -<li><a href="/confluence/display/DRILL/Querying+a+File+System">Querying a File System</a></li> -<li><a href="/confluence/display/DRILL/Querying+HBase">Querying HBase</a></li> -<li><a href="/confluence/display/DRILL/Querying+Hive">Querying Hive</a></li> -<li><a href="/confluence/display/DRILL/Querying+Complex+Data">Querying Complex Data</a></li> -<li><a href="/confluence/display/DRILL/Querying+the+INFORMATION_SCHEMA">Querying the INFORMATION_SCHEMA</a></li> -<li><a href="/confluence/display/DRILL/Querying+System+Tables">Querying System Tables</a></li> -<li><a href="/confluence/display/DRILL/Drill+Interfaces">Drill Interfaces</a></li> -</ul> +issue the <code>USE</code> statement, you can use absolute notation, such as <code>schema.table.column</code>.</p> <p>You may need to use casting functions in some queries. For example, you may have to cast a string <code>"100"</code> to an integer in order to apply a math function @@ -107,23 +93,19 @@ a query.</p> <p>Remember the following tips when querying data with Drill:</p> <ul> -<li><p>Include a semicolon at the end of SQL statements, except when you issue a command with an exclamation point <code>(!). -</code>Example: <code>!set maxwidth 10000</code></p></li> -<li><p>Use backticks around file and directory names that contain special characters and also around reserved words when you query a file system .<br> +<li>Include a semicolon at the end of SQL statements, except when you issue a command with an exclamation point <code>(!). +</code>Example: <code>!set maxwidth 10000</code></li> +<li><p>Use backticks around file and directory names that contain special characters and also around reserved words when you query a file system.<br> The following special characters require backticks:</p> <ul> <li>. (period)</li> <li>/ (forward slash)</li> -<li>_ (underscore)</li> +<li>_ (underscore) +Example: <code>SELECT * FROM dfs.default.`sample_data/my_sample.json`;</code></li> </ul></li> -</ul> - -<p>Example: <code>SELECT * FROM dfs.default.`sample_data/my_sample.json`;</code></p> - -<ul> -<li><code>CAST</code> data to <code>VARCHAR</code> if an expression in a query returns <code>VARBINARY</code> as the result type in order to view the <code>VARBINARY</code> types as readable data. If you do not use the <code>CAST</code> function, Drill returns the results as byte data.<br> -Example: <code>CAST (VARBINARY_expr as VARCHAR(50))</code></li> +<li><p><code>CAST</code> data to <code>VARCHAR</code> if an expression in a query returns <code>VARBINARY</code> as the result type in order to view the <code>VARBINARY</code> types as readable data. If you do not use the <code>CAST</code> function, Drill returns the results as byte data.<br> + Example: <code>CAST (VARBINARY_expr as VARCHAR(50))</code></p></li> </ul> </div> Modified: drill/site/trunk/content/drill/docs/querying-a-file-system/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/querying-a-file-system/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/querying-a-file-system/index.html (original) +++ drill/site/trunk/content/drill/docs/querying-a-file-system/index.html Thu Feb 26 01:16:43 2015 @@ -75,13 +75,12 @@ plugin>.<workspace> or hdfs.logs.</p> <p>The following example shows a query on a file system database in a Hadoop distributed file system:</p> - -<p><code>SELECT * FROM hdfs.logs.`AppServerLogs/20104/Jan/01/part0001.txt`;</code></p> - +<div class="highlight"><pre><code class="language-text" data-lang="text"> SELECT * FROM hdfs.logs.`AppServerLogs/20104/Jan/01/part0001.txt`; +</code></pre></div> <p>The default <code>dfs</code> storage plugin instance registered with Drill has a <code>default</code> workspace. If you query data in the <code>default</code> workspace, you do not need to include the workspace in the query. Refer to -<a href="https://cwiki.apache.org/confluence/display/DRILL/Workspaces">Workspaces</a> for +<a href="/drill/docs/workspaces">Workspaces</a> for more information.</p> <p>Drill supports the following file types:</p> @@ -105,16 +104,6 @@ more information.</p> <p>The extensions for these file types must match the configuration settings for your registered storage plugins. For example, PSV files may be defined with a <code>.tbl</code> extension, while CSV files are defined with a <code>.csv</code> extension.</p> - -<p>Click on any of the following links for more information about querying -various file types:</p> - -<ul> -<li><a href="/confluence/display/DRILL/Querying+JSON+Files">Querying JSON Files</a></li> -<li><a href="/confluence/display/DRILL/Querying+Parquet+Files">Querying Parquet Files</a></li> -<li><a href="/confluence/display/DRILL/Querying+Plain+Text+Files">Querying Plain Text Files</a></li> -<li><a href="/confluence/display/DRILL/Querying+Directories">Querying Directories</a></li> -</ul> </div> Modified: drill/site/trunk/content/drill/docs/querying-complex-data/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/querying-complex-data/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/querying-complex-data/index.html (original) +++ drill/site/trunk/content/drill/docs/querying-complex-data/index.html Thu Feb 26 01:16:43 2015 @@ -121,17 +121,7 @@ Drill Web UI (<code>dfs</code> storage p <div class="highlight"><pre><code class="language-text" data-lang="text">"json" : { "type" : "json" } -</code></pre></div> -<p>Click on any of the following links to see examples of complex queries:</p> - -<ul> -<li><a href="/confluence/display/DRILL/Sample+Data%3A+Donuts">Sample Data: Donuts</a></li> -<li><a href="/confluence/display/DRILL/Query+1%3A+Selecting+Flat+Data">Query 1: Selecting Flat Data</a></li> -<li><a href="/confluence/display/DRILL/Query+2%3A+Using+Standard+SQL+Functions%2C+Clauses%2C+and+Joins">Query 2: Using Standard SQL Functions, Clauses, and Joins</a></li> -<li><a href="/confluence/display/DRILL/Query+3%3A+Selecting+Nested+Data+for+a+Column">Query 3: Selecting Nested Data for a Column</a></li> -<li><a href="/confluence/display/DRILL/Query+4%3A+Selecting+Multiple+Columns+Within+Nested+Data">Query 4: Selecting Multiple Columns Within Nested Data</a></li> -</ul> -</div> +</code></pre></div></div> <div id="footer" class="mw"> Modified: drill/site/trunk/content/drill/docs/querying-hbase/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/querying-hbase/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/querying-hbase/index.html (original) +++ drill/site/trunk/content/drill/docs/querying-hbase/index.html Thu Feb 26 01:16:43 2015 @@ -77,15 +77,14 @@ steps:</p> <li><p>Issue the following command to start the HBase shell:</p> <div class="highlight"><pre><code class="language-text" data-lang="text">hbase shell </code></pre></div></li> -<li><p>Issue the following commands to create a âstudentsâ table and a âclicksâ table with column families in HBase: </p> +<li><p>Issue the following commands to create a âstudentsâ table and a âclicksâ table with column families in HBase:</p> +<div class="highlight"><pre><code class="language-text" data-lang="text">echo "create 'students','account','address'" | hbase shell -<p><code>echo "create 'students','account','address'" | hbase shell</code></p> -<div class="highlight"><pre><code class="language-text" data-lang="text">``echo "create 'clicks','clickinfo','iteminfo'" | hbase shell`` +echo "create 'clicks','clickinfo','iteminfo'" | hbase shell </code></pre></div></li> -<li><p>Issue the following command with the provided data to create a <code>testdata.txt</code> file: </p> - -<p><code>cat > testdata.txt</code></p> - +<li><p>Issue the following command with the provided data to create a <code>testdata.txt</code> file:</p> +<div class="highlight"><pre><code class="language-text" data-lang="text">cat > testdata.txt +</code></pre></div> <p><strong>Sample Data</strong></p> <div class="highlight"><pre><code class="language-text" data-lang="text">put 'students','student1','account:name','Alice' put 'students','student1','address:street','123 Ballmer Av' @@ -150,90 +149,68 @@ put 'clicks','click9',&# put 'clicks','click9','iteminfo:quantity','10' </code></pre></div></li> <li><p>Issue the following command to verify that the data is in the <code>testdata.txt</code> file: </p> - -<p><code>cat testdata.txt | hbase shell</code></p></li> +<div class="highlight"><pre><code class="language-text" data-lang="text"> cat testdata.txt | hbase shell +</code></pre></div></li> <li><p>Issue <code>exit</code> to leave the <code>hbase shell</code>.</p></li> -<li><p>Start Drill. Refer to <a href="/confluence/pages/viewpage.action?pageId=44994063">Starting/Stopping Drill</a> for instructions.</p></li> -<li><p>Use Drill to issue the following SQL queries on the âstudentsâ and âclicksâ tables: -a. Issue the following query to see the data in the âstudentsâ table: </p> -<div class="highlight"><pre><code class="language-text" data-lang="text">``SELECT * FROM hbase.`students`;`` - -The query returns binary results: - -`Query finished, fetching results ...` - -`+----------+----------+----------+-----------+----------+----------+----------+-----------+` - -`|id | name | state | street | zipcode |` - -`+----------+----------+----------+-----------+----------+-----------+----------+-----------` - -`| [B@1ee37126 | [B@661985a1 | [B@15944165 | [B@385158f4 | [B@3e08d131 |` +<li><p>Start Drill. Refer to <a href="/drill/docs/starting-stopping-drill">Starting/Stopping Drill</a> for instructions.</p></li> +<li><p>Use Drill to issue the following SQL queries on the âstudentsâ and âclicksâ tables: </p> -`| [B@64a7180e | [B@161c72c2 | [B@25b229e5 | [B@53dc8cb8 |[B@1d11c878 |` - -`| [B@349aaf0b | [B@175a1628 | [B@1b64a812 | [B@6d5643ca |[B@147db06f |` - -`| [B@3a7cbada | [B@52cf5c35 | [B@2baec60c | [B@5f4c543b |[B@2ec515d6 |` +<ol> +<li><p>Issue the following query to see the data in the âstudentsâ table: </p> +<div class="highlight"><pre><code class="language-text" data-lang="text">SELECT * FROM hbase.`students`; +</code></pre></div> +<p>The query returns binary results:</p> +<div class="highlight"><pre><code class="language-text" data-lang="text">Query finished, fetching results ... ++----------+----------+----------+-----------+----------+----------+----------+-----------+ +|id | name | state | street | zipcode |` ++----------+----------+----------+-----------+----------+-----------+----------+----------- +| [B@1ee37126 | [B@661985a1 | [B@15944165 | [B@385158f4 |[B@3e08d131 | +| [B@64a7180e | [B@161c72c2 | [B@25b229e5 | [B@53dc8cb8 |[B@1d11c878 | +| [B@349aaf0b | [B@175a1628 | [B@1b64a812 | [B@6d5643ca |[B@147db06f | +| [B@3a7cbada | [B@52cf5c35 | [B@2baec60c | [B@5f4c543b |[B@2ec515d6 | </code></pre></div> <p>Since Drill does not require metadata, you must use the SQL <code>CAST</code> function in -some queries to get readable query results.</p> - -<p>b. Issue the following query, that includes the <code>CAST</code> function, to see the data in the â<code>students</code>â table:</p> - -<p><code>SELECT CAST(students.clickinfo.studentid as VarChar(20)), +some queries to get readable query results.</p></li> +<li><p>Issue the following query, that includes the <code>CAST</code> function, to see the data in the â<code>students</code>â table:</p> +<div class="highlight"><pre><code class="language-text" data-lang="text">SELECT CAST(students.clickinfo.studentid as VarChar(20)), CAST(students.account.name as VarChar(20)), CAST (students.address.state as VarChar(20)), CAST (students.address.street as VarChar(20)), CAST -(students.address.zipcode as VarChar(20)), FROM hbase.students;</code></p> - -<p><strong>Note:</strong> Use the following format when you query a column in an HBase table:<br> - <code>tablename.columnfamilyname.columnname</code><br> - For more information about column families, refer to <a href="http://hbase.apache.org/book/columnfamily.html">5.6. Column +(students.address.zipcode as VarChar(20)), FROM hbase.students; +</code></pre></div> +<p><strong>Note:</strong> Use the following format when you query a column in an HBase table:</p> +<div class="highlight"><pre><code class="language-text" data-lang="text"> tablename.columnfamilyname.columnname +</code></pre></div> +<p>For more information about column families, refer to <a href="http://hbase.apache.org/book/columnfamily.html">5.6. Column Family</a>.</p> <p>The query returns the data:</p> +<div class="highlight"><pre><code class="language-text" data-lang="text">Query finished, fetching results ... ++----------+-------+-------+------------------+---------+` +| studentid | name | state | street | zipcode |` ++----------+-------+-------+------------------+---------+` +| student1 | Alice | CA | 123 Ballmer Av | 12345 |` +| student2 | Bob | CA | 1 Infinite Loop | 12345 |` +| student3 | Frank | CA | 435 Walker Ct | 12345 |` +| student4 | Mary | CA | 56 Southern Pkwy | 12345 |` ++----------+-------+-------+------------------+---------+` +</code></pre></div></li> +<li><p>Issue the following query on the âclicksâ table to find out which students clicked on google.com:</p> +<div class="highlight"><pre><code class="language-text" data-lang="text"> SELECT CAST(clicks.clickinfo.studentid as VarChar(200)), CAST(clicks.clickinfo.url as VarChar(200)) FROM hbase.`clicks` WHERE URL LIKE '%google%'; +</code></pre></div> +<p>The query returns the data:</p> +<div class="highlight"><pre><code class="language-text" data-lang="text">Query finished, fetching results ...` -<p><code>Query finished, fetching results ...</code></p> - -<p><code>+----------+-------+-------+------------------+---------+</code></p> - -<p><code>| studentid | name | state | street | zipcode |</code></p> - -<p><code>+----------+-------+-------+------------------+---------+</code></p> - -<p><code>| student1 | Alice | CA | 123 Ballmer Av | 12345 |</code></p> - -<p><code>| student2 | Bob | CA | 1 Infinite Loop | 12345 |</code></p> - -<p><code>| student3 | Frank | CA | 435 Walker Ct | 12345 |</code></p> - -<p><code>| student4 | Mary | CA | 56 Southern Pkwy | 12345 |</code></p> - -<p><code>+----------+-------+-------+------------------+---------+</code></p></li> ++---------+-----------+-------------------------------+-----------------------+----------+----------+ +| clickid | studentid | time | url | itemtype | quantity | ++---------+-----------+-------------------------------+-----------------------+----------+----------+ +| click1 | student1 | 2014-01-01 12:01:01.000100000 | http://www.google.com | image | 1 | +| click3 | student2 | 2014-01-01 01:02:01.000100000 | http://www.google.com | text | 2 | +| click6 | student3 | 2013-02-01 12:01:01.000100000 | http://www.google.com | image | 1 | ++---------+-----------+-------------------------------+-----------------------+----------+----------+ +</code></pre></div></li> +</ol></li> </ol> - -<p>c. Issue the following query on the âclicksâ table to find out which students clicked on google.com:</p> -<div class="highlight"><pre><code class="language-text" data-lang="text"> ``SELECT CAST(clicks.clickinfo.studentid as VarChar(200)), CAST(clicks.clickinfo.url as VarChar(200)) FROM hbase.`clicks` WHERE URL LIKE '%google%';`` - - The query returns the data: - - - `Query finished, fetching results ...` - - `+---------+-----------+-------------------------------+-----------------------+----------+----------+` - - `| clickid | studentid | time | url | itemtype | quantity |` - - `+---------+-----------+-------------------------------+-----------------------+----------+----------+` - - `| click1 | student1 | 2014-01-01 12:01:01.000100000 | http://www.google.com | image | 1 |` - - `| click3 | student2 | 2014-01-01 01:02:01.000100000 | http://www.google.com | text | 2 |` - - `| click6 | student3 | 2013-02-01 12:01:01.000100000 | http://www.google.com | image | 1 |` - - `+---------+-----------+-------------------------------+-----------------------+----------+----------+` -</code></pre></div></div> +</div> <div id="footer" class="mw"> Modified: drill/site/trunk/content/drill/docs/querying-hive/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/querying-hive/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/querying-hive/index.html (original) +++ drill/site/trunk/content/drill/docs/querying-hive/index.html Thu Feb 26 01:16:43 2015 @@ -77,55 +77,35 @@ download the <a href="http://doc.mapr.co <li><p>Issue the following command to start the Hive shell:</p> <div class="highlight"><pre><code class="language-text" data-lang="text">hive </code></pre></div></li> -<li><p>Issue the following command from the Hive shell to import the <code>customers.csv</code> file and create a table:</p> -<div class="highlight"><pre><code class="language-text" data-lang="text">hive> create table customers(FirstName string, -LastName string,Company string,Address string, -City string,County string,State string,Zip string, -Phone string,Fax string,Email string,Web string) -row format delimited fields terminated by ',' stored as textfile; +<li><p>Issue the following command from the Hive shell create a table schema:</p> +<div class="highlight"><pre><code class="language-text" data-lang="text">hive> create table customers(FirstName string, LastName string, Company string, Address string, City string, County string, State string, Zip string, Phone string, Fax string, Email string, Web string) row format delimited fields terminated by ',' stored as textfile; </code></pre></div></li> <li><p>Issue the following command to load the customer data into the customers table: </p> - -<p><code>Hive> load data local inpath '/<directory path>/customers.csv' overwrite into table customers;</code></p></li> +<div class="highlight"><pre><code class="language-text" data-lang="text">hive> load data local inpath '/<directory path>/customers.csv' overwrite into table customers;` +</code></pre></div></li> <li><p>Issue <code>quit</code> or <code>exit</code> to leave the Hive shell.</p></li> -<li><p>Start Drill. Refer to <a href="/confluence/pages/viewpage.action?pageId=44994063">Starting/Stopping Drill</a> for instructions.</p></li> +<li><p>Start Drill. Refer to <a href="/drill/docs/starting-stopping-drill">Starting/Stopping Drill</a> for instructions.</p></li> <li><p>Issue the following query to Drill to get the first and last names of the first ten customers in the Hive table: </p> - -<p><code>0: jdbc:drill:schema=hiveremote> SELECT firstname,lastname FROM hiveremote.</code>customers<code>limit 10;</code></p> - +<div class="highlight"><pre><code class="language-text" data-lang="text">0: jdbc:drill:schema=hiveremote> SELECT firstname,lastname FROM hiveremote.`customers` limit 10;` +</code></pre></div> <p>The query returns the following results:</p> - -<p><code>+------------+------------+</code></p> - -<p><code>| firstname | lastname |</code></p> - -<p><code>+------------+------------+</code></p> - -<p><code>| Essie | Vaill |</code></p> - -<p><code>| Cruz | Roudabush |</code></p> - -<p><code>| Billie | Tinnes |</code></p> - -<p><code>| Zackary | Mockus |</code></p> - -<p><code>| Rosemarie | Fifield |</code></p> - -<p><code>| Bernard | Laboy |</code></p> - -<p><code>| Sue | Haakinson |</code></p> - -<p><code>| Valerie | Pou |</code></p> - -<p><code>| Lashawn | Hasty |</code></p> - -<p><code>| Marianne | Earman |</code></p> - -<p><code>+------------+------------+</code></p> - -<p><code>10 rows selected (1.5 seconds)</code></p> - -<p><code>0: jdbc:drill:schema=hiveremote></code></p></li> +<div class="highlight"><pre><code class="language-text" data-lang="text">+------------+------------+ +| firstname | lastname | ++------------+------------+ +| Essie | Vaill | +| Cruz | Roudabush | +| Billie | Tinnes | +| Zackary | Mockus | +| Rosemarie | Fifield | +| Bernard | Laboy | +| Sue | Haakinson | +| Valerie | Pou | +| Lashawn | Hasty | +| Marianne | Earman | ++------------+------------+ +10 rows selected (1.5 seconds) +0: jdbc:drill:schema=hiveremote> +</code></pre></div></li> </ol> </div> Modified: drill/site/trunk/content/drill/docs/querying-json-files/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/querying-json-files/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/querying-json-files/index.html (original) +++ drill/site/trunk/content/drill/docs/querying-json-files/index.html Thu Feb 26 01:16:43 2015 @@ -73,9 +73,8 @@ data. Use SQL syntax to query the sample <p>To view the data in the <code>employee.json</code> file, submit the following SQL query to Drill:</p> - -<p><code>0: jdbc:drill:zk=local> SELECT * FROM cp.`employee.json`;</code></p> - +<div class="highlight"><pre><code class="language-text" data-lang="text"> 0: jdbc:drill:zk=local> SELECT * FROM cp.`employee.json`; +</code></pre></div> <p>The query returns the following results:</p> <p><strong>Example of partial output</strong></p> Modified: drill/site/trunk/content/drill/docs/querying-parquet-files/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/querying-parquet-files/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/querying-parquet-files/index.html (original) +++ drill/site/trunk/content/drill/docs/querying-parquet-files/index.html Thu Feb 26 01:16:43 2015 @@ -67,13 +67,13 @@ </div> -<div class="int_text" align="left"><p>Your Drill installation includes a <code>sample-date</code> directory with Parquet files +<div class="int_text" align="left"><p>Your Drill installation includes a <code>sample-data</code> directory with Parquet files that you can query. Use SQL syntax to query the <code>region.parquet</code> and <code>nation.parquet</code> files in the <code>sample-data</code> directory.</p> -<p><strong>Note:</strong> Your Drill installation location may differ from the examples used here. The examples assume that Drill was installed in embedded mode on your machine following the <a href="https://cwiki.apache.org/confluence/display/DRILL/Apache+Drill+in+10+Minutes">Apache Drill in 10 Minutes </a>tutorial. If you installed Drill in distributed mode, or your <code>sample-data</code> directory differs from the location used in the examples, make sure to change the <code>sample-data</code> directory to the correct location before you run the queries.</p> +<p><strong>Note:</strong> Your Drill installation location may differ from the examples used here. The examples assume that Drill was installed in embedded mode on your machine following the <a href="/drill/docs/apache-drill-in-10-minutes/">Apache Drill in 10 Minutes </a>tutorial. If you installed Drill in distributed mode, or your <code>sample-data</code> directory differs from the location used in the examples, make sure to change the <code>sample-data</code> directory to the correct location before you run the queries.</p> -<h4 id="region-file">Region File</h4> +<h2 id="region-file">Region File</h2> <p>If you followed the Apache Drill in 10 Minutes instructions to install Drill in embedded mode, the path to the parquet file varies between operating @@ -83,18 +83,15 @@ systems.</p> your operating system:</p> <ul> -<li><p>Linux<br> -<code>SELECT * FROM dfs.`/opt/drill/apache-drill-0.4.0-incubating/sample- -data/region.parquet`;</code></p> - -<ul> -<li>Mac OS X<br> -<code>SELECT * FROM dfs.`/Users/max/drill/apache-drill-0.4.0-incubating/sample- -data/region.parquet`;</code></li> -<li>Windows<br> -<code>SELECT * FROM dfs.`C:\drill\apache-drill-0.4.0-incubating\sample- -data\region.parquet`;</code></li> -</ul></li> +<li><p>Linux </p> +<div class="highlight"><pre><code class="language-text" data-lang="text">SELECT * FROM dfs.`/opt/drill/apache-drill-0.4.0-incubating/sample-data/region.parquet`; +</code></pre></div></li> +<li><p>Mac OS X </p> +<div class="highlight"><pre><code class="language-text" data-lang="text">SELECT * FROM dfs.`/Users/max/drill/apache-drill-0.4.0-incubating/sample-data/region.parquet`; +</code></pre></div></li> +<li><p>Windows </p> +<div class="highlight"><pre><code class="language-text" data-lang="text">SELECT * FROM dfs.`C:\drill\apache-drill-0.4.0-incubating\sample-data\region.parquet`; +</code></pre></div></li> </ul> <p>The query returns the following results:</p> @@ -110,7 +107,7 @@ data\region.parquet`;</code></li> 5 rows selected (0.165 seconds) 0: jdbc:drill:zk=local> </code></pre></div> -<h4 id="nation-file">Nation File</h4> +<h2 id="nation-file">Nation File</h2> <p>If you followed the Apache Drill in 10 Minutes instructions to install Drill in embedded mode, the path to the parquet file varies between operating @@ -120,15 +117,15 @@ systems.</p> your operating system:</p> <ul> -<li><p>Linux<br> -<code>SELECT * FROM dfs.`/opt/drill/apache-drill-0.4.0-incubating/sample- -data/nation.parquet`;</code></p></li> -<li><p>Mac OS X<br> -<code>SELECT * FROM dfs.`/Users/max/drill/apache-drill-0.4.0-incubating/sample- -data/nation.parquet`;</code></p></li> -<li><p>Windows<br> -<code>SELECT * FROM dfs.`C:\drill\apache-drill-0.4.0-incubating\sample- -data\nation.parquet`;</code></p></li> +<li><p>Linux </p> +<div class="highlight"><pre><code class="language-text" data-lang="text">SELECT * FROM dfs.`/opt/drill/apache-drill-0.4.0-incubating/sample-data/nation.parquet`; +</code></pre></div></li> +<li><p>Mac OS X </p> +<div class="highlight"><pre><code class="language-text" data-lang="text">SELECT * FROM dfs.`/Users/max/drill/apache-drill-0.4.0-incubating/sample-data/nation.parquet`; +</code></pre></div></li> +<li><p>Windows </p> +<div class="highlight"><pre><code class="language-text" data-lang="text">SELECT * FROM dfs.`C:\drill\apache-drill-0.4.0-incubating\sample-data\nation.parquet`; +</code></pre></div></li> </ul> <p>The query returns the following results:</p> Modified: drill/site/trunk/content/drill/docs/querying-plain-text-files/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/querying-plain-text-files/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/querying-plain-text-files/index.html (original) +++ drill/site/trunk/content/drill/docs/querying-plain-text-files/index.html Thu Feb 26 01:16:43 2015 @@ -108,9 +108,9 @@ records:</p> </code></pre></div> <p>Drill recognizes each row as an array of values and returns one column for each row.</p> +<div class="highlight"><pre><code class="language-text" data-lang="text"> 0: jdbc:drill:zk=local> select * from dfs.`/Users/brumsby/drill/plays.csv`; -<p>0: jdbc:drill:zk=local> select * from dfs.<code>/Users/brumsby/drill/plays.csv</code>;</p> -<div class="highlight"><pre><code class="language-text" data-lang="text">+------------+ ++------------+ | columns | +------------+ | ["1599","As You Like It"] | @@ -128,10 +128,9 @@ each row.</p> <p>You can use the <code>COLUMNS[n]</code> syntax in the SELECT list to return these CSV rows in a more readable, column by column, format. (This syntax uses a zero- based index, so the first column is column <code>0</code>.)</p> +<div class="highlight"><pre><code class="language-text" data-lang="text">0: jdbc:drill:zk=local> select columns[0], columns[1] from dfs.`/Users/brumsby/drill/plays.csv`; -<p>0: jdbc:drill:zk=local> select columns[0], columns[1] -from dfs.<code>/Users/brumsby/drill/plays.csv</code>;</p> -<div class="highlight"><pre><code class="language-text" data-lang="text">+------------+------------+ ++------------+------------+ | EXPR$0 | EXPR$1 | +------------+------------+ | 1599 | As You Like It | @@ -165,10 +164,10 @@ from dfs.`/Users/brumsby/drill/plays.csv <p>You cannot refer to the aliases in subsequent clauses of the query. Use the original <code>columns[n]</code> syntax, as shown in the WHERE clause for the following example:</p> +<div class="highlight"><pre><code class="language-text" data-lang="text">0: jdbc:drill:zk=local> select columns[0] as `Year`, columns[1] as Play +from dfs.`/Users/brumsby/drill/plays.csv` where columns[0]>1599; -<p>0: jdbc:drill:zk=local> select columns[0] as <code>Year</code>, columns[1] as Play -from dfs.<code>/Users/brumsby/drill/plays.csv</code> where columns[0]>1599;</p> -<div class="highlight"><pre><code class="language-text" data-lang="text">+------------+------------+ ++------------+------------+ | Year | Play | +------------+------------+ | 1601 | Twelfth Night | Modified: drill/site/trunk/content/drill/docs/querying-system-tables/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/querying-system-tables/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/querying-system-tables/index.html (original) +++ drill/site/trunk/content/drill/docs/querying-system-tables/index.html Thu Feb 26 01:16:43 2015 @@ -126,7 +126,7 @@ requests.</p> <p>Query the drillbits, version, and options tables in the sys database.</p> -<h6 id="query-the-drillbits-table.">Query the drillbits table.</h6> +<h3 id="query-the-drillbits-table.">Query the drillbits table.</h3> <div class="highlight"><pre><code class="language-text" data-lang="text">0: jdbc:drill:zk=10.10.100.113:5181> select * from drillbits; +------------------+------------+--------------+------------+---------+ | host | user_port | control_port | data_port | current| @@ -138,23 +138,23 @@ requests.</p> 3 rows selected (0.146 seconds) </code></pre></div> <ul> -<li><p>host<br> -The name of the node running the Drillbit service.</p></li> -<li><p>user-port<br> +<li>host<br> +The name of the node running the Drillbit service.</li> +<li>user-port<br> The user port address, used between nodes in a cluster for connecting to -external clients and for the Drill Web UI. </p></li> -<li><p>control_port<br> +external clients and for the Drill Web UI.<br></li> +<li>control_port<br> The control port address, used between nodes for multi-node installation of -Apache Drill.</p></li> -<li><p>data_port<br> +Apache Drill.</li> +<li>data_port<br> The data port address, used between nodes for multi-node installation of -Apache Drill.</p></li> -<li><p>current<br> +Apache Drill.</li> +<li>current<br> True means the Drillbit is connected to the session or client running the -query. This Drillbit is the Foreman for the current session. </p></li> +query. This Drillbit is the Foreman for the current session.<br></li> </ul> -<h6 id="query-the-version-table.">Query the version table.</h6> +<h3 id="query-the-version-table.">Query the version table.</h3> <div class="highlight"><pre><code class="language-text" data-lang="text">0: jdbc:drill:zk=10.10.100.113:5181> select * from version; +------------+----------------+-------------+-------------+------------+ | commit_id | commit_message | commit_time | build_email | build_time | @@ -164,21 +164,21 @@ query. This Drillbit is the Foreman for 1 row selected (0.144 seconds) </code></pre></div> <ul> -<li><p>commit_id<br> +<li>commit_id<br> The github id of the release you are running. For example, <https://github.com -/apache/drill/commit/e3ab2c1760ad34bda80141e2c3108f7eda7c9104></p></li> -<li><p>commit_message<br> -The message explaining the change.</p></li> -<li><p>commit_time<br> -The date and time of the change.</p></li> -<li><p>build_email<br> +/apache/drill/commit/e3ab2c1760ad34bda80141e2c3108f7eda7c9104></li> +<li>commit_message<br> +The message explaining the change.</li> +<li>commit_time<br> +The date and time of the change.</li> +<li>build_email<br> The email address of the person who made the change, which is unknown in this -example.</p></li> -<li><p>build_time<br> -The time that the release was built.</p></li> +example.</li> +<li>build_time<br> +The time that the release was built.</li> </ul> -<h6 id="query-the-options-table.">Query the options table.</h6> +<h3 id="query-the-options-table.">Query the options table.</h3> <p>Drill provides system, session, and boot options that you can query.</p> @@ -201,27 +201,27 @@ The time that the release was built.</p> 10 rows selected (0.334 seconds) </code></pre></div> <ul> -<li><p>name<br> -The name of the option.</p></li> -<li><p>kind<br> -The data type of the option value.</p></li> -<li><p>type<br> -The type of options in the output: system, session, or boot.</p></li> -<li><p>num_val<br> -The default value, which is of the long or int data type; otherwise, null.</p></li> -<li><p>string_val<br> -The default value, which is a string; otherwise, null.</p></li> -<li><p>bool_val<br> -The default value, which is true or false; otherwise, null.</p></li> -<li><p>float_val<br> +<li>name<br> +The name of the option.</li> +<li>kind<br> +The data type of the option value.</li> +<li>type<br> +The type of options in the output: system, session, or boot.</li> +<li>num_val<br> +The default value, which is of the long or int data type; otherwise, null.</li> +<li>string_val<br> +The default value, which is a string; otherwise, null.</li> +<li>bool_val<br> +The default value, which is true or false; otherwise, null.</li> +<li>float_val<br> The default value, which is of the double, float, or long double data type; -otherwise, null.</p></li> +otherwise, null.</li> </ul> -<p>For information about how to configure Drill system and session options, see<a href="https://cwiki.apache.org/confluence/display/DR%0AILL/Planning+and+Execution+Options"> +<p>For information about how to configure Drill system and session options, see<a href="/drill/docs/planning-and-execution-options"> Planning and Execution Options</a>.</p> -<p>For information about how to configure Drill start-up options, see<a href="https://cwiki.apache.org/confluence/display/DRILL/Start-Up+Options"> Start-Up +<p>For information about how to configure Drill start-up options, see<a href="/drill/docs/start-up-options"> Start-Up Options</a>.</p> </div> Modified: drill/site/trunk/content/drill/docs/registering-a-file-system/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/registering-a-file-system/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/registering-a-file-system/index.html (original) +++ drill/site/trunk/content/drill/docs/registering-a-file-system/index.html Thu Feb 26 01:16:43 2015 @@ -81,9 +81,11 @@ the following steps:</p> <ol> <li>Navigate to <code>[http://localhost:8047](http://localhost:8047/)</code>, and select the <strong>Storage</strong> tab.</li> -<li>In the New Storage Plugin window, enter a unique name and then click <strong>Create</strong>. </li> -<li><p>In the Configuration window, provide the following configuration information for the type of file system that you are configuring as a data source. -a. Local file system example:</p> +<li>In the New Storage Plugin window, enter a unique name and then click <strong>Create</strong>.</li> +<li><p>In the Configuration window, provide the following configuration information for the type of file system that you are configuring as a data source.</p> + +<ol> +<li><p>Local file system example:</p> <div class="highlight"><pre><code class="language-text" data-lang="text">{ "type": "file", "enabled": true, @@ -93,16 +95,16 @@ a. Local file system example:</p> "location": "/user/max/donuts", "writable": false, "storageformat": null - } + } }, - "formats" : { - "json" : { - "type" : "json" - } + "formats" : { + "json" : { + "type" : "json" + } + } } -} -</code></pre></div> -<p>b. Distributed file system example:</p> +</code></pre></div></li> +<li><p>Distributed file system example:</p> <div class="highlight"><pre><code class="language-text" data-lang="text">{ "type" : "file", "enabled" : true, @@ -120,15 +122,16 @@ a. Local file system example:</p> } } } -</code></pre></div> +</code></pre></div></li> +</ol> + <p>To connect to a Hadoop file system, you must include the IP address of the name node and the port number.</p></li> <li><p>Click <strong>Enable</strong>.</p></li> </ol> <p>Once you have configured a storage plugin instance for the file system, you -can issue Drill queries against it. For information about querying a file -system, refer to <a href="https://cwiki.apache.org/confluence/%0Adisplay/DRILL/Connecting+to+Data+Sources#ConnectingtoDataSources-%0AQueryingaFileSystem">Querying a File System</a>.</p> +can issue Drill queries against it.</p> </div> Modified: drill/site/trunk/content/drill/docs/registering-hbase/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/registering-hbase/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/registering-hbase/index.html (original) +++ drill/site/trunk/content/drill/docs/registering-hbase/index.html Thu Feb 26 01:16:43 2015 @@ -95,9 +95,7 @@ type as âhbaseâ in the Drill W </ol> <p>Once you have configured a storage plugin instance for the HBase, you can -issue Drill queries against it. For information about querying an HBase data -source, refer to <a href="https://cwiki.apache.org/confluence/display/DRILL/Querying+HBase">Querying -HBase</a>.</p> +issue Drill queries against it.</p> </div> Modified: drill/site/trunk/content/drill/docs/registering-hive/index.html URL: http://svn.apache.org/viewvc/drill/site/trunk/content/drill/docs/registering-hive/index.html?rev=1662344&r1=1662343&r2=1662344&view=diff ============================================================================== --- drill/site/trunk/content/drill/docs/registering-hive/index.html (original) +++ drill/site/trunk/content/drill/docs/registering-hive/index.html Thu Feb 26 01:16:43 2015 @@ -76,9 +76,9 @@ metastore connection information.</p> <p>Currently, Drill only works with Hive version 0.12. To access Hive tables using custom SerDes or InputFormat/OutputFormat, all nodes running Drillbits must have the SerDes or InputFormat/OutputFormat <code>JAR</code> files in the -<code><drill_installation_directory>/jars/3rdparty</code>folder.</p> +<code><drill_installation_directory>/jars/3rdparty</code> folder.</p> -<p>Hive Remote Metastore</p> +<h2 id="hive-remote-metastore">Hive Remote Metastore</h2> <p>In this configuration, the Hive metastore runs as a separate service outside of Hive. Drill communicates with the Hive metastore through Thrift. The @@ -91,7 +91,7 @@ in the Drill Web UI to configure a conne <p>To register a remote Hive metastore with Drill, complete the following steps:</p> <ol> -<li><p>Issue the following command to start the Hive metastore service on the system specified in the <code>hive.metastore.uris</code>: </p> +<li><p>Issue the following command to start the Hive metastore service on the system specified in the <code>hive.metastore.uris</code>:</p> <div class="highlight"><pre><code class="language-text" data-lang="text">hive --service metastore </code></pre></div></li> <li><p>Navigate to <a href="http://localhost:8047/">http://localhost:8047</a>, and select the <strong>Storage</strong> tab.</p></li> @@ -113,17 +113,15 @@ in the Drill Web UI to configure a conne </ol> <p>Once you have configured a storage plugin instance for a Hive data source, you -can issue Drill queries against it. For information about querying a Hive data -source, refer to <a href="https://cwiki.apache.org/confluence/display/DRILL/Querying+Hive">Querying -Hive</a>.</p> +can <a href="/drill/docs/querying-hive/">query Hive tables</a>.</p> -<h3 id="hive-embedded-metastore">Hive Embedded Metastore</h3> +<h2 id="hive-embedded-metastore">Hive Embedded Metastore</h2> <p>In this configuration, the Hive metastore is embedded within the Drill process. Provide the metastore database configuration settings in the Drill Web UI. Before you register Hive, verify that the driver you use to connect to the Hive metastore is in the Drill classpath located in <code>/<drill installation -dirctory>/lib/.</code>If the driver is not there, copy the driver to <code>/<drill +dirctory>/lib/.</code> If the driver is not there, copy the driver to <code>/<drill installation directory>/lib</code> on the Drill node. For more information about storage types and configurations, refer to <a href="/confluence/display/Hive/AdminManual+MetastoreAdmin">AdminManual MetastoreAdmin</a>.</p> @@ -132,8 +130,8 @@ MetastoreAdmin</a>.</p> steps:</p> <ol> -<li><p>Navigate to <code>[http://localhost:8047](http://localhost:8047/),</code>and select the <strong>Storage</strong> tab</p></li> -<li><p>In the disabled storage plugins section, click <strong>Update</strong> next to <code>hive</code> instance.</p></li> +<li>Navigate to <code>[http://localhost:8047](http://localhost:8047/)</code>, and select the <strong>Storage</strong> tab</li> +<li>In the disabled storage plugins section, click <strong>Update</strong> next to <code>hive</code> instance.</li> <li><p>In the configuration window, add the database configuration settings.</p> <p><strong>Example</strong></p> @@ -152,11 +150,6 @@ steps:</p> <div class="highlight"><pre><code class="language-text" data-lang="text">export HADOOP_CLASSPATH=/<directory path>/hadoop/hadoop-0.20.2 </code></pre></div></li> </ol> - -<p>Once you have configured a storage plugin instance for the Hive, you can issue -Drill queries against it. For information about querying a Hive data source, -refer to <a href="https://cwiki.apache.org/confluence/display/DRILL/Querying+Hive">Querying -Hive</a>.</p> </div>