http://git-wip-us.apache.org/repos/asf/impala/blob/b4ad38a9/docs/build/html/topics/impala_porting.html ---------------------------------------------------------------------- diff --git a/docs/build/html/topics/impala_porting.html b/docs/build/html/topics/impala_porting.html index b0cc056..c464e28 100644 --- a/docs/build/html/topics/impala_porting.html +++ b/docs/build/html/topics/impala_porting.html @@ -1,28 +1,54 @@ +<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html - SYSTEM "about:legacy-compat"> -<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_langref.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 2.12x"><meta name="version" content="Impala 2.12x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="porting"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>Porting SQL from Other Database Systems to Impala</title></head><body id="porting"><main role="main"><article role="article" aria-labelledby="ariaid-title1"> + PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> +<head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> + +<meta name="copyright" content="(C) Copyright 2018" /> +<meta name="DC.rights.owner" content="(C) Copyright 2018" /> +<meta name="DC.Type" content="concept" /> +<meta name="DC.Title" content="Porting SQL from Other Database Systems to Impala" /> +<meta name="DC.Relation" scheme="URI" content="../topics/impala_langref.html" /> +<meta name="prodname" content="Impala" /> +<meta name="prodname" content="Impala" /> +<meta name="version" content="Impala 3.0.x" /> +<meta name="version" content="Impala 3.0.x" /> +<meta name="DC.Format" content="XHTML" /> +<meta name="DC.Identifier" content="porting" /> +<link rel="stylesheet" type="text/css" href="../commonltr.css" /> +<title>Porting SQL from Other Database Systems to Impala</title> +</head> +<body id="porting"> + <h1 class="title topictitle1" id="ariaid-title1">Porting SQL from Other Database Systems to Impala</h1> + <div class="body conbody"> - <p class="p"> - - Although Impala uses standard SQL for queries, you might need to modify SQL source when bringing applications - to Impala, due to variations in data types, built-in functions, vendor language extensions, and - Hadoop-specific syntax. Even when SQL is working correctly, you might make further minor modifications for - best performance. - </p> + <p class="p"> Although Impala uses standard SQL for queries, you might need to modify + SQL source when bringing applications to Impala, due to variations in data + types, built-in functions, vendor language extensions, and Hadoop-specific + syntax. Even when SQL is working correctly, you might make further minor + modifications for best performance. </p> + <p class="p toc inpage"></p> + </div> - <nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_langref.html">Impala SQL Language Reference</a></div></div></nav><article class="topic concept nested1" aria-labelledby="ariaid-title2" id="porting__porting_ddl_dml"> + + <div class="related-links"> +<div class="familylinks"> +<div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_langref.html">Impala SQL Language Reference</a></div> +</div> +</div><div class="topic concept nested1" aria-labelledby="ariaid-title2" id="porting_ddl_dml"> <h2 class="title topictitle2" id="ariaid-title2">Porting DDL and DML Statements</h2> + <div class="body conbody"> <p class="p"> @@ -32,23 +58,29 @@ account for the Impala partitioning scheme and Hadoop file formats. </p> + <p class="p"> Expect SQL queries to have a much higher degree of compatibility. With modest rewriting to address vendor extensions and features not yet supported in Impala, you might be able to run identical or almost-identical query text on both systems. </p> + <p class="p"> Therefore, consider separating out the DDL into a separate Impala-specific setup script. Focus your reuse and ongoing tuning efforts on the code for SQL queries. </p> + </div> - </article> - <article class="topic concept nested1" aria-labelledby="ariaid-title3" id="porting__porting_data_types"> + </div> + + + <div class="topic concept nested1" aria-labelledby="ariaid-title3" id="porting_data_types"> <h2 class="title topictitle2" id="ariaid-title3">Porting Data Types from Other Database Systems</h2> + <div class="body conbody"> <ul class="ul"> @@ -64,8 +96,10 @@ However, for performance reasons, it is still preferable to use <code class="ph codeph">STRING</code> columns where practical.) </p> + </li> + <li class="li"> <p class="p"> For national language character types such as <code class="ph codeph">NCHAR</code>, <code class="ph codeph">NVARCHAR</code>, or @@ -73,8 +107,10 @@ some string manipulation operations only work correctly with ASCII data. See <a class="xref" href="impala_string.html#string">STRING Data Type</a> for details. </p> + </li> + <li class="li"> <p class="p"> Change any <code class="ph codeph">DATE</code>, <code class="ph codeph">DATETIME</code>, or <code class="ph codeph">TIME</code> columns to @@ -87,6 +123,7 @@ <a class="xref" href="impala_datetime_functions.html#datetime_functions">Impala Date and Time Functions</a> for conversion functions for different date and time formats. </p> + <p class="p"> You might also need to adapt date- and time-related literal values and format strings to use the supported Impala date and time formats. If you have date and time literals with different separators or @@ -97,16 +134,20 @@ <a class="xref" href="impala_string_functions.html#string_functions">Impala String Functions</a> for string conversion functions such as <code class="ph codeph">regexp_replace()</code>. </p> + <p class="p"> Instead of <code class="ph codeph">SYSDATE</code>, call the function <code class="ph codeph">NOW()</code>. </p> + <p class="p"> Instead of adding or subtracting directly from a date value to produce a value <var class="keyword varname">N</var> days in the past or future, use an <code class="ph codeph">INTERVAL</code> expression, for example <code class="ph codeph">NOW() + INTERVAL 30 DAYS</code>. </p> + </li> + <li class="li"> <p class="p"> Although Impala supports <code class="ph codeph">INTERVAL</code> expressions for datetime arithmetic, as shown in @@ -117,17 +158,21 @@ <code class="ph codeph">DEADLINES</code> with an <code class="ph codeph">INT</code> column <code class="ph codeph">TIME_PERIOD</code>, you could construct dates N days in the future like so: </p> + <pre class="pre codeblock"><code>SELECT NOW() + INTERVAL time_period DAYS from deadlines;</code></pre> </li> + <li class="li"> <p class="p"> For <code class="ph codeph">YEAR</code> columns, change to the smallest Impala integer type that has sufficient range. See <a class="xref" href="impala_datatypes.html#datatypes">Data Types</a> for details about ranges, casting, and so on for the various numeric data types. </p> + </li> + <li class="li"> <p class="p"> Change any <code class="ph codeph">DECIMAL</code> and <code class="ph codeph">NUMBER</code> types. If fixed-point precision is not @@ -138,8 +183,10 @@ to manipulate them. See <a class="xref" href="impala_datatypes.html#datatypes">Data Types</a> for details about ranges, casting, and so on for the various numeric data types. </p> + </li> + <li class="li"> <p class="p"> <code class="ph codeph">FLOAT</code>, <code class="ph codeph">DOUBLE</code>, and <code class="ph codeph">REAL</code> types are supported in @@ -148,8 +195,10 @@ <code class="ph codeph">DOUBLE</code> behind the scenes.) See <a class="xref" href="impala_datatypes.html#datatypes">Data Types</a> for details about ranges, casting, and so on for the various numeric data types. </p> + </li> + <li class="li"> <p class="p"> Most integer types from other systems have equivalents in Impala, perhaps under different names such as @@ -158,26 +207,32 @@ Remove any precision specifications. See <a class="xref" href="impala_datatypes.html#datatypes">Data Types</a> for details about ranges, casting, and so on for the various numeric data types. </p> + </li> + <li class="li"> <p class="p"> Remove any <code class="ph codeph">UNSIGNED</code> constraints. All Impala numeric types are signed. See <a class="xref" href="impala_datatypes.html#datatypes">Data Types</a> for details about ranges, casting, and so on for the various numeric data types. </p> + </li> + <li class="li"> <p class="p"> For any types holding bitwise values, use an integer type with enough range to hold all the relevant bits within a positive integer. See <a class="xref" href="impala_datatypes.html#datatypes">Data Types</a> for details about ranges, casting, and so on for the various numeric data types. </p> + <p class="p"> For example, <code class="ph codeph">TINYINT</code> has a maximum positive value of 127, not 256, so to manipulate 8-bit bitfields as positive numbers switch to the next largest type <code class="ph codeph">SMALLINT</code>. </p> + <pre class="pre codeblock"><code>[localhost:21000] > select cast(127*2 as tinyint); +--------------------------+ | cast(127 * 2 as tinyint) | @@ -199,8 +254,10 @@ <p class="p"> Impala does not support notation such as <code class="ph codeph">b'0101'</code> for bit literals. </p> + </li> + <li class="li"> <p class="p"> For BLOB values, use <code class="ph codeph">STRING</code> to represent <code class="ph codeph">CLOB</code> or @@ -208,14 +265,18 @@ such as <code class="ph codeph">BLOB</code>, <code class="ph codeph">RAW</code> <code class="ph codeph">BINARY</code>, and <code class="ph codeph">VARBINARY</code> do not currently have an equivalent in Impala. </p> + </li> + <li class="li"> <p class="p"> For Boolean-like types such as <code class="ph codeph">BOOL</code>, use the Impala <code class="ph codeph">BOOLEAN</code> type. </p> + </li> + <li class="li"> <p class="p"> Because Impala currently does not support composite or nested types, any spatial data types in other @@ -224,8 +285,10 @@ practical, separate spatial types into separate tables so that Impala can still work with the non-spatial data. </p> + </li> + <li class="li"> <p class="p"> Take out any <code class="ph codeph">DEFAULT</code> clauses. Impala can use data files produced from many different @@ -236,8 +299,10 @@ <code class="ph codeph">NVL</code> to substitute some other value for <code class="ph codeph">NULL</code> fields; see <a class="xref" href="impala_conditional_functions.html#conditional_functions">Impala Conditional Functions</a> for details. </p> + </li> + <li class="li"> <p class="p"> Take out any constraints from your <code class="ph codeph">CREATE TABLE</code> and <code class="ph codeph">ALTER TABLE</code> @@ -251,6 +316,7 @@ substitute some other value for <code class="ph codeph">NULL</code> fields; see <a class="xref" href="impala_conditional_functions.html#conditional_functions">Impala Conditional Functions</a> for details. </p> + <p class="p"> Do as much verification as practical before loading data into Impala. After data is loaded into Impala, you can do further verification using SQL queries to check if values have expected ranges, if values @@ -258,8 +324,10 @@ re-run earlier stages of the ETL process, or do an <code class="ph codeph">INSERT ... SELECT</code> statement in Impala to copy the faulty data to a new table and transform or filter out the bad values. </p> + </li> + <li class="li"> <p class="p"> Take out any <code class="ph codeph">CREATE INDEX</code>, <code class="ph codeph">DROP INDEX</code>, and <code class="ph codeph">ALTER @@ -269,8 +337,10 @@ read operations for data warehouse-style queries, and therefore does not support indexes for its tables. </p> + </li> + <li class="li"> <p class="p"> Calls to built-in functions with out-of-range or otherwise incorrect arguments, return @@ -280,6 +350,7 @@ rather than <code class="ph codeph">NULL</code>. For example, unsupported <code class="ph codeph">CAST</code> operations do not raise an error in Impala: </p> + <pre class="pre codeblock"><code>select cast('foo' as int); +--------------------+ | cast('foo' as int) | @@ -288,13 +359,16 @@ +--------------------+</code></pre> </li> + <li class="li"> <p class="p"> For any other type not supported in Impala, you could represent their values in string format and write UDFs to process them. See <a class="xref" href="impala_udf.html#udfs">Impala User-Defined Functions (UDFs)</a> for details. </p> + </li> + <li class="li"> <p class="p"> To detect the presence of unsupported or unconvertable data types in data files, do initial testing @@ -302,6 +376,7 @@ fail immediately if they encounter disallowed type conversions. See <a class="xref" href="impala_abort_on_error.html#abort_on_error">ABORT_ON_ERROR Query Option</a> for details. For example: </p> + <pre class="pre codeblock"><code>set abort_on_error=true; select count(*) from (select * from t1); -- The above query will fail if the data files for T1 contain any @@ -309,171 +384,200 @@ select count(*) from (select * from t1); -- For example, if T1.C1 is defined as INT but the column contains -- floating-point values like 1.1, the query will return an error.</code></pre> </li> + </ul> + </div> - </article> - <article class="topic concept nested1" aria-labelledby="ariaid-title4" id="porting__porting_statements"> + </div> + + + <div class="topic concept nested1" aria-labelledby="ariaid-title4" id="porting_statements"> <h2 class="title topictitle2" id="ariaid-title4">SQL Statements to Remove or Adapt</h2> + <div class="body conbody"> - <p class="p"> - Some SQL statements or clauses that you might be familiar with are not currently supported in Impala: - </p> + <p class="p"> The following SQL statements or clauses are not currently supported or + supported with limitations in Impala: </p> + <ul class="ul"> <li class="li"> - <p class="p"> - Impala has no <code class="ph codeph">DELETE</code> statement. Impala is intended for data warehouse-style operations - where you do bulk moves and transforms of large quantities of data. Instead of using - <code class="ph codeph">DELETE</code>, use <code class="ph codeph">INSERT OVERWRITE</code> to entirely replace the contents of a - table or partition, or use <code class="ph codeph">INSERT ... SELECT</code> to copy a subset of data (everything but - the rows you intended to delete) from one table to another. See <a class="xref" href="impala_dml.html#dml">DML Statements</a> for - an overview of Impala DML statements. - </p> + <p class="p"> Impala supports the <code class="ph codeph">DELETE</code> statement only for + Kudu tables. </p> + + <p class="p">Impala is intended for data warehouse-style operations where you do + bulk moves and transforms of large quantities of data. When not + using Kudu tables, instead of <code class="ph codeph">DELETE</code>, use + <code class="ph codeph">INSERT OVERWRITE</code> to entirely replace the contents + of a table or partition, or use <code class="ph codeph">INSERT ... SELECT</code> + to copy a subset of data (everything but the rows you intended to + delete) from one table to another. See <a class="xref" href="impala_dml.html#dml">DML Statements</a> for an overview of Impala DML + statements. </p> + </li> <li class="li"> - <p class="p"> - Impala has no <code class="ph codeph">UPDATE</code> statement. Impala is intended for data warehouse-style operations - where you do bulk moves and transforms of large quantities of data. Instead of using - <code class="ph codeph">UPDATE</code>, do all necessary transformations early in the ETL process, such as in the job - that generates the original data, or when copying from one table to another to convert to a particular - file format or partitioning scheme. See <a class="xref" href="impala_dml.html#dml">DML Statements</a> for an overview of Impala DML - statements. - </p> + <p class="p"> Impala supports the <code class="ph codeph">UPDATE</code> statement only for + Kudu tables.</p> + + <p class="p">When not using Kudu tables, instead of <code class="ph codeph">UPDATE</code>, do + all necessary transformations early in the ETL process, such as in + the job that generates the original data, or when copying from one + table to another to convert to a particular file format or + partitioning scheme. See <a class="xref" href="impala_dml.html#dml">DML Statements</a> for an + overview of Impala DML statements. </p> + </li> <li class="li"> - <p class="p"> - Impala has no transactional statements, such as <code class="ph codeph">COMMIT</code> or <code class="ph codeph">ROLLBACK</code>. - Impala effectively works like the <code class="ph codeph">AUTOCOMMIT</code> mode in some database systems, where - changes take effect as soon as they are made. - </p> + <p class="p"> Impala has no transactional statements, such as + <code class="ph codeph">COMMIT</code> or <code class="ph codeph">ROLLBACK</code>. </p> + + <p class="p">Impala effectively works like the <code class="ph codeph">AUTOCOMMIT</code> mode + in some database systems, where changes take effect as soon as they + are made. </p> + </li> <li class="li"> - <p class="p"> - If your database, table, column, or other names conflict with Impala reserved words, use different - names or quote the names with backticks. See <a class="xref" href="impala_reserved_words.html#reserved_words">Impala Reserved Words</a> - for the current list of Impala reserved words. - </p> - <p class="p"> - Conversely, if you use a keyword that Impala does not recognize, it might be interpreted as a table or - column alias. For example, in <code class="ph codeph">SELECT * FROM t1 NATURAL JOIN t2</code>, Impala does not - recognize the <code class="ph codeph">NATURAL</code> keyword and interprets it as an alias for the table - <code class="ph codeph">t1</code>. If you experience any unexpected behavior with queries, check the list of reserved - words to make sure all keywords in join and <code class="ph codeph">WHERE</code> clauses are recognized. + <p class="p"> If your database, table, column, or other names conflict with + Impala reserved words, use different names or quote the names with + backticks. </p> + + <p class="p">See <a class="xref" href="impala_reserved_words.html#reserved_words">Impala Reserved Words</a> for the + current list of Impala reserved words. </p> + + <p class="p"> Conversely, if you use a keyword that Impala does not recognize, + it might be interpreted as a table or column alias. </p> + + <p class="p">For example, in <code class="ph codeph">SELECT * FROM t1 NATURAL JOIN t2</code>, + Impala does not recognize the <code class="ph codeph">NATURAL</code> keyword and + interprets it as an alias for the table <code class="ph codeph">t1</code>. If you + experience any unexpected behavior with queries, check the list of + reserved words to make sure all keywords in join and + <code class="ph codeph">WHERE</code> clauses are supported keywords in Impala. </p> + </li> <li class="li"> - <p class="p"> - Impala supports subqueries only in the <code class="ph codeph">FROM</code> clause of a query, not within the - <code class="ph codeph">WHERE</code> clauses. Therefore, you cannot use clauses such as <code class="ph codeph">WHERE - <var class="keyword varname">column</var> IN (<var class="keyword varname">subquery</var>)</code>. Also, Impala does not allow - <code class="ph codeph">EXISTS</code> or <code class="ph codeph">NOT EXISTS</code> clauses (although <code class="ph codeph">EXISTS</code> is a - reserved keyword). - </p> + <p class="p">Impala has some restrictions on subquery support. See <a href="impala_subqueries.html"><span class="keyword">Subqueries in Impala SELECT Statements</span></a> for the current details.</p> + </li> <li class="li"> - <p class="p"> - Impala supports <code class="ph codeph">UNION</code> and <code class="ph codeph">UNION ALL</code> set operators, but not - <code class="ph codeph">INTERSECT</code>. <span class="ph">Prefer <code class="ph codeph">UNION ALL</code> over <code class="ph codeph">UNION</code> when you know the + <p class="p"> Impala supports <code class="ph codeph">UNION</code> and <code class="ph codeph">UNION + ALL</code> set operators, but not <code class="ph codeph">INTERSECT</code>. </p> + + <p class="p"><span class="ph">Prefer <code class="ph codeph">UNION ALL</code> over <code class="ph codeph">UNION</code> when you know the data sets are disjoint or duplicate values are not a problem; <code class="ph codeph">UNION ALL</code> is more efficient because it avoids materializing and sorting the entire result set to eliminate duplicate values.</span> </p> - </li> - <li class="li"> - <p class="p"> - Within queries, Impala requires query aliases for any subqueries: - </p> -<pre class="pre codeblock"><code>-- Without the alias 'contents_of_t1' at the end, query gives syntax error. -select count(*) from (select * from t1) contents_of_t1;</code></pre> </li> + <li class="li"><p class="p">Impala requires query aliases for the subqueries used as inline + views in the <code class="ph codeph">FROM</code> clause. </p> +<div class="p">For example, + without the alias <code class="ph codeph">contents_of_t1</code> at the end, the + following query gives a syntax + error:<pre class="pre codeblock"><code>SELECT COUNT(*) FROM (SELECT * FROM t1) contents_of_t1;</code></pre></div> +Aliases + are not required for the subqueries used in other parts of queries. + For + example:<pre class="pre codeblock"><code>SELECT * FROM functional.alltypes WHERE id = (SELECT MIN(id) FROM functional.alltypes); +</code></pre></li> + <li class="li"> - <p class="p"> - When an alias is declared for an expression in a query, that alias cannot be referenced again within - the same query block: - </p> -<pre class="pre codeblock"><code>-- Can't reference AVERAGE twice in the SELECT list where it's defined. -select avg(x) as average, average+1 from t1 group by x; -ERROR: AnalysisException: couldn't resolve column reference: 'average' + <p class="p"> When an alias is declared for an expression in a query, that alias + cannot be referenced again within the same <code class="ph codeph">SELECT</code> + list.</p> --- Although it can be referenced again later in the same query. -select avg(x) as average from t1 group by x having average > 3;</code></pre> - <p class="p"> - For Impala, either repeat the expression again, or abstract the expression into a <code class="ph codeph">WITH</code> - clause, creating named columns that can be referenced multiple times anywhere in the base query: - </p> -<pre class="pre codeblock"><code>-- The following 2 query forms are equivalent. -select avg(x) as average, avg(x)+1 from t1 group by x; -with avg_t as (select avg(x) average from t1 group by x) select average, average+1 from avg_t;</code></pre> + <p class="p">For example, the <code class="ph codeph">average</code> alias cannot be + referenced twice in the <code class="ph codeph">SELECT</code> list as below. You + will receive an error:</p> + + <pre class="pre codeblock"><code>SELECT AVG(x) AS average, average+1 FROM t1 GROUP BY x;</code></pre> + <div class="p">An alias can be referenced again in the same query if not in the + <code class="ph codeph">SELECT</code> list. For example, the + <code class="ph codeph">average</code> alias can be referenced twice as shown + below:<pre class="pre codeblock"><code>SELECT AVG(x) AS average FROM t1 GROUP BY x HAVING average > 3;</code></pre></div> </li> <li class="li"> - <p class="p"> - Impala does not support certain rarely used join types that are less appropriate for high-volume tables - used for data warehousing. In some cases, Impala supports join types but requires explicit syntax to - ensure you do not do inefficient joins of huge tables by accident. For example, Impala does not support - natural joins or anti-joins, and requires the <code class="ph codeph">CROSS JOIN</code> operator for Cartesian - products. See <a class="xref" href="impala_joins.html#joins">Joins in Impala SELECT Statements</a> for details on the syntax for Impala join clauses. - </p> + <p class="p"> Impala does not support <code class="ph codeph">NATURAL JOIN</code>, and it does + not support the <code class="ph codeph">USING</code> clause in joins. See <a class="xref" href="impala_joins.html#joins">Joins in Impala SELECT Statements</a> for details on the syntax for + Impala join clauses. </p> + </li> <li class="li"> - <p class="p"> - Impala has a limited choice of partitioning types. Partitions are defined based on each distinct - combination of values for one or more partition key columns. Impala does not redistribute or check data - to create evenly distributed partitions; you must choose partition key columns based on your knowledge - of the data volume and distribution. Adapt any tables that use range, list, hash, or key partitioning - to use the Impala partition syntax for <code class="ph codeph">CREATE TABLE</code> and <code class="ph codeph">ALTER TABLE</code> - statements. Impala partitioning is similar to range partitioning where every range has exactly one - value, or key partitioning where the hash function produces a separate bucket for every combination of - key values. See <a class="xref" href="impala_partitioning.html#partitioning">Partitioning for Impala Tables</a> for usage details, and - <a class="xref" href="impala_create_table.html#create_table">CREATE TABLE Statement</a> and - <a class="xref" href="impala_alter_table.html#alter_table">ALTER TABLE Statement</a> for syntax. - </p> - <div class="note note note_note"><span class="note__title notetitle">Note:</span> - Because the number of separate partitions is potentially higher than in other database systems, keep a - close eye on the number of partitions and the volume of data in each one; scale back the number of - partition key columns if you end up with too many partitions with a small volume of data in each one. - Remember, to distribute work for a query across a cluster, you need at least one HDFS block per node. - HDFS blocks are typically multiple megabytes, <span class="ph">especially</span> for Parquet - files. Therefore, if each partition holds only a few megabytes of data, you are unlikely to see much - parallelism in the query because such a small amount of data is typically processed by a single node. - </div> + <p class="p"> Impala supports a limited choice of partitioning types. </p> + + <p class="p">Partitions are defined based on each distinct combination of values + for one or more partition key columns. Impala does not redistribute + or check data to create evenly distributed partitions. You must + choose partition key columns based on your knowledge of the data + volume and distribution. Adapt any tables that use range, list, + hash, or key partitioning to use the Impala partition syntax for + <code class="ph codeph">CREATE TABLE</code> and <code class="ph codeph">ALTER TABLE</code> + statements. </p> + + <p class="p">Impala partitioning is similar to range partitioning where every + range has exactly one value, or key partitioning where the hash + function produces a separate bucket for every combination of key + values. See <a class="xref" href="impala_partitioning.html#partitioning">Partitioning for Impala Tables</a> for + usage details, and <a class="xref" href="impala_create_table.html#create_table">CREATE TABLE Statement</a> and <a class="xref" href="impala_alter_table.html#alter_table">ALTER TABLE Statement</a> for syntax. </p> + + <div class="note note"><span class="notetitle">Note:</span> Because the number of separate partitions is potentially higher + than in other database systems, keep a close eye on the number of + partitions and the volume of data in each one; scale back the number + of partition key columns if you end up with too many partitions with + a small volume of data in each one. <p class="p">To distribute work for a + query across a cluster, you need at least one HDFS block per node. + HDFS blocks are typically multiple megabytes, <span class="ph">especially</span> for Parquet files. + Therefore, if each partition holds only a few megabytes of data, + you are unlikely to see much parallelism in the query because such + a small amount of data is typically processed by a single node. + </p> +</div> + </li> <li class="li"> - <p class="p"> - For <span class="q">"top-N"</span> queries, Impala uses the <code class="ph codeph">LIMIT</code> clause rather than comparing against a - pseudocolumn named <code class="ph codeph">ROWNUM</code> or <code class="ph codeph">ROW_NUM</code>. See - <a class="xref" href="impala_limit.html#limit">LIMIT Clause</a> for details. - </p> + <p class="p"> For the <span class="q">"top-N"</span> queries, Impala uses the + <code class="ph codeph">LIMIT</code> clause rather than comparing against a + pseudo column named <code class="ph codeph">ROWNUM</code> or + <code class="ph codeph">ROW_NUM</code>. </p> + + <p class="p">See <a class="xref" href="impala_limit.html#limit">LIMIT Clause</a> for details. </p> + </li> + </ul> + </div> - </article> - <article class="topic concept nested1" aria-labelledby="ariaid-title5" id="porting__porting_antipatterns"> + </div> + + + <div class="topic concept nested1" aria-labelledby="ariaid-title5" id="porting_antipatterns"> + + <h2 class="title topictitle2" id="ariaid-title5">SQL Constructs to Double-check</h2> - <h2 class="title topictitle2" id="ariaid-title5">SQL Constructs to Doublecheck</h2> <div class="body conbody"> - <p class="p"> - Some SQL constructs that are supported have behavior or defaults more oriented towards convenience than - optimal performance. Also, sometimes machine-generated SQL, perhaps issued through JDBC or ODBC - applications, might have inefficiencies or exceed internal Impala limits. As you port SQL code, be alert - and change these things where appropriate: - </p> + <p class="p"> Some SQL constructs that are supported have behavior or defaults more + oriented towards convenience than optimal performance. Also, sometimes + machine-generated SQL, perhaps issued through JDBC or ODBC applications, + might have inefficiencies or exceed internal Impala limits. As you port + SQL code, examine and possibly update the following where appropriate: </p> + <ul class="ul"> <li class="li"> @@ -485,39 +589,46 @@ with avg_t as (select avg(x) average from t1 group by x) select average, average <a class="xref" href="impala_parquet.html#parquet">Using the Parquet File Format with Impala Tables</a>, for details about the file format most heavily optimized for large-scale data warehouse queries. </p> + </li> + <li class="li"> - <p class="p"> - A <code class="ph codeph">CREATE TABLE</code> statement with no <code class="ph codeph">PARTITIONED BY</code> clause stores all the - data files in the same physical location, which can lead to scalability problems when the data volume - becomes large. - </p> - <p class="p"> - On the other hand, adapting tables that were already partitioned in a different database system could - produce an Impala table with a high number of partitions and not enough data in each one, leading to - underutilization of Impala's parallel query features. - </p> + <p class="p"> Adapting tables that were already partitioned in a different + database system could produce an Impala table with a high number of + partitions and not enough data in each one, leading to + underutilization of Impala's parallel query features. </p> + <p class="p"> See <a class="xref" href="impala_partitioning.html#partitioning">Partitioning for Impala Tables</a> for details about setting up partitioning and tuning the performance of queries on partitioned tables. </p> + </li> + <li class="li"> - <p class="p"> - The <code class="ph codeph">INSERT ... VALUES</code> syntax is suitable for setting up toy tables with a few rows for - functional testing, but because each such statement creates a separate tiny file in HDFS, it is not a - scalable technique for loading megabytes or gigabytes (let alone petabytes) of data. Consider revising - your data load process to produce raw data files outside of Impala, then setting up Impala external - tables or using the <code class="ph codeph">LOAD DATA</code> statement to use those data files instantly in Impala - tables, with no conversion or indexing stage. See <a class="xref" href="impala_tables.html#external_tables">External Tables</a> and - <a class="xref" href="impala_load_data.html#load_data">LOAD DATA Statement</a> for details about the Impala techniques for working with - data files produced outside of Impala; see <a class="xref" href="impala_tutorial.html#tutorial_etl">Data Loading and Querying Examples</a> for examples - of ETL workflow for Impala. - </p> + <p class="p"> The <code class="ph codeph">INSERT ... VALUES</code> syntax is suitable for + setting up toy tables with a few rows for functional testing when + used with HDFS. Each such statement creates a separate tiny file in + HDFS, and it is not a scalable technique for loading megabytes or + gigabytes (let alone petabytes) of data. </p> + + <p class="p">Consider revising your data load process to produce raw data files + outside of Impala, then setting up Impala external tables or using + the <code class="ph codeph">LOAD DATA</code> statement to use those data files + instantly in Impala tables, with no conversion or indexing stage. + See <a class="xref" href="impala_tables.html#external_tables">External Tables</a> and <a class="xref" href="impala_load_data.html#load_data">LOAD DATA Statement</a> for details about the + Impala techniques for working with data files produced outside of + Impala; see <a class="xref" href="impala_tutorial.html#tutorial_etl">Data Loading and Querying Examples</a> for + examples of ETL workflow for Impala. </p> + + <p class="p"><code class="ph codeph">INSERT</code> works fine for Kudu tables even though not + particularly fast.</p> + </li> + <li class="li"> <p class="p"> If your ETL process is not optimized for Hadoop, you might end up with highly fragmented small data @@ -526,24 +637,22 @@ with avg_t as (select avg(x) average from t1 group by x) select average, average new table and reorganize into a more efficient layout in the same operation. See <a class="xref" href="impala_insert.html#insert">INSERT Statement</a> for details about the <code class="ph codeph">INSERT</code> statement. </p> - <p class="p"> - You can do <code class="ph codeph">INSERT ... SELECT</code> into a table with a more efficient file format (see - <a class="xref" href="impala_file_formats.html#file_formats">How Impala Works with Hadoop File Formats</a>) or from an unpartitioned table into a partitioned - one (see <a class="xref" href="impala_partitioning.html#partitioning">Partitioning for Impala Tables</a>). - </p> + + <p class="p"> You can do <code class="ph codeph">INSERT ... SELECT</code> into a table with a + more efficient file format (see <a class="xref" href="impala_file_formats.html#file_formats">How Impala Works with Hadoop File Formats</a>) or from an + unpartitioned table into a partitioned one. See <a class="xref" href="impala_partitioning.html#partitioning">Partitioning for Impala Tables</a>. </p> + </li> + <li class="li"> - <p class="p"> - The number of expressions allowed in an Impala query might be smaller than for some other database - systems, causing failures for very complicated queries (typically produced by automated SQL - generators). Where practical, keep the number of expressions in the <code class="ph codeph">WHERE</code> clauses to - approximately 2000 or fewer. As a workaround, set the query option - <code class="ph codeph">DISABLE_CODEGEN=true</code> if queries fail for this reason. See - <a class="xref" href="impala_disable_codegen.html#disable_codegen">DISABLE_CODEGEN Query Option</a> for details. - </p> + <p class="p"> Complex queries may have high codegen time. As a workaround, set + the query option <code class="ph codeph">DISABLE_CODEGEN=true</code> if queries + fail for this reason. See <a class="xref" href="impala_disable_codegen.html#disable_codegen">DISABLE_CODEGEN Query Option</a> for details. </p> + </li> + <li class="li"> <p class="p"> If practical, rewrite <code class="ph codeph">UNION</code> queries to use the <code class="ph codeph">UNION ALL</code> operator @@ -551,53 +660,58 @@ with avg_t as (select avg(x) average from t1 group by x) select average, average data sets are disjoint or duplicate values are not a problem; <code class="ph codeph">UNION ALL</code> is more efficient because it avoids materializing and sorting the entire result set to eliminate duplicate values.</span> </p> + </li> + </ul> + </div> - </article> - <article class="topic concept nested1" aria-labelledby="ariaid-title6" id="porting__porting_next"> + </div> + + + <div class="topic concept nested1" aria-labelledby="ariaid-title6" id="porting_next"> <h2 class="title topictitle2" id="ariaid-title6">Next Porting Steps after Verifying Syntax and Semantics</h2> + <div class="body conbody"> - <p class="p"> - Throughout this section, some of the decisions you make during the porting process also have a substantial - impact on performance. After your SQL code is ported and working correctly, doublecheck the - performance-related aspects of your schema design, physical layout, and queries to make sure that the - ported application is taking full advantage of Impala's parallelism, performance-related SQL features, and - integration with Hadoop components. - </p> + <p class="p"> Some of the decisions you make during the porting process can have an + impact on performance. After your SQL code is ported and working + correctly, examine the performance-related aspects of your schema + design, physical layout, and queries to make sure that the ported + application is taking full advantage of Impala's parallelism, + performance-related SQL features, and integration with Hadoop + components. The following are a few of the areas you should examine:</p> + <ul class="ul"> - <li class="li"> - Have you run the <code class="ph codeph">COMPUTE STATS</code> statement on each table involved in join queries? Have - you also run <code class="ph codeph">COMPUTE STATS</code> for each table used as the source table in an <code class="ph codeph">INSERT - ... SELECT</code> or <code class="ph codeph">CREATE TABLE AS SELECT</code> statement? - </li> + <li class="li"> For the optimal performance, we recommend that you run + <code class="ph codeph">COMPUTE STATS</code> on all tables.</li> - <li class="li"> - Are you using the most efficient file format for your data volumes, table structure, and query - characteristics? - </li> - <li class="li"> - Are you using partitioning effectively? That is, have you partitioned on columns that are often used for - filtering in <code class="ph codeph">WHERE</code> clauses? Have you partitioned at the right granularity so that there - is enough data in each partition to parallelize the work for each query? - </li> + <li class="li"> Use the most efficient file format for your data volumes, table + structure, and query characteristics.</li> + + + <li class="li"> Partition on columns that are often used for filtering in + <code class="ph codeph">WHERE</code> clauses.</li> + + + <li class="li"> Your ETL process should produce a relatively small number of + multi-megabyte data files rather than a huge number of small + files.</li> - <li class="li"> - Does your ETL process produce a relatively small number of multi-megabyte data files (good) rather than a - huge number of small files (bad)? - </li> </ul> - <p class="p"> - See <a class="xref" href="impala_performance.html#performance">Tuning Impala for Performance</a> for details about the whole performance tuning - process. - </p> + + <p class="p"> See <a class="xref" href="impala_performance.html#performance">Tuning Impala for Performance</a> for details + about the performance tuning process. </p> + </div> - </article> -</article></main></body></html> \ No newline at end of file + + </div> + +</body> +</html> \ No newline at end of file
http://git-wip-us.apache.org/repos/asf/impala/blob/b4ad38a9/docs/build/html/topics/impala_ports.html ---------------------------------------------------------------------- diff --git a/docs/build/html/topics/impala_ports.html b/docs/build/html/topics/impala_ports.html index 3594704..9a9b19c 100644 --- a/docs/build/html/topics/impala_ports.html +++ b/docs/build/html/topics/impala_ports.html @@ -1,11 +1,30 @@ +<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html - SYSTEM "about:legacy-compat"> -<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta name="DC.Type" content="concept"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 2.12x"><meta name="version" content="Impala 2.12x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="ports"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>Ports Used by Impala</title></head><body id="ports"><main role="main"><article role="article" aria-labelledby="ariaid-title1"> + PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> +<head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> + +<meta name="copyright" content="(C) Copyright 2018" /> +<meta name="DC.rights.owner" content="(C) Copyright 2018" /> +<meta name="DC.Type" content="concept" /> +<meta name="DC.Title" content="Ports Used by Impala" /> +<meta name="prodname" content="Impala" /> +<meta name="prodname" content="Impala" /> +<meta name="version" content="Impala 3.0.x" /> +<meta name="version" content="Impala 3.0.x" /> +<meta name="DC.Format" content="XHTML" /> +<meta name="DC.Identifier" content="ports" /> +<link rel="stylesheet" type="text/css" href="../commonltr.css" /> +<title>Ports Used by Impala</title> +</head> +<body id="ports"> + <h1 class="title topictitle1" id="ariaid-title1">Ports Used by Impala</h1> + - <div class="body conbody" id="ports__conbody_ports"> + <div class="body conbody" id="conbody_ports"> <p class="p"> @@ -13,409 +32,605 @@ on each system. </p> - <table class="table"><caption></caption><colgroup><col style="width:18.181818181818183%"><col style="width:27.27272727272727%"><col style="width:9.090909090909092%"><col style="width:18.181818181818183%"><col style="width:27.27272727272727%"></colgroup><thead class="thead"> + + +<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" class="table" frame="border" border="1" rules="all"><colgroup><col style="width:18.181818181818183%" /><col style="width:27.27272727272727%" /><col style="width:9.090909090909092%" /><col style="width:18.181818181818183%" /><col style="width:27.27272727272727%" /></colgroup><thead class="thead" style="text-align:left;"> <tr class="row"> - <th class="entry nocellnorowborder" id="ports__conbody_ports__entry__1"> + <th class="entry nocellnorowborder" style="vertical-align:top;" id="d143935e75"> Component </th> - <th class="entry nocellnorowborder" id="ports__conbody_ports__entry__2"> + + <th class="entry nocellnorowborder" style="vertical-align:top;" id="d143935e78"> Service </th> - <th class="entry nocellnorowborder" id="ports__conbody_ports__entry__3"> + + <th class="entry nocellnorowborder" style="vertical-align:top;" id="d143935e81"> Port </th> - <th class="entry nocellnorowborder" id="ports__conbody_ports__entry__4"> + + <th class="entry nocellnorowborder" style="vertical-align:top;" id="d143935e84"> Access Requirement </th> - <th class="entry nocellnorowborder" id="ports__conbody_ports__entry__5"> + + <th class="entry cell-norowborder" style="vertical-align:top;" id="d143935e87"> Comment </th> + </tr> - </thead><tbody class="tbody"> + + </thead> +<tbody class="tbody"> <tr class="row"> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__1 "> + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e75 "> <p class="p"> Impala Daemon </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__2 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e78 "> <p class="p"> Impala Daemon Frontend Port </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__3 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e81 "> <p class="p"> 21000 </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__4 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e84 "> <p class="p"> External </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__5 "> + + <td class="entry cell-norowborder" style="vertical-align:top;" headers="d143935e87 "> <p class="p"> Used to transmit commands and receive results by <code class="ph codeph">impala-shell</code> and some ODBC drivers. </p> + </td> + </tr> + <tr class="row"> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__1 "> + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e75 "> <p class="p"> Impala Daemon </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__2 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e78 "> <p class="p"> Impala Daemon Frontend Port </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__3 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e81 "> <p class="p"> 21050 </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__4 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e84 "> <p class="p"> External </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__5 "> + + <td class="entry cell-norowborder" style="vertical-align:top;" headers="d143935e87 "> <p class="p"> Used to transmit commands and receive results by applications, such as Business Intelligence tools, using JDBC, the Beeswax query editor in Hue, and some ODBC drivers. </p> + </td> + </tr> + <tr class="row"> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__1 "> + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e75 "> <p class="p"> Impala Daemon </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__2 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e78 "> <p class="p"> Impala Daemon Backend Port </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__3 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e81 "> <p class="p"> 22000 </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__4 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e84 "> <p class="p"> Internal </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__5 "> - <p class="p"> - Internal use only. Impala daemons use this port to communicate with each other. - </p> + + <td class="entry cell-norowborder" style="vertical-align:top;" headers="d143935e87 "> + <p class="p"> Internal use only. Impala daemons use this port for Thrift + based communication with each other. </p> + </td> + </tr> + <tr class="row"> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__1 "> + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e75 "> <p class="p"> Impala Daemon </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__2 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e78 "> <p class="p"> StateStoreSubscriber Service Port </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__3 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e81 "> <p class="p"> 23000 </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__4 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e84 "> <p class="p"> Internal </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__5 "> + + <td class="entry cell-norowborder" style="vertical-align:top;" headers="d143935e87 "> <p class="p"> Internal use only. Impala daemons listen on this port for updates from the statestore daemon. </p> + </td> + </tr> + <tr class="row"> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__1 "> + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e75 "> <p class="p"> Catalog Daemon </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__2 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e78 "> <p class="p"> StateStoreSubscriber Service Port </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__3 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e81 "> <p class="p"> 23020 </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__4 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e84 "> <p class="p"> Internal </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__5 "> + + <td class="entry cell-norowborder" style="vertical-align:top;" headers="d143935e87 "> <p class="p"> Internal use only. The catalog daemon listens on this port for updates from the statestore daemon. </p> + </td> + </tr> + <tr class="row"> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__1 "> + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e75 "> <p class="p"> Impala Daemon </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__2 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e78 "> <p class="p"> Impala Daemon HTTP Server Port </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__3 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e81 "> <p class="p"> 25000 </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__4 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e84 "> <p class="p"> External </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__5 "> + + <td class="entry cell-norowborder" style="vertical-align:top;" headers="d143935e87 "> <p class="p"> Impala web interface for administrators to monitor and troubleshoot. </p> + </td> + </tr> + <tr class="row"> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__1 "> + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e75 "> <p class="p"> Impala StateStore Daemon </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__2 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e78 "> <p class="p"> StateStore HTTP Server Port </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__3 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e81 "> <p class="p"> 25010 </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__4 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e84 "> <p class="p"> External </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__5 "> + + <td class="entry cell-norowborder" style="vertical-align:top;" headers="d143935e87 "> <p class="p"> StateStore web interface for administrators to monitor and troubleshoot. </p> + </td> + </tr> + <tr class="row"> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__1 "> + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e75 "> <p class="p"> Impala Catalog Daemon </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__2 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e78 "> <p class="p"> Catalog HTTP Server Port </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__3 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e81 "> <p class="p"> 25020 </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__4 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e84 "> <p class="p"> External </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__5 "> + + <td class="entry cell-norowborder" style="vertical-align:top;" headers="d143935e87 "> <p class="p"> Catalog service web interface for administrators to monitor and troubleshoot. New in Impala 1.2 and higher. </p> + </td> + </tr> + <tr class="row"> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__1 "> + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e75 "> <p class="p"> Impala StateStore Daemon </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__2 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e78 "> <p class="p"> StateStore Service Port </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__3 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e81 "> <p class="p"> 24000 </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__4 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e84 "> <p class="p"> Internal </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__5 "> + + <td class="entry cell-norowborder" style="vertical-align:top;" headers="d143935e87 "> <p class="p"> Internal use only. The statestore daemon listens on this port for registration/unregistration requests. </p> + </td> + </tr> + <tr class="row"> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__1 "> + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e75 "> <p class="p"> Impala Catalog Daemon </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__2 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e78 "> <p class="p"> StateStore Service Port </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__3 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e81 "> <p class="p"> 26000 </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__4 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e84 "> <p class="p"> Internal </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__5 "> + + <td class="entry cell-norowborder" style="vertical-align:top;" headers="d143935e87 "> <p class="p"> Internal use only. The catalog service uses this port to communicate with the Impala daemons. New in Impala 1.2 and higher. </p> + </td> + </tr> + + <tr class="row"> + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e75 "> + <p class="p"> Impala Daemon </p> + + </td> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e78 "> + <p class="p">KRPC Port</p> + + </td> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e81 "> + <p class="p">27000</p> + + </td> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e84 "> + <p class="p">Internal</p> + + </td> + + <td class="entry cell-norowborder" style="vertical-align:top;" headers="d143935e87 "> + <p class="p">Internal use only. Impala daemons use this port for KRPC based + communication with each other.</p> + + </td> + + </tr> + <tr class="row"> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__1 "> + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e75 "> <p class="p"> Impala Daemon </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__2 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e78 "> <p class="p"> Llama Callback Port </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__3 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e81 "> <p class="p"> 28000 </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__4 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e84 "> <p class="p"> Internal </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__5 "> + + <td class="entry cell-norowborder" style="vertical-align:top;" headers="d143935e87 "> <p class="p"> Internal use only. Impala daemons use to communicate with Llama. New in <span class="keyword">Impala 1.3</span> and higher. </p> + </td> + </tr> + <tr class="row"> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__1 "> + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e75 "> <p class="p"> Impala Llama ApplicationMaster </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__2 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e78 "> <p class="p"> Llama Thrift Admin Port </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__3 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e81 "> <p class="p"> 15002 </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__4 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e84 "> <p class="p"> Internal </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__5 "> + + <td class="entry cell-norowborder" style="vertical-align:top;" headers="d143935e87 "> <p class="p"> Internal use only. New in <span class="keyword">Impala 1.3</span> and higher. </p> + </td> + </tr> + <tr class="row"> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__1 "> + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e75 "> <p class="p"> Impala Llama ApplicationMaster </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__2 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e78 "> <p class="p"> Llama Thrift Port </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__3 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e81 "> <p class="p"> 15000 </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__4 "> + + <td class="entry nocellnorowborder" style="vertical-align:top;" headers="d143935e84 "> <p class="p"> Internal </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__5 "> + + <td class="entry cell-norowborder" style="vertical-align:top;" headers="d143935e87 "> <p class="p"> Internal use only. New in <span class="keyword">Impala 1.3</span> and higher. </p> + </td> + </tr> + <tr class="row"> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__1 "> + <td class="entry row-nocellborder" style="vertical-align:top;" headers="d143935e75 "> <p class="p"> Impala Llama ApplicationMaster </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__2 "> + + <td class="entry row-nocellborder" style="vertical-align:top;" headers="d143935e78 "> <p class="p"> Llama HTTP Port </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__3 "> + + <td class="entry row-nocellborder" style="vertical-align:top;" headers="d143935e81 "> <p class="p"> 15001 </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__4 "> + + <td class="entry row-nocellborder" style="vertical-align:top;" headers="d143935e84 "> <p class="p"> External </p> + </td> - <td class="entry nocellnorowborder" headers="ports__conbody_ports__entry__5 "> + + <td class="entry cellrowborder" style="vertical-align:top;" headers="d143935e87 "> <p class="p"> Llama service web interface for administrators to monitor and troubleshoot. New in <span class="keyword">Impala 1.3</span> and higher. </p> + </td> + </tr> - </tbody></table> + + </tbody> +</table> +</div> + </div> -</article></main></body></html> \ No newline at end of file + +</body> +</html> \ No newline at end of file http://git-wip-us.apache.org/repos/asf/impala/blob/b4ad38a9/docs/build/html/topics/impala_prefetch_mode.html ---------------------------------------------------------------------- diff --git a/docs/build/html/topics/impala_prefetch_mode.html b/docs/build/html/topics/impala_prefetch_mode.html index 503cf05..b7f2a7b 100644 --- a/docs/build/html/topics/impala_prefetch_mode.html +++ b/docs/build/html/topics/impala_prefetch_mode.html @@ -1,8 +1,28 @@ +<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html - SYSTEM "about:legacy-compat"> -<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" content="../topics/impala_query_options.html"><meta name="prodname" content="Impala"><meta name="prodname" content="Impala"><meta name="version" content="Impala 2.12x"><meta name="version" content="Impala 2.12x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="prefetch_mode"><link rel="stylesheet" type="text/css" href="../commonltr.css"><title>PREFETCH_MODE Query Option (Impala 2.6 or higher only)</title></head><body id="prefetch_mode"><main role="main"><article role="article" aria-labelledby="ariaid-title1"> + PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> +<head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> + +<meta name="copyright" content="(C) Copyright 2018" /> +<meta name="DC.rights.owner" content="(C) Copyright 2018" /> +<meta name="DC.Type" content="concept" /> +<meta name="DC.Title" content="PREFETCH_MODE Query Option (Impala 2.6 or higher only)" /> +<meta name="DC.Relation" scheme="URI" content="../topics/impala_query_options.html" /> +<meta name="prodname" content="Impala" /> +<meta name="prodname" content="Impala" /> +<meta name="version" content="Impala 3.0.x" /> +<meta name="version" content="Impala 3.0.x" /> +<meta name="DC.Format" content="XHTML" /> +<meta name="DC.Identifier" content="prefetch_mode" /> +<link rel="stylesheet" type="text/css" href="../commonltr.css" /> +<title>PREFETCH_MODE Query Option (Impala 2.6 or higher only)</title> +</head> +<body id="prefetch_mode"> + <h1 class="title topictitle1" id="ariaid-title1">PREFETCH_MODE Query Option (<span class="keyword">Impala 2.6</span> or higher only)</h1> + @@ -14,34 +34,48 @@ join query processing. </p> + <p class="p"> <strong class="ph b">Type:</strong> numeric (0, 1) or corresponding mnemonic strings (<code class="ph codeph">NONE</code>, <code class="ph codeph">HT_BUCKET</code>). </p> + <p class="p"> <strong class="ph b">Default:</strong> 1 (equivalent to <code class="ph codeph">HT_BUCKET</code>) </p> + <p class="p"> <strong class="ph b">Added in:</strong> <span class="keyword">Impala 2.6.0</span> </p> + <p class="p"> <strong class="ph b">Usage notes:</strong> </p> + <p class="p"> The default mode is 1, which means that hash table buckets are prefetched during join query processing. </p> + <p class="p"> <strong class="ph b">Related information:</strong> </p> + <p class="p"> <a class="xref" href="impala_joins.html#joins">Joins in Impala SELECT Statements</a>, <a class="xref" href="impala_perf_joins.html#perf_joins">Performance Considerations for Join Queries</a>. </p> + </div> -<nav role="navigation" class="related-links"><div class="familylinks"><div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_query_options.html">Query Options for the SET Statement</a></div></div></nav></article></main></body></html> \ No newline at end of file + +<div class="related-links"> +<div class="familylinks"> +<div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_query_options.html">Query Options for the SET Statement</a></div> +</div> +</div></body> +</html> \ No newline at end of file