[42/51] [partial] impala git commit: [DOCS] Impala 3.1 Docs to be published

mikeb Thu, 06 Dec 2018 15:15:02 -0800

http://git-wip-us.apache.org/repos/asf/impala/blob/b4ad38a9/docs/build/html/topics/impala_complex_types.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_complex_types.html 
b/docs/build/html/topics/impala_complex_types.html
index 1920363..119508e 100644
--- a/docs/build/html/topics/impala_complex_types.html
+++ b/docs/build/html/topics/impala_complex_types.html
@@ -1,9 +1,29 @@
+<?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE html
-  SYSTEM "about:legacy-compat">
-<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; 
charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) 
Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta 
name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" 
content="../topics/impala_datatypes.html"><meta name="prodname" 
content="Impala"><meta name="prodname" content="Impala"><meta name="version" 
content="Impala 2.12x"><meta name="version" content="Impala 2.12x"><meta 
name="DC.Format" content="XHTML"><meta name="DC.Identifier" 
content="complex_types"><link rel="stylesheet" type="text/css" 
href="../commonltr.css"><title>Complex Types (Impala 2.3 or higher 
only)</title></head><body id="complex_types"><main role="main"><article 
role="article" aria-labelledby="complex_types__nested_types">
+  PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
+<html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
+<head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
+
+<meta name="copyright" content="(C) Copyright 2018" />
+<meta name="DC.rights.owner" content="(C) Copyright 2018" />
+<meta name="DC.Type" content="concept" />
+<meta name="DC.Title" content="Complex Types (Impala 2.3 or higher only)" />
+<meta name="DC.Relation" scheme="URI" 
content="../topics/impala_datatypes.html" />
+<meta name="prodname" content="Impala" />
+<meta name="prodname" content="Impala" />
+<meta name="version" content="Impala 3.0.x" />
+<meta name="version" content="Impala 3.0.x" />
+<meta name="DC.Format" content="XHTML" />
+<meta name="DC.Identifier" content="complex_types" />
+<link rel="stylesheet" type="text/css" href="../commonltr.css" />
+<title>Complex Types (Impala 2.3 or higher only)</title>
+</head>
+<body id="complex_types">
+
 
   <h1 class="title topictitle1" id="complex_types__nested_types">Complex Types 
(<span class="keyword">Impala 2.3</span> or higher only)</h1>
 
+
   
 
   <div class="body conbody">
@@ -19,64 +39,85 @@
       and higher. The Hive <code class="ph codeph">UNION</code> type is not 
currently supported.
     </p>
 
+
     <p class="p toc inpage"></p>
 
+
     <p class="p">
       Once you understand the basics of complex types, refer to the individual 
type topics when you need to refresh your memory about syntax
       and examples:
     </p>
 
+
     <ul class="ul">
       <li class="li">
         <a class="xref" href="impala_array.html#array">ARRAY Complex Type 
(Impala 2.3 or higher only)</a>
       </li>
 
+
       <li class="li">
         <a class="xref" href="impala_struct.html#struct">STRUCT Complex Type 
(Impala 2.3 or higher only)</a>
       </li>
 
+
       <li class="li">
         <a class="xref" href="impala_map.html#map">MAP Complex Type (Impala 
2.3 or higher only)</a>
       </li>
+
     </ul>
 
+
   </div>
 
-  <nav role="navigation" class="related-links"><div class="familylinks"><div 
class="parentlink"><strong>Parent topic:</strong> <a class="link" 
href="../topics/impala_datatypes.html">Data Types</a></div></div></nav><article 
class="topic concept nested1" aria-labelledby="ariaid-title2" 
id="complex_types__complex_types_benefits">
+
+  <div class="related-links">
+<div class="familylinks">
+<div class="parentlink"><strong>Parent topic:</strong> <a class="link" 
href="../topics/impala_datatypes.html">Data Types</a></div>
+</div>
+</div><div class="topic concept nested1" aria-labelledby="ariaid-title2" 
id="complex_types_benefits">
 
     <h2 class="title topictitle2" id="ariaid-title2">Benefits of Impala 
Complex Types</h2>
 
+
     <div class="body conbody">
 
       <p class="p">
         The reasons for using Impala complex types include the following:
       </p>
 
+
       <ul class="ul">
         <li class="li">
           <p class="p">
             You already have data produced by Hive or other non-Impala 
component that uses the complex type column names. You might need to
             convert the underlying data to Parquet to use it with Impala.
           </p>
+
         </li>
 
+
         <li class="li">
           <p class="p">
             Your data model originates with a non-SQL programming language or 
a NoSQL data management system. For example, if you are
             representing Python data expressed as nested lists, dictionaries, 
and tuples, those data structures correspond closely to Impala
             <code class="ph codeph">ARRAY</code>, <code class="ph 
codeph">MAP</code>, and <code class="ph codeph">STRUCT</code> types.
           </p>
+
         </li>
 
+
         <li class="li">
           <p class="p">
             Your analytic queries involving multiple tables could benefit from 
greater locality during join processing. By packing more
             related data items within each HDFS data block, complex types let 
join queries avoid the network overhead of the traditional
             Hadoop shuffle or broadcast join techniques.
           </p>
+
         </li>
+
       </ul>
 
+
       <p class="p">
         The Impala complex type support produces result sets with all scalar 
values, and the scalar components of complex types can be used
         with all SQL clauses, such as <code class="ph codeph">GROUP BY</code>, 
<code class="ph codeph">ORDER BY</code>, all kinds of joins, subqueries, and 
inline
@@ -84,14 +125,18 @@
         programming languages to deconstruct the underlying data structures.
       </p>
 
+
     </div>
 
-  </article>
 
-  <article class="topic concept nested1" aria-labelledby="ariaid-title3" 
id="complex_types__complex_types_overview">
+  </div>
+
+
+  <div class="topic concept nested1" aria-labelledby="ariaid-title3" 
id="complex_types_overview">
 
     <h2 class="title topictitle2" id="ariaid-title3">Overview of Impala 
Complex Types</h2>
 
+
     <div class="body conbody">
 
       <p class="p">
@@ -102,6 +147,7 @@
         has a name.
       </p>
 
+
       <p class="p">
         The elements of an <code class="ph codeph">ARRAY</code> or <code 
class="ph codeph">MAP</code>, or the fields of a <code class="ph 
codeph">STRUCT</code>, can also be other
         complex types. You can construct elaborate data structures with up to 
100 levels of nesting. For example, you can make an
@@ -111,6 +157,7 @@
         properties of these types.
       </p>
 
+
       <p class="p">
         When visualizing your data model in familiar SQL terms, you can think 
of each <code class="ph codeph">ARRAY</code> or <code class="ph 
codeph">MAP</code> as a
         miniature table, and each <code class="ph codeph">STRUCT</code> as a 
row within such a table. By default, the table represented by an
@@ -120,6 +167,7 @@
 
       </p>
 
+
       <p class="p">
         The <code class="ph codeph">ITEM</code> and <code class="ph 
codeph">VALUE</code> names are only required for the very simplest kinds of 
<code class="ph codeph">ARRAY</code>
         and <code class="ph codeph">MAP</code> columns, ones that hold only 
scalar values. When the elements within the <code class="ph 
codeph">ARRAY</code> or
@@ -129,6 +177,7 @@
 
 
 
+
       <p class="p">
         You write most queries that process complex type columns using 
familiar join syntax, even though the data for both sides of the join
         resides in a single table. The join notation brings together the 
scalar values from a row with the values from the complex type
@@ -137,6 +186,7 @@
 
       </p>
 
+
       <p class="p">
         Behind the scenes, Impala ensures that the processing for each row is 
done efficiently on a single host, without the network traffic
         involved in broadcast or shuffle joins. The most common type of join 
query for tables with complex type columns is <code class="ph codeph">INNER
@@ -144,7 +194,8 @@
         examples in this section use either the <code class="ph codeph">INNER 
JOIN</code> clause or the equivalent comma notation.
       </p>
 
-      <div class="note note note_note"><span class="note__title 
notetitle">Note:</span> 
+
+      <div class="note note"><span class="notetitle">Note:</span>
         <p class="p">
           Although Impala can query complex types that are present in Parquet 
files, Impala currently cannot create new Parquet files
           containing complex types. Therefore, the discussion and examples 
presume that you are working with existing Parquet data produced
@@ -152,20 +203,26 @@
           files with complex type columns.
         </p>
 
+
         <p class="p">
           For learning purposes, you can create empty tables with complex type 
columns and practice query syntax, even if you do not have
           sample data with the required structure.
         </p>
+
       </div>
 
+
     </div>
 
-  </article>
 
-  <article class="topic concept nested1" aria-labelledby="ariaid-title4" 
id="complex_types__complex_types_design">
+  </div>
+
+
+  <div class="topic concept nested1" aria-labelledby="ariaid-title4" 
id="complex_types_design">
 
     <h2 class="title topictitle2" id="ariaid-title4">Design Considerations for 
Complex Types</h2>
 
+
     <div class="body conbody">
 
       <p class="p">
@@ -175,14 +232,18 @@
         type data using Impala SQL syntax.
       </p>
 
+
       <p class="p toc inpage"></p>
 
+
     </div>
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title5" 
id="complex_types_design__complex_types_vs_rdbms">
+
+    <div class="topic concept nested2" aria-labelledby="ariaid-title5" 
id="complex_types_vs_rdbms">
 
       <h3 class="title topictitle3" id="ariaid-title5">How Complex Types 
Differ from Traditional Data Warehouse Schemas</h3>
 
+
       <div class="body conbody">
 
         <p class="p">
@@ -190,15 +251,18 @@
           relational database management systems or data warehouses, a schema 
with complex types has the following differences:
         </p>
 
+
         <ul class="ul">
           <li class="li">
             <p class="p">
               Logically, related values can now be grouped tightly together in 
the same table.
             </p>
 
+
             <p class="p">
               In traditional data warehousing, related values were typically 
arranged in one of two ways:
             </p>
+
             <ul class="ul">
               <li class="li">
                 <p class="p">
@@ -207,8 +271,10 @@
                   expensive because the related data had to be retrieved from 
separate locations. (In the case of distributed Hadoop
                   queries, the joined tables might even be transmitted between 
different hosts in a cluster.)
                 </p>
+
               </li>
 
+
               <li class="li">
                 <p class="p">
                   Flattened into a single denormalized table. Although this 
layout eliminated some potential performance issues by removing
@@ -216,8 +282,11 @@
                   cause performance issues in other parts of the workflow, 
such as longer ETL cycles or more expensive full-table scans
                   during queries.
                 </p>
+
               </li>
+
             </ul>
+
             <p class="p">
               Complex types represent a middle ground that addresses these 
performance and volume concerns. By physically locating related
               data within the same data files, complex types increase locality 
and reduce the expense of join queries. By associating an
@@ -227,17 +296,23 @@
               <code class="ph codeph">MAP</code> types lets you model familiar 
constructs such as fact and dimension tables from a data warehouse, and
               wide tables representing sparse matrixes.
             </p>
+
           </li>
+
         </ul>
 
+
       </div>
 
-    </article>
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title6" 
id="complex_types_design__complex_types_physical">
+    </div>
+
+
+    <div class="topic concept nested2" aria-labelledby="ariaid-title6" 
id="complex_types_physical">
 
       <h3 class="title topictitle3" id="ariaid-title6">Physical Storage for 
Complex Types</h3>
 
+
       <div class="body conbody">
 
         <p class="p">
@@ -248,54 +323,69 @@
           (possibly large) values of the composite columns.
         </p>
 
+
         <p class="p">
           Within each Parquet data file, the constituent parts of complex type 
columns are stored in column-oriented format:
         </p>
 
+
         <ul class="ul">
           <li class="li">
             <p class="p">
               Each field of a <code class="ph codeph">STRUCT</code> type is 
stored like a column, with all the scalar values adjacent to each other and
               encoded, compressed, and so on using the Parquet space-saving 
techniques.
             </p>
+
           </li>
 
+
           <li class="li">
             <p class="p">
               For an <code class="ph codeph">ARRAY</code> containing scalar 
values, all those values (represented by the <code class="ph codeph">ITEM</code>
               pseudocolumn) are stored adjacent to each other.
             </p>
+
           </li>
 
+
           <li class="li">
             <p class="p">
               For a <code class="ph codeph">MAP</code>, the values of the 
<code class="ph codeph">KEY</code> pseudocolumn are stored adjacent to each 
other. If the
               <code class="ph codeph">VALUE</code> pseudocolumn is a scalar 
type, its values are also stored adjacent to each other.
             </p>
+
           </li>
 
+
           <li class="li">
             <p class="p">
               If an <code class="ph codeph">ARRAY</code> element, <code 
class="ph codeph">STRUCT</code> field, or <code class="ph codeph">MAP</code> 
<code class="ph codeph">VALUE</code> part is
               another complex type, the column-oriented storage applies to the 
next level down (or the next level after that, and so on for
               deeply nested types) where the final elements, fields, or values 
are of scalar types.
             </p>
+
           </li>
+
         </ul>
 
+
         <p class="p">
           The numbers represented by the <code class="ph codeph">POS</code> 
pseudocolumn of an <code class="ph codeph">ARRAY</code> are not physically 
stored in the
           data files. They are synthesized at query time based on the order of 
the <code class="ph codeph">ARRAY</code> elements associated with each row.
         </p>
 
+
       </div>
 
-    </article>
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title7" 
id="complex_types_design__complex_types_file_formats">
+    </div>
+
+
+    <div class="topic concept nested2" aria-labelledby="ariaid-title7" 
id="complex_types_file_formats">
 
       <h3 class="title topictitle3" id="ariaid-title7">File Format Support for 
Impala Complex Types</h3>
 
+
       <div class="body conbody">
 
         <p class="p">
@@ -303,15 +393,6 @@
           for details about the performance benefits and physical layout of 
this file format.
         </p>
 
-        <p class="p">
-          Each table, or each partition within a table, can have a separate 
file format, and you can change file format at the table or
-          partition level through an <code class="ph codeph">ALTER 
TABLE</code> statement. Because this flexibility makes it difficult to 
guarantee ahead
-          of time that all the data files for a table or partition are in a 
compatible format, Impala does not throw any errors when you
-          change the file format for a table or partition using <code 
class="ph codeph">ALTER TABLE</code>. Any errors come at runtime when Impala
-          actually processes a table or partition that contains nested types 
and is not in one of the supported formats. If a query on a
-          partitioned table only processes some partitions, and all those 
partitions are in one of the supported formats, the query
-          succeeds.
-        </p>
 
         <p class="p">
           Because Impala does not parse the data structures containing nested 
types for unsupported formats such as text, Avro,
@@ -321,41 +402,54 @@
           nested type data and Impala queries on that table will generate 
errors.
         </p>
 
-        <div class="note note note_note"><span class="note__title 
notetitle">Note:</span> 
-          <p class="p">
+
+        <p class="p">
             The one exception to the preceding rule is <code class="ph 
codeph">COUNT(*)</code> queries on RCFile tables that include complex types.
             Such queries are allowed in <span class="keyword">Impala 
2.6</span> and higher.
-          </p>
-        </div>
+        </p>
+
 
         <p class="p">
-          You can perform DDL operations (even <code class="ph codeph">CREATE 
TABLE</code>) for tables involving complex types in file formats other than
-          Parquet. The DDL support lets you set up intermediate tables in your 
ETL pipeline, to be populated by Hive, before the final stage
-          where the data resides in a Parquet table and is queryable by 
Impala. Also, you can have a partitioned table with complex type
-          columns that uses a non-Parquet format, and use <code class="ph 
codeph">ALTER TABLE</code> to change the file format to Parquet for individual
-          partitions. When you put Parquet data files into those partitions, 
Impala can execute queries against that data as long as the
-          query does not involve any of the non-Parquet partitions.
+          You can perform DDL operations for tables involving complex types in
+          most file formats other than Parquet. You cannot create tables in
+          Impala with complex types using text files.
         </p>
 
+
+        <p class="p">
+          You can have a partitioned table with complex type columns that uses
+          a non-Parquet format, and use <code class="ph codeph">ALTER 
TABLE</code> to change
+          the file format to Parquet for individual partitions. When you put
+          Parquet data files into those partitions, Impala can execute queries
+          against that data as long as the query does not involve any of the
+          non-Parquet partitions.
+        </p>
+
+
         <p class="p">
           If you use the <span class="keyword cmdname">parquet-tools</span> 
command to examine the structure of a Parquet data file that includes complex
           types, you see that both <code class="ph codeph">ARRAY</code> and 
<code class="ph codeph">MAP</code> are represented as a <code class="ph 
codeph">Bag</code> in Parquet
           terminology, with all fields marked <code class="ph 
codeph">Optional</code> because Impala allows any column to be nullable.
         </p>
 
+
         <p class="p">
           Impala supports either 2-level and 3-level encoding within each 
Parquet data file. When constructing Parquet data files outside
           Impala, use either encoding style but do not mix 2-level and 3-level 
encoding within the same data file.
         </p>
 
+
       </div>
 
-    </article>
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title8" 
id="complex_types_design__complex_types_vs_normalization">
+    </div>
+
+
+    <div class="topic concept nested2" aria-labelledby="ariaid-title8" 
id="complex_types_vs_normalization">
 
       <h3 class="title topictitle3" id="ariaid-title8">Choosing Between 
Complex Types and Normalized Tables</h3>
 
+
       <div class="body conbody">
 
         <p class="p">
@@ -363,6 +457,7 @@
           decision.
         </p>
 
+
         <ul class="ul">
           <li class="li">
             <p class="p">
@@ -370,24 +465,30 @@
               between tables. Your business intelligence tools might already 
be optimized for dealing with this kind of multi-table scenario
               through join queries.
             </p>
+
           </li>
 
+
           <li class="li">
             <p class="p">
               If you are pulling data from Impala into an application written 
in a programming language that has data structures analogous
               to the complex types, such as Python or Java, complex types in 
Impala could simplify data interchange and improve
               understandability and reliability of your program logic.
             </p>
+
           </li>
 
+
           <li class="li">
             <p class="p">
               You might already be faced with existing infrastructure or 
receive high volumes of data that assume one layout or the other.
               For example, complex types are popular with web-oriented 
applications, for example to keep information about an online user
               all in one place for convenient lookup and analysis, or to deal 
with sparse or constantly evolving data fields.
             </p>
+
           </li>
 
+
           <li class="li">
             <p class="p">
               If some parts of the data change over time while related data 
remains constant, using multiple normalized tables lets you
@@ -395,12 +496,15 @@
               together, such as in JSON files, using complex types can save 
the overhead of splitting the related items across multiple
               tables.
             </p>
+
           </li>
 
+
           <li class="li">
             <p class="p">
               From a performance perspective:
             </p>
+
             <ul class="ul">
               <li class="li">
                 <p class="p">
@@ -410,8 +514,10 @@
                   from that column, only the data for the relevant parts of 
the column type hierarchy.
 
                 </p>
+
               </li>
 
+
               <li class="li">
                 <p class="p">
                   Complex types avoid the possibility of expensive join 
queries when data from fact and dimension tables is processed in
@@ -419,8 +525,10 @@
                   block, and therefore does not need to be transmitted across 
the network when joining fields that are all part of the same
                   row.
                 </p>
+
               </li>
 
+
               <li class="li">
                 <p class="p">
                   The tradeoff with complex types is that fewer rows fit in 
each data block. Whether it is better to have more data blocks
@@ -430,26 +538,28 @@
                   size by including complex columns might produce more data 
blocks and thus spread the work more evenly across the cluster.
                   See <a class="xref" 
href="impala_scalability.html#scalability">Scalability Considerations for 
Impala</a> for more on this advanced topic.
                 </p>
+
               </li>
+
             </ul>
-          </li>
-        </ul>
 
-      </div>
+          </li>
 
-    </article>
+        </ul>
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title9" 
id="complex_types_design__complex_types_hive">
 
-      <h3 class="title topictitle3" id="ariaid-title9">Differences Between 
Impala and Hive Complex Types</h3>
+      </div>
 
-      <div class="body conbody">
 
+    </div>
 
 
+    <div class="topic concept nested2" aria-labelledby="ariaid-title9" 
id="complex_types_hive">
 
+      <h3 class="title topictitle3" id="ariaid-title9">Differences Between 
Impala and Hive Complex Types</h3>
 
 
+      <div class="body conbody">
 
         <p class="p">
           Impala can query Parquet tables containing <code class="ph 
codeph">ARRAY</code>, <code class="ph codeph">STRUCT</code>, and <code 
class="ph codeph">MAP</code> columns
@@ -458,25 +568,31 @@
         </p>
 
         <p class="p">
-          The syntax for specifying <code class="ph codeph">ARRAY</code>, 
<code class="ph codeph">STRUCT</code>, and <code class="ph codeph">MAP</code> 
types in a <code class="ph codeph">CREATE
-          TABLE</code> statement is compatible between Impala and Hive.
+          Impala supports a subset of the syntax that Hive supports for
+          specifying <code class="ph codeph">ARRAY</code>, <code class="ph 
codeph">STRUCT</code>, and
+            <code class="ph codeph">MAP</code> types in the <code class="ph 
codeph">CREATE TABLE</code>
+          statements.
         </p>
 
+
         <p class="p">
           Because Impala <code class="ph codeph">STRUCT</code> columns include 
user-specified field names, you use the <code class="ph 
codeph">NAMED_STRUCT()</code>
           constructor in Hive rather than the <code class="ph 
codeph">STRUCT()</code> constructor when you populate an Impala <code class="ph 
codeph">STRUCT</code>
           column using a Hive <code class="ph codeph">INSERT</code> statement.
         </p>
 
+
         <p class="p">
           The Hive <code class="ph codeph">UNION</code> type is not currently 
supported in Impala.
         </p>
 
+
         <p class="p">
           While Impala usually aims for a high degree of compatibility with 
HiveQL query syntax, Impala syntax differs from Hive for queries
           involving complex types. The differences are intended to provide 
extra flexibility for queries involving these kinds of tables.
         </p>
 
+
         <ul class="ul">
           <li class="li">
             Impala uses dot notation for referring to element names or 
elements within complex types, and join notation for
@@ -484,18 +600,21 @@
             VIEW</code> clause and <code class="ph codeph">EXPLODE()</code> 
function of HiveQL.
           </li>
 
+
           <li class="li">
             Using join notation lets you use all the kinds of join queries 
with complex type columns. For example, you can use a
             <code class="ph codeph">LEFT OUTER JOIN</code>, <code class="ph 
codeph">LEFT ANTI JOIN</code>, or <code class="ph codeph">LEFT SEMI JOIN</code> 
query to evaluate
             different scenarios where the complex columns do or do not contain 
any elements.
           </li>
 
+
           <li class="li">
             You can include references to collection types inside subqueries 
and inline views. For example, you can construct a
             <code class="ph codeph">FROM</code> clause where one of the <span 
class="q">"tables"</span> is a subquery against a complex type column, or use a 
subquery
             against a complex type column as the argument to an <code 
class="ph codeph">IN</code> or <code class="ph codeph">EXISTS</code> clause.
           </li>
 
+
           <li class="li">
             The Impala pseudocolumn <code class="ph codeph">POS</code> lets 
you retrieve the position of elements in an array along with the elements
             themselves, equivalent to the <code class="ph 
codeph">POSEXPLODE()</code> function of HiveQL. You do not use index notation 
to retrieve a
@@ -503,39 +622,50 @@
             specify which elements to return.
           </li>
 
+
           <li class="li">
             <p class="p">
               Join clauses involving complex type columns do not require an 
<code class="ph codeph">ON</code> or <code class="ph codeph">USING</code> 
clause. Impala
               implicitly applies the join key so that the correct array 
entries or map elements are associated with the correct row from the
               table.
             </p>
+
           </li>
 
+
           <li class="li">
             <p class="p">
               Impala does not currently support the <code class="ph 
codeph">UNION</code> complex type.
             </p>
+
           </li>
+
         </ul>
 
+
       </div>
 
-    </article>
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title10" 
id="complex_types_design__complex_types_limits">
+    </div>
+
+
+    <div class="topic concept nested2" aria-labelledby="ariaid-title10" 
id="complex_types_limits">
 
       <h3 class="title topictitle3" id="ariaid-title10">Limitations and 
Restrictions for Complex Types</h3>
 
+
       <div class="body conbody">
 
         <p class="p">
           Complex type columns can only be used in tables or partitions with 
the Parquet file format.
         </p>
 
+
         <p class="p">
           Complex type columns cannot be used as partition key columns in a 
partitioned table.
         </p>
 
+
         <p class="p">
           When you use complex types with the <code class="ph codeph">ORDER 
BY</code>, <code class="ph codeph">GROUP BY</code>, <code class="ph 
codeph">HAVING</code>, or
           <code class="ph codeph">WHERE</code> clauses, you cannot refer to 
the column name by itself. Instead, you refer to the names of the scalar
@@ -543,32 +673,38 @@
           <code class="ph codeph">VALUE</code> pseudocolumns, or the field 
names from a <code class="ph codeph">STRUCT</code>.
         </p>
 
+
         <p class="p">
           The maximum depth of nesting for complex types is 100 levels.
         </p>
 
+
         <p class="p">
             The maximum length of the column definition for any complex type, 
including declarations for any nested types,
             is 4000 characters.
           </p>
 
+
         <p class="p">
           For ideal performance and scalability, use small or medium-sized 
collections, where all the complex columns contain at most a few
           hundred megabytes per row. Remember, all the columns of a row are 
stored in the same HDFS data block, whose size in Parquet files
           typically ranges from 256 MB to 1 GB.
         </p>
 
+
         <p class="p">
           Including complex type columns in a table introduces some overhead 
that might make queries that do not reference those columns
           somewhat slower than Impala queries against tables without any 
complex type columns. Expect at most a 2x slowdown compared to
           tables that do not have any complex type columns.
         </p>
 
+
         <p class="p">
           Currently, the <code class="ph codeph">COMPUTE STATS</code> 
statement does not collect any statistics for columns containing complex types.
           Impala uses heuristics to construct execution plans involving 
complex type columns.
         </p>
 
+
         <p class="p">
           Currently, Impala built-in functions and user-defined functions 
cannot accept complex types as parameters or produce them as
           function return values. (When the complex type values are 
materialized in an Impala result set, the result set contains the scalar
@@ -577,6 +713,7 @@
           scalar data items <em class="ph i">can</em> be used with built-in 
functions and UDFs as usual.)
         </p>
 
+
         <p class="p">
         Impala currently cannot write new data files containing complex type 
columns.
         Therefore, although the <code class="ph codeph">SELECT</code> 
statement works for queries
@@ -586,6 +723,7 @@
         ETL mechanism such as MapReduce jobs, Spark jobs, Pig, and so on.
       </p>
 
+
         <p class="p">
           Currently, Impala can query complex type columns only from Parquet 
tables or Parquet partitions within partitioned tables.
           Although you can use complex types in tables with Avro, text, and 
other file formats as part of your ETL pipeline, for example as
@@ -595,16 +733,21 @@
           <a class="xref" 
href="impala_complex_types.html#complex_types_file_formats">File Format Support 
for Impala Complex Types</a> for more details.
         </p>
 
+
       </div>
 
-    </article>
 
-  </article>
+    </div>
+
+
+  </div>
+
 
-  <article class="topic concept nested1" aria-labelledby="ariaid-title11" 
id="complex_types__complex_types_using">
+  <div class="topic concept nested1" aria-labelledby="ariaid-title11" 
id="complex_types_using">
 
     <h2 class="title topictitle2" id="ariaid-title11">Using Complex Types from 
SQL</h2>
 
+
     <div class="body conbody">
 
       <p class="p">
@@ -614,14 +757,18 @@
         number of Parquet tables, and use Hive, Spark, Pig, or other mechanism 
outside Impala to populate the tables with data.
       </p>
 
+
       <p class="p toc inpage"></p>
 
+
     </div>
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title12" 
id="complex_types_using__nested_types_ddl">
+
+    <div class="topic concept nested2" aria-labelledby="ariaid-title12" 
id="nested_types_ddl">
 
       <h3 class="title topictitle3" id="ariaid-title12">Complex Type Syntax 
for DDL Statements</h3>
 
+
       <div class="body conbody">
 
         <p class="p">
@@ -629,6 +776,7 @@
           statements, now includes complex types in addition to primitive 
types:
         </p>
 
+
 <pre class="pre codeblock"><code>  primitive_type
 | array_type
 | map_type
@@ -640,31 +788,31 @@
         </p>
 
         <p class="p">
-          Array, struct, and map column type declarations are specified in the 
<code class="ph codeph">CREATE TABLE</code> statement. You can also add or
-          change the type of complex columns through the <code class="ph 
codeph">ALTER TABLE</code> statement.
-        </p>
+          <code class="ph codeph">Array</code>, <code class="ph 
codeph">struct</code>, and
+            <code class="ph codeph">map</code> column type declarations are 
specified in the
+            <code class="ph codeph">CREATE TABLE</code> statement. You can 
also add or change
+          the type of complex columns through the <code class="ph 
codeph">ALTER TABLE</code>
+          statement. </p>
 
-        <div class="note note note_note"><span class="note__title 
notetitle">Note:</span> 
-          <p class="p">
-            Currently, Impala queries allow complex types only in tables that 
use the Parquet format. If an Impala query encounters complex
-            types in a table or partition using another file format, the query 
returns a runtime error.
-          </p>
+        <p class="p"> Currently, Impala queries allow complex types only in 
tables that
+          use the Parquet format. If an Impala query encounters complex types 
in
+          a table or partition using another file format, the query returns a
+          runtime error. </p>
+
+        <p class="p"> You can use <code class="ph codeph">ALTER TABLE ... SET 
FILEFORMAT PARQUET</code>
+          to change the file format of an existing table containing complex
+          types to Parquet, after which Impala can query it. Make sure to load
+          Parquet files into the table after changing the file format, because
+          the <code class="ph codeph">ALTER TABLE ... SET FILEFORMAT</code> 
statement does not
+          convert existing data to the new file format. </p>
 
-          <p class="p">
-            The Impala DDL support for complex types works for all file 
formats, so that you can create tables using text or other
-            non-Parquet formats for Hive to use as staging tables in an ETL 
cycle that ends with the data in a Parquet table. You can also
-            use <code class="ph codeph">ALTER TABLE ... SET FILEFORMAT 
PARQUET</code> to change the file format of an existing table containing complex
-            types to Parquet, after which Impala can query it. Make sure to 
load Parquet files into the table after changing the file
-            format, because the <code class="ph codeph">ALTER TABLE ... SET 
FILEFORMAT</code> statement does not convert existing data to the new file
-            format.
-          </p>
-        </div>
 
         <p class="p">
         Partitioned tables can contain complex type columns.
         All the partition key columns must be scalar types.
       </p>
 
+
         <p class="p">
           Because use cases for Impala complex types require that you already 
have Parquet data files produced outside of Impala, you can
           use the Impala <code class="ph codeph">CREATE TABLE LIKE 
PARQUET</code> syntax to produce a table with columns that match the structure 
of an
@@ -673,13 +821,15 @@
           resulting table is still text.
         </p>
 
+
         <p class="p">
           Because the complex columns are omitted from the result set of an 
Impala <code class="ph codeph">SELECT *</code> or <code class="ph codeph">SELECT
           <var class="keyword varname">col_name</var></code> query, and 
because Impala currently does not support writing Parquet files with complex 
type
           columns, you cannot use the <code class="ph codeph">CREATE TABLE AS 
SELECT</code> syntax to create a table with nested type columns.
         </p>
 
-        <div class="note note note_note"><span class="note__title 
notetitle">Note:</span> 
+
+        <div class="note note"><span class="notetitle">Note:</span>
           <p class="p">
             Once you have a table set up with complex type columns, use the 
<code class="ph codeph">DESCRIBE</code> and <code class="ph codeph">SHOW CREATE 
TABLE</code>
             statements to see the correct notation with <code class="ph 
codeph">&lt;</code> and <code class="ph codeph">&gt;</code> delimiters and 
comma and colon
@@ -689,21 +839,25 @@
             referring to items within the complex type columns. In the <code 
class="ph codeph">FROM</code> clause, you use join notation to construct
             table aliases for any referenced <code class="ph 
codeph">ARRAY</code> and <code class="ph codeph">MAP</code> columns.
           </p>
+
         </div>
 
 
 
+
         <p class="p">
           For example, when defining a table that holds contact information, 
you might represent phone numbers differently depending on the
           expected layout and relationships of the data, and how well you can 
predict those properties in advance.
         </p>
 
+
         <p class="p">
           Here are different ways that you might represent phone numbers in a 
traditional relational schema, with equivalent representations
           using complex types.
         </p>
 
-        <figure class="fig fignone" 
id="nested_types_ddl__complex_types_phones_flat_fixed"><figcaption><span 
class="fig--title-label">Figure 1. </span>Traditional Relational Representation 
of Phone Numbers: Single Table</figcaption>
+
+        <div class="fig fignone" 
id="nested_types_ddl__complex_types_phones_flat_fixed"><span 
class="figcap"><span class="fig--title-label">Figure 1. </span>Traditional 
Relational Representation of Phone Numbers: Single Table</span>
 
           
 
@@ -714,6 +868,7 @@
             corresponding column is <code class="ph codeph">NULL</code> for 
that row.
           </p>
 
+
 <pre class="pre codeblock"><code>
 CREATE TABLE contacts_fixed_phones
 (
@@ -726,9 +881,10 @@ CREATE TABLE contacts_fixed_phones
 ) STORED AS PARQUET;
 </code></pre>
 
-        </figure>
+        </div>
 
-        <figure class="fig fignone" 
id="nested_types_ddl__complex_types_phones_array"><figcaption><span 
class="fig--title-label">Figure 2. </span>An Array of Phone Numbers</figcaption>
+
+        <div class="fig fignone" 
id="nested_types_ddl__complex_types_phones_array"><span class="figcap"><span 
class="fig--title-label">Figure 2. </span>An Array of Phone Numbers</span>
 
           
 
@@ -740,6 +896,7 @@ CREATE TABLE contacts_fixed_phones
             <code class="ph codeph">ARRAY</code> where each element is a <code 
class="ph codeph">STRUCT</code>.)
           </p>
 
+
 <pre class="pre codeblock"><code>
 CREATE TABLE contacts_array_of_phones
 (
@@ -751,9 +908,10 @@ CREATE TABLE contacts_array_of_phones
 
 </code></pre>
 
-        </figure>
+        </div>
+
 
-        <figure class="fig fignone" 
id="nested_types_ddl__complex_types_phones_map"><figcaption><span 
class="fig--title-label">Figure 3. </span>A Map of Phone Numbers</figcaption>
+        <div class="fig fignone" 
id="nested_types_ddl__complex_types_phones_map"><span class="figcap"><span 
class="fig--title-label">Figure 3. </span>A Map of Phone Numbers</span>
 
           
 
@@ -764,6 +922,7 @@ CREATE TABLE contacts_array_of_phones
             <code class="ph codeph">'mobile'</code>. A query could filter the 
data based on the key values, or display the key values in reports.
           </p>
 
+
 <pre class="pre codeblock"><code>
 CREATE TABLE contacts_unlimited_phones
 (
@@ -772,9 +931,10 @@ CREATE TABLE contacts_unlimited_phones
 
 </code></pre>
 
-        </figure>
+        </div>
+
 
-        <figure class="fig fignone" 
id="nested_types_ddl__complex_types_phones_flat_normalized"><figcaption><span 
class="fig--title-label">Figure 4. </span>Traditional Relational Representation 
of Phone Numbers: Normalized Tables</figcaption>
+        <div class="fig fignone" 
id="nested_types_ddl__complex_types_phones_flat_normalized"><span 
class="figcap"><span class="fig--title-label">Figure 4. </span>Traditional 
Relational Representation of Phone Numbers: Normalized Tables</span>
 
           
 
@@ -785,6 +945,7 @@ CREATE TABLE contacts_unlimited_phones
             number, such as whether it is a home, work, or mobile phone.
           </p>
 
+
           <p class="p">
             The flexibility of this approach comes with some drawbacks. 
Reconstructing all the data for a particular person requires a join
             query, which might require performance tuning on Hadoop because 
the data from each table might be transmitted from a different
@@ -792,6 +953,7 @@ CREATE TABLE contacts_unlimited_phones
             table.
           </p>
 
+
           <p class="p">
             This example illustrates a traditional database schema to store 
contact info normalized across 2 tables. The fact table
             establishes the identity and basic information about person. A 
dimension table stores information only about phone numbers,
@@ -800,6 +962,7 @@ CREATE TABLE contacts_unlimited_phones
             to represent all sorts of details about each phone number.
           </p>
 
+
 <pre class="pre codeblock"><code>
 CREATE TABLE fact_contacts (id BIGINT, name STRING, address STRING) STORED AS 
PARQUET;
 CREATE TABLE dim_phones
@@ -819,9 +982,10 @@ CREATE TABLE dim_phones
 STORED AS PARQUET;
 </code></pre>
 
-        </figure>
+        </div>
+
 
-        <figure class="fig fignone" 
id="nested_types_ddl__complex_types_phones_array_struct"><figcaption><span 
class="fig--title-label">Figure 5. </span>Phone Numbers Represented as an Array 
of Structs</figcaption>
+        <div class="fig fignone" 
id="nested_types_ddl__complex_types_phones_array_struct"><span 
class="figcap"><span class="fig--title-label">Figure 5. </span>Phone Numbers 
Represented as an Array of Structs</span>
 
           
 
@@ -834,6 +998,7 @@ STORED AS PARQUET;
             table from the previous example.
           </p>
 
+
           <p class="p">
             You can do all the same kinds of queries with the complex type 
schema as with the normalized schema from the previous example.
             The advantages of the complex type design are in the areas of 
convenience and performance. Now your backup and ETL processes
@@ -842,6 +1007,7 @@ STORED AS PARQUET;
             single host without requiring network transmission.
           </p>
 
+
 <pre class="pre codeblock"><code>
 CREATE TABLE contacts_detailed_phones
 (
@@ -862,16 +1028,20 @@ CREATE TABLE contacts_detailed_phones
 
 </code></pre>
 
-        </figure>
+        </div>
+
 
       </div>
 
-    </article>
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title13" 
id="complex_types_using__complex_types_sql">
+    </div>
+
+
+    <div class="topic concept nested2" aria-labelledby="ariaid-title13" 
id="complex_types_sql">
 
       <h3 class="title topictitle3" id="ariaid-title13">SQL Statements that 
Support Complex Types</h3>
 
+
       <div class="body conbody">
 
         <p class="p">
@@ -885,6 +1055,7 @@ CREATE TABLE contacts_detailed_phones
           containing complex type columns into a table, and query Parquet 
tables containing complex types.
         </p>
 
+
         <p class="p">
         Impala currently cannot write new data files containing complex type 
columns.
         Therefore, although the <code class="ph codeph">SELECT</code> 
statement works for queries
@@ -894,20 +1065,25 @@ CREATE TABLE contacts_detailed_phones
         ETL mechanism such as MapReduce jobs, Spark jobs, Pig, and so on.
       </p>
 
+
         <p class="p toc inpage"></p>
 
+
       </div>
 
-      <article class="topic concept nested3" aria-labelledby="ariaid-title14" 
id="complex_types_sql__complex_types_ddl">
+
+      <div class="topic concept nested3" aria-labelledby="ariaid-title14" 
id="complex_types_ddl">
 
         <h4 class="title topictitle4" id="ariaid-title14">DDL Statements and 
Complex Types</h4>
 
+
         <div class="body conbody">
 
           <p class="p">
             Column specifications for complex or nested types use <code 
class="ph codeph">&lt;</code> and <code class="ph codeph">&gt;</code> 
delimiters:
           </p>
 
+
 <pre class="pre codeblock"><code>-- What goes inside the &lt; &gt; for an 
ARRAY is a single type, either a scalar or another
 -- complex type (ARRAY, STRUCT, or MAP).
 CREATE TABLE array_t
@@ -950,12 +1126,15 @@ STORED AS PARQUET;
 
         </div>
 
-      </article>
 
-      <article class="topic concept nested3" aria-labelledby="ariaid-title15" 
id="complex_types_sql__complex_types_queries">
+      </div>
+
+
+      <div class="topic concept nested3" aria-labelledby="ariaid-title15" 
id="complex_types_queries">
 
         <h4 class="title topictitle4" id="ariaid-title15">Queries and Complex 
Types</h4>
 
+
         <div class="body conbody">
 
 
@@ -969,12 +1148,14 @@ STORED AS PARQUET;
             columns with complex types are skipped.
           </p>
 
+
           <p class="p">
             The following example shows how referring directly to a complex 
type column returns an error, while <code class="ph codeph">SELECT *</code> on
             the same table succeeds, but only retrieves the scalar columns.
           </p>
 
-          <div class="note note note_note"><span class="note__title 
notetitle">Note:</span> 
+
+          <div class="note note"><span class="notetitle">Note:</span>
       Many of the complex type examples refer to tables
       such as <code class="ph codeph">CUSTOMER</code> and <code class="ph 
codeph">REGION</code>
       adapted from the tables used in the TPC-H benchmark.
@@ -984,6 +1165,7 @@ STORED AS PARQUET;
 
 
 
+
 <pre class="pre codeblock"><code>SELECT c_orders FROM customer LIMIT 1;
 ERROR: AnalysisException: Expr 'c_orders' in select list returns a complex 
type 'ARRAY&lt;STRUCT&lt;o_orderkey:BIGINT,o_orderstatus:STRING, ... 
l_receiptdate:STRING,l_shipinstruct:STRING,l_shipmode:STRING,l_comment:STRING&gt;&gt;&gt;&gt;'.
 Only scalar types are allowed in the select list.
@@ -1043,6 +1225,7 @@ DESC select_star_customer;
 
 
 
+
 <pre class="pre codeblock"><code>SELECT id, address.city FROM customers WHERE 
address.zip = 94305;
 </code></pre>
 
@@ -1052,6 +1235,7 @@ DESC select_star_customer;
 
 
 
+
 <pre class="pre codeblock"><code>select r_name, r_nations.item.n_name from 
region, region.r_nations limit 7;
 +--------+----------------+
 | r_name | item.n_name    |
@@ -1073,6 +1257,7 @@ DESC select_star_customer;
             <code class="ph codeph">MAP_FIELD.VALUE</code>, which have zero, 
one, or many instances for each row from the containing table.
           </p>
 
+
 <pre class="pre codeblock"><code>DESCRIBE table_0;
 +---------+-----------------------+
 | name    | type                  |
@@ -1114,6 +1299,7 @@ LIMIT 10;
 
 
 
+
 <pre class="pre codeblock"><code>SELECT id, phone_numbers.area_code FROM 
contact_info_many_structs INNER JOIN contact_info_many_structs.phone_numbers 
phone_numbers LIMIT 3;
 </code></pre>
 
@@ -1131,7 +1317,8 @@ LIMIT 10;
 
 
 
-          <div class="note note note_note"><span class="note__title 
notetitle">Note:</span> 
+
+          <div class="note note"><span class="notetitle">Note:</span>
       Many of the complex type examples refer to tables
       such as <code class="ph codeph">CUSTOMER</code> and <code class="ph 
codeph">REGION</code>
       adapted from the tables used in the TPC-H benchmark.
@@ -1139,11 +1326,13 @@ LIMIT 10;
       for the table definitions.
       </div>
 
+
           <p class="p">
             For example, the following queries work equivalently. They each 
return customer and order data for customers that have at least
             one order.
           </p>
 
+
 <pre class="pre codeblock"><code>SELECT c.c_name, o.o_orderkey FROM customer 
c, c.c_orders o LIMIT 5;
 +--------------------+------------+
 | c_name             | o_orderkey |
@@ -1172,6 +1361,7 @@ SELECT c.c_name, o.o_orderkey FROM customer c INNER JOIN 
c.c_orders o LIMIT 5;
             <code class="ph codeph">C_ORDERS</code> array):
           </p>
 
+
 <pre class="pre codeblock"><code>SELECT c.c_custkey, o.o_orderkey
   FROM customer c LEFT OUTER JOIN c.c_orders o
 LIMIT 5;
@@ -1193,6 +1383,7 @@ LIMIT 5;
             information in the right-hand table.)
           </p>
 
+
 <pre class="pre codeblock"><code>SELECT c.c_custkey, c.c_name
   FROM customer c LEFT ANTI JOIN c.c_orders o
 LIMIT 5;
@@ -1214,12 +1405,14 @@ LIMIT 5;
             You can also perform correlated subqueries to examine the 
properties of complex type columns for each row in the result set.
           </p>
 
+
           <p class="p">
             Count the number of orders per customer. Note the correlated 
reference to the table alias <code class="ph codeph">C</code>. The
             <code class="ph codeph">COUNT(*)</code> operation applies to all 
the elements of the <code class="ph codeph">C_ORDERS</code> array for the 
corresponding
             row, avoiding the need for a <code class="ph codeph">GROUP 
BY</code> clause.
           </p>
 
+
 <pre class="pre codeblock"><code>select c_name, howmany FROM customer c, 
(SELECT COUNT(*) howmany FROM c.c_orders) v limit 5;
 +--------------------+---------+
 | c_name             | howmany |
@@ -1236,6 +1429,7 @@ LIMIT 5;
             Count the number of orders per customer, ignoring any customers 
that have not placed any orders:
           </p>
 
+
 <pre class="pre codeblock"><code>SELECT c_name, howmany_orders
 FROM
   customer c,
@@ -1260,6 +1454,7 @@ LIMIT 5;
             from each row of the <code class="ph codeph">CUSTOMERS</code> 
table.
           </p>
 
+
 <pre class="pre codeblock"><code>SELECT c_name, o_orderkey, howmany_line_items
 FROM
   customer c,
@@ -1284,6 +1479,7 @@ LIMIT 5;
             the original <code class="ph codeph">CUSTOMER</code> table, and 
only apply to the complex columns associated with that row.
           </p>
 
+
 <pre class="pre codeblock"><code>SELECT c_name, howmany, average_price, 
most_items
 FROM
   customer c,
@@ -1308,6 +1504,7 @@ LIMIT 5;
             another <code class="ph codeph">ARRAY</code> of <code class="ph 
codeph">STRUCT</code>:
           </p>
 
+
 <pre class="pre codeblock"><code>-- How many orders does each customer have?
 -- The type of the ARRAY column doesn't matter, this is just counting the 
elements.
 SELECT c_custkey, count(*)
@@ -1361,14 +1558,18 @@ LIMIT 5;
 
         </div>
 
-      </article>
 
-    </article>
+      </div>
+
+
+    </div>
+
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title16" 
id="complex_types_using__pseudocolumns">
+    <div class="topic concept nested2" aria-labelledby="ariaid-title16" 
id="pseudocolumns">
 
       <h3 class="title topictitle3" id="ariaid-title16">Pseudocolumns for 
ARRAY and MAP Types</h3>
 
+
       <div class="body conbody">
 
         <p class="p">
@@ -1378,6 +1579,7 @@ LIMIT 5;
           part of qualified column names in queries:
         </p>
 
+
         <ul class="ul">
           <li class="li">
             <code class="ph codeph">ITEM</code>: The value of an array 
element. If the <code class="ph codeph">ARRAY</code> contains <code class="ph 
codeph">STRUCT</code> elements,
@@ -1385,32 +1587,40 @@ LIMIT 5;
             <code class="ph codeph"><var class="keyword 
varname">array_name</var>.<var class="keyword varname">field_name</var></code>.
           </li>
 
+
           <li class="li">
             <code class="ph codeph">POS</code>: The position of an element 
within an array.
           </li>
 
+
           <li class="li">
             <code class="ph codeph">KEY</code>: The value forming the first 
part of a key-value pair in a map. It is not necessarily unique.
           </li>
 
+
           <li class="li">
             <code class="ph codeph">VALUE</code>: The data item forming the 
second part of a key-value pair in a map. If the <code class="ph 
codeph">VALUE</code> part
             of the <code class="ph codeph">MAP</code> element is a <code 
class="ph codeph">STRUCT</code>, you can refer to either
             <code class="ph codeph"><var class="keyword 
varname">map_name</var>.VALUE.<var class="keyword 
varname">field_name</var></code> or use the shorthand
             <code class="ph codeph"><var class="keyword 
varname">map_name</var>.<var class="keyword varname">field_name</var></code>.
           </li>
+
         </ul>
 
 
 
+
         <p class="p toc inpage"></p>
 
+
       </div>
 
-      <article class="topic concept nested3" aria-labelledby="item__pos" 
id="pseudocolumns__item">
+
+      <div class="topic concept nested3" aria-labelledby="item__pos" id="item">
 
         <h4 class="title topictitle4" id="item__pos">ITEM and POS 
Pseudocolumns</h4>
 
+
         <div class="body conbody">
 
           <p class="p">
@@ -1423,6 +1633,7 @@ LIMIT 5;
             <code class="ph codeph">SELECT</code> list, or the <code class="ph 
codeph">WHERE</code> or other clauses.
           </p>
 
+
           <p class="p">
             This example shows a table with two <code class="ph 
codeph">ARRAY</code> columns whose elements are of the scalar type
             <code class="ph codeph">STRING</code>. When referring to the 
values of the array elements in the <code class="ph codeph">SELECT</code> list,
@@ -1430,6 +1641,7 @@ LIMIT 5;
             within the array, the individual elements have no defined names.
           </p>
 
+
 <pre class="pre codeblock"><code>create TABLE persons_of_interest
 (
 person_id BIGINT,
@@ -1457,12 +1669,14 @@ WHERE associates.item LIKE '% MacGuffin';
             <code class="ph codeph">POS</code> pseudocolumn lets you filter or 
reorder the result set based on the sequence of array elements.
           </p>
 
+
           <p class="p">
             The following example uses a table from a flattened version of the 
TPC-H schema. The <code class="ph codeph">REGION</code> table only has a
             few rows, such as one row for Europe and one for Asia. The row for 
each region represents all the countries in that region as an
             <code class="ph codeph">ARRAY</code> of <code class="ph 
codeph">STRUCT</code> elements:
           </p>
 
+
 <pre class="pre codeblock"><code>[localhost:21000] &gt; desc region;
 
+-------------+--------------------------------------------------------------------+
 | name        | type                                                           
    |
@@ -1480,6 +1694,7 @@ WHERE associates.item LIKE '% MacGuffin';
             refer to the <code class="ph codeph">POS</code> pseudocolumn in 
the select list:
           </p>
 
+
 <pre class="pre codeblock"><code>[localhost:21000] &gt; SELECT r1.r_name, 
r2.n_name, <strong class="ph b">r2.POS</strong>
                   &gt; FROM region r1 INNER JOIN r1.r_nations r2
                   &gt; WHERE r1.r_name = 'ASIA';
@@ -1499,6 +1714,7 @@ WHERE associates.item LIKE '% MacGuffin';
             ordering of results from the complex type column or to filter 
certain elements from the array:
           </p>
 
+
 <pre class="pre codeblock"><code>[localhost:21000] &gt; SELECT r1.r_name, 
r2.n_name, r2.POS
                   &gt; FROM region r1 INNER JOIN r1.r_nations r2
                   &gt; WHERE r1.r_name = 'ASIA'
@@ -1526,12 +1742,15 @@ WHERE associates.item LIKE '% MacGuffin';
 
         </div>
 
-      </article>
 
-      <article class="topic concept nested3" aria-labelledby="key__value" 
id="pseudocolumns__key">
+      </div>
+
+
+      <div class="topic concept nested3" aria-labelledby="key__value" id="key">
 
         <h4 class="title topictitle4" id="key__value">KEY and VALUE 
Pseudocolumns</h4>
 
+
         <div class="body conbody">
 
           <p class="p">
@@ -1542,6 +1761,7 @@ WHERE associates.item LIKE '% MacGuffin';
             <code class="ph codeph"><var class="keyword 
varname">map_column</var>.KEY</code> and <code class="ph codeph"><var 
class="keyword varname">map_column</var>.VALUE</code>.
           </p>
 
+
           <p class="p">
             The <code class="ph codeph">KEY</code> must always be a scalar 
type, such as <code class="ph codeph">STRING</code>, <code class="ph 
codeph">BIGINT</code>, or
             <code class="ph codeph">TIMESTAMP</code>. It can be <code 
class="ph codeph">NULL</code>. Values of the <code class="ph codeph">KEY</code> 
field are not necessarily unique
@@ -1549,6 +1769,7 @@ WHERE associates.item LIKE '% MacGuffin';
             clauses in the query, and loop through the result set to process 
all the values matching any specified keys.
           </p>
 
+
           <p class="p">
             The <code class="ph codeph">VALUE</code> can be either a scalar 
type or another complex type. If the <code class="ph codeph">VALUE</code> is a
             <code class="ph codeph">STRUCT</code>, you can construct a 
qualified name
@@ -1559,6 +1780,7 @@ WHERE associates.item LIKE '% MacGuffin';
             <code class="ph codeph"><var class="keyword 
varname">table_alias</var>.KEY</code> and <code class="ph codeph"><var 
class="keyword varname">table_alias</var>.VALUE</code>
           </p>
 
+
           <p class="p">
             The following example shows different ways to access a <code 
class="ph codeph">MAP</code> column using the <code class="ph 
codeph">KEY</code> and
             <code class="ph codeph">VALUE</code> pseudocolumns. The <code 
class="ph codeph">DETAILS</code> column has a <code class="ph 
codeph">STRING</code> first part with short,
@@ -1569,7 +1791,8 @@ WHERE associates.item LIKE '% MacGuffin';
             underlying values.
           </p>
 
-          <div class="note note note_note"><span class="note__title 
notetitle">Note:</span> 
+
+          <div class="note note"><span class="notetitle">Note:</span>
             If you find that the single-item nature of the <code class="ph 
codeph">VALUE</code> makes it difficult to model your data accurately, the
             solution is typically to add some nesting to the complex type. For 
example, to have several sets of key-value pairs, make the
             column an <code class="ph codeph">ARRAY</code> whose elements are 
<code class="ph codeph">MAP</code>. To make a set of key-value pairs that holds 
more
@@ -1577,6 +1800,7 @@ WHERE associates.item LIKE '% MacGuffin';
             or a <code class="ph codeph">STRUCT</code>.
           </div>
 
+
 <pre class="pre codeblock"><code>CREATE TABLE dream_journal
 (
   dream_id BIGINT,
@@ -1609,6 +1833,7 @@ WHERE
             <code class="ph codeph">VALUE</code> pseudocolumn directly, you 
use dot notation to refer to the <code class="ph codeph">STRUCT</code> fields 
inside it.
           </p>
 
+
 <pre class="pre codeblock"><code>CREATE TABLE better_dream_journal
 (
   dream_id BIGINT,
@@ -1637,16 +1862,20 @@ WHERE
 
         </div>
 
-      </article>
 
-    </article>
+      </div>
+
+
+    </div>
+
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title19" 
id="complex_types_using__complex_types_etl">
+    <div class="topic concept nested2" aria-labelledby="ariaid-title19" 
id="complex_types_etl">
 
 
 
       <h3 class="title topictitle3" id="ariaid-title19">Loading Data 
Containing Complex Types</h3>
 
+
       <div class="body conbody">
 
         <p class="p">
@@ -1656,12 +1885,14 @@ WHERE
           files.
         </p>
 
+
         <p class="p">
           If you have created a Hive table with the Parquet file format and 
containing complex types, use the same table for Impala queries
           with no changes. If you have such a Hive table in some other format, 
use a Hive <code class="ph codeph">CREATE TABLE AS SELECT ... STORED AS
           PARQUET</code> or <code class="ph codeph">INSERT ... SELECT</code> 
statement to produce an equivalent Parquet table that Impala can query.
         </p>
 
+
         <p class="p">
           If you have existing Parquet data files containing complex types, 
located outside of any Impala or Hive table, such as data files
           created by Spark jobs, you can use an Impala <code class="ph 
codeph">CREATE TABLE ... STORED AS PARQUET</code> statement, followed by an 
Impala
@@ -1670,6 +1901,7 @@ WHERE
           files.
         </p>
 
+
         <p class="p">
           Perhaps the simplest way to get started with complex type data is to 
take a denormalized table containing duplicated values, and
           use an <code class="ph codeph">INSERT ... SELECT</code> statement to 
copy the data into a Parquet table and condense the repeated values into
@@ -1680,21 +1912,26 @@ WHERE
           match the field names from the <code class="ph codeph">CREATE 
TABLE</code> statement.
         </p>
 
-        <div class="note note note_note"><span class="note__title 
notetitle">Note:</span> 
+
+        <div class="note note"><span class="notetitle">Note:</span>
           Because Hive currently cannot construct individual rows using 
complex types through the <code class="ph codeph">INSERT ... VALUES</code> 
syntax,
           you prepare the data in flat form in a separate table, then copy it 
to the table with complex columns using <code class="ph codeph">INSERT ...
           SELECT</code> and the complex type constructors. See <a class="xref" 
href="impala_complex_types.html#complex_types_ex_hive_etl">Constructing Parquet 
Files with Complex Columns Using Hive</a> for
           examples.
         </div>
 
+
       </div>
 
-    </article>
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title20" 
id="complex_types_using__complex_types_nesting">
+    </div>
+
+
+    <div class="topic concept nested2" aria-labelledby="ariaid-title20" 
id="complex_types_nesting">
 
       <h3 class="title topictitle3" id="ariaid-title20">Using Complex Types as 
Nested Types</h3>
 
+
       <div class="body conbody">
 
         <p class="p">
@@ -1705,10 +1942,12 @@ WHERE
           <code class="ph codeph">STRUCT</code>, elements of an <code 
class="ph codeph">ARRAY</code>, and keys and values of a <code class="ph 
codeph">MAP</code>.
         </p>
 
+
         <p class="p">
           Schemas involving complex types typically use some level of nesting 
for the complex type columns.
         </p>
 
+
         <p class="p">
           For example, to model a relationship like a dimension table and a 
fact table, you typically use an <code class="ph codeph">ARRAY</code> where
           each array element is a <code class="ph codeph">STRUCT</code>. The 
<code class="ph codeph">STRUCT</code> fields represent what would traditionally 
be columns
@@ -1718,6 +1957,7 @@ WHERE
 
 
 
+
         <p class="p">
           Perhaps the only use case for a top-level <code class="ph 
codeph">STRUCT</code> would be to to allow <code class="ph 
codeph">STRUCT</code> fields with the
           same name as columns to coexist in the same table. The following 
example shows how a table could have a column named
@@ -1726,6 +1966,7 @@ WHERE
           conflict.
         </p>
 
+
 <pre class="pre codeblock"><code>CREATE TABLE struct_namespaces
 (
   id BIGINT
@@ -1746,6 +1987,7 @@ select id, s1.id, s2.id from struct_namespaces;
           structures where each row contains only a few data values drawn from 
a large set of possible choices.
         </p>
 
+
         <p class="p">
           Although you can use an <code class="ph codeph">ARRAY</code> of 
scalar values as the top-level column in a table, such a simple array is
           typically of limited use for analytic queries. The only property of 
the array elements, aside from the element value, is the
@@ -1754,6 +1996,7 @@ select id, s1.id, s2.id from struct_namespaces;
           of scalar values.
         </p>
 
+
         <p class="p">
           If you are considering having multiple <code class="ph 
codeph">ARRAY</code> or <code class="ph codeph">MAP</code> columns, with 
related items under the same
           position in each <code class="ph codeph">ARRAY</code> or the same 
key in each <code class="ph codeph">MAP</code>, prefer to use a <code class="ph 
codeph">STRUCT</code> to
@@ -1764,6 +2007,7 @@ select id, s1.id, s2.id from struct_namespaces;
           notation to refer to the relevant fields rather than a sequence of 
join clauses.
         </p>
 
+
         <p class="p">
           For example, here is a table with several complex type columns all 
at the top level and containing only scalar types. To retrieve
           every data item for the row requires a separate join for each <code 
class="ph codeph">ARRAY</code> or <code class="ph codeph">MAP</code> column. 
The fields of
@@ -1772,6 +2016,7 @@ select id, s1.id, s2.id from struct_namespaces;
           <code class="ph codeph">FIELD2</code>.
         </p>
 
+
 <pre class="pre codeblock"><code>CREATE TABLE complex_types_top_level
 (
   id BIGINT,
@@ -1825,6 +2070,7 @@ from
           <code class="ph codeph">STRUCT</code>.
         </p>
 
+
 <pre class="pre codeblock"><code>CREATE TABLE nesting_demo
 (
   user_id BIGINT,
@@ -1843,6 +2089,7 @@ STORED AS PARQUET;
           names within each <code class="ph codeph">STRUCT</code> for easy 
readability:
         </p>
 
+
 <pre class="pre codeblock"><code>DESCRIBE nesting_demo;
 +----------------+-----------------------------+
 | name           | type                        |
@@ -1879,6 +2126,7 @@ STORED AS PARQUET;
 
 
 
+
 <pre class="pre codeblock"><code>SELECT
 -- The lone scalar field doesn't require any dot notation or join clauses.
     user_id
@@ -1920,14 +2168,18 @@ FROM
           <code class="ph codeph">MAP</code> items by running comparisons 
against the <code class="ph codeph">KEY</code> part in the <code class="ph 
codeph">WHERE</code> clause.
         </p>
 
+
       </div>
 
-    </article>
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title21" 
id="complex_types_using__complex_types_views">
+    </div>
+
+
+    <div class="topic concept nested2" aria-labelledby="ariaid-title21" 
id="complex_types_views">
 
       <h3 class="title topictitle3" id="ariaid-title21">Accessing Complex Type 
Data in Flattened Form Using Views</h3>
 
+
       <div class="body conbody">
 
         <p class="p">
@@ -1941,6 +2193,7 @@ FROM
 
 
 
+
         <p class="p">
           For example, the variation of the TPC-H schema containing complex 
types has a table <code class="ph codeph">REGION</code>. This table has 5
           rows, corresponding to 5 regions such as <code class="ph 
codeph">NORTH AMERICA</code> and <code class="ph codeph">AFRICA</code>. Each 
row has an
@@ -1948,6 +2201,7 @@ FROM
           region.
         </p>
 
+
 <pre class="pre codeblock"><code>DESCRIBE region;
 +-------------+-------------------------+
 | name        | type                    |
@@ -1970,6 +2224,7 @@ FROM
           still keeping the data in a single table rather than normalizing 
across multiple tables.
         </p>
 
+
         <p class="p">
           To use this table with a JDBC or ODBC application that expected 
scalar columns, we could create a view that represented the result
           set as a set of scalar columns (three columns from the original 
table, plus three more from the <code class="ph codeph">STRUCT</code> fields of
@@ -1980,6 +2235,7 @@ FROM
 
 
 
+
 <pre class="pre codeblock"><code>CREATE VIEW region_view AS
   SELECT
     r_regionkey,
@@ -1998,6 +2254,7 @@ FROM
           nation.
         </p>
 
+
 <pre class="pre codeblock"><code>-- Retrieve info such as the nation name from 
the original R_NATIONS array elements.
 select n_name from region_view where r_name in ('EUROPE', 'ASIA');
 +----------------+
@@ -2043,14 +2300,18 @@ SELECT r_regionkey, r_name, n_nationkey, n_name FROM 
region_view LIMIT 7;
 
       </div>
 
-    </article>
 
-  </article>
+    </div>
+
+
+  </div>
+
 
-  <article class="topic concept nested1" aria-labelledby="ariaid-title22" 
id="complex_types__complex_types_examples">
+  <div class="topic concept nested1" aria-labelledby="ariaid-title22" 
id="complex_types_examples">
 
     <h2 class="title topictitle2" id="ariaid-title22">Tutorials and Examples 
for Complex Types</h2>
 
+
     
 
     <div class="body conbody">
@@ -2059,14 +2320,18 @@ SELECT r_regionkey, r_name, n_nationkey, n_name FROM 
region_view LIMIT 7;
         The following examples illustrate the query syntax for some common use 
cases involving complex type columns.
       </p>
 
+
       <p class="p toc inpage"></p>
 
+
     </div>
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title23" 
id="complex_types_examples__complex_sample_schema">
+
+    <div class="topic concept nested2" aria-labelledby="ariaid-title23" 
id="complex_sample_schema">
 
       <h3 class="title topictitle3" id="ariaid-title23">Sample Schema and Data 
for Experimenting with Impala Complex Types</h3>
 
+
       <div class="body conbody">
 
 
@@ -2076,6 +2341,7 @@ SELECT r_regionkey, r_name, n_nationkey, n_name FROM 
region_view LIMIT 7;
           the complex type feature use these tables, adapted from the schema 
used for TPC-H testing:
         </p>
 
+
 <pre class="pre codeblock"><code>SHOW TABLES;
 +----------+
 | name     |
@@ -2174,6 +2440,7 @@ DESCRIBE supplier;
           The volume of data used in the following examples is:
         </p>
 
+
 <pre class="pre codeblock"><code>SELECT count(*) FROM customer;
 +----------+
 | count(*) |
@@ -2206,9 +2473,11 @@ SELECT count(*) FROM supplier;
 
       </div>
 
+
       
 
-    </article>
+    </div>
+
 
     
 
@@ -2216,10 +2485,11 @@ SELECT count(*) FROM supplier;
 
     
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title24" 
id="complex_types_examples__complex_types_ex_hive_etl">
+    <div class="topic concept nested2" aria-labelledby="ariaid-title24" 
id="complex_types_ex_hive_etl">
 
       <h3 class="title topictitle3" id="ariaid-title24">Constructing Parquet 
Files with Complex Columns Using Hive</h3>
 
+
       <div class="body conbody">
 
         <p class="p">
@@ -2229,10 +2499,12 @@ SELECT count(*) FROM supplier;
           format.
         </p>
 
+
         <p class="p">
           <strong class="ph b">Create table with <code class="ph 
codeph">ARRAY</code> in Impala, load data in Hive, query in Impala:</strong>
         </p>
 
+
         <p class="p">
           This example shows the cycle of creating the tables and querying the 
complex data in Impala, and using Hive (either the
           <code class="ph codeph">hive</code> shell or <code class="ph 
codeph">beeline</code>) for the data loading step. The data starts in 
flattened, denormalized
@@ -2240,6 +2512,7 @@ SELECT count(*) FROM supplier;
           analytic queries on the Parquet table, using join notation to unpack 
the <code class="ph codeph">ARRAY</code> column.
         </p>
 
+
 <pre class="pre codeblock"><code>/* Initial DDL and loading of flat, 
denormalized data happens in impala-shell */CREATE TABLE flat_array (country 
STRING, city STRING);INSERT INTO flat_array VALUES
     ('Canada', 'Toronto') , ('Canada', 'Vancouver') , ('Canada', "St. John\'s")
   , ('Canada', 'Saint John') , ('Canada', 'Montreal') , ('Canada', 'Halifax')
@@ -2299,6 +2572,7 @@ SELECT country, city.item FROM complex_array, 
complex_array.city
           <strong class="ph b">Create table with <code class="ph 
codeph">STRUCT</code> and <code class="ph codeph">ARRAY</code> in Impala, load 
data in Hive, query in Impala:</strong>
         </p>
 
+
         <p class="p">
           This example shows the cycle of creating the tables and querying the 
complex data in Impala, and using Hive (either the
           <code class="ph codeph">hive</code> shell or <code class="ph 
codeph">beeline</code>) for the data loading step. The data starts in 
flattened, denormalized
@@ -2309,6 +2583,7 @@ SELECT country, city.item FROM complex_array, 
complex_array.city
 
 
 
+
 <pre class="pre codeblock"><code>/* Initial DDL and loading of flat, 
denormalized data happens in impala-shell */
 
 CREATE TABLE flat_struct_array (continent STRING, country STRING, city STRING);
@@ -2374,14 +2649,17 @@ SELECT t1.continent, t1.country.name, t2.item
 
       </div>
 
-    </article>
+
+    </div>
+
 
     
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title25" 
id="complex_types_examples__complex_denormalizing">
+    <div class="topic concept nested2" aria-labelledby="ariaid-title25" 
id="complex_denormalizing">
 
       <h3 class="title topictitle3" id="ariaid-title25">Flattening Normalized 
Tables into a Single Table with Complex Types</h3>
 
+
       <div class="body conbody">
 
         <p class="p">
@@ -2390,17 +2668,20 @@ SELECT t1.continent, t1.country.name, t2.item
           of rows as in the original normalized table, and put all the 
associated data from the other table in a single new column.
         </p>
 
+
         <p class="p">
           In this flattening scenario, you might frequently use a column that 
is an <code class="ph codeph">ARRAY</code> consisting of
           <code class="ph codeph">STRUCT</code> elements, where each field 
within the <code class="ph codeph">STRUCT</code> corresponds to a column name 
from the table
           that you are combining.
         </p>
 
+
         <p class="p">
           The following example shows a traditional normalized layout using 
two tables, and then an equivalent layout using complex types in
           a single table.
         </p>
 
+
 <pre class="pre codeblock"><code>/* Traditional relational design */
 
 -- This table just stores numbers, allowing us to look up details about the 
employee
@@ -2470,24 +2751,29 @@ STORED AS PARQUET;
 
       </div>
 
-    </article>
 
-    <article class="topic concept nested2" aria-labelledby="ariaid-title26" 
id="complex_types_examples__complex_inference">
+    </div>
+
+
+    <div class="topic concept nested2" aria-labelledby="ariaid-title26" 
id="complex_inference">
 
       <h3 class="title topictitle3" id="ariaid-title26">Interchanging Complex 
Type Tables and Data Files with Hive and Other Components</h3>
 
+
       <div class="body conbody">
 
         <p class="p">
           You can produce Parquet data files through several Hadoop components 
and APIs.
         </p>
 
+
         <p class="p">
           If you have a Hive-created Parquet table that includes <code 
class="ph codeph">ARRAY</code>, <code class="ph codeph">STRUCT</code>, or <code 
class="ph codeph">MAP</code>
           columns, Impala can query that same table in <span 
class="keyword">Impala 2.3</span> and higher, subject to the usual restriction 
that all other
           columns are of data types supported by Impala, and also that the 
file type of the table must be Parquet.
         </p>
 
+
         <p class="p">
           If you have a Parquet data file produced outside of Impala, Impala 
can automatically deduce the appropriate table structure using
           the syntax <code class="ph codeph">CREATE TABLE ... LIKE PARQUET 
'<var class="keyword varname">hdfs_path_of_parquet_file</var>'</code>. In <span 
class="keyword">Impala 2.3</span>
@@ -2495,6 +2781,7 @@ STORED AS PARQUET;
           <code class="ph codeph">MAP</code> types.
         </p>
 
+
 <pre class="pre codeblock"><code>/* In impala-shell, find the HDFS data 
directory of the original table.
 DESCRIBE FORMATTED tpch_nested_parquet.customer;
 ...
@@ -2599,8 +2886,12 @@ describe customer_ctlp;
 
       </div>
 
-    </article>
 
-  </article>
+    </div>
+
+
+  </div>
+
 
-</article></main></body></html>
\ No newline at end of file
+</body>
+</html>
\ No newline at end of file


http://git-wip-us.apache.org/repos/asf/impala/blob/b4ad38a9/docs/build/html/topics/impala_components.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_components.html 
b/docs/build/html/topics/impala_components.html
index c6ee7fb..eb68b8b 100644
--- a/docs/build/html/topics/impala_components.html
+++ b/docs/build/html/topics/impala_components.html
@@ -1,8 +1,34 @@
+<?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE html
-  SYSTEM "about:legacy-compat">
-<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; 
charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) 
Copyright 2018"><meta name="DC.rights.owner" content="(C) Copyright 2018"><meta 
name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" 
content="../topics/impala_concepts.html"><meta name="prodname" 
content="Impala"><meta name="prodname" content="Impala"><meta name="prodname" 
content="Impala"><meta name="prodname" content="Impala"><meta name="prodname" 
content="Impala"><meta name="version" content="Impala 2.12x"><meta 
name="version" content="Impala 2.12x"><meta name="version" content="Impala 
2.12x"><meta name="version" content="Impala 2.12x"><meta name="version" 
content="Impala 2.12x"><meta name="DC.Format" content="XHTML"><meta 
name="DC.Identifier" content="intro_components"><link rel="stylesheet" 
type="text/css" href="../commonltr.css"><title>Components of the Impala 
Server</title></head><body id="intro_components"><main 
 role="main"><article role="article" aria-labelledby="ariaid-title1">
+  PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
+<html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
+<head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
+
+<meta name="copyright" content="(C) Copyright 2018" />
+<meta name="DC.rights.owner" content="(C) Copyright 2018" />
+<meta name="DC.Type" content="concept" />
+<meta name="DC.Title" content="Components of the Impala Server" />
+<meta name="DC.Relation" scheme="URI" content="../topics/impala_concepts.html" 
/>
+<meta name="prodname" content="Impala" />
+<meta name="prodname" content="Impala" />
+<meta name="prodname" content="Impala" />
+<meta name="prodname" content="Impala" />
+<meta name="prodname" content="Impala" />
+<meta name="version" content="Impala 3.0.x" />
+<meta name="version" content="Impala 3.0.x" />
+<meta name="version" content="Impala 3.0.x" />
+<meta name="version" content="Impala 3.0.x" />
+<meta name="version" content="Impala 3.0.x" />
+<meta name="DC.Format" content="XHTML" />
+<meta name="DC.Identifier" content="intro_components" />
+<link rel="stylesheet" type="text/css" href="../commonltr.css" />
+<title>Components of the Impala Server</title>
+</head>
+<body id="intro_components">
+
 
   <h1 class="title topictitle1" id="ariaid-title1">Components of the Impala 
Server</h1>
+
   
   
 
@@ -13,13 +39,21 @@
       different daemon processes that run on specific hosts within your <span 
class="keyword"></span> cluster.
     </p>
 
+
     <p class="p toc inpage"></p>
+
   </div>
 
-  <nav role="navigation" class="related-links"><div class="familylinks"><div 
class="parentlink"><strong>Parent topic:</strong> <a class="link" 
href="../topics/impala_concepts.html">Impala Concepts and 
Architecture</a></div></div></nav><article class="topic concept nested1" 
aria-labelledby="ariaid-title2" id="intro_components__intro_impalad">
+
+  <div class="related-links">
+<div class="familylinks">
+<div class="parentlink"><strong>Parent topic:</strong> <a class="link" 
href="../topics/impala_concepts.html">Impala Concepts and Architecture</a></div>
+</div>
+</div><div class="topic concept nested1" aria-labelledby="ariaid-title2" 
id="intro_impalad">
 
     <h2 class="title topictitle2" id="ariaid-title2">The Impala Daemon</h2>
 
+
     <div class="body conbody">
 
       <p class="p">
@@ -30,6 +64,7 @@
         central coordinator node.
       </p>
 
+
       <p class="p">
         You can submit a query to the Impala daemon running on any DataNode, 
and that instance of the daemon serves as the
         <dfn class="term">coordinator node</dfn> for that query. The other 
nodes transmit partial results back to the
@@ -39,11 +74,13 @@
         submitting each query to a different Impala daemon in round-robin 
style, using the JDBC or ODBC interfaces.
       </p>
 
+
       <p class="p">
         The Impala daemons are in constant communication with the <dfn 
class="term">statestore</dfn>, to confirm which nodes
         are healthy and can accept new work.
       </p>
 
+
       <p class="p">
         They also receive broadcast messages from the <span class="keyword 
cmdname">catalogd</span> daemon (introduced in Impala 1.2)
         whenever any Impala node in the cluster creates, alters, or drops any 
type of object, or when an
@@ -52,24 +89,30 @@
         METADATA</code> statements that were needed to coordinate metadata 
across nodes prior to Impala 1.2.
       </p>
 
+
       <p class="p">
         In <span class="keyword">Impala 2.9</span> and higher, you can control 
which hosts act as query coordinators
         and which act as query executors, to improve scalability for highly 
concurrent workloads on large clusters.
-        See <a class="xref" href="impala_scalability.html">Scalability 
Considerations for Impala</a> for details.
+        See <a class="xref" href="impala_dedicated_coordinator.html">How to 
Configure Impala with Dedicated Coordinators</a> for details.
       </p>
 
+
       <p class="p">
         <strong class="ph b">Related information:</strong> <a class="xref" 
href="impala_config_options.html#config_options">Modifying Impala Startup 
Options</a>,
         <a class="xref" href="impala_processes.html#processes">Starting 
Impala</a>, <a class="xref" href="impala_timeouts.html#impalad_timeout">Setting 
the Idle Query and Idle Session Timeouts for impalad</a>,
         <a class="xref" href="impala_ports.html#ports">Ports Used by 
Impala</a>, <a class="xref" href="impala_proxy.html#proxy">Using Impala through 
a Proxy for High Availability</a>
       </p>
+
     </div>
-  </article>
 
-  <article class="topic concept nested1" aria-labelledby="ariaid-title3" 
id="intro_components__intro_statestore">
+  </div>
+
+
+  <div class="topic concept nested1" aria-labelledby="ariaid-title3" 
id="intro_statestore">
 
     <h2 class="title topictitle2" id="ariaid-title3">The Impala Statestore</h2>
 
+
     <div class="body conbody">
 
       <p class="p">
@@ -81,14 +124,27 @@
         requests to the unreachable node.
       </p>
 
+
       <p class="p">
-        Because the statestore's purpose is to help when things go wrong, it 
is not critical to the normal
-        operation of an Impala cluster. If the statestore is not running or 
becomes unreachable, the Impala daemons
-        continue running and distributing work among themselves as usual; the 
cluster just becomes less robust if
-        other Impala daemons fail while the statestore is offline. When the 
statestore comes back online, it re-establishes
-        communication with the Impala daemons and resumes its monitoring 
function.
+        Because the statestore's purpose is to help when things go wrong and
+        to broadcast metadata to coordinators, it is not always critical to the
+        normal operation of an Impala cluster. If the statestore is not running
+        or becomes unreachable, the Impala daemons continue running and
+        distributing work among themselves as usual when working with the data
+        known to Impala. The cluster just becomes less robust if other Impala
+        daemons fail, and metadata becomes less consistent as it changes while
+        the statestore is offline. When the statestore comes back online, it
+        re-establishes communication with the Impala daemons and resumes its
+        monitoring and broadcasting functions.
       </p>
 
+
+      <p class="p">
+        If you issue a DDL statement while the statestore is down, the queries
+        that access the new object the DDL created will fail.
+      </p>
+
+
       <p class="p">
         Most considerations for load balancing and high availability apply to 
the <span class="keyword cmdname">impalad</span> daemon.
         The <span class="keyword cmdname">statestored</span> and <span 
class="keyword cmdname">catalogd</span> daemons do not have special
@@ -99,22 +155,28 @@
         Impala service.
       </p>
 
+
       <p class="p">
         <strong class="ph b">Related information:</strong>
       </p>
 
+
       <p class="p">
         <a class="xref" 
href="impala_scalability.html#statestore_scalability">Scalability 
Considerations for the Impala Statestore</a>,
         <a class="xref" 
href="impala_config_options.html#config_options">Modifying Impala Startup 
Options</a>, <a class="xref" href="impala_processes.html#processes">Starting 
Impala</a>,
         <a class="xref" 
href="impala_timeouts.html#statestore_timeout">Increasing the Statestore 
Timeout</a>, <a class="xref" href="impala_ports.html#ports">Ports Used by 
Impala</a>
       </p>
+
     </div>
-  </article>
 
-  <article class="topic concept nested1" aria-labelledby="ariaid-title4" 
id="intro_components__intro_catalogd">
+  </div>
+
+
+  <div class="topic concept nested1" aria-labelledby="ariaid-title4" 
id="intro_catalogd">
 
     <h2 class="title topictitle2" id="ariaid-title4">The Impala Catalog 
Service</h2>
 
+
     <div class="body conbody">
 
       <p class="p">
@@ -125,6 +187,7 @@
         <span class="keyword cmdname">catalogd</span> services on the same 
host.
       </p>
 
+
       <p class="p">
         The catalog service avoids the need to issue
         <code class="ph codeph">REFRESH</code> and <code class="ph 
codeph">INVALIDATE METADATA</code> statements when the metadata changes are
@@ -133,12 +196,14 @@
         before executing a query there.
       </p>
 
+
       <p class="p">
         This feature touches a number of aspects of Impala:
       </p>
 
 
 
+
       <ul class="ul" id="intro_catalogd__catalogd_xrefs">
         <li class="li">
           <p class="p">
@@ -146,8 +211,10 @@
             <a class="xref" href="impala_processes.html#processes">Starting 
Impala</a>, for usage information for the
             <span class="keyword cmdname">catalogd</span> daemon.
           </p>
+
         </li>
 
+
         <li class="li">
           <p class="p">
             The <code class="ph codeph">REFRESH</code> and <code class="ph 
codeph">INVALIDATE METADATA</code> statements are not needed
@@ -159,9 +226,12 @@
             <a class="xref" 
href="impala_invalidate_metadata.html#invalidate_metadata">INVALIDATE METADATA 
Statement</a> for the latest usage information for
             those statements.
           </p>
+
         </li>
+
       </ul>
 
+
       <div class="p">
         Use <code class="ph codeph">--load_catalog_in_background</code> option 
to control when
         the metadata of a table is loaded.
@@ -174,6 +244,7 @@
             <code class="ph codeph">load_catalog_in_background</code> is
             <code class="ph codeph">false</code>.
           </li>
+
           <li class="li">
             If set to <code class="ph codeph">true</code>, the catalog service 
attempts to
             load metadata for a table even if no query needed that metadata. So
@@ -188,16 +259,22 @@
                 and can lead to a seemingly random long running queries that 
are
                 difficult to diagnose.
               </li>
+
               <li class="li">
                 Impala may load metadata for tables that are possibly never
                 used, potentially increasing catalog size and consequently 
memory
                 usage for both catalog service and Impala Daemon.
               </li>
+
             </ul>
+
           </li>
+
         </ul>
+
       </div>
 
+
       <p class="p">
         Most considerations for load balancing and high availability apply to 
the <span class="keyword cmdname">impalad</span> daemon.
         The <span class="keyword cmdname">statestored</span> and <span 
class="keyword cmdname">catalogd</span> daemons do not have special
@@ -208,7 +285,8 @@
         Impala service.
       </p>
 
-      <div class="note note note_note"><span class="note__title 
notetitle">Note:</span> 
+
+      <div class="note note"><span class="notetitle">Note:</span>
         <p class="p">
         In Impala 1.2.4 and higher, you can specify a table name with <code 
class="ph codeph">INVALIDATE METADATA</code> after
         the table is created in Hive, allowing you to make individual tables 
visible to Impala without doing a full
@@ -216,12 +294,18 @@
         mechanism faster and more responsive, especially during Impala 
startup. See
         <a class="xref" 
href="../shared/../topics/impala_new_features.html#new_features_124">New 
Features in Impala 1.2.4</a> for details.
       </p>
+
       </div>
 
+
       <p class="p">
         <strong class="ph b">Related information:</strong> <a class="xref" 
href="impala_config_options.html#config_options">Modifying Impala Startup 
Options</a>,
         <a class="xref" href="impala_processes.html#processes">Starting 
Impala</a>, <a class="xref" href="impala_ports.html#ports">Ports Used by 
Impala</a>
       </p>
+
     </div>
-  </article>
-</article></main></body></html>
\ No newline at end of file
+
+  </div>
+
+</body>
+</html>
\ No newline at end of file

[42/51] [partial] impala git commit: [DOCS] Impala 3.1 Docs to be published

Reply via email to