HAWQ-1296 - initial draft of hawq getting started guide (closes #98)

Project: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/commit/be34a833
Tree: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/tree/be34a833
Diff: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/diff/be34a833

Branch: refs/heads/master
Commit: be34a8339d0bb1b9dea5079af18e5ffe5a65fd46
Parents: 2e6e0f3
Author: Lisa Owen <lo...@pivotal.io>
Authored: Mon Apr 24 14:28:03 2017 -0700
Committer: David Yozie <yo...@apache.org>
Committed: Mon Apr 24 14:28:03 2017 -0700

----------------------------------------------------------------------
 .../source/subnavs/apache-hawq-nav.erb          |  24 ++
 .../gettingstarted/basicdbadmin.html.md.erb     | 233 ++++++++++++++++
 .../gettingstarted/basichawqadmin.html.md.erb   | 225 ++++++++++++++++
 .../gettingstarted/dataandscripts.html.md.erb   | 266 +++++++++++++++++++
 .../tutorial/gettingstarted/imgs/addprop.png    | Bin 0 -> 28885 bytes
 .../gettingstarted/imgs/advhawqsite.png         | Bin 0 -> 81027 bytes
 .../gettingstarted/imgs/ambariconsole.png       | Bin 0 -> 217655 bytes
 .../tutorial/gettingstarted/imgs/ambbgops.png   | Bin 0 -> 122558 bytes
 .../gettingstarted/imgs/hawqcfgsadv.png         | Bin 0 -> 175848 bytes
 .../gettingstarted/imgs/hawqsvcacts.png         | Bin 0 -> 36751 bytes
 .../gettingstarted/imgs/hawqsvccheckout.png     | Bin 0 -> 276107 bytes
 .../gettingstarted/imgs/orangerestart.png       | Bin 0 -> 46796 bytes
 .../gettingstarted/introhawqenv.html.md.erb     | 188 +++++++++++++
 .../gettingstarted/introhawqtbls.html.md.erb    | 222 ++++++++++++++++
 .../gettingstarted/intropxfhdfs.html.md.erb     | 224 ++++++++++++++++
 markdown/tutorial/overview.html.md.erb          |  46 ++++
 16 files changed, 1428 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/book/master_middleman/source/subnavs/apache-hawq-nav.erb
----------------------------------------------------------------------
diff --git a/book/master_middleman/source/subnavs/apache-hawq-nav.erb 
b/book/master_middleman/source/subnavs/apache-hawq-nav.erb
index a32c9ef..a69da5c 100644
--- a/book/master_middleman/source/subnavs/apache-hawq-nav.erb
+++ b/book/master_middleman/source/subnavs/apache-hawq-nav.erb
@@ -39,6 +39,30 @@
         </ul>
       </li>
       <li class="has_submenu">
+        <a 
href="/docs/userguide/2.2.0.0-incubating/tutorial/overview.html">Getting 
Started with HAWQ Tutorial</a>
+          <ul>
+            <li>
+              <a 
href="/docs/userguide/2.2.0.0-incubating/tutorial/gettingstarted/introhawqenv.html">Lesson
 1 - Runtime Environment</a>
+            </li>
+            <li>
+              <a 
href="/docs/userguide/2.2.0.0-incubating/tutorial/gettingstarted/basichawqadmin.html">Lesson
 2 - Cluster Administration</a>
+            </li>
+            <li>
+              <a 
href="/docs/userguide/2.2.0.0-incubating/tutorial/gettingstarted/basicdbadmin.html">Lesson
 3 - Database Administration</a>
+            </li>
+            <li>
+              <a 
href="/docs/userguide/2.2.0.0-incubating/tutorial/gettingstarted/dataandscripts.html">Lesson
 4 - Sample Data Set and HAWQ Schemas</a>
+            </li>
+            <li>
+              <a 
href="/docs/userguide/2.2.0.0-incubating/tutorial/gettingstarted/introhawqtbls.html">Lesson
 5 - HAWQ Tables</a>
+            </li>
+            <li>
+              <a 
href="/docs/userguide/2.2.0.0-incubating/tutorial/gettingstarted/intropxfhdfs.html">Lesson
 6 - HAWQ Extension Framework (PXF)</a>
+            </li>
+          </ul>
+        </li>
+
+      <li class="has_submenu">
         <span>
           Running a HAWQ Cluster
         </span>

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/gettingstarted/basicdbadmin.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/tutorial/gettingstarted/basicdbadmin.html.md.erb 
b/markdown/tutorial/gettingstarted/basicdbadmin.html.md.erb
new file mode 100644
index 0000000..04fdaab
--- /dev/null
+++ b/markdown/tutorial/gettingstarted/basicdbadmin.html.md.erb
@@ -0,0 +1,233 @@
+---
+title: Lesson 3 - Database Administration
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+The HAWQ `gpadmin` user and other users who are granted the necessary 
privileges can execute SQL commands to create HAWQ databases and tables. These 
commands may be invoked via scripts, programs, and from the `psql` client 
utility.
+
+This lesson introduces basic HAWQ database administration commands and tasks 
using `psql`. You will create a database and a simple table, and add data to 
and query the table.
+
+## <a id="tut_adminprereq"></a> Prerequisites
+
+Ensure that you have [Set Up your HAWQ Runtime 
Environment](introhawqenv.html#tut_runtime_setup) and that your HAWQ cluster is 
up and running.
+
+
+## <a id="tut_ex_createdb"></a>Exercise: Create the HAWQ Tutorial Database
+
+In this exercise, you use the `psql` command line utility to create a HAWQ 
database.
+
+1. Start the `psql` subsystem:
+
+    ``` shell
+    gpadmin@master$ psql -d postgres
+    ```
+
+    You enter the `psql` interpreter, connecting to the `postgres` database. 
`postgres` is a default template database created during HAWQ installation.
+    
+    ``` sql
+    psql (8.2.15)
+    Type "help" for help.
+
+    postgres=# 
+    ```
+    
+    The `psql` prompt is the database name followed by `=#` or `=>`. `=#` 
identifies the session as that of a database superuser. The default `psql` 
prompt for a non-superuser is `=>`.
+
+2. Create a database named `hawqgsdb`:
+
+    ``` sql
+    postgres=# CREATE DATABASE hawqgsdb;
+    CREATE DATABASE
+    ```
+    
+    The `;` at the end of the `CREATE DATABASE` statement instructs `psql` to 
interpret the command. SQL commands that span multiple lines are not 
interpreted until the `;` is entered.
+
+3. Connect to the `hawqgsdb` database you just created:
+
+    ``` sql
+    postgres=# \c hawqgsdb
+    You are now connected to database "hawqgsdb" as user "gpadmin".
+    hawqgsdb=#
+    ```
+
+4. Use the `psql` `\l` meta-command to list all HAWQ databases:
+
+    ``` sql
+    hawqgsdb=# \l
+                         List of databases
+          Name       |  Owner  | Encoding | Access privileges 
+    -----------------+---------+----------+-------------------
+     hawqgsdb        | gpadmin | UTF8     | 
+     postgres        | gpadmin | UTF8     | 
+     template0       | gpadmin | UTF8     | 
+     template1       | gpadmin | UTF8     | 
+    (4 rows)
+    ```
+    
+    HAWQ creates two additional template databases during installation, 
`template0` and `template1`, as you see above. Your HAWQ cluster may list 
additional databases.
+
+5. Exit `psql`:
+
+    ``` sql
+    hawqgsdb=# \q
+    ```
+
+## <a id="tut_ex_usepsql"></a>Exercise: Use psql for Table Operations
+
+You manage and access HAWQ databases and tables via the `psql` utility, an 
interactive front-end to the HAWQ database. In this exercise, you use `psql` to 
create, add data to, and query a simple HAWQ table.
+
+1. Start the `psql` subsystem:
+
+    ``` shell
+    gpadmin@master$ psql -d hawqgsdb
+    ```
+
+    The `-d hawqgsdb` option instructs `psql` to connect directly to the 
`hawqgsdb` database.
+  
+
+2. Create a table named `first_tbl` that has a single integer column named `i`:
+
+    ``` sql
+    hawqgsdb=# CREATE TABLE first_tbl( i int );
+    CREATE TABLE 
+    ```
+
+3. Display descriptive information about table `first_tbl`:
+
+    ``` sql
+    hawqgsdb=# \d first_tbl
+    Append-Only Table "public.first_tbl"
+     Column |  Type   | Modifiers 
+    --------+---------+-----------
+     i      | integer | 
+    Compression Type: None
+    Compression Level: 0
+    Block Size: 32768
+    Checksum: f
+    Distributed randomly
+    ```
+    
+    `first_tbl` is a table in the HAWQ `public` schema. `first_tbl` has a 
single integer column, was created with no compression, and is distributed 
randomly.
+
+4. Add some data to `first_tbl`:
+
+    ``` sql
+    hawqgsdb=# INSERT INTO first_tbl VALUES(1);
+    INSERT 0 1
+    hawqgsdb=# INSERT INTO first_tbl VALUES(2);
+    INSERT 0 1 
+    ```
+    
+    Each `INSERT` command adds a row to `first_tbl`, the first adding a row 
with the value `i=1`, and the second, a row with the value `i=2`. Each `INSERT` 
also displays the number of rows added (1).
+
+4. HAWQ provides several built-in functions for data manipulation. The  
`generate_series(<start>, <end>)` function generates a series of numbers 
beginning with `<start>` and finishing at `<end>`. Use the `generate_series()` 
HAWQ built-in function to add rows for `i=3`, `i=4`, and `i=5` to `first_tbl`:
+
+    ``` sql
+    hawqgsdb=# INSERT INTO first_tbl SELECT generate_series(3, 5);
+    INSERT 0 3
+    ```
+    
+    This `INSERT `command uses the `generate_series()` built-in function to 
add 3 rows to `first_tbl`, starting with `i=3` and writing and incrementing `i` 
for each new row.
+        
+5. Perform a query to return all rows in the `first_tbl` table:
+
+    ``` sql
+    hawqgsdb=# SELECT * FROM first_tbl;
+     i  
+    ----
+      1
+      2
+      3
+      4
+      5
+    (5 rows)
+    ```
+    
+    The `SELECT *` command queries `first_tbl`, returning all columns and all 
rows. `SELECT` also displays the total number of rows returned in the query.
+
+6. Perform a query to return column `i` for all rows in `first_tbl` where `i` 
is greater than 3:
+
+    ``` sql
+    hawqgsdb=# SELECT i FROM first_tbl WHERE i>3;
+     i  
+    ----
+      4
+      5
+    (2 rows)
+    ```
+    
+    The `SELECT` command returns the 2 rows (`i=4` and `i=5`) in the table 
where `i` is larger than 3 and displays the value of `i`.
+
+7. Exit the `psql` subsystem:
+
+    ``` sql
+    hawqgsdb=# \q
+    ```
+    
+8. `psql` includes an option, `-c`, to run a single SQL command from the shell 
command line. Perform the same query you ran in Step 7 using the `-c 
<sql-command>` option:
+
+    ``` shell
+    gpadmin@master$ psql -d hawqgsdb -c 'SELECT i FROM first_tbl WHERE i>3'
+    ```
+    
+    Notice that you enclose the SQL command in single quotes.
+
+9. Set the HAWQ `PGDATABASE` environment variable to identify `hawqsgdb`:
+
+    ``` shell
+    gpadmin@master$ export PGDATABASE=hawqgsdb
+    ```
+
+    `$PGDATABASE` identifies the default database to which to connect when 
invoking the HAWQ `psql` command.
+
+10. Re-run the query from the command line again, this time omitting the `-d` 
option:
+
+    ``` shell
+    gpadmin@master$ psql -c 'SELECT i FROM first_tbl WHERE i>3'
+    ```
+    
+    When no database is specified on the command line, `psql` attempts to 
connect to the database identified by `$PGDATABASE`.
+
+11. Add the `PGDATABASE` setting to your `.bash_profile`:
+
+    ``` shell
+    export PGDATABASE=hawqgsdb
+    ```  
+
+    
+## <a id="tut_dbadmin_summary"></a>Summary
+You created the database you will use in later lessons. You also created, 
inserted data into, and queried a simple HAWQ table using`psql`.
+
+For information on SQL command support in HAWQ, refer to the [SQL 
Command](../../reference/SQLCommandReference.html) reference. 
+
+For detailed information on the `psql` subsystem, refer to the 
[psql](../../reference/cli/client_utilities/psql.html) reference page. 
Commonly-used `psql` meta\-commands are identified in the table below.
+
+| Action                                                    | Command          
                                                                                
                                                                                
                  |
+|-------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| List databases | `\l` |
+| List tables in current database   | `\dt`                                    
                                                     |
+| Describe a specific table   | `\d <table-name>`                              
                                                           |
+| Execute an SQL script     | `\i <script-name>`                               
                                                          |
+| Quit/Exit    | `\q`                                                          
                               |
+
+Lesson 4 introduces the Retail demo, a more complicated data set used in 
upcoming lessons. You will download and examine the data set and work files. 
You will also load some of the data set into HDFS.
+ 
+**Lesson 4**: [Sample Data Set and HAWQ Schemas](dataandscripts.html)

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/gettingstarted/basichawqadmin.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/tutorial/gettingstarted/basichawqadmin.html.md.erb 
b/markdown/tutorial/gettingstarted/basichawqadmin.html.md.erb
new file mode 100644
index 0000000..84ecb5f
--- /dev/null
+++ b/markdown/tutorial/gettingstarted/basichawqadmin.html.md.erb
@@ -0,0 +1,225 @@
+---
+title: Lesson 2 - Cluster Administration
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+The HAWQ `gpadmin` administrative user has super-user capabilities on all HAWQ 
databases and HAWQ cluster management commands.
+
+HAWQ configuration parameters affect the behaviour of both the HAWQ cluster 
and individual HAWQ nodes.
+
+This lesson introduces basic HAWQ cluster administration tasks. You will view 
and update HAWQ configuration parameters.
+
+**Note**: Before installing HAWQ, you or your administrator choose to 
configure and manage the HAWQ cluster either using the command line or using 
the Ambari UI. You will perform command line and Ambari exercises for managing 
your HAWQ cluster in this lesson. Although you are introduced to both, command 
line and Ambari HAWQ cluster management modes should not be mixed.
+
+## <a id="tut_adminprereq"></a> Prerequisites
+
+Ensure that you have [Set Up your HAWQ Runtime 
Environment](introhawqenv.html#tut_runtime_setup) and that your HAWQ cluster is 
up and running.
+
+## <a id="tut_ex_cmdline_cfg"></a>Exercise: View and Update HAWQ Configuration 
from the Command Line
+
+If you choose to manage your HAWQ cluster from the command line, you will 
perform many administrative functions using the `hawq` utility. The `hawq` 
command line utility provides subcommands including `start`, `stop`, `config`, 
and `state`.
+
+In this exercise, you will use the command line to view and set HAWQ server 
configuration parameters. 
+
+Perform the following steps to view the HAWQ HDFS filespace URL and set the 
`pljava_classpath` server configuration parameter:
+
+1. The `hawq_dfs_url` configuration parameter identifies the HDFS NameNode (or 
HDFS NameService if HDFS High Availability is enabled) host, port, and the HAWQ 
filespace location within HDFS. Display the value of this parameter:
+
+    ``` shell
+    gpadmin@master$ hawq config -s hawq_dfs_url
+    GUC           : hawq_dfs_url
+    Value  : <hdfs-namenode>:8020/hawq_data
+    ```
+    
+    Make note of the <hdfs-namenode> hostname or IP address returned, you will 
need this in *Lesson 6: HAWQ Extension Framework (PXF)*.
+
+2. The HAWQ PL/Java `pljava_classpath` server configuration parameter 
identifies the classpath used by the HAWQ PL/Java extension. View the current 
`pljava_classpath` configuration parameter setting:
+
+    ``` shell
+    gpadmin@master$ hawq config -s pljava_classpath
+    GUC                : pljava_classpath
+    Value   :
+    ```
+    
+    The value is currently not set, as indicated by the empty `Value`.
+
+3. Your HAWQ installation includes an example PL/Java JAR file. Set 
`pljava_classpath` to include the `examples.jar` file installed with HAWQ:
+
+    ``` shell
+    gpadmin@master$ hawq config -c pljava_classpath -v 'examples.jar'
+    GUC pljava_classpath does not exist in hawq-site.xml
+    Try to add it with value: examples.jar
+    GUC            : pljava_classpath
+    Value   : examples.jar
+    ```
+
+    The message 'GUC pljava\_classpath does not exist in hawq-site.xml; Try to 
add it with value: examples.jar' indicates that HAWQ could not find a previous 
setting for `pljava_classpath` and attempts to set this configuration parameter 
to `examples.jar`, the value you provided with the `-v` option.
+
+3. You must reload the HAWQ configuration after setting a configuration 
parameter: 
+
+    ``` shell
+    gpadmin@master$ hawq stop cluster --reload
+    20170411:19:58:17:428600 hawq_stop:master:gpadmin-[INFO]:-Prepare to do 
'hawq stop'
+    20170411:19:58:17:428600 hawq_stop:master:gpadmin-[INFO]:-You can find log 
in:
+    20170411:19:58:17:428600 
hawq_stop:master:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20170411.log
+    20170411:19:58:17:428600 hawq_stop:master:gpadmin-[INFO]:-GPHOME is set to:
+    20170411:19:58:17:428600 hawq_stop:master:gpadmin-[INFO]:-/usr/local/hawq/.
+    20170411:19:58:17:428600 hawq_stop:master:gpadmin-[INFO]:-Reloading 
configuration without restarting hawq cluster
+
+    Continue with HAWQ service stop Yy|Nn (default=N):
+    > 
+    ```
+    
+    Reloading configuration does not actually stop the cluster, as noted in 
the INFO messages above.
+    
+    HAWQ prompts you to confirm the operation. Enter `y` to confirm:
+    
+    ``` shell
+    > y
+    20170411:19:58:22:428600 hawq_stop:master:gpadmin-[INFO]:-No standby host 
configured
+    20170411:19:58:23:428600 hawq_stop:master:gpadmin-[INFO]:-Reload hawq 
cluster
+    20170411:19:58:23:428600 hawq_stop:master:gpadmin-[INFO]:-Reload hawq 
master
+    20170411:19:58:23:428600 hawq_stop:master:gpadmin-[INFO]:-Master reloaded 
successfully
+    20170411:19:58:23:428600 hawq_stop:master:gpadmin-[INFO]:-Reload hawq 
segment
+   20170411:19:58:23:428600 hawq_stop:master:gpadmin-[INFO]:-Reload segments 
in list: ['segment']
+   20170411:19:58:23:428600 hawq_stop:master:gpadmin-[INFO]:-Total segment 
number is: 1
+..
+    20170411:19:58:25:428600 hawq_stop:master:gpadmin-[INFO]:-1 of 1 segments 
reload successfully
+    20170411:19:58:25:428600 hawq_stop:master:gpadmin-[INFO]:-Segments 
reloaded successfully
+    20170411:19:58:25:428600 hawq_stop:master:gpadmin-[INFO]:-Cluster reloaded 
successfully
+    ```
+
+    Configuration parameter value changes made by `hawq config` are 
system-wide; they are propagated to all segments across the cluster.
+
+
+## <a id="tut_ex_hawqstatecmdline"></a>Exercise: View the State of Your HAWQ 
Cluster via Ambari
+
+You may choose to use Ambari to manage the HAWQ deployment. The Ambari Web UI 
provides a graphical front-end to HAWQ cluster management activities.
+
+Perform the following steps to view the state of your HAWQ cluster via the 
Ambari web console:
+
+1. Start the Ambari web UI: 
+
+    ``` shell
+    <ambari-server-node>:8080
+    ```
+    
+    Ambari runs on port 8080.
+
+2. Log in to the Ambari UI using the Ambari user credentials.
+
+    The Ambari UI dashboard window displays.
+
+3. Select the **HAWQ** service from the service list in the left pane.
+
+    The HAWQ service page **Summary** tab is displayed.  This page includes a 
**Summary** pane identifying the HAWQ master and all HAWQ segment nodes in your 
cluster. The **Metrics** pane includes a set of HAWQ-specific metrics tiles.
+
+4. Perform a HAWQ service check operation by selecting the **Run Service 
Check** item from the **Service Actions** button drop-down menu and 
**Confirm**ing the operation.
+
+    ![HAWQ Service Actions](imgs/hawqsvcacts.png)
+
+    The **Background Operations Running** dialog displays. This dialog 
identifies all service-related operations performed on your HAWQ cluster.
+    
+    ![Ambari Background Operations](imgs/ambbgops.png)
+    
+5. Select the most recent **HAWQ Service Check** operation from the top of the 
**Operations** column. Select the HAWQ master host name from the **HAWQ Service 
Check** dialog, and then select the **Check HAWQ** task.
+
+    ![HAWQ Service Check Output](imgs/hawqsvccheckout.png)
+
+    The **Check HAWQ** task dialog displays the output of the service check 
operation. This operation returns the state of your HAWQ cluster, as well as 
the results of HAWQ database operation tests performed by Ambari.
+
+
+## <a id="tut_ex_ambari_cfg"></a>Exercise: View and Update HAWQ Configuration 
via Ambari
+
+Perform the following steps to view the HDFS NodeName and set the HAWQ PL/Java 
`pljava_classpath` configuration parameter and value via Ambari:
+
+1. Navigate to the **HAWQ** service page.
+    
+2. Select the **Configs** tab to view the current HAWQ-specific configuration 
settings.
+
+    HAWQ general settings displayed include master and segment data and temp 
directory locations, as well as specific resource management parameters.
+    
+3. Select the **Advanced** tab to view additional HAWQ parameter settings.
+
+    ![HAWQ Advanced Configs](imgs/hawqcfgsadv.png)
+
+    The **General** drop down pane opens. This tab displays information 
including the HAWQ master hostname and master and segment port numbers.
+    
+4. Locate the **HAWQ DFS URL** configuration parameter setting in the 
**General** pane. This value should match that returned by `hawq config -s 
hawq_dfs_url` in the previous exercise. Make note of the HDFS NameNode hostname 
or IP address if you have not done so previously.
+
+    **Note**: The **HDFS** service, **Configs > Advanced Configs** tab also 
identifies the HDFS NameNode hostname.
+    
+4. **Advanced \<config\>** and **Custom \<config\>** drop down panes provide 
access to advanced configuration settings for HAWQ and other cluster 
components. Select the **Advanced hawq-site** drop down.
+
+    ![Advanced hawq-site](imgs/advhawqsite.png)
+
+    Specific HAWQ configuration parameters and values are displayed in the 
pane. Hover the mouse cursor over the value field to display a tooltip 
description of a specific configuration parameter.
+
+5. Select the **Custom hawq-site** drop down.
+
+    Currently configured custom parameters and values are displayed in the 
pane.  If no configuration parameters are set, the pane will be empty.
+
+6. Select **Add Property ...**.
+
+    The **Add Property** dialog displays. This dialog includes **Type**, 
**Key**, and **Value** entry fields.
+
+7. Select the single property add mode (single label icon) in the **Add 
Property** dialog and fill in the fields:
+
+    **Key**: pljava_classpath  
+    **Value**: examples.jar
+    
+    ![Add Property](imgs/addprop.png)
+    
+8. **Add** the custom property, then **Save** the updated configuration, 
optionally providing a **Note** in the **Save Configuration** dialog.
+    
+    ![Restart Button](imgs/orangerestart.png)
+    
+    Notice the now orange-colored **Restart** button in the right corner of 
the window. You must restart the HAWQ service after adding or updating 
configuration parameter values through Ambari.
+
+9. Select the orange **Restart** button to **Restart All Affected** HAWQ nodes.
+
+    You can monitor the restart operation from the **Background Operations 
Running** dialog.
+
+10. When the restart operation completes, log out of the Ambari console by 
clicking the **admin** button and selecting the **Sign out** drop down menu 
item.
+
+## <a id="tut_hawqadmin_summary"></a>Summary
+
+In this lesson, you viewed the state of the HAWQ cluster and learned how to 
change cluster configuration parameters. 
+
+For additional information on HAWQ server configuration parameters, see 
[Server Configuration Parameter Reference](../../reference/HAWQSiteConfig.html).
+
+The following table identifies HAWQ management commands used in the tutorial 
exercises. For detailed information on specific HAWQ management commands, refer 
to the [HAWQ Management Tools 
Reference](../../reference/cli/management_tools.html).
+
+<a id="topic_table_clustmgmtcmd"></a>
+
+| Action                                                    | Command          
                                                                                
                                                                                
                  |
+|-------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Get HAWQ cluster status | `$ hawq state` |
+| Start/stop/restart HAWQ \<object\> (cluster, master, segment, standby, 
allsegments) | `$ hawq start <object>` <p> `$ hawq stop <object>` <p> `$ hawq 
restart <object>` |
+| List all HAWQ configuration parameters and their current settings     | `$ 
hawq config -l`                                                                 
                        |
+| Display the current setting of a specific HAWQ configuration parameter    | 
`$ hawq config -s <param-name>`                                                 
                                        |
+| Add/change the value of HAWQ configuration parameter (command-line managed 
HAWQ clusters only)  | `$ hawq config -c <param-name> -v <value>`               
                                                                          |
+| Reload HAWQ configuration        | `$ hawq stop cluster --reload`            
                                                                             |
+
+
+Lesson 3 introduces basic HAWQ database administration activities and commands.
+
+**Lesson 3**: [Database Administration](basicdbadmin.html)

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/gettingstarted/dataandscripts.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/tutorial/gettingstarted/dataandscripts.html.md.erb 
b/markdown/tutorial/gettingstarted/dataandscripts.html.md.erb
new file mode 100644
index 0000000..d50162a
--- /dev/null
+++ b/markdown/tutorial/gettingstarted/dataandscripts.html.md.erb
@@ -0,0 +1,266 @@
+---
+title: Lesson 4 - Sample Data Set and HAWQ Schemas
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+The sample Retail demo data set used in the tutorial exercises models an 
online retail store operation. The store carries different categories of 
products. Customers order the products. The company delivers the products to 
the customers.
+
+This and later exercises operate on this example data set. The data set is 
provided in a set of `gzip`'d `.tsv` (tab-separated values) text files. The 
exercises also reference scripts and other supporting files that operate on the 
data set.
+
+In this section, you are introduced to the Retail demo data schema. You will 
download and examine the data set and work files. You will also load some of 
the data into HDFS.
+
+## <a id="tut_dataset_prereq"></a>Prerequisites
+
+Ensure that you have [Created the HAWQ Tutorial 
Database](basicdbadmin.html#tut_ex_createdb) and that your HAWQ cluster is up 
and running.
+
+
+## <a id="tut_exdownloadfilessteps"></a>Exercise: Download the Retail Demo 
Data and Script Files
+
+Perform the following steps to download the sample data set and scripts:
+
+1. Open a terminal window and log in to the HAWQ master node as the `gpadmin` 
user:
+
+    ``` shell
+    $ ssh gpadmin@<master>
+    ```
+
+3. Create a working directory for the data files and scripts:
+
+    ``` shell
+    gpadmin@master$ mkdir /tmp/hawq_getstart
+    gpadmin@master$ cd /tmp/hawq_getstart
+    ```
+    
+    You may choose a different base work directory. If you do, ensure that all 
path components up to and including the `hawq_getstart` directory have read and 
execute permissions for all.
+
+4. Download the tutorial work and data files from `github`, checking out the 
appropriate tag/branch:
+
+    ``` shell
+    gpadmin@master$ git clone 
https://github.com/pivotalsoftware/hawq-samples.git
+    Cloning into 'hawq-samples'...
+    remote: Counting objects: 42, done.
+    remote: Total 42 (delta 0), reused 0 (delta 0), pack-reused 42
+    Unpacking objects: 100% (42/42), done.
+    Checking out files: 100% (18/18), done.
+    gpadmin@master$ cd hawq-samples
+    gpadmin@master$ git checkout hawq2x_tutorial
+    ```
+
+5. Save the path to the work files base directory:
+
+    ``` shell
+    gpadmin@master$ export HAWQGSBASE=/tmp/hawq_getstart/hawq-samples
+    ```
+    
+    (If you chose a different base work directory, modify the command as 
appropriate.) 
+    
+6. Add the `$HAWQGSBASE` environment variable setting to your `.bash_profile`.
+
+7. Examine the tutorial files. Exercises in this guide reference data files 
and SQL and shell scripts residing in the `hawq-samples` repository.  
Specifically:
+  
+    | Directory                                                    | Content   
                                                                                
                                                                                
                         |
+    
|----------------------------------------|----------------------------------------------------------------------------------|
+    | datasets/retail/ | Retail demo data set data files (`.tsv.gz` format) |
+    | tutorials/getstart/        | *Getting Started with HAWQ* guide work 
files |
+    | tutorials/getstart/hawq/  | SQL and shell scripts used by the HAWQ 
tables exercises                    |
+    | tutorials/getstart/pxf/   | SQL and shell scripts used by the PXF 
exercises                                                                       
                                                                                
                          |
+    <p>
+
+    (`hawq-samples` repository directories not mentioned in the table above 
are not used by the *Getting Started with HAWQ* exercises.)
+
+
+## <a id="tut_dsschema_ex"></a>Exercise: Create the Retail Demo HAWQ Schema
+
+A HAWQ schema is a namespace for a database. It contains named objects like 
tables, data types, functions, and operators. Access these objects by 
qualifying their name with the prefix `<schema-name>`.
+
+Perform the following steps to create the Retail demo data schema:
+
+1. Start the `psql` subsystem:
+
+    ``` shell
+    gpadmin@master$ psql
+    hawqgsdb=#
+    ```
+    
+    You are connected to the `hawqgsdb` database.
+
+2. List the HAWQ schemas:
+
+    ``` sql
+    hawqgsdb=# \dn
+           List of schemas
+            Name        |  Owner  
+    --------------------+---------
+     hawq_toolkit       | gpadmin
+     information_schema | gpadmin
+     pg_aoseg           | gpadmin
+     pg_bitmapindex     | gpadmin
+     pg_catalog         | gpadmin
+     pg_toast           | gpadmin
+     public             | gpadmin
+    (7 rows)
+    ```
+    
+    Every database includes a schema named `public`. Database objects you 
create without specifying a schema are created in the default schema. The 
default HAWQ schema is the `public` schema, unless you explicitly set it to 
another schema. (More about this later.)
+
+3. Display the tables in the `public` schema:
+
+    ``` sql
+    hawqgsdb=#\dt public.*
+               List of relations
+     Schema |    Name   | Type  |  Owner  |   Storage   
+    --------+-----------+-------+---------+-------------
+     public | first_tbl | table | gpadmin | append only
+    (1 row)
+    ```
+    
+    In Lesson 3, you created the `first_tbl` table in the `public` schema.
+
+4. Create a schema named `retail_demo` to represent the Retail demo namespace:
+
+    ``` sql
+    hawqgsdb=# CREATE SCHEMA retail_demo;
+    CREATE SCHEMA
+    ```
+
+5. The `search_path` server configuration parameter identifies the order in 
which HAWQ should search or apply schemas for objects. Set the schema search 
path to include the new `retail_demo` schema first:
+
+    ``` sql
+    hawqgsdb=# SET search_path TO retail_demo, public;
+    SET
+    ```
+    
+    `retail_demo`, the first schema in your `search_path`, becomes your 
default schema.
+    
+    **Note**: Setting `search_path` in this manner sets the parameter only for 
the current `psql` session. You must re-set `search_path` in subsequent `psql` 
sessions.
+
+4. Create another table named `first_tbl`:
+
+    ``` sql
+    hawqgsdb=# CREATE TABLE first_tbl( i int );
+    CREATE TABLE
+    hawqgsdb=# INSERT INTO first_tbl SELECT generate_series(100,103);
+    INSERT 0 4
+    hawqgsdb=# SELECT * FROM first_tbl;
+      i  
+    -----
+     100
+     101
+     102
+     103
+    (4 rows)
+    ```
+    
+    HAWQ creates this table named `first_tbl` in your default schema since no 
schema was explicitly identified for the table. Your default schema is  
`retail_demo` due to your current `search_path` schema ordering.
+
+5. Verify that this `first_tbl` was created in the `retail_demo` schema by 
displaying the tables in this schema:
+
+    ``` sql
+    hawqgsdb=#\dt retail_demo.*
+                         List of relations
+       Schema    |         Name         | Type  |  Owner  |   Storage   
+    -------------+----------------------+-------+---------+-------------
+     retail_demo | first_tbl            | table | gpadmin | append only
+    (1 row)
+    ```
+
+6. Query the `first_tbl` table that you created in Lesson 3:
+
+    ``` sql
+    hawqgsdb=# SELECT * from public.first_tbl;
+      i 
+    ---
+     1
+     2
+     3
+     4
+     5
+    (5 rows)
+    ```
+
+    You must prepend the table name with `public.` to explicitly identify the 
`first_tbl` table in which you are interested. 
+
+7. Exit `psql`:
+
+    ``` sql
+    hawqgsdb=# \q
+    ```
+
+## <a id="tut_loadhdfs_ex"></a>Exercise: Load the Dimension Data to HDFS
+
+The Retail demo data set includes the entities described in the table below. A 
fact table consists of business facts. Orders and order line items are fact 
tables. Dimension tables provide descriptive information for the measurements 
in a fact table. The other entities are represented in dimension tables. 
+
+|   Entity   | Description  |
+|---------------------|----------------------------|
+| customers\_dim  |  Customer data: first/last name, id, gender  |
+| customer\_addresses\_dim  |  Address and phone number of each customer |
+| email\_addresses\_dim  |  Customer e-mail addresses |
+| categories\_dim  |  Product category name, id |
+| products\_dim  |  Product details including name, id, category, and price |
+| date\_dim  |  Date information including year, quarter, month, week, day of 
week |
+| payment\_methods  |  Payment method code, id |
+| orders  |  Details of an order such as the id, payment method, billing 
address, day/time, and other fields. Each order is associated with a specific 
customer. |
+| order\_lineitems  |  Details of an order line item such as the id, item id, 
category, store, shipping address, and other fields. Each line item references 
a specific product from a specific order from a specific customer. |
+
+Perform the following steps to load the Retail demo dimension data into HDFS 
for later consumption:
+
+1. Navigate to the PXF script directory:
+
+    ``` shell
+    gpadmin@master$ cd $HAWQGSBASE/tutorials/getstart/pxf
+    ```
+
+2. Using the provided script, load the sample data files representing 
dimension data into an HDFS directory named `/retail_demo`. The script removes 
any existing `/retail_demo` directory and contents before loading the data: 
+
+    ``` shell
+    gpadmin@master$ ./load_data_to_HDFS.sh
+    running: sudo -u hdfs hdfs -rm -r -f -skipTrash /retail_demo
+    sudo -u hdfs hdfs dfs -mkdir /retail_demo/categories_dim
+    sudo -u hdfs hdfs dfs -put 
/tmp/hawq_getstart/hawq-samples/datasets/retail/categories_dim.tsv.gz 
/retail_demo/categories_dim/
+    sudo -u hdfs hdfs dfs -mkdir /retail_demo/customer_addresses_dim
+    sudo -u hdfs hdfs dfs -put 
/tmp/hawq_getstart/hawq-samples/datasets/retail/customer_addresses_dim.tsv.gz 
/retail_demo/customer_addresses_dim/
+    ...
+    ```
+       
+        `load_to_HDFS.sh` loads the dimension data `.tsv.gz` files directly 
into HDFS. Each file is loaded to its respective 
`/retail_demo/<basename>/<basename>.tsv.gz` file path.
+        
+3. View the contents of the HDFS `/retail_demo` directory hierarchy:
+
+    ``` shell
+    gpadmin@master$ sudo -u hdfs hdfs dfs -ls /retail_demo/*
+    -rw-r--r--   3 hdfs hdfs        590 2017-04-10 19:59 
/retail_demo/categories_dim/categories_dim.tsv.gz
+    Found 1 items
+    -rw-r--r--   3 hdfs hdfs   53995977 2017-04-10 19:59 
/retail_demo/customer_addresses_dim/customer_addresses_dim.tsv.gz
+    Found 1 items
+    -rw-r--r--   3 hdfs hdfs    4646775 2017-04-10 19:59 
/retail_demo/customers_dim/customers_dim.tsv.gz
+    Found 1 items
+    ...
+    
+    Because the retail demo data exists only as `.tsv.gz` files in HDFS, you 
cannot immediately query the data using HAWQ. In the next lesson, you create 
HAWQ external tables that reference these data files, after which you can query 
them via PXF.
+    ```
+
+## <a id="tut_dataset_summary"></a>Summary
+
+In this lesson, you downloaded the tutorial data set and work files, created 
the `retail_demo` HAWQ schema, and loaded the Retail demo dimension data into 
HDFS. 
+
+In Lessons 5 and 6, you will create and query HAWQ internal and external 
tables in the `retail_demo` schema.
+
+**Lesson 5**: [HAWQ Tables](introhawqtbls.html)

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/gettingstarted/imgs/addprop.png
----------------------------------------------------------------------
diff --git a/markdown/tutorial/gettingstarted/imgs/addprop.png 
b/markdown/tutorial/gettingstarted/imgs/addprop.png
new file mode 100644
index 0000000..930bc92
Binary files /dev/null and b/markdown/tutorial/gettingstarted/imgs/addprop.png 
differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/gettingstarted/imgs/advhawqsite.png
----------------------------------------------------------------------
diff --git a/markdown/tutorial/gettingstarted/imgs/advhawqsite.png 
b/markdown/tutorial/gettingstarted/imgs/advhawqsite.png
new file mode 100644
index 0000000..4d4afa0
Binary files /dev/null and 
b/markdown/tutorial/gettingstarted/imgs/advhawqsite.png differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/gettingstarted/imgs/ambariconsole.png
----------------------------------------------------------------------
diff --git a/markdown/tutorial/gettingstarted/imgs/ambariconsole.png 
b/markdown/tutorial/gettingstarted/imgs/ambariconsole.png
new file mode 100644
index 0000000..45a5202
Binary files /dev/null and 
b/markdown/tutorial/gettingstarted/imgs/ambariconsole.png differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/gettingstarted/imgs/ambbgops.png
----------------------------------------------------------------------
diff --git a/markdown/tutorial/gettingstarted/imgs/ambbgops.png 
b/markdown/tutorial/gettingstarted/imgs/ambbgops.png
new file mode 100644
index 0000000..9882371
Binary files /dev/null and b/markdown/tutorial/gettingstarted/imgs/ambbgops.png 
differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/gettingstarted/imgs/hawqcfgsadv.png
----------------------------------------------------------------------
diff --git a/markdown/tutorial/gettingstarted/imgs/hawqcfgsadv.png 
b/markdown/tutorial/gettingstarted/imgs/hawqcfgsadv.png
new file mode 100644
index 0000000..5bccd19
Binary files /dev/null and 
b/markdown/tutorial/gettingstarted/imgs/hawqcfgsadv.png differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/gettingstarted/imgs/hawqsvcacts.png
----------------------------------------------------------------------
diff --git a/markdown/tutorial/gettingstarted/imgs/hawqsvcacts.png 
b/markdown/tutorial/gettingstarted/imgs/hawqsvcacts.png
new file mode 100644
index 0000000..775220c
Binary files /dev/null and 
b/markdown/tutorial/gettingstarted/imgs/hawqsvcacts.png differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/gettingstarted/imgs/hawqsvccheckout.png
----------------------------------------------------------------------
diff --git a/markdown/tutorial/gettingstarted/imgs/hawqsvccheckout.png 
b/markdown/tutorial/gettingstarted/imgs/hawqsvccheckout.png
new file mode 100644
index 0000000..70d91b1
Binary files /dev/null and 
b/markdown/tutorial/gettingstarted/imgs/hawqsvccheckout.png differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/gettingstarted/imgs/orangerestart.png
----------------------------------------------------------------------
diff --git a/markdown/tutorial/gettingstarted/imgs/orangerestart.png 
b/markdown/tutorial/gettingstarted/imgs/orangerestart.png
new file mode 100644
index 0000000..94f7836
Binary files /dev/null and 
b/markdown/tutorial/gettingstarted/imgs/orangerestart.png differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/gettingstarted/introhawqenv.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/tutorial/gettingstarted/introhawqenv.html.md.erb 
b/markdown/tutorial/gettingstarted/introhawqenv.html.md.erb
new file mode 100644
index 0000000..1749d2c
--- /dev/null
+++ b/markdown/tutorial/gettingstarted/introhawqenv.html.md.erb
@@ -0,0 +1,188 @@
+---
+title: Lesson 1 - Runtime Environment
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+This section introduces you to the HAWQ runtime environment. You will examine 
your HAWQ installation, set up your HAWQ environment, and execute HAWQ 
management commands. If installed in your environment, you will also explore 
the Ambari management console.
+
+## <a id="tut_runtime_usercred"></a>Prerequisites
+
+- Install a HAWQ commercial product distribution or HAWQ sandbox virtual 
machine or docker environment, or build and install HAWQ from source. Ensure 
that your HAWQ installation is configured appropriately.
+
+- Make note of the HAWQ master node hostname or IP address.
+
+- The HAWQ administrative user is named `gpadmin`. This is the user account 
from which you will administer your HAWQ cluster. To perform the exercises in 
this tutorial, you must:
+
+    - Obtain the `gpadmin` user credentials.
+
+    - Ensure that your HAWQ runtime environment is configured such that the 
HAWQ admin user `gpadmin` can run commands to access the HDFS Hadoop system 
accounts (`hdfs`, `hadoop`) via `sudo` without having to provide a password.
+
+    - Obtain the Ambari UI user name and password (optional, if Ambari is 
installed in your HAWQ deployment). The default Ambari user name and password 
are both `admin`.
+
+## <a id="tut_runtime_setup"></a> Exercise: Set Up your HAWQ Runtime 
Environment
+
+HAWQ installs a script that you can use to set up your HAWQ cluster 
environment. The `greenplum_path.sh` script, located in your HAWQ root install 
directory, sets `$PATH` and other environment variables to find HAWQ files.  
Most importantly, `greenplum_path.sh` sets the `$GPHOME` environment variable 
to point to the root directory of the HAWQ installation.  If you installed HAWQ 
from a product distribution or are running a HAWQ sandbox environment, the HAWQ 
root is typically `/usr/local/hawq`. If you built HAWQ from source or 
downloaded the tarball, your `$GPHOME` may differ.
+
+Perform the following steps to set up your HAWQ runtime environment:
+
+4.     Log in to the HAWQ master node using the `gpadmin` user credentials; 
you may not need to provide a password:
+
+    ``` shell
+    $ ssh gpadmin@<master>
+    Password:
+    gpadmin@master$ 
+    ```
+
+5. Set up your HAWQ operating environment by sourcing the `greenplum_path.sh` 
file. If you built HAWQ from source or downloaded the tarball, substitute the 
path to the installed or extracted `greenplum_path.sh` file \(for example 
`/opt/hawq-2.1.0.0/greenplum_path.sh`\):
+
+    ``` shell
+    gpadmin@master$ source /usr/local/hawq/greenplum_path.sh
+    ```
+    
+    `source`ing `greenplum_path.sh` sets:
+    - `$GPHOME`
+    - `$PATH` to include the HAWQ `$GPHOME/bin/` directory 
+    - `$LD_LIBRARY_PATH` to include the HAWQ libraries in `$GPHOME/lib/`
+    
+    
+    ``` shell
+    gpadmin@master$ echo $GPHOME
+    /usr/local/hawq/.
+    gpadmin@master$ echo $PATH
+    
/usr/local/hawq/./bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/gpadmin/bin
+    gpadmin@master$ echo $LD_LIBRARY_PATH
+    /usr/local/hawq/./lib
+    ```
+    
+    **Note**: You must source `greenplum_path.sh` before invoking any HAWQ 
commands. 
+
+3. Edit your (`gpadmin`) `.bash_profile` or other shell initialization file to 
source `greenplum_path.sh` on login.  For example, add:
+
+    ``` shell
+    source /usr/local/hawq/greenplum_path.sh
+    ```
+    
+4. Set the HAWQ-specific environment variables relevant to your deployment in 
your shell initialization file. These include `PGDATABASE`, `PGHOST`, 
`PGOPTIONS`, `PGPORT`, and `PGUSER.` You may not need to set any of these 
environment variables. For example, if you use a custom HAWQ master port 
number, make this port number the default by setting the `PGPORT` environment 
variable in your shell initialization file; add:
+
+    ``` shell
+    export PGPORT=5432
+    ```
+    
+    Setting `PGPORT` simplifies `psql` invocation by providing a default for 
the port option value.
+    
+    Similarly, setting `PGDATABASE` simplifies `psql` invocation by providing 
a default for the database option value.
+
+
+6. Examine your HAWQ installation:
+
+    ``` shell
+    gpadmin@master$ ls $GPHOME
+    bin  docs  etc  greenplum_path.sh  include  lib  sbin  share
+    ```
+    
+    The HAWQ command line utilities are located in `$GPHOME/bin`. 
`$GPHOME/lib` includes HAWQ and PostgreSQL libraries.
+  
+6. View the current state of your HAWQ cluster, and if it is not already 
running, start the cluster. In practice, you will perform different procedures 
depending upon whether you manage your cluster from the command line or use 
Ambari. While you are introduced to both in this tutorial, lessons will focus 
on command line instructions, as not every HAWQ deployment will utilize 
Ambari.<p>
+
+    *Command Line*:
+
+    ``` shell
+    gpadmin@master$ hawq state
+    Failed to connect to database, this script can only be run when the 
database is up.
+    ```
+    
+    If your cluster is not running, start it:
+    
+    ``` shell
+    gpadmin@master$ hawq start cluster
+    20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-Prepare to do 
'hawq start'
+    20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-You can find 
log in:
+    20170411:15:54:47:357122 
hawq_start:master:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_start_20170411.log
+    20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-GPHOME is set 
to:
+    20170411:15:54:47:357122 
hawq_start:master:gpadmin-[INFO]:-/usr/local/hawq/.
+    20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-Start hawq with 
args: ['start', 'cluster']
+    20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-Gathering 
information and validating the environment...
+    20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-No standby host 
configured
+    20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-Start all the 
nodes in hawq cluster
+    20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-Starting master 
node 'master'
+    20170411:15:54:47:357122 hawq_start:master:gpadmin-[INFO]:-Start master 
service
+    20170411:15:54:48:357122 hawq_start:master:gpadmin-[INFO]:-Master started 
successfully
+    20170411:15:54:48:357122 hawq_start:master:gpadmin-[INFO]:-Start all the 
segments in hawq cluster
+    20170411:15:54:48:357122 hawq_start:master:gpadmin-[INFO]:-Start segments 
in list: ['segment']
+    20170411:15:54:48:357122 hawq_start:master:gpadmin-[INFO]:-Start segment 
service
+    20170411:15:54:48:357122 hawq_start:master:gpadmin-[INFO]:-Total segment 
number is: 1
+    .....
+    20170411:15:54:53:357122 hawq_start:master:gpadmin-[INFO]:-1 of 1 segments 
start successfully
+    20170411:15:54:53:357122 hawq_start:master:gpadmin-[INFO]:-Segments 
started successfully
+    20170411:15:54:53:357122 hawq_start:master:gpadmin-[INFO]:-HAWQ cluster 
started successfully
+    ```
+    
+    Get the status of your cluster:
+    
+    ``` shell
+    gpadmin@master$ hawq state
+    20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:-- HAWQ instance 
status summary
+    20170411:16:39:18:370305 
hawq_state:master:gpadmin-[INFO]:------------------------------------------------------
+    20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:--   Master 
instance                                = Active
+    20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:--   No Standby 
master defined                           
+    20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:--   Total 
segment instance count from config file  = 1
+    20170411:16:39:18:370305 
hawq_state:master:gpadmin-[INFO]:------------------------------------------------------
 
+    20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:--   Segment 
Status                                    
+    20170411:16:39:18:370305 
hawq_state:master:gpadmin-[INFO]:------------------------------------------------------
 
+    20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:--   Total 
segments count from catalog      = 1
+    20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:--   Total 
segment valid (at master)        = 1
+    20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:--   Total 
segment failures (at master)     = 0
+    20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:--   Total 
number of postmaster.pid files missing   = 0
+    20170411:16:39:18:370305 hawq_state:master:gpadmin-[INFO]:--   Total 
number of postmaster.pid files found     = 1
+    ```
+    
+    State information returned includes the status of the master node, standby 
master, number of segment instances, and for each segment, the number valid and 
failed.<p>
+
+    *Ambari*:
+    
+    If your deployment includes an Ambari server, perform the following steps 
to start and view the current state of your HAWQ cluster. 
+    
+    1. Start the Ambari management console by entering the following URL in 
your favorite (supported) browser window:
+
+        ``` shell
+        <ambari-server-node>:8080
+        ```
+
+    2. Log in with the Ambari credentials (default `admin`:`admin`) and view 
the Ambari dashboard:
+
+        ![Ambari Dashboard](imgs/ambariconsole.png)
+ 
+        The Ambari dashboard provides an at-a-glance status of the health of 
your HAWQ cluster. A list of each running service and its status is provided in 
the left panel. The main display area includes a set of configurable tiles 
providing specific information about your cluster, including HAWQ segment 
status, HDFS disk usage, and resource manager metrics. 
+        
+    3. Navigate to the **HAWQ** service listed in the left pane. If the 
service is not running (i.e. no green checkmark to the left of the service 
name), start your HAWQ cluster by clicking the **HAWQ** service name, and then 
selecting the **Start** operation from the **Service Actions** menu button.
+
+    4. Log out of the Ambari console by clicking the **admin** button and 
selecting the **Sign out** drop down menu item.
+
+## <a id="tut_runtime_sumary"></a>Summary
+Your HAWQ cluster is now running. For additional information:
+
+- [HAWQ Files and 
Directories](../../admin/setuphawqopenv.html#hawq_env_files_and_dirs) 
identifies HAWQ files and directories and their install locations.
+- [Environment 
Variables](../../reference/HAWQEnvironmentVariables.html#optionalenvironmentvariables)
 includes a complete list of HAWQ deployment-specific environment variables.
+- [Running a HAWQ Cluster](../../admin/RunningHAWQ.html) provides an overview 
of the components comprising a HAWQ cluster, including the users 
(administrative and operating), deployment systems (HAWQ master, standby, and 
segments), databases, and data sources.
+
+Lesson 2 introduces basic HAWQ cluster administration activities and commands.
+ 
+**Lesson 2**: [Cluster Administration](basichawqadmin.html)

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/gettingstarted/introhawqtbls.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/tutorial/gettingstarted/introhawqtbls.html.md.erb 
b/markdown/tutorial/gettingstarted/introhawqtbls.html.md.erb
new file mode 100644
index 0000000..c2a72dc
--- /dev/null
+++ b/markdown/tutorial/gettingstarted/introhawqtbls.html.md.erb
@@ -0,0 +1,222 @@
+---
+title: Lesson 5 - HAWQ Tables
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+HAWQ writes data to, and reads data from, HDFS natively. HAWQ tables are 
similar to tables in any relational database, except that table rows (data) are 
distributed across the different segments in the cluster.
+
+In this exercise, you will run scripts that use the SQL `CREATE TABLE` command 
to create HAWQ tables. You will load the Retail demo fact data into the HAWQ 
tables using the SQL `COPY` command. You will then perform simple and complex 
queries on the data.
+
+
+## <a id="tut_introhawqtblprereq"></a>Prerequisites
+
+Ensure that you have:
+
+- [Set Up your HAWQ Runtime Environment](introhawqenv.html#tut_runtime_setup)
+- [Created the HAWQ Tutorial Database](basicdbadmin.html#tut_ex_createdb)
+- [Downloaded the Retail Data and Script 
Files](dataandscripts.html#tut_exdownloadfilessteps)
+- [Created the Retail Demo HAWQ Schema](dataandscripts.html#tut_dsschema_ex)
+- Started your HAWQ cluster.
+
+## <a id="tut_excreatehawqtblsteps"></a>Exercise: Create, Add Data to, and 
Query HAWQ Retail Demo Tables
+
+Perform the following steps to create and load HAWQ tables from the sample 
Retail demo data set. 
+
+1. Navigate to the HAWQ script directory:
+
+    ``` shell
+    gpadmin@master$ cd $HAWQGSBASE/tutorials/getstart/hawq
+    ```
+
+2. Create tables for the Retail demo fact data using the script provided:
+    
+    ``` shell
+    gpadmin@master$ psql -f ./create_hawq_tables.sql 
+    psql:./create_hawq_tables.sql:2: NOTICE:  table "order_lineitems_hawq" 
does not exist, skipping
+    DROP TABLE
+    CREATE TABLE
+    psql:./create_hawq_tables.sql:41: NOTICE:  table "orders_hawq" does not 
exist, skipping
+    DROP TABLE
+    CREATE TABLE
+    ```
+       
+    **Note**: The `create_hawq_tables.sql` script deletes each table before 
attempting to create it. If this is your first time performing this exercise, 
you can safely ignore the `psql` "table does not exist, skipping" messages.)
+    
+3. Let's take a look at the `create_hawq_tables.sql` script; for example:
+
+    ``` shell
+    gpadmin@master$ vi create_hawq_tables.sql
+    ```
+
+    Notice the use of the `retail_demo.` schema name prefix to the 
`order_lineitems_hawq` table name:
+    
+    ``` sql
+    DROP TABLE IF EXISTS retail_demo.order_lineitems_hawq;
+    CREATE  TABLE retail_demo.order_lineitems_hawq
+    (
+        order_id TEXT,
+        order_item_id TEXT,
+        product_id TEXT,
+        product_name TEXT,
+        customer_id TEXT,
+        store_id TEXT,
+        item_shipment_status_code TEXT,
+        order_datetime TEXT,
+        ship_datetime TEXT,
+        item_return_datetime TEXT,
+        item_refund_datetime TEXT,
+        product_category_id TEXT,
+        product_category_name TEXT,
+        payment_method_code TEXT,
+        tax_amount TEXT,
+        item_quantity TEXT,
+        item_price TEXT,
+        discount_amount TEXT,
+        coupon_code TEXT,
+        coupon_amount TEXT,
+        ship_address_line1 TEXT,
+        ship_address_line2 TEXT,
+        ship_address_line3 TEXT,
+        ship_address_city TEXT,
+        ship_address_state TEXT,
+        ship_address_postal_code TEXT,
+        ship_address_country TEXT,
+        ship_phone_number TEXT,
+        ship_customer_name TEXT,
+        ship_customer_email_address TEXT,
+        ordering_session_id TEXT,
+        website_url TEXT
+    )
+    WITH (appendonly=true, compresstype=zlib) DISTRIBUTED RANDOMLY;
+    ```
+    
+    The `CREATE TABLE` statement above creates a table named 
`order_lineitems_hawq` in the `retail_demo` schema. `order_lineitems_hawq` has 
several columns. `order_id` and `customer_id` provide keys into the orders fact 
and customers dimension tables. The data in `order_lineitems_hawq` is 
distributed randomly and is compressed using the `zlib` compression algorithm.
+    
+    The `create_hawq_tables.sql` script also creates the `orders_hawq` fact 
table.
+
+6. Take a look at the `load_hawq_tables.sh` script:
+
+    ``` shell
+    gpadmin@master$ vi load_hawq_tables.sh
+    ```
+
+    Again, notice the use of the `retail_demo.` schema name prefix to the 
table names. 
+    
+    Examine the `psql -c` `COPY` commands:
+    
+    ``` shell
+    zcat $DATADIR/order_lineitems.tsv.gz | psql -d hawqgsdb -c "COPY 
retail_demo.order_lineitems_hawq FROM STDIN DELIMITER E'\t' NULL E'';"
+    zcat $DATADIR/orders.tsv.gz | psql -d hawqgsdb -c "COPY 
retail_demo.orders_hawq FROM STDIN DELIMITER E'\t' NULL E'';"
+    ```
+    The `load_hawq_tables.sh` shell script uses the `zcat` command to 
uncompress the `.tsv.gz` data files. The SQL `COPY` command copies `STDIN` 
(i.e. the output of the `zcat` command) to the HAWQ table. The `COPY` command 
also identifies the `DELIMITER` used in the file (tab) and the `NULL` string 
('').
+    
+6. Use the `load_hawq_tables.sh` script to load the Retail demo fact data into 
the newly-created tables. This process may take some time to complete.
+
+    ``` shell
+    gpadmin@master$ ./load_hawq_tables.sh
+    ```
+
+6. Use the provided script to verify that the Retail demo fact tables were 
loaded successfully:
+
+    ``` shell
+    gpadmin@master$ ./verify_load_hawq_tables.sh
+    ```
+
+    The output of the `verify_load_hawq_tables.sh` script should match the 
following:
+
+    ``` shell                                              
+        Table Name                |    Count 
+    ------------------------------+------------------------
+     order_lineitems_hawq         |   744196
+     orders_hawq                  |   512071
+    ------------------------------+------------------------
+    ```
+    
+7. Run a query on the `order_lineitems_hawq` table that returns the 
`product_id`, `item_quantity`, `item_price`, and `coupon_amount` for all order 
line items associated with order id `8467975147`:
+
+    ``` shell
+    gpadmin@master$ psql
+    hawqgsdb=# SELECT product_id, item_quantity, item_price, coupon_amount 
+                 FROM retail_demo.order_lineitems_hawq 
+                 WHERE order_id='8467975147' ORDER BY item_price;
+     product_id | item_quantity | item_price | coupon_amount 
+    ------------+---------------+------------+---------------
+     1611429    | 1             | 11.38      | 0.00000
+     1035114    | 1             | 12.95      | 0.15000
+     1382850    | 1             | 17.56      | 0.50000
+     1562908    | 1             | 18.50      | 0.00000
+     1248913    | 1             | 34.99      | 0.50000
+     741706     | 1             | 45.99      | 0.00000
+    (6 rows)
+    ```
+    
+    The `ORDER BY` clause identifies the sort column, `item_price`. If you do 
not specify an `ORDER BY` column(s), the rows are returned in the order in 
which they were added to the table.
+
+7. Determine the top three postal codes by order revenue by running the 
following query on the `orders_hawq` table:
+
+    ``` sql
+    hawqgsdb=# SELECT billing_address_postal_code,
+                 sum(total_paid_amount::float8) AS total,
+                 sum(total_tax_amount::float8) AS tax
+               FROM retail_demo.orders_hawq
+                 GROUP BY billing_address_postal_code
+                 ORDER BY total DESC LIMIT 3;
+    ```
+    
+    Notice the use of the `sum()` aggregate function to add the order totals 
(`total_amount_paid`) and tax totals (`total_tax_paid`) for all orders. These 
totals are grouped/summed for each `billing_address_postal_code`.
+    
+    Compare your output to the following:
+ 
+    ``` pre
+     billing_address_postal_code |   total   |    tax    
+    ----------------------------+-----------+-----------
+     48001                       | 111868.32 | 6712.0992
+     15329                       | 107958.24 | 6477.4944
+     42714                       | 103244.58 | 6194.6748
+    (3 rows)
+    ```
+
+10. Run the following query on the `orders_hawq` and `order_lineitems_hawq` 
tables to display the `product_id`, `item_quantity`, and `item_price` for all 
line items identifying a `product_id` of `1869831`:
+
+    ``` sql
+    hawqgsdb=# SELECT retail_demo.order_lineitems_hawq.order_id, product_id, 
item_quantity, item_price
+                 FROM retail_demo.order_lineitems_hawq, retail_demo.orders_hawq
+               WHERE 
retail_demo.order_lineitems_hawq.order_id=retail_demo.orders_hawq.order_id AND 
retail_demo.order_lineitems_hawq.product_id=1869831
+                 ORDER BY retail_demo.order_lineitems_hawq.order_id, 
product_id;
+      order_id  | product_id | item_quantity | item_price 
+    ------------+------------+---------------+------------
+     4831097728 | 1869831    | 1             | 11.87
+     6734073469 | 1869831    | 1             | 11.87
+    (2 rows)
+    ```
+   
+11. Exit the `psql` subsystem:
+
+    ``` sql
+    hawqgsdb=# \q
+    ```
+
+## <a id="tut_introhawqtbl_summary"></a>Summary
+In this lesson, you created and loaded Retail order and order line item data 
into HAWQ fact tables. You also queried these tables, learning how to filter 
the data to your needs. 
+
+In Lesson 6, you use PXF external tables to similarly access dimension data 
stored in HDFS.
+ 
+**Lesson 6**: [HAWQ Extension Framework (PXF)](intropxfhdfs.html)

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/gettingstarted/intropxfhdfs.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/tutorial/gettingstarted/intropxfhdfs.html.md.erb 
b/markdown/tutorial/gettingstarted/intropxfhdfs.html.md.erb
new file mode 100644
index 0000000..029ff2b
--- /dev/null
+++ b/markdown/tutorial/gettingstarted/intropxfhdfs.html.md.erb
@@ -0,0 +1,224 @@
+---
+title: Lesson 6 - HAWQ Extension Framework (PXF)
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Data in many HAWQ deployments may already reside in external sources. The HAWQ 
Extension Framework (PXF) provides access to this external data via built-in 
connectors called plug-ins. PXF plug-ins facilitate mapping a data source to a 
HAWQ external table definition. PXF is installed with HDFS, Hive, HBase, and 
JSON plug-ins.
+
+In this exercise, you use the PXF HDFS plug-in to: 
+
+- Create PXF external table definitions
+- Perform queries on the data you loaded into HDFS
+- Run more complex queries on HAWQ and PXF tables
+
+## <a id="tut_intropxfprereq"></a>Prerequisites
+
+Ensure that you have:
+
+- [Set Up your HAWQ Runtime Environment](introhawqenv.html#tut_runtime_setup)
+- [Created the HAWQ Tutorial Database](basicdbadmin.html#tut_ex_createdb)
+- [Downloaded the Retail Data and Script 
Files](dataandscripts.html#tut_exdownloadfilessteps)
+- [Created the Retail Demo HAWQ Schema](dataandscripts.html#tut_dsschema_ex)
+- [Loaded the Dimension Data to HDFS](dataandscripts.html#tut_loadhdfs_ex)
+- [Created the HAWQ Retail Demo Fact 
Tables](introhawqtbls.html#tut_excreatehawqtblsteps)
+- Started your HAWQ cluster. 
+
+You should also retrieve the hostname or IP address of the HDFS NameNode that 
you noted in [View and Update HAWQ 
Configuration](basichawqadmin.html#tut_ex_cmdline_cfg).
+
+## <a id="tut_excreatepxftblsteps"></a>Exercise: Create and Query PXF External 
Tables
+
+Perform the following steps to create HAWQ external table definitions to read 
the dimension data you previously loaded into HDFS.
+
+1. Log in to the HAWQ master node as the `gpadmin` user:
+
+    ``` shell
+    $ ssh gpadmin@<master>
+    ```
+
+2. Navigate to the PXF script directory:
+
+    ``` shell
+    gpadmin@master$ cd $HAWQGSBASE/tutorials/getstart/pxf
+    ```
+
+6. Start the `psql` subsystem:
+
+    ``` shell
+    gpadmin@master$ psql
+    hawqgsdb=#
+    ```
+
+8. Create a HAWQ external table definition to represent the Retail demo 
`customers_dim` dimension data you loaded into HDFS in Lesson 4; substitute 
your NameNode hostname or IP address in the \<namenode\> field of the 
`LOCATION` clause:
+
+        ``` sql
+    hawqgsdb=# CREATE EXTERNAL TABLE retail_demo.customers_dim_pxf
+                (customer_id TEXT, first_name TEXT,
+                 last_name TEXT, gender TEXT)
+               LOCATION 
('pxf://<namenode>:51200/retail_demo/customers_dim/customers_dim.tsv.gz?profile=HdfsTextSimple')
+               FORMAT 'TEXT' (DELIMITER = E'\t');
+    CREATE EXTERNAL TABLE
+    ```
+
+    The `LOCATION` clause of a `CREATE EXTERNAL TABLE` statement specifying 
the `pxf` protocol must include:
+    - The hostname or IP address of your HAWQ cluster's HDFS \<namenode\>.
+    - The location and/or name of the external data source. You specified the 
HDFS file path to the `customer_dim` data file above.
+    - The PXF `profile` to use to access the external data. The PXF HDFS 
plug-in supports the `HdfsTextSimple` profile to access delimited text format 
data.
+
+    The `FORMAT` clause of a `CREATE EXTERNAL TABLE` statement specifying the 
`pxf` protocol and `HdfsTextSimple` profile must identify `TEXT` format and 
include the `DELIMITER` character used to access the external data source. You 
identified a tab delimiter character above.
+
+5. The `create_pxf_tables.sql` SQL script creates HAWQ external table 
definitions for the remainder of the Retail dimension data. In another terminal 
window, edit `create_pxf_tables.sql`, replacing each occurrence of NAMENODE 
with the hostname or IP address you specified in the previous step. For example:
+
+    ``` shell
+    gpadmin@master$ cd $HAWQGSBASE/tutorials/getstart/pxf
+    gpadmin@master$ vi create_pxf_tables.sql
+    ```
+
+6. Run the `create_pxf_tables.sql` SQL script to create the remainder of the 
HAWQ external table definitions, then exit the `psql` subsystem:
+
+    ``` sql
+    hawqgsdb=# \i create_pxf_tables.sql
+    hawqgsdb=# \q
+    ```
+       
+    **Note**: The `create_pxf_tables.sql` script deletes each external table 
before attempting to create it. If this is your first time performing this 
exercise, you can safely ignore the `psql` "table does not exist, skipping" 
messages.
+    
+6. Run the following script to verify that you successfully created the 
external table definitions:
+
+    ``` shell
+    gpadmin@master$ ./verify_create_pxf_tables.sh 
+    ```
+        
+    The output of the script should match the following:
+
+    ``` pre
+        Table Name                 |    Count 
+    -------------------------------+------------------------
+     customers_dim_pxf             |   401430  
+     categories_dim_pxf            |   56 
+     customer_addresses_dim_pxf    |   1130639
+     email_addresses_dim_pxf       |   401430
+     payment_methods_pxf           |   5
+     products_dim_pxf              |   698911
+    -------------------------------+------------------------
+    ```
+
+8. Display the allowed payment methods by running the following query on the 
`payment_methods_pxf` table:
+
+    ``` sql
+    gpadmin@master$ psql
+    hawqgsdb=# SELECT * FROM retail_demo.payment_methods_pxf;
+     payment_method_id | payment_method_code 
+    -------------------+---------------------
+                     4 | GiftCertificate
+                     3 | CreditCard
+                     5 | FreeReplacement
+                     2 | Credit
+                     1 | COD
+    (5 rows)
+    ```
+
+8. Run the following query on the `customers_dim_pxf` and 
`customer_addresses_dim_pxf` tables to display the names of all male customers 
in the 06119 zip code:
+
+    ``` sql
+    hawqgsdb=# SELECT last_name, first_name
+                 FROM retail_demo.customers_dim_pxf, 
retail_demo.customer_addresses_dim_pxf
+               WHERE 
retail_demo.customers_dim_pxf.customer_id=retail_demo.customer_addresses_dim_pxf.customer_id
 AND
+                 retail_demo.customer_addresses_dim_pxf.zip_code='06119' AND 
+                 retail_demo.customers_dim_pxf.gender='M';
+    ```
+
+    Compare your output to the following:
+ 
+    ``` shell
+     last_name | first_name 
+    -----------+------------
+     Gigliotti | Maurice
+     Detweiler | Rashaad
+     Nusbaum   | Morton
+     Mann      | Damian
+     ...
+    ```
+
+11. Exit the `psql` subsystem:
+
+    ``` sql
+    hawqgsdb=# \q
+    ```
+
+
+## <a id="tut_exhawqpxfquerysteps"></a>Exercise: Query HAWQ and PXF Tables
+
+Often, data will reside in both HAWQ tables and external data sources. In 
these instances, you can use both HAWQ internal and PXF external tables to 
relate and query the data.
+
+Perform the following steps to identify the names and email addresses of all 
customers who made gift certificate purchases, providing an overall order total 
for such purchases. The orders fact data resides in a HAWQ-managed table and 
the customers data resides in HDFS.
+
+1. Start the `psql` subsystem:
+
+    ``` shell
+    gpadmin@master$ psql
+    hawqgsdb=#
+    ```
+
+2. The orders fact data is accessible via the `orders_hawq` table created in 
the previous lesson. The customers data is accessible via the 
`customers_dim_pxf` table created in the previous exercise. Using these 
internal and external HAWQ  tables, construct a query to identify the names and 
email addresses of all customers who made gift certificate purchases; also 
include an overall order total for such purchases:
+
+    ``` sql
+    hawqgsdb=# SELECT substring(retail_demo.orders_hawq.customer_email_address 
for 37) AS email_address, last_name, 
+                 sum(retail_demo.orders_hawq.total_paid_amount::float8) AS 
gift_cert_total
+               FROM retail_demo.customers_dim_pxf, retail_demo.orders_hawq
+               WHERE 
retail_demo.orders_hawq.payment_method_code='GiftCertificate' AND 
+                     
retail_demo.orders_hawq.customer_id=retail_demo.customers_dim_pxf.customer_id
+               GROUP BY retail_demo.orders_hawq.customer_email_address, 
last_name ORDER BY last_name;
+    ```
+    
+    The `SELECT` statement above uses columns from the HAWQ `orders_hawq` and 
PXF external `customers_dim_pxf` tables to form the query. The `orders_hawq` 
`customer_id` field is compared with the `customers_dim_pxf` `customer_id` 
field to produce the orders associated with a specific customer where the 
`orders_hawq` `payment_method_code` identifies `GiftCertificate`.
+    
+    Query output:
+    
+    ``` pre
+                 email_address             |   last_name    |   
gift_cert_total    
+    
---------------------------------------+----------------+-------------------
+     christopher.aa...@phpmydirectory.com  | Aaron          |             17.16
+     libbie.aa...@qatarw.com               | Aaron          |            102.33
+     jay.aa...@aljsad.net                  | Aaron          |             72.36
+     marybelle.a...@idividi.com.mk         | Abad           |             14.97
+     suellen.a...@anatranny.com            | Abad           |            125.93
+     luvenia.a...@mediabiz.de              | Abad           |            107.99
+     ...
+    ```
+    
+    Enter `q` at any time to exit the query results.
+
+3. Exit the `psql` subsystem:
+
+    ``` sql
+    hawqgsdb=# \q
+    ```
+
+## <a id="tut_intropxf_summary"></a>Summary    
+In this lesson, you created PXF external tables to access HDFS data and 
queried these tables. You also performed a query using this external data and 
the HAWQ internal fact tables created previously, executing business logic on 
both your managed and unmanaged data.
+
+For additional information about PXF, refer to [Using PXF with Unmanaged 
Data](../../pxf/HawqExtensionFrameworkPXF.html).
+
+Refer to [Accessing HDFS File Data](../../pxf/HDFSFileDataPXF.html) for 
detailed information about the PXF HDFS Plug-in.
+
+This lesson wraps up the *Getting Started with HAWQ* tutorial. Now that you 
are familiar with basic environment set-up, cluster, database, and data 
management activities, you should feel more confident interacting with your 
HAWQ cluster.
+ 
+**Next Steps**: View HAWQ documentation related to [Running a HAWQ 
Cluster](../../admin/RunningHAWQ.html).

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/be34a833/markdown/tutorial/overview.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/tutorial/overview.html.md.erb 
b/markdown/tutorial/overview.html.md.erb
new file mode 100644
index 0000000..7216b62
--- /dev/null
+++ b/markdown/tutorial/overview.html.md.erb
@@ -0,0 +1,46 @@
+---
+title: Getting Started with HAWQ
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## <a id="tut_getstartov"></a>Overview
+
+This tutorial provides a quick introduction to get you up and running with 
your HAWQ installation.  You will be introduced to basic HAWQ functionality, 
including cluster management, database creation, and simple querying. You will 
also become acquainted with using the HAWQ Extension Framework (PXF) to access 
and query external HDFS data sources.
+
+
+## <a id="tut_getstartov_prereq"></a>Prerequisites
+
+Ensure that you have a running HAWQ 2.x single or multi-node cluster. You may 
choose to use a:
+
+- HAWQ commercial product distribution, such as [Pivotal 
HDB](https://pivotal.io/pivotal-hdb).
+- [HAWQ sandbox virtual 
machine](https://network.pivotal.io/products/pivotal-hdb) or [HAWQ docker 
environment](https://github.com/apache/incubator-hawq/tree/master/contrib/hawq-docker).
+- HAWQ installation you built from 
[source](https://cwiki.apache.org/confluence/display/HAWQ/Build+and+Install).
+
+## <a id="tut_hawqexlist"></a>Lessons 
+
+This guide includes the following content and exercises:
+
+[Lesson 1: Runtime Environment](gettingstarted/introhawqenv.html) - Examine 
and set up the HAWQ runtime environment.  
+[Lesson 2: Cluster Administration](gettingstarted/basichawqadmin.html) - 
Perform common HAWQ cluster management activities.  
+[Lesson 3: Database Administration](gettingstarted/basicdbadmin.html) - 
Perform common HAWQ database management activities.  
+[Lesson 4: Sample Data Set and HAWQ 
Schemas](gettingstarted/dataandscripts.html) - Download tutorial data and work 
files, create the Retail demo schema, load data to HDFS.  
+[Lesson 5: HAWQ Tables](gettingstarted/introhawqtbls.html) - Create and query 
HAWQ-managed tables.  
+[Lesson 6: HAWQ Extension Framework (PXF)](gettingstarted/intropxfhdfs.html) - 
Use PXF to access external HDFS data.  

Reply via email to