[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612454#comment-15612454
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

Github user kavinderd commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/39#discussion_r85376355
  
--- Diff: pxf/HivePXF.html.md.erb ---
@@ -2,121 +2,450 @@
 title: Accessing Hive Data
 ---
 
-This topic describes how to access Hive data using PXF. You have several 
options for querying data stored in Hive. You can create external tables in PXF 
and then query those tables, or you can easily query Hive tables by using HAWQ 
and PXF's integration with HCatalog. HAWQ accesses Hive table metadata stored 
in HCatalog.
+Apache Hive is a distributed data warehousing infrastructure.  Hive 
facilitates managing large data sets supporting multiple data formats, 
including comma-separated value (.csv), RC, ORC, and parquet. The PXF Hive 
plug-in reads data stored in Hive, as well as HDFS or HBase.
+
+This section describes how to use PXF to access Hive data. Options for 
querying data stored in Hive include:
+
+-  Creating an external table in PXF and querying that table
+-  Querying Hive tables via PXF's integration with HCatalog
 
 ## Prerequisites
 
-Check the following before using PXF to access Hive:
+Before accessing Hive data with HAWQ and PXF, ensure that:
 
--   The PXF HDFS plug-in is installed on all cluster nodes.
+-   The PXF HDFS plug-in is installed on all cluster nodes. See 
[Installing PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation 
information.
 -   The PXF Hive plug-in is installed on all cluster nodes.
 -   The Hive JAR files and conf directory are installed on all cluster 
nodes.
--   Test PXF on HDFS before connecting to Hive or HBase.
+-   You have tested PXF on HDFS.
 -   You are running the Hive Metastore service on a machine in your 
cluster. 
 -   You have set the `hive.metastore.uris` property in the `hive-site.xml` 
on the NameNode.
 
+## Hive File Formats
+
+Hive supports several file formats:
+
+-   TextFile - flat file with data in comma-, tab-, or space-separated 
value format or JSON notation
+-   SequenceFile - flat file consisting of binary key/value pairs
+-   RCFile - record columnar data consisting of binary key/value pairs; 
high row compression rate
+-   ORCFile - optimized row columnar data with stripe, footer, and 
postscript sections; reduces data size
+-   Parquet - compressed columnar data representation
+-   Avro - JSON-defined, schema-based data serialization format
+
+Refer to [File 
Formats](https://cwiki.apache.org/confluence/display/Hive/FileFormats) for 
detailed information about the file formats supported by Hive.
+
+The PXF Hive plug-in supports the following profiles for accessing the 
Hive file formats listed above. These include:
+
+- `Hive`
+- `HiveText`
+- `HiveRC`
+
+## Data Type Mapping
+
+### Primitive Data Types
+
+To represent Hive data in HAWQ, map data values that use a primitive data 
type to HAWQ columns of the same type.
+
+The following table summarizes external mapping rules for Hive primitive 
types.
+
+| Hive Data Type  | Hawq Data Type |
+|---|---|
+| boolean| bool |
+| int   | int4 |
+| smallint   | int2 |
+| tinyint   | int2 |
+| bigint   | int8 |
+| decimal  |  numeric  |
+| float   | float4 |
+| double   | float8 |
+| string   | text |
+| binary   | bytea |
+| char   | bpchar |
+| varchar   | varchar |
+| timestamp   | timestamp |
+| date   | date |
+
+
+### Complex Data Types
+
+Hive supports complex data types including array, struct, map, and union. 
PXF maps each of these complex types to `text`.  While HAWQ does not natively 
support these types, you can create HAWQ functions or application code to 
extract subcomponents of these complex data types.
+
+An example using complex data types is provided later in this topic.
+
+
+## Sample Data Set
+
+Examples used in this topic will operate on a common data set. This simple 
data set models a retail sales operation and includes fields with the following 
names and data types:
+
+- location - text
+- month - text
+- number\_of\_orders - integer
+- total\_sales - double
+
+Prepare the sample data set for use:
+
+1. First, create a text file:
+
+```
+$ vi /tmp/pxf_hive_datafile.txt
+```
+
+2. Add the following data to `pxf_hive_datafile.txt`; notice the use of 
the comma `,` to separate the four field values:
+
+```
+Prague,Jan,101,4875.33
+

[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612400#comment-15612400
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user lisakowen commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r85374612
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,506 +2,449 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes. See [Installing 
PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation information.
+-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
 
-``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
-( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://host[:port]/path-to-data?[=value...]')
-  FORMAT '[TEXT | CSV | CUSTOM]' ();
-```
+## HDFS File Formats
 
-where `` is:
+The PXF HDFS plug-in supports reading the following file formats:
 
-``` pre
-   
FRAGMENTER=fragmenter_class=accessor_class=resolver_class]
- | PROFILE=profile-name
-```
+- Text File - comma-separated value (.csv) or delimited format plain text 
file
+- Avro - JSON-defined, schema-based data serialization format
 
-**Note:** Omit the `FRAGMENTER` parameter for `READABLE` external tables.
+The PXF HDFS plug-in includes the following profiles to support the file 
formats listed above:
 
-Use an SQL `SELECT` statement to read from an HDFS READABLE table:
+- `HdfsTextSimple` - text files
+- `HdfsTextMulti` - text files with embedded line feeds
+- `Avro` - Avro files
 
-``` sql
-SELECT ... FROM table_name;
+If you find that the pre-defined PXF HDFS profiles do not meet your needs, 
you may choose to create a custom HDFS profile from the existing HDFS 
serialization and deserialization classes. Refer to [Adding and Updating 
Profiles](ReadWritePXF.html#addingandupdatingprofiles) for information on 
creating a custom profile.
+
+## HDFS Shell Commands
+Hadoop includes command-line tools that interact directly with HDFS.  
These tools support typical file system operations including copying and 
listing files, changing file permissions, and so forth. 
+
+The HDFS file system command syntax is `hdfs dfs  []`. 
Invoked with no options, `hdfs dfs` lists the file system options supported by 
the tool.
+
+`hdfs dfs` options used in this topic are:
+
+| Option  | Description |
+|---|-|
+| `-cat`| Display file contents. |
+| `-mkdir`| Create directory in HDFS. |
+| `-put`| Copy file from local file system to HDFS. |
+
+Examples:
+
+Create a directory in HDFS:
+
+``` shell
+$ sudo -u hdfs hdfs dfs -mkdir -p /data/exampledir
 ```
 
-Use an SQL `INSERT` statement to add data to an HDFS WRITABLE table:
+Copy a text file to HDFS:
 
-``` sql
-INSERT INTO table_name ...;
+``` shell
+$ sudo -u hdfs hdfs dfs -put /tmp/example.txt /data/exampledir/
 ```
 
-To read the data in the files or to write based on the existing format, 
use `FORMAT`, `PROFILE`, or one of the classes.
-
-This topic describes the following:
-
--   FORMAT clause
--   Profile
--   Accessor
--   Resolver
--   Avro
-
-**Note:** For more details about the API and classes, see [PXF External 
Tables and 
API](PXFExternalTableandAPIReference.html#pxfexternaltableandapireference).
-
-### FORMAT clause
-
-Use one of the following formats to read data with any PXF connector:
-
--   `FORMAT 'TEXT'`: Use with plain delimited text files on HDFS.
--   `FORMAT 'CSV'`: Use with comma-separated value files on HDFS.

[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612406#comment-15612406
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user kavinderd commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r85375160
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,506 +2,449 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes. See [Installing 
PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation information.
+-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
 
-``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
-( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://host[:port]/path-to-data?[=value...]')
-  FORMAT '[TEXT | CSV | CUSTOM]' ();
-```
+## HDFS File Formats
 
-where `` is:
+The PXF HDFS plug-in supports reading the following file formats:
 
-``` pre
-   
FRAGMENTER=fragmenter_class=accessor_class=resolver_class]
- | PROFILE=profile-name
-```
+- Text File - comma-separated value (.csv) or delimited format plain text 
file
+- Avro - JSON-defined, schema-based data serialization format
 
-**Note:** Omit the `FRAGMENTER` parameter for `READABLE` external tables.
+The PXF HDFS plug-in includes the following profiles to support the file 
formats listed above:
 
-Use an SQL `SELECT` statement to read from an HDFS READABLE table:
+- `HdfsTextSimple` - text files
+- `HdfsTextMulti` - text files with embedded line feeds
+- `Avro` - Avro files
 
-``` sql
-SELECT ... FROM table_name;
+If you find that the pre-defined PXF HDFS profiles do not meet your needs, 
you may choose to create a custom HDFS profile from the existing HDFS 
serialization and deserialization classes. Refer to [Adding and Updating 
Profiles](ReadWritePXF.html#addingandupdatingprofiles) for information on 
creating a custom profile.
+
+## HDFS Shell Commands
+Hadoop includes command-line tools that interact directly with HDFS.  
These tools support typical file system operations including copying and 
listing files, changing file permissions, and so forth. 
+
+The HDFS file system command syntax is `hdfs dfs  []`. 
Invoked with no options, `hdfs dfs` lists the file system options supported by 
the tool.
+
+`hdfs dfs` options used in this topic are:
+
+| Option  | Description |
+|---|-|
+| `-cat`| Display file contents. |
+| `-mkdir`| Create directory in HDFS. |
+| `-put`| Copy file from local file system to HDFS. |
+
+Examples:
+
+Create a directory in HDFS:
+
+``` shell
+$ sudo -u hdfs hdfs dfs -mkdir -p /data/exampledir
 ```
 
-Use an SQL `INSERT` statement to add data to an HDFS WRITABLE table:
+Copy a text file to HDFS:
 
-``` sql
-INSERT INTO table_name ...;
+``` shell
+$ sudo -u hdfs hdfs dfs -put /tmp/example.txt /data/exampledir/
 ```
 
-To read the data in the files or to write based on the existing format, 
use `FORMAT`, `PROFILE`, or one of the classes.
-
-This topic describes the following:
-
--   FORMAT clause
--   Profile
--   Accessor
--   Resolver
--   Avro
-
-**Note:** For more details about the API and classes, see [PXF External 
Tables and 
API](PXFExternalTableandAPIReference.html#pxfexternaltableandapireference).
-
-### FORMAT clause
-
-Use one of the following formats to read data with any PXF connector:
-
--   `FORMAT 'TEXT'`: Use with plain delimited text files on HDFS.
--   `FORMAT 'CSV'`: Use with comma-separated value files on HDFS.

[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612373#comment-15612373
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

GitHub user lisakowen opened a pull request:

https://github.com/apache/incubator-hawq-docs/pull/41

HAWQ-1107 - incorporate kavinder's comments

incorporated kavinder's comments on HDFS plug in doc restructure.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lisakowen/incubator-hawq-docs 
feature/pxfhdfs-enhance

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq-docs/pull/41.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #41


commit e16a4a46b6ab2a180e99f5fc793bbabb4f4cbfec
Author: Lisa Owen 
Date:   2016-10-27T16:10:29Z

incorporate kavinder's comments




> PXF HDFS documentation - restructure content and include more examples
> --
>
> Key: HAWQ-1107
> URL: https://issues.apache.org/jira/browse/HAWQ-1107
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF HDFS documentation does not include any runnable examples.  
> add runnable examples for all (HdfsTextSimple, HdfsTextMulti, SerialWritable, 
> Avro) profiles.  restructure the content as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612361#comment-15612361
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/39#discussion_r85372086
  
--- Diff: pxf/HivePXF.html.md.erb ---
@@ -2,121 +2,450 @@
 title: Accessing Hive Data
 ---
 
-This topic describes how to access Hive data using PXF. You have several 
options for querying data stored in Hive. You can create external tables in PXF 
and then query those tables, or you can easily query Hive tables by using HAWQ 
and PXF's integration with HCatalog. HAWQ accesses Hive table metadata stored 
in HCatalog.
+Apache Hive is a distributed data warehousing infrastructure.  Hive 
facilitates managing large data sets supporting multiple data formats, 
including comma-separated value (.csv), RC, ORC, and parquet. The PXF Hive 
plug-in reads data stored in Hive, as well as HDFS or HBase.
+
+This section describes how to use PXF to access Hive data. Options for 
querying data stored in Hive include:
+
+-  Creating an external table in PXF and querying that table
+-  Querying Hive tables via PXF's integration with HCatalog
 
 ## Prerequisites
 
-Check the following before using PXF to access Hive:
+Before accessing Hive data with HAWQ and PXF, ensure that:
 
--   The PXF HDFS plug-in is installed on all cluster nodes.
+-   The PXF HDFS plug-in is installed on all cluster nodes. See 
[Installing PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation 
information.
 -   The PXF Hive plug-in is installed on all cluster nodes.
 -   The Hive JAR files and conf directory are installed on all cluster 
nodes.
--   Test PXF on HDFS before connecting to Hive or HBase.
+-   You have tested PXF on HDFS.
 -   You are running the Hive Metastore service on a machine in your 
cluster. 
 -   You have set the `hive.metastore.uris` property in the `hive-site.xml` 
on the NameNode.
 
+## Hive File Formats
+
+Hive supports several file formats:
+
+-   TextFile - flat file with data in comma-, tab-, or space-separated 
value format or JSON notation
+-   SequenceFile - flat file consisting of binary key/value pairs
+-   RCFile - record columnar data consisting of binary key/value pairs; 
high row compression rate
+-   ORCFile - optimized row columnar data with stripe, footer, and 
postscript sections; reduces data size
+-   Parquet - compressed columnar data representation
+-   Avro - JSON-defined, schema-based data serialization format
+
+Refer to [File 
Formats](https://cwiki.apache.org/confluence/display/Hive/FileFormats) for 
detailed information about the file formats supported by Hive.
+
+The PXF Hive plug-in supports the following profiles for accessing the 
Hive file formats listed above. These include:
+
+- `Hive`
+- `HiveText`
+- `HiveRC`
+
+## Data Type Mapping
+
+### Primitive Data Types
+
+To represent Hive data in HAWQ, map data values that use a primitive data 
type to HAWQ columns of the same type.
+
+The following table summarizes external mapping rules for Hive primitive 
types.
+
+| Hive Data Type  | Hawq Data Type |
+|---|---|
+| boolean| bool |
+| int   | int4 |
+| smallint   | int2 |
+| tinyint   | int2 |
+| bigint   | int8 |
+| decimal  |  numeric  |
+| float   | float4 |
+| double   | float8 |
+| string   | text |
+| binary   | bytea |
+| char   | bpchar |
+| varchar   | varchar |
+| timestamp   | timestamp |
+| date   | date |
+
+
+### Complex Data Types
+
+Hive supports complex data types including array, struct, map, and union. 
PXF maps each of these complex types to `text`.  While HAWQ does not natively 
support these types, you can create HAWQ functions or application code to 
extract subcomponents of these complex data types.
+
+An example using complex data types is provided later in this topic.
+
+
+## Sample Data Set
+
+Examples used in this topic will operate on a common data set. This simple 
data set models a retail sales operation and includes fields with the following 
names and data types:
+
+- location - text
+- month - text
+- number\_of\_orders - integer
+- total\_sales - double
+
+Prepare the sample data set for use:
+
+1. First, create a text file:
+
+```
+$ vi /tmp/pxf_hive_datafile.txt
+```
+
+2. Add the following data to `pxf_hive_datafile.txt`; notice the use of 
the comma `,` to separate the four field values:
+
+```
+Prague,Jan,101,4875.33
+

[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612370#comment-15612370
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user kavinderd commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r85372290
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,506 +2,449 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes. See [Installing 
PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation information.
+-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
 
-``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
-( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://host[:port]/path-to-data?[=value...]')
-  FORMAT '[TEXT | CSV | CUSTOM]' ();
-```
+## HDFS File Formats
 
-where `` is:
+The PXF HDFS plug-in supports reading the following file formats:
 
-``` pre
-   
FRAGMENTER=fragmenter_class=accessor_class=resolver_class]
- | PROFILE=profile-name
-```
+- Text File - comma-separated value (.csv) or delimited format plain text 
file
+- Avro - JSON-defined, schema-based data serialization format
 
-**Note:** Omit the `FRAGMENTER` parameter for `READABLE` external tables.
+The PXF HDFS plug-in includes the following profiles to support the file 
formats listed above:
 
-Use an SQL `SELECT` statement to read from an HDFS READABLE table:
+- `HdfsTextSimple` - text files
+- `HdfsTextMulti` - text files with embedded line feeds
+- `Avro` - Avro files
 
-``` sql
-SELECT ... FROM table_name;
+If you find that the pre-defined PXF HDFS profiles do not meet your needs, 
you may choose to create a custom HDFS profile from the existing HDFS 
serialization and deserialization classes. Refer to [Adding and Updating 
Profiles](ReadWritePXF.html#addingandupdatingprofiles) for information on 
creating a custom profile.
+
+## HDFS Shell Commands
+Hadoop includes command-line tools that interact directly with HDFS.  
These tools support typical file system operations including copying and 
listing files, changing file permissions, and so forth. 
+
+The HDFS file system command syntax is `hdfs dfs  []`. 
Invoked with no options, `hdfs dfs` lists the file system options supported by 
the tool.
+
+`hdfs dfs` options used in this topic are:
+
+| Option  | Description |
+|---|-|
+| `-cat`| Display file contents. |
+| `-mkdir`| Create directory in HDFS. |
+| `-put`| Copy file from local file system to HDFS. |
+
+Examples:
+
+Create a directory in HDFS:
+
+``` shell
+$ sudo -u hdfs hdfs dfs -mkdir -p /data/exampledir
 ```
 
-Use an SQL `INSERT` statement to add data to an HDFS WRITABLE table:
+Copy a text file to HDFS:
 
-``` sql
-INSERT INTO table_name ...;
+``` shell
+$ sudo -u hdfs hdfs dfs -put /tmp/example.txt /data/exampledir/
 ```
 
-To read the data in the files or to write based on the existing format, 
use `FORMAT`, `PROFILE`, or one of the classes.
-
-This topic describes the following:
-
--   FORMAT clause
--   Profile
--   Accessor
--   Resolver
--   Avro
-
-**Note:** For more details about the API and classes, see [PXF External 
Tables and 
API](PXFExternalTableandAPIReference.html#pxfexternaltableandapireference).
-
-### FORMAT clause
-
-Use one of the following formats to read data with any PXF connector:
-
--   `FORMAT 'TEXT'`: Use with plain delimited text files on HDFS.
--   `FORMAT 'CSV'`: Use with comma-separated value files on HDFS.

[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612358#comment-15612358
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/39#discussion_r85370681
  
--- Diff: pxf/HivePXF.html.md.erb ---
@@ -151,184 +477,120 @@ To enable HCatalog query integration in HAWQ, 
perform the following steps:
 postgres=# GRANT ALL ON PROTOCOL pxf TO "role";
 ``` 
 
-3.  To query a Hive table with HCatalog integration, simply query HCatalog 
directly from HAWQ. The query syntax is:
 
-``` sql
-postgres=# SELECT * FROM hcatalog.hive-db-name.hive-table-name;
-```
+To query a Hive table with HCatalog integration, query HCatalog directly 
from HAWQ. The query syntax is:
--- End diff --

It's a bit awkward to drop out of the procedure and into free-form 
discussion of the various operations.  I think it might be better to put the 
previous 3-step procedure into a new subsection like "Enabling HCatalog 
Integration" and then putting the remaining non-procedural content into "Usage" 
?


> add PXF HiveText and HiveRC profile examples to the documentation
> -
>
> Key: HAWQ-1071
> URL: https://issues.apache.org/jira/browse/HAWQ-1071
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF Hive documentation includes an example for only the Hive 
> profile.  add examples for HiveText and HiveRC profiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612362#comment-15612362
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/39#discussion_r85368842
  
--- Diff: pxf/HivePXF.html.md.erb ---
@@ -2,121 +2,450 @@
 title: Accessing Hive Data
 ---
 
-This topic describes how to access Hive data using PXF. You have several 
options for querying data stored in Hive. You can create external tables in PXF 
and then query those tables, or you can easily query Hive tables by using HAWQ 
and PXF's integration with HCatalog. HAWQ accesses Hive table metadata stored 
in HCatalog.
+Apache Hive is a distributed data warehousing infrastructure.  Hive 
facilitates managing large data sets supporting multiple data formats, 
including comma-separated value (.csv), RC, ORC, and parquet. The PXF Hive 
plug-in reads data stored in Hive, as well as HDFS or HBase.
+
+This section describes how to use PXF to access Hive data. Options for 
querying data stored in Hive include:
+
+-  Creating an external table in PXF and querying that table
+-  Querying Hive tables via PXF's integration with HCatalog
 
 ## Prerequisites
 
-Check the following before using PXF to access Hive:
+Before accessing Hive data with HAWQ and PXF, ensure that:
 
--   The PXF HDFS plug-in is installed on all cluster nodes.
+-   The PXF HDFS plug-in is installed on all cluster nodes. See 
[Installing PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation 
information.
 -   The PXF Hive plug-in is installed on all cluster nodes.
 -   The Hive JAR files and conf directory are installed on all cluster 
nodes.
--   Test PXF on HDFS before connecting to Hive or HBase.
+-   You have tested PXF on HDFS.
 -   You are running the Hive Metastore service on a machine in your 
cluster. 
 -   You have set the `hive.metastore.uris` property in the `hive-site.xml` 
on the NameNode.
 
+## Hive File Formats
+
+Hive supports several file formats:
+
+-   TextFile - flat file with data in comma-, tab-, or space-separated 
value format or JSON notation
+-   SequenceFile - flat file consisting of binary key/value pairs
+-   RCFile - record columnar data consisting of binary key/value pairs; 
high row compression rate
+-   ORCFile - optimized row columnar data with stripe, footer, and 
postscript sections; reduces data size
+-   Parquet - compressed columnar data representation
+-   Avro - JSON-defined, schema-based data serialization format
+
+Refer to [File 
Formats](https://cwiki.apache.org/confluence/display/Hive/FileFormats) for 
detailed information about the file formats supported by Hive.
+
+The PXF Hive plug-in supports the following profiles for accessing the 
Hive file formats listed above. These include:
+
+- `Hive`
+- `HiveText`
+- `HiveRC`
+
+## Data Type Mapping
+
+### Primitive Data Types
+
+To represent Hive data in HAWQ, map data values that use a primitive data 
type to HAWQ columns of the same type.
+
+The following table summarizes external mapping rules for Hive primitive 
types.
+
+| Hive Data Type  | Hawq Data Type |
+|---|---|
+| boolean| bool |
+| int   | int4 |
+| smallint   | int2 |
+| tinyint   | int2 |
+| bigint   | int8 |
+| decimal  |  numeric  |
+| float   | float4 |
+| double   | float8 |
+| string   | text |
+| binary   | bytea |
+| char   | bpchar |
+| varchar   | varchar |
+| timestamp   | timestamp |
+| date   | date |
+
+
+### Complex Data Types
+
+Hive supports complex data types including array, struct, map, and union. 
PXF maps each of these complex types to `text`.  While HAWQ does not natively 
support these types, you can create HAWQ functions or application code to 
extract subcomponents of these complex data types.
+
+An example using complex data types is provided later in this topic.
+
+
+## Sample Data Set
+
+Examples used in this topic will operate on a common data set. This simple 
data set models a retail sales operation and includes fields with the following 
names and data types:
+
+- location - text
+- month - text
+- number\_of\_orders - integer
+- total\_sales - double
+
+Prepare the sample data set for use:
+
+1. First, create a text file:
+
+```
+$ vi /tmp/pxf_hive_datafile.txt
+```
+
+2. Add the following data to `pxf_hive_datafile.txt`; notice the use of 
the comma `,` to separate the four field values:
+
+```
+Prague,Jan,101,4875.33
+

[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612357#comment-15612357
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/39#discussion_r85367789
  
--- Diff: pxf/HivePXF.html.md.erb ---
@@ -2,121 +2,450 @@
 title: Accessing Hive Data
 ---
 
-This topic describes how to access Hive data using PXF. You have several 
options for querying data stored in Hive. You can create external tables in PXF 
and then query those tables, or you can easily query Hive tables by using HAWQ 
and PXF's integration with HCatalog. HAWQ accesses Hive table metadata stored 
in HCatalog.
+Apache Hive is a distributed data warehousing infrastructure.  Hive 
facilitates managing large data sets supporting multiple data formats, 
including comma-separated value (.csv), RC, ORC, and parquet. The PXF Hive 
plug-in reads data stored in Hive, as well as HDFS or HBase.
+
+This section describes how to use PXF to access Hive data. Options for 
querying data stored in Hive include:
+
+-  Creating an external table in PXF and querying that table
+-  Querying Hive tables via PXF's integration with HCatalog
 
 ## Prerequisites
 
-Check the following before using PXF to access Hive:
+Before accessing Hive data with HAWQ and PXF, ensure that:
 
--   The PXF HDFS plug-in is installed on all cluster nodes.
+-   The PXF HDFS plug-in is installed on all cluster nodes. See 
[Installing PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation 
information.
 -   The PXF Hive plug-in is installed on all cluster nodes.
 -   The Hive JAR files and conf directory are installed on all cluster 
nodes.
--   Test PXF on HDFS before connecting to Hive or HBase.
+-   You have tested PXF on HDFS.
 -   You are running the Hive Metastore service on a machine in your 
cluster. 
 -   You have set the `hive.metastore.uris` property in the `hive-site.xml` 
on the NameNode.
 
+## Hive File Formats
+
+Hive supports several file formats:
+
+-   TextFile - flat file with data in comma-, tab-, or space-separated 
value format or JSON notation
+-   SequenceFile - flat file consisting of binary key/value pairs
+-   RCFile - record columnar data consisting of binary key/value pairs; 
high row compression rate
+-   ORCFile - optimized row columnar data with stripe, footer, and 
postscript sections; reduces data size
+-   Parquet - compressed columnar data representation
+-   Avro - JSON-defined, schema-based data serialization format
+
+Refer to [File 
Formats](https://cwiki.apache.org/confluence/display/Hive/FileFormats) for 
detailed information about the file formats supported by Hive.
+
+The PXF Hive plug-in supports the following profiles for accessing the 
Hive file formats listed above. These include:
+
+- `Hive`
+- `HiveText`
+- `HiveRC`
+
+## Data Type Mapping
+
+### Primitive Data Types
+
+To represent Hive data in HAWQ, map data values that use a primitive data 
type to HAWQ columns of the same type.
+
+The following table summarizes external mapping rules for Hive primitive 
types.
+
+| Hive Data Type  | Hawq Data Type |
+|---|---|
+| boolean| bool |
+| int   | int4 |
+| smallint   | int2 |
+| tinyint   | int2 |
+| bigint   | int8 |
+| decimal  |  numeric  |
+| float   | float4 |
+| double   | float8 |
+| string   | text |
+| binary   | bytea |
+| char   | bpchar |
+| varchar   | varchar |
+| timestamp   | timestamp |
+| date   | date |
+
+
+### Complex Data Types
+
+Hive supports complex data types including array, struct, map, and union. 
PXF maps each of these complex types to `text`.  While HAWQ does not natively 
support these types, you can create HAWQ functions or application code to 
extract subcomponents of these complex data types.
+
+An example using complex data types is provided later in this topic.
+
+
+## Sample Data Set
+
+Examples used in this topic will operate on a common data set. This simple 
data set models a retail sales operation and includes fields with the following 
names and data types:
+
+- location - text
+- month - text
+- number\_of\_orders - integer
+- total\_sales - double
+
+Prepare the sample data set for use:
+
+1. First, create a text file:
+
+```
+$ vi /tmp/pxf_hive_datafile.txt
+```
+
+2. Add the following data to `pxf_hive_datafile.txt`; notice the use of 
the comma `,` to separate the four field values:
+
+```
+Prague,Jan,101,4875.33
+

[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612366#comment-15612366
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/39#discussion_r85369947
  
--- Diff: pxf/HivePXF.html.md.erb ---
@@ -2,121 +2,450 @@
 title: Accessing Hive Data
 ---
 
-This topic describes how to access Hive data using PXF. You have several 
options for querying data stored in Hive. You can create external tables in PXF 
and then query those tables, or you can easily query Hive tables by using HAWQ 
and PXF's integration with HCatalog. HAWQ accesses Hive table metadata stored 
in HCatalog.
+Apache Hive is a distributed data warehousing infrastructure.  Hive 
facilitates managing large data sets supporting multiple data formats, 
including comma-separated value (.csv), RC, ORC, and parquet. The PXF Hive 
plug-in reads data stored in Hive, as well as HDFS or HBase.
+
+This section describes how to use PXF to access Hive data. Options for 
querying data stored in Hive include:
+
+-  Creating an external table in PXF and querying that table
+-  Querying Hive tables via PXF's integration with HCatalog
 
 ## Prerequisites
 
-Check the following before using PXF to access Hive:
+Before accessing Hive data with HAWQ and PXF, ensure that:
 
--   The PXF HDFS plug-in is installed on all cluster nodes.
+-   The PXF HDFS plug-in is installed on all cluster nodes. See 
[Installing PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation 
information.
 -   The PXF Hive plug-in is installed on all cluster nodes.
 -   The Hive JAR files and conf directory are installed on all cluster 
nodes.
--   Test PXF on HDFS before connecting to Hive or HBase.
+-   You have tested PXF on HDFS.
 -   You are running the Hive Metastore service on a machine in your 
cluster. 
 -   You have set the `hive.metastore.uris` property in the `hive-site.xml` 
on the NameNode.
 
+## Hive File Formats
+
+Hive supports several file formats:
+
+-   TextFile - flat file with data in comma-, tab-, or space-separated 
value format or JSON notation
+-   SequenceFile - flat file consisting of binary key/value pairs
+-   RCFile - record columnar data consisting of binary key/value pairs; 
high row compression rate
+-   ORCFile - optimized row columnar data with stripe, footer, and 
postscript sections; reduces data size
+-   Parquet - compressed columnar data representation
+-   Avro - JSON-defined, schema-based data serialization format
+
+Refer to [File 
Formats](https://cwiki.apache.org/confluence/display/Hive/FileFormats) for 
detailed information about the file formats supported by Hive.
+
+The PXF Hive plug-in supports the following profiles for accessing the 
Hive file formats listed above. These include:
+
+- `Hive`
+- `HiveText`
+- `HiveRC`
+
+## Data Type Mapping
+
+### Primitive Data Types
+
+To represent Hive data in HAWQ, map data values that use a primitive data 
type to HAWQ columns of the same type.
+
+The following table summarizes external mapping rules for Hive primitive 
types.
+
+| Hive Data Type  | Hawq Data Type |
+|---|---|
+| boolean| bool |
+| int   | int4 |
+| smallint   | int2 |
+| tinyint   | int2 |
+| bigint   | int8 |
+| decimal  |  numeric  |
+| float   | float4 |
+| double   | float8 |
+| string   | text |
+| binary   | bytea |
+| char   | bpchar |
+| varchar   | varchar |
+| timestamp   | timestamp |
+| date   | date |
+
+
+### Complex Data Types
+
+Hive supports complex data types including array, struct, map, and union. 
PXF maps each of these complex types to `text`.  While HAWQ does not natively 
support these types, you can create HAWQ functions or application code to 
extract subcomponents of these complex data types.
+
+An example using complex data types is provided later in this topic.
+
+
+## Sample Data Set
+
+Examples used in this topic will operate on a common data set. This simple 
data set models a retail sales operation and includes fields with the following 
names and data types:
+
+- location - text
+- month - text
+- number\_of\_orders - integer
+- total\_sales - double
+
+Prepare the sample data set for use:
+
+1. First, create a text file:
+
+```
+$ vi /tmp/pxf_hive_datafile.txt
+```
+
+2. Add the following data to `pxf_hive_datafile.txt`; notice the use of 
the comma `,` to separate the four field values:
+
+```
+Prague,Jan,101,4875.33
+

[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612365#comment-15612365
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/39#discussion_r85365540
  
--- Diff: pxf/HivePXF.html.md.erb ---
@@ -2,121 +2,450 @@
 title: Accessing Hive Data
 ---
 
-This topic describes how to access Hive data using PXF. You have several 
options for querying data stored in Hive. You can create external tables in PXF 
and then query those tables, or you can easily query Hive tables by using HAWQ 
and PXF's integration with HCatalog. HAWQ accesses Hive table metadata stored 
in HCatalog.
+Apache Hive is a distributed data warehousing infrastructure.  Hive 
facilitates managing large data sets supporting multiple data formats, 
including comma-separated value (.csv), RC, ORC, and parquet. The PXF Hive 
plug-in reads data stored in Hive, as well as HDFS or HBase.
+
+This section describes how to use PXF to access Hive data. Options for 
querying data stored in Hive include:
+
+-  Creating an external table in PXF and querying that table
+-  Querying Hive tables via PXF's integration with HCatalog
 
 ## Prerequisites
 
-Check the following before using PXF to access Hive:
+Before accessing Hive data with HAWQ and PXF, ensure that:
 
--   The PXF HDFS plug-in is installed on all cluster nodes.
+-   The PXF HDFS plug-in is installed on all cluster nodes. See 
[Installing PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation 
information.
 -   The PXF Hive plug-in is installed on all cluster nodes.
 -   The Hive JAR files and conf directory are installed on all cluster 
nodes.
--   Test PXF on HDFS before connecting to Hive or HBase.
+-   You have tested PXF on HDFS.
 -   You are running the Hive Metastore service on a machine in your 
cluster. 
 -   You have set the `hive.metastore.uris` property in the `hive-site.xml` 
on the NameNode.
 
+## Hive File Formats
+
+Hive supports several file formats:
+
+-   TextFile - flat file with data in comma-, tab-, or space-separated 
value format or JSON notation
+-   SequenceFile - flat file consisting of binary key/value pairs
+-   RCFile - record columnar data consisting of binary key/value pairs; 
high row compression rate
+-   ORCFile - optimized row columnar data with stripe, footer, and 
postscript sections; reduces data size
+-   Parquet - compressed columnar data representation
+-   Avro - JSON-defined, schema-based data serialization format
--- End diff --

Just a suggestion, but I think this would read better as a 2-column 
term/definition table.  You could even make it a 3-column table to describe 
which PXF plug-ins are used with each format.


> add PXF HiveText and HiveRC profile examples to the documentation
> -
>
> Key: HAWQ-1071
> URL: https://issues.apache.org/jira/browse/HAWQ-1071
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF Hive documentation includes an example for only the Hive 
> profile.  add examples for HiveText and HiveRC profiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612363#comment-15612363
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/39#discussion_r85367290
  
--- Diff: pxf/HivePXF.html.md.erb ---
@@ -2,121 +2,450 @@
 title: Accessing Hive Data
 ---
 
-This topic describes how to access Hive data using PXF. You have several 
options for querying data stored in Hive. You can create external tables in PXF 
and then query those tables, or you can easily query Hive tables by using HAWQ 
and PXF's integration with HCatalog. HAWQ accesses Hive table metadata stored 
in HCatalog.
+Apache Hive is a distributed data warehousing infrastructure.  Hive 
facilitates managing large data sets supporting multiple data formats, 
including comma-separated value (.csv), RC, ORC, and parquet. The PXF Hive 
plug-in reads data stored in Hive, as well as HDFS or HBase.
+
+This section describes how to use PXF to access Hive data. Options for 
querying data stored in Hive include:
+
+-  Creating an external table in PXF and querying that table
+-  Querying Hive tables via PXF's integration with HCatalog
 
 ## Prerequisites
 
-Check the following before using PXF to access Hive:
+Before accessing Hive data with HAWQ and PXF, ensure that:
 
--   The PXF HDFS plug-in is installed on all cluster nodes.
+-   The PXF HDFS plug-in is installed on all cluster nodes. See 
[Installing PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation 
information.
 -   The PXF Hive plug-in is installed on all cluster nodes.
 -   The Hive JAR files and conf directory are installed on all cluster 
nodes.
--   Test PXF on HDFS before connecting to Hive or HBase.
+-   You have tested PXF on HDFS.
 -   You are running the Hive Metastore service on a machine in your 
cluster. 
 -   You have set the `hive.metastore.uris` property in the `hive-site.xml` 
on the NameNode.
 
+## Hive File Formats
+
+Hive supports several file formats:
+
+-   TextFile - flat file with data in comma-, tab-, or space-separated 
value format or JSON notation
+-   SequenceFile - flat file consisting of binary key/value pairs
+-   RCFile - record columnar data consisting of binary key/value pairs; 
high row compression rate
+-   ORCFile - optimized row columnar data with stripe, footer, and 
postscript sections; reduces data size
+-   Parquet - compressed columnar data representation
+-   Avro - JSON-defined, schema-based data serialization format
+
+Refer to [File 
Formats](https://cwiki.apache.org/confluence/display/Hive/FileFormats) for 
detailed information about the file formats supported by Hive.
+
+The PXF Hive plug-in supports the following profiles for accessing the 
Hive file formats listed above. These include:
+
+- `Hive`
+- `HiveText`
+- `HiveRC`
+
+## Data Type Mapping
+
+### Primitive Data Types
+
+To represent Hive data in HAWQ, map data values that use a primitive data 
type to HAWQ columns of the same type.
+
+The following table summarizes external mapping rules for Hive primitive 
types.
+
+| Hive Data Type  | Hawq Data Type |
+|---|---|
+| boolean| bool |
+| int   | int4 |
+| smallint   | int2 |
+| tinyint   | int2 |
+| bigint   | int8 |
+| decimal  |  numeric  |
+| float   | float4 |
+| double   | float8 |
+| string   | text |
+| binary   | bytea |
+| char   | bpchar |
+| varchar   | varchar |
+| timestamp   | timestamp |
+| date   | date |
+
+
+### Complex Data Types
+
+Hive supports complex data types including array, struct, map, and union. 
PXF maps each of these complex types to `text`.  While HAWQ does not natively 
support these types, you can create HAWQ functions or application code to 
extract subcomponents of these complex data types.
+
+An example using complex data types is provided later in this topic.
+
+
+## Sample Data Set
+
+Examples used in this topic will operate on a common data set. This simple 
data set models a retail sales operation and includes fields with the following 
names and data types:
+
+- location - text
+- month - text
+- number\_of\_orders - integer
+- total\_sales - double
+
+Prepare the sample data set for use:
+
+1. First, create a text file:
+
+```
+$ vi /tmp/pxf_hive_datafile.txt
+```
+
+2. Add the following data to `pxf_hive_datafile.txt`; notice the use of 
the comma `,` to separate the four field values:
+
+```
+Prague,Jan,101,4875.33
+

[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612367#comment-15612367
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/39#discussion_r85367943
  
--- Diff: pxf/HivePXF.html.md.erb ---
@@ -2,121 +2,450 @@
 title: Accessing Hive Data
 ---
 
-This topic describes how to access Hive data using PXF. You have several 
options for querying data stored in Hive. You can create external tables in PXF 
and then query those tables, or you can easily query Hive tables by using HAWQ 
and PXF's integration with HCatalog. HAWQ accesses Hive table metadata stored 
in HCatalog.
+Apache Hive is a distributed data warehousing infrastructure.  Hive 
facilitates managing large data sets supporting multiple data formats, 
including comma-separated value (.csv), RC, ORC, and parquet. The PXF Hive 
plug-in reads data stored in Hive, as well as HDFS or HBase.
+
+This section describes how to use PXF to access Hive data. Options for 
querying data stored in Hive include:
+
+-  Creating an external table in PXF and querying that table
+-  Querying Hive tables via PXF's integration with HCatalog
 
 ## Prerequisites
 
-Check the following before using PXF to access Hive:
+Before accessing Hive data with HAWQ and PXF, ensure that:
 
--   The PXF HDFS plug-in is installed on all cluster nodes.
+-   The PXF HDFS plug-in is installed on all cluster nodes. See 
[Installing PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation 
information.
 -   The PXF Hive plug-in is installed on all cluster nodes.
 -   The Hive JAR files and conf directory are installed on all cluster 
nodes.
--   Test PXF on HDFS before connecting to Hive or HBase.
+-   You have tested PXF on HDFS.
 -   You are running the Hive Metastore service on a machine in your 
cluster. 
 -   You have set the `hive.metastore.uris` property in the `hive-site.xml` 
on the NameNode.
 
+## Hive File Formats
+
+Hive supports several file formats:
+
+-   TextFile - flat file with data in comma-, tab-, or space-separated 
value format or JSON notation
+-   SequenceFile - flat file consisting of binary key/value pairs
+-   RCFile - record columnar data consisting of binary key/value pairs; 
high row compression rate
+-   ORCFile - optimized row columnar data with stripe, footer, and 
postscript sections; reduces data size
+-   Parquet - compressed columnar data representation
+-   Avro - JSON-defined, schema-based data serialization format
+
+Refer to [File 
Formats](https://cwiki.apache.org/confluence/display/Hive/FileFormats) for 
detailed information about the file formats supported by Hive.
+
+The PXF Hive plug-in supports the following profiles for accessing the 
Hive file formats listed above. These include:
+
+- `Hive`
+- `HiveText`
+- `HiveRC`
+
+## Data Type Mapping
+
+### Primitive Data Types
+
+To represent Hive data in HAWQ, map data values that use a primitive data 
type to HAWQ columns of the same type.
+
+The following table summarizes external mapping rules for Hive primitive 
types.
+
+| Hive Data Type  | Hawq Data Type |
+|---|---|
+| boolean| bool |
+| int   | int4 |
+| smallint   | int2 |
+| tinyint   | int2 |
+| bigint   | int8 |
+| decimal  |  numeric  |
+| float   | float4 |
+| double   | float8 |
+| string   | text |
+| binary   | bytea |
+| char   | bpchar |
+| varchar   | varchar |
+| timestamp   | timestamp |
+| date   | date |
+
+
+### Complex Data Types
+
+Hive supports complex data types including array, struct, map, and union. 
PXF maps each of these complex types to `text`.  While HAWQ does not natively 
support these types, you can create HAWQ functions or application code to 
extract subcomponents of these complex data types.
+
+An example using complex data types is provided later in this topic.
+
+
+## Sample Data Set
+
+Examples used in this topic will operate on a common data set. This simple 
data set models a retail sales operation and includes fields with the following 
names and data types:
+
+- location - text
+- month - text
+- number\_of\_orders - integer
+- total\_sales - double
+
+Prepare the sample data set for use:
+
+1. First, create a text file:
+
+```
+$ vi /tmp/pxf_hive_datafile.txt
+```
+
+2. Add the following data to `pxf_hive_datafile.txt`; notice the use of 
the comma `,` to separate the four field values:
+
+```
+Prague,Jan,101,4875.33
+

[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612359#comment-15612359
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/39#discussion_r85365959
  
--- Diff: pxf/HivePXF.html.md.erb ---
@@ -2,121 +2,450 @@
 title: Accessing Hive Data
 ---
 
-This topic describes how to access Hive data using PXF. You have several 
options for querying data stored in Hive. You can create external tables in PXF 
and then query those tables, or you can easily query Hive tables by using HAWQ 
and PXF's integration with HCatalog. HAWQ accesses Hive table metadata stored 
in HCatalog.
+Apache Hive is a distributed data warehousing infrastructure.  Hive 
facilitates managing large data sets supporting multiple data formats, 
including comma-separated value (.csv), RC, ORC, and parquet. The PXF Hive 
plug-in reads data stored in Hive, as well as HDFS or HBase.
+
+This section describes how to use PXF to access Hive data. Options for 
querying data stored in Hive include:
+
+-  Creating an external table in PXF and querying that table
+-  Querying Hive tables via PXF's integration with HCatalog
 
 ## Prerequisites
 
-Check the following before using PXF to access Hive:
+Before accessing Hive data with HAWQ and PXF, ensure that:
 
--   The PXF HDFS plug-in is installed on all cluster nodes.
+-   The PXF HDFS plug-in is installed on all cluster nodes. See 
[Installing PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation 
information.
 -   The PXF Hive plug-in is installed on all cluster nodes.
 -   The Hive JAR files and conf directory are installed on all cluster 
nodes.
--   Test PXF on HDFS before connecting to Hive or HBase.
+-   You have tested PXF on HDFS.
 -   You are running the Hive Metastore service on a machine in your 
cluster. 
 -   You have set the `hive.metastore.uris` property in the `hive-site.xml` 
on the NameNode.
 
+## Hive File Formats
+
+Hive supports several file formats:
+
+-   TextFile - flat file with data in comma-, tab-, or space-separated 
value format or JSON notation
+-   SequenceFile - flat file consisting of binary key/value pairs
+-   RCFile - record columnar data consisting of binary key/value pairs; 
high row compression rate
+-   ORCFile - optimized row columnar data with stripe, footer, and 
postscript sections; reduces data size
+-   Parquet - compressed columnar data representation
+-   Avro - JSON-defined, schema-based data serialization format
+
+Refer to [File 
Formats](https://cwiki.apache.org/confluence/display/Hive/FileFormats) for 
detailed information about the file formats supported by Hive.
+
+The PXF Hive plug-in supports the following profiles for accessing the 
Hive file formats listed above. These include:
+
+- `Hive`
+- `HiveText`
+- `HiveRC`
+
+## Data Type Mapping
+
+### Primitive Data Types
+
+To represent Hive data in HAWQ, map data values that use a primitive data 
type to HAWQ columns of the same type.
+
+The following table summarizes external mapping rules for Hive primitive 
types.
+
+| Hive Data Type  | Hawq Data Type |
+|---|---|
+| boolean| bool |
+| int   | int4 |
+| smallint   | int2 |
+| tinyint   | int2 |
+| bigint   | int8 |
+| decimal  |  numeric  |
+| float   | float4 |
+| double   | float8 |
+| string   | text |
+| binary   | bytea |
+| char   | bpchar |
+| varchar   | varchar |
+| timestamp   | timestamp |
+| date   | date |
+
+
+### Complex Data Types
+
+Hive supports complex data types including array, struct, map, and union. 
PXF maps each of these complex types to `text`.  While HAWQ does not natively 
support these types, you can create HAWQ functions or application code to 
extract subcomponents of these complex data types.
+
+An example using complex data types is provided later in this topic.
+
+
+## Sample Data Set
+
+Examples used in this topic will operate on a common data set. This simple 
data set models a retail sales operation and includes fields with the following 
names and data types:
+
+- location - text
+- month - text
+- number\_of\_orders - integer
+- total\_sales - double
--- End diff --

Also consider term/definition table here.


> add PXF HiveText and HiveRC profile examples to the documentation
> -
>
> Key: HAWQ-1071
> URL: https://issues.apache.org/jira/browse/HAWQ-1071
> Project: Apache HAWQ
>   

[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612356#comment-15612356
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/39#discussion_r85368752
  
--- Diff: pxf/HivePXF.html.md.erb ---
@@ -2,121 +2,450 @@
 title: Accessing Hive Data
 ---
 
-This topic describes how to access Hive data using PXF. You have several 
options for querying data stored in Hive. You can create external tables in PXF 
and then query those tables, or you can easily query Hive tables by using HAWQ 
and PXF's integration with HCatalog. HAWQ accesses Hive table metadata stored 
in HCatalog.
+Apache Hive is a distributed data warehousing infrastructure.  Hive 
facilitates managing large data sets supporting multiple data formats, 
including comma-separated value (.csv), RC, ORC, and parquet. The PXF Hive 
plug-in reads data stored in Hive, as well as HDFS or HBase.
+
+This section describes how to use PXF to access Hive data. Options for 
querying data stored in Hive include:
+
+-  Creating an external table in PXF and querying that table
+-  Querying Hive tables via PXF's integration with HCatalog
 
 ## Prerequisites
 
-Check the following before using PXF to access Hive:
+Before accessing Hive data with HAWQ and PXF, ensure that:
 
--   The PXF HDFS plug-in is installed on all cluster nodes.
+-   The PXF HDFS plug-in is installed on all cluster nodes. See 
[Installing PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation 
information.
 -   The PXF Hive plug-in is installed on all cluster nodes.
 -   The Hive JAR files and conf directory are installed on all cluster 
nodes.
--   Test PXF on HDFS before connecting to Hive or HBase.
+-   You have tested PXF on HDFS.
 -   You are running the Hive Metastore service on a machine in your 
cluster. 
 -   You have set the `hive.metastore.uris` property in the `hive-site.xml` 
on the NameNode.
 
+## Hive File Formats
+
+Hive supports several file formats:
+
+-   TextFile - flat file with data in comma-, tab-, or space-separated 
value format or JSON notation
+-   SequenceFile - flat file consisting of binary key/value pairs
+-   RCFile - record columnar data consisting of binary key/value pairs; 
high row compression rate
+-   ORCFile - optimized row columnar data with stripe, footer, and 
postscript sections; reduces data size
+-   Parquet - compressed columnar data representation
+-   Avro - JSON-defined, schema-based data serialization format
+
+Refer to [File 
Formats](https://cwiki.apache.org/confluence/display/Hive/FileFormats) for 
detailed information about the file formats supported by Hive.
+
+The PXF Hive plug-in supports the following profiles for accessing the 
Hive file formats listed above. These include:
+
+- `Hive`
+- `HiveText`
+- `HiveRC`
+
+## Data Type Mapping
+
+### Primitive Data Types
+
+To represent Hive data in HAWQ, map data values that use a primitive data 
type to HAWQ columns of the same type.
+
+The following table summarizes external mapping rules for Hive primitive 
types.
+
+| Hive Data Type  | Hawq Data Type |
+|---|---|
+| boolean| bool |
+| int   | int4 |
+| smallint   | int2 |
+| tinyint   | int2 |
+| bigint   | int8 |
+| decimal  |  numeric  |
+| float   | float4 |
+| double   | float8 |
+| string   | text |
+| binary   | bytea |
+| char   | bpchar |
+| varchar   | varchar |
+| timestamp   | timestamp |
+| date   | date |
+
+
+### Complex Data Types
+
+Hive supports complex data types including array, struct, map, and union. 
PXF maps each of these complex types to `text`.  While HAWQ does not natively 
support these types, you can create HAWQ functions or application code to 
extract subcomponents of these complex data types.
+
+An example using complex data types is provided later in this topic.
+
+
+## Sample Data Set
+
+Examples used in this topic will operate on a common data set. This simple 
data set models a retail sales operation and includes fields with the following 
names and data types:
+
+- location - text
+- month - text
+- number\_of\_orders - integer
+- total\_sales - double
+
+Prepare the sample data set for use:
+
+1. First, create a text file:
+
+```
+$ vi /tmp/pxf_hive_datafile.txt
+```
+
+2. Add the following data to `pxf_hive_datafile.txt`; notice the use of 
the comma `,` to separate the four field values:
+
+```
+Prague,Jan,101,4875.33
+

[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612364#comment-15612364
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/39#discussion_r85366470
  
--- Diff: pxf/HivePXF.html.md.erb ---
@@ -2,121 +2,450 @@
 title: Accessing Hive Data
 ---
 
-This topic describes how to access Hive data using PXF. You have several 
options for querying data stored in Hive. You can create external tables in PXF 
and then query those tables, or you can easily query Hive tables by using HAWQ 
and PXF's integration with HCatalog. HAWQ accesses Hive table metadata stored 
in HCatalog.
+Apache Hive is a distributed data warehousing infrastructure.  Hive 
facilitates managing large data sets supporting multiple data formats, 
including comma-separated value (.csv), RC, ORC, and parquet. The PXF Hive 
plug-in reads data stored in Hive, as well as HDFS or HBase.
+
+This section describes how to use PXF to access Hive data. Options for 
querying data stored in Hive include:
+
+-  Creating an external table in PXF and querying that table
+-  Querying Hive tables via PXF's integration with HCatalog
 
 ## Prerequisites
 
-Check the following before using PXF to access Hive:
+Before accessing Hive data with HAWQ and PXF, ensure that:
 
--   The PXF HDFS plug-in is installed on all cluster nodes.
+-   The PXF HDFS plug-in is installed on all cluster nodes. See 
[Installing PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation 
information.
 -   The PXF Hive plug-in is installed on all cluster nodes.
 -   The Hive JAR files and conf directory are installed on all cluster 
nodes.
--   Test PXF on HDFS before connecting to Hive or HBase.
+-   You have tested PXF on HDFS.
 -   You are running the Hive Metastore service on a machine in your 
cluster. 
 -   You have set the `hive.metastore.uris` property in the `hive-site.xml` 
on the NameNode.
 
+## Hive File Formats
+
+Hive supports several file formats:
+
+-   TextFile - flat file with data in comma-, tab-, or space-separated 
value format or JSON notation
+-   SequenceFile - flat file consisting of binary key/value pairs
+-   RCFile - record columnar data consisting of binary key/value pairs; 
high row compression rate
+-   ORCFile - optimized row columnar data with stripe, footer, and 
postscript sections; reduces data size
+-   Parquet - compressed columnar data representation
+-   Avro - JSON-defined, schema-based data serialization format
+
+Refer to [File 
Formats](https://cwiki.apache.org/confluence/display/Hive/FileFormats) for 
detailed information about the file formats supported by Hive.
+
+The PXF Hive plug-in supports the following profiles for accessing the 
Hive file formats listed above. These include:
+
+- `Hive`
+- `HiveText`
+- `HiveRC`
+
+## Data Type Mapping
+
+### Primitive Data Types
+
+To represent Hive data in HAWQ, map data values that use a primitive data 
type to HAWQ columns of the same type.
+
+The following table summarizes external mapping rules for Hive primitive 
types.
+
+| Hive Data Type  | Hawq Data Type |
+|---|---|
+| boolean| bool |
+| int   | int4 |
+| smallint   | int2 |
+| tinyint   | int2 |
+| bigint   | int8 |
+| decimal  |  numeric  |
+| float   | float4 |
+| double   | float8 |
+| string   | text |
+| binary   | bytea |
+| char   | bpchar |
+| varchar   | varchar |
+| timestamp   | timestamp |
+| date   | date |
+
+
+### Complex Data Types
+
+Hive supports complex data types including array, struct, map, and union. 
PXF maps each of these complex types to `text`.  While HAWQ does not natively 
support these types, you can create HAWQ functions or application code to 
extract subcomponents of these complex data types.
+
+An example using complex data types is provided later in this topic.
+
+
+## Sample Data Set
+
+Examples used in this topic will operate on a common data set. This simple 
data set models a retail sales operation and includes fields with the following 
names and data types:
+
+- location - text
+- month - text
+- number\_of\_orders - integer
+- total\_sales - double
+
+Prepare the sample data set for use:
+
+1. First, create a text file:
+
+```
+$ vi /tmp/pxf_hive_datafile.txt
+```
+
+2. Add the following data to `pxf_hive_datafile.txt`; notice the use of 
the comma `,` to separate the four field values:
+
+```
+Prague,Jan,101,4875.33
+

[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612360#comment-15612360
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/39#discussion_r85371576
  
--- Diff: pxf/HivePXF.html.md.erb ---
@@ -151,184 +477,120 @@ To enable HCatalog query integration in HAWQ, 
perform the following steps:
 postgres=# GRANT ALL ON PROTOCOL pxf TO "role";
 ``` 
 
-3.  To query a Hive table with HCatalog integration, simply query HCatalog 
directly from HAWQ. The query syntax is:
 
-``` sql
-postgres=# SELECT * FROM hcatalog.hive-db-name.hive-table-name;
-```
+To query a Hive table with HCatalog integration, query HCatalog directly 
from HAWQ. The query syntax is:
+
+``` sql
+postgres=# SELECT * FROM hcatalog.hive-db-name.hive-table-name;
+```
 
-For example:
+For example:
 
-``` sql
-postgres=# SELECT * FROM hcatalog.default.sales;
-```
-
-4.  To obtain a description of a Hive table with HCatalog integration, you 
can use the `psql` client interface.
--   Within HAWQ, use either the `\d
 hcatalog.hive-db-name.hive-table-name` or `\d+ 
hcatalog.hive-db-name.hive-table-name` commands to describe a 
single table. For example, from the `psql` client interface:
-
-``` shell
-$ psql -d postgres
-postgres=# \d hcatalog.default.test
-
-PXF Hive Table "default.test"
-Column|  Type  
---+
- name | text
- type | text
- supplier_key | int4
- full_price   | float8 
-```
--   Use `\d hcatalog.hive-db-name.*` to describe the whole database 
schema. For example:
-
-``` shell
-postgres=# \d hcatalog.default.*
-
-PXF Hive Table "default.test"
-Column|  Type  
---+
- type | text
- name | text
- supplier_key | int4
- full_price   | float8
-
-PXF Hive Table "default.testabc"
- Column | Type 
-+--
- type   | text
- name   | text
-```
--   Use `\d hcatalog.*.*` to describe the whole schema:
-
-``` shell
-postgres=# \d hcatalog.*.*
-
-PXF Hive Table "default.test"
-Column|  Type  
---+
- type | text
- name | text
- supplier_key | int4
- full_price   | float8
-
-PXF Hive Table "default.testabc"
- Column | Type 
-+--
- type   | text
- name   | text
-
-PXF Hive Table "userdb.test"
-  Column  | Type 
---+--
- address  | text
- username | text
- 
-```
-
-**Note:** When using `\d` or `\d+` commands in the `psql` HAWQ client, 
`hcatalog` will not be listed as a database. If you use other `psql` compatible 
clients, `hcatalog` will be listed as a database with a size value of `-1` 
since `hcatalog` is not a real database in HAWQ.
-
-5.  Alternatively, you can use the **pxf\_get\_item\_fields** user-defined 
function (UDF) to obtain Hive table descriptions from other client interfaces 
or third-party applications. The UDF takes a PXF profile and a table pattern 
string as its input parameters.
-
-**Note:** Currently the only supported input profile is `'Hive'`.
-
-For example, the following statement returns a description of a 
specific table. The description includes path, itemname (table), fieldname, and 
fieldtype.
+``` sql
+postgres=# SELECT * FROM hcatalog.default.sales_info;
+```
+
+To obtain a description of a Hive table with HCatalog integration, you can 
use the `psql` client interface.
+
+-   Within HAWQ, use either the `\d
 hcatalog.hive-db-name.hive-table-name` or `\d+ 
hcatalog.hive-db-name.hive-table-name` commands to describe a single 
table. For example, from the `psql` client interface:
+
+``` shell
+$ psql -d postgres
+```
 
 ``` sql
-postgres=# select * from pxf_get_item_fields('Hive','default.test');
+postgres=# \d hcatalog.default.sales_info_rcfile;
 ```
-
-``` pre
-  path   | itemname |  

[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612342#comment-15612342
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user lisakowen commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r85371514
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,506 +2,449 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes. See [Installing 
PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation information.
+-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
 
-``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
-( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://host[:port]/path-to-data?[=value...]')
-  FORMAT '[TEXT | CSV | CUSTOM]' ();
-```
+## HDFS File Formats
 
-where `` is:
+The PXF HDFS plug-in supports reading the following file formats:
 
-``` pre
-   
FRAGMENTER=fragmenter_class=accessor_class=resolver_class]
- | PROFILE=profile-name
-```
+- Text File - comma-separated value (.csv) or delimited format plain text 
file
+- Avro - JSON-defined, schema-based data serialization format
 
-**Note:** Omit the `FRAGMENTER` parameter for `READABLE` external tables.
+The PXF HDFS plug-in includes the following profiles to support the file 
formats listed above:
 
-Use an SQL `SELECT` statement to read from an HDFS READABLE table:
+- `HdfsTextSimple` - text files
+- `HdfsTextMulti` - text files with embedded line feeds
+- `Avro` - Avro files
 
-``` sql
-SELECT ... FROM table_name;
+If you find that the pre-defined PXF HDFS profiles do not meet your needs, 
you may choose to create a custom HDFS profile from the existing HDFS 
serialization and deserialization classes. Refer to [Adding and Updating 
Profiles](ReadWritePXF.html#addingandupdatingprofiles) for information on 
creating a custom profile.
+
+## HDFS Shell Commands
+Hadoop includes command-line tools that interact directly with HDFS.  
These tools support typical file system operations including copying and 
listing files, changing file permissions, and so forth. 
+
+The HDFS file system command syntax is `hdfs dfs  []`. 
Invoked with no options, `hdfs dfs` lists the file system options supported by 
the tool.
+
+`hdfs dfs` options used in this topic are:
+
+| Option  | Description |
+|---|-|
+| `-cat`| Display file contents. |
+| `-mkdir`| Create directory in HDFS. |
+| `-put`| Copy file from local file system to HDFS. |
+
+Examples:
+
+Create a directory in HDFS:
+
+``` shell
+$ sudo -u hdfs hdfs dfs -mkdir -p /data/exampledir
 ```
 
-Use an SQL `INSERT` statement to add data to an HDFS WRITABLE table:
+Copy a text file to HDFS:
 
-``` sql
-INSERT INTO table_name ...;
+``` shell
+$ sudo -u hdfs hdfs dfs -put /tmp/example.txt /data/exampledir/
 ```
 
-To read the data in the files or to write based on the existing format, 
use `FORMAT`, `PROFILE`, or one of the classes.
-
-This topic describes the following:
-
--   FORMAT clause
--   Profile
--   Accessor
--   Resolver
--   Avro
-
-**Note:** For more details about the API and classes, see [PXF External 
Tables and 
API](PXFExternalTableandAPIReference.html#pxfexternaltableandapireference).
-
-### FORMAT clause
-
-Use one of the following formats to read data with any PXF connector:
-
--   `FORMAT 'TEXT'`: Use with plain delimited text files on HDFS.
--   `FORMAT 'CSV'`: Use with comma-separated value files on HDFS.

[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612340#comment-15612340
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user lisakowen commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r85371358
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,506 +2,449 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes. See [Installing 
PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation information.
+-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
 
-``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
-( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://host[:port]/path-to-data?[=value...]')
-  FORMAT '[TEXT | CSV | CUSTOM]' ();
-```
+## HDFS File Formats
 
-where `` is:
+The PXF HDFS plug-in supports reading the following file formats:
 
-``` pre
-   
FRAGMENTER=fragmenter_class=accessor_class=resolver_class]
- | PROFILE=profile-name
-```
+- Text File - comma-separated value (.csv) or delimited format plain text 
file
+- Avro - JSON-defined, schema-based data serialization format
 
-**Note:** Omit the `FRAGMENTER` parameter for `READABLE` external tables.
+The PXF HDFS plug-in includes the following profiles to support the file 
formats listed above:
 
-Use an SQL `SELECT` statement to read from an HDFS READABLE table:
+- `HdfsTextSimple` - text files
+- `HdfsTextMulti` - text files with embedded line feeds
+- `Avro` - Avro files
 
-``` sql
-SELECT ... FROM table_name;
+If you find that the pre-defined PXF HDFS profiles do not meet your needs, 
you may choose to create a custom HDFS profile from the existing HDFS 
serialization and deserialization classes. Refer to [Adding and Updating 
Profiles](ReadWritePXF.html#addingandupdatingprofiles) for information on 
creating a custom profile.
+
+## HDFS Shell Commands
+Hadoop includes command-line tools that interact directly with HDFS.  
These tools support typical file system operations including copying and 
listing files, changing file permissions, and so forth. 
+
+The HDFS file system command syntax is `hdfs dfs  []`. 
Invoked with no options, `hdfs dfs` lists the file system options supported by 
the tool.
+
+`hdfs dfs` options used in this topic are:
+
+| Option  | Description |
+|---|-|
+| `-cat`| Display file contents. |
+| `-mkdir`| Create directory in HDFS. |
+| `-put`| Copy file from local file system to HDFS. |
+
+Examples:
+
+Create a directory in HDFS:
+
+``` shell
+$ sudo -u hdfs hdfs dfs -mkdir -p /data/exampledir
 ```
 
-Use an SQL `INSERT` statement to add data to an HDFS WRITABLE table:
+Copy a text file to HDFS:
 
-``` sql
-INSERT INTO table_name ...;
+``` shell
+$ sudo -u hdfs hdfs dfs -put /tmp/example.txt /data/exampledir/
 ```
 
-To read the data in the files or to write based on the existing format, 
use `FORMAT`, `PROFILE`, or one of the classes.
-
-This topic describes the following:
-
--   FORMAT clause
--   Profile
--   Accessor
--   Resolver
--   Avro
-
-**Note:** For more details about the API and classes, see [PXF External 
Tables and 
API](PXFExternalTableandAPIReference.html#pxfexternaltableandapireference).
-
-### FORMAT clause
-
-Use one of the following formats to read data with any PXF connector:
-
--   `FORMAT 'TEXT'`: Use with plain delimited text files on HDFS.
--   `FORMAT 'CSV'`: Use with comma-separated value files on HDFS.

[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612281#comment-15612281
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user kavinderd commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r85362384
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,506 +2,449 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes. See [Installing 
PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation information.
+-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
 
-``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
-( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://host[:port]/path-to-data?[=value...]')
-  FORMAT '[TEXT | CSV | CUSTOM]' ();
-```
+## HDFS File Formats
 
-where `` is:
+The PXF HDFS plug-in supports reading the following file formats:
 
-``` pre
-   
FRAGMENTER=fragmenter_class=accessor_class=resolver_class]
- | PROFILE=profile-name
-```
+- Text File - comma-separated value (.csv) or delimited format plain text 
file
+- Avro - JSON-defined, schema-based data serialization format
 
-**Note:** Omit the `FRAGMENTER` parameter for `READABLE` external tables.
+The PXF HDFS plug-in includes the following profiles to support the file 
formats listed above:
 
-Use an SQL `SELECT` statement to read from an HDFS READABLE table:
+- `HdfsTextSimple` - text files
+- `HdfsTextMulti` - text files with embedded line feeds
+- `Avro` - Avro files
 
-``` sql
-SELECT ... FROM table_name;
+If you find that the pre-defined PXF HDFS profiles do not meet your needs, 
you may choose to create a custom HDFS profile from the existing HDFS 
serialization and deserialization classes. Refer to [Adding and Updating 
Profiles](ReadWritePXF.html#addingandupdatingprofiles) for information on 
creating a custom profile.
+
+## HDFS Shell Commands
+Hadoop includes command-line tools that interact directly with HDFS.  
These tools support typical file system operations including copying and 
listing files, changing file permissions, and so forth. 
+
+The HDFS file system command syntax is `hdfs dfs  []`. 
Invoked with no options, `hdfs dfs` lists the file system options supported by 
the tool.
+
+`hdfs dfs` options used in this topic are:
+
+| Option  | Description |
+|---|-|
+| `-cat`| Display file contents. |
+| `-mkdir`| Create directory in HDFS. |
+| `-put`| Copy file from local file system to HDFS. |
+
+Examples:
+
+Create a directory in HDFS:
+
+``` shell
+$ sudo -u hdfs hdfs dfs -mkdir -p /data/exampledir
 ```
 
-Use an SQL `INSERT` statement to add data to an HDFS WRITABLE table:
+Copy a text file to HDFS:
 
-``` sql
-INSERT INTO table_name ...;
+``` shell
+$ sudo -u hdfs hdfs dfs -put /tmp/example.txt /data/exampledir/
 ```
 
-To read the data in the files or to write based on the existing format, 
use `FORMAT`, `PROFILE`, or one of the classes.
-
-This topic describes the following:
-
--   FORMAT clause
--   Profile
--   Accessor
--   Resolver
--   Avro
-
-**Note:** For more details about the API and classes, see [PXF External 
Tables and 
API](PXFExternalTableandAPIReference.html#pxfexternaltableandapireference).
-
-### FORMAT clause
-
-Use one of the following formats to read data with any PXF connector:
-
--   `FORMAT 'TEXT'`: Use with plain delimited text files on HDFS.
--   `FORMAT 'CSV'`: Use with comma-separated value files on HDFS.

[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612279#comment-15612279
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user kavinderd commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r85358483
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,506 +2,449 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes. See [Installing 
PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation information.
+-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
 
-``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
-( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://host[:port]/path-to-data?[=value...]')
-  FORMAT '[TEXT | CSV | CUSTOM]' ();
-```
+## HDFS File Formats
 
-where `` is:
+The PXF HDFS plug-in supports reading the following file formats:
 
-``` pre
-   
FRAGMENTER=fragmenter_class=accessor_class=resolver_class]
- | PROFILE=profile-name
-```
+- Text File - comma-separated value (.csv) or delimited format plain text 
file
+- Avro - JSON-defined, schema-based data serialization format
 
-**Note:** Omit the `FRAGMENTER` parameter for `READABLE` external tables.
+The PXF HDFS plug-in includes the following profiles to support the file 
formats listed above:
 
-Use an SQL `SELECT` statement to read from an HDFS READABLE table:
+- `HdfsTextSimple` - text files
+- `HdfsTextMulti` - text files with embedded line feeds
+- `Avro` - Avro files
 
-``` sql
-SELECT ... FROM table_name;
+If you find that the pre-defined PXF HDFS profiles do not meet your needs, 
you may choose to create a custom HDFS profile from the existing HDFS 
serialization and deserialization classes. Refer to [Adding and Updating 
Profiles](ReadWritePXF.html#addingandupdatingprofiles) for information on 
creating a custom profile.
+
+## HDFS Shell Commands
+Hadoop includes command-line tools that interact directly with HDFS.  
These tools support typical file system operations including copying and 
listing files, changing file permissions, and so forth. 
+
+The HDFS file system command syntax is `hdfs dfs  []`. 
Invoked with no options, `hdfs dfs` lists the file system options supported by 
the tool.
+
+`hdfs dfs` options used in this topic are:
+
+| Option  | Description |
+|---|-|
+| `-cat`| Display file contents. |
+| `-mkdir`| Create directory in HDFS. |
+| `-put`| Copy file from local file system to HDFS. |
+
+Examples:
+
+Create a directory in HDFS:
+
+``` shell
+$ sudo -u hdfs hdfs dfs -mkdir -p /data/exampledir
 ```
 
-Use an SQL `INSERT` statement to add data to an HDFS WRITABLE table:
+Copy a text file to HDFS:
 
-``` sql
-INSERT INTO table_name ...;
+``` shell
+$ sudo -u hdfs hdfs dfs -put /tmp/example.txt /data/exampledir/
 ```
 
-To read the data in the files or to write based on the existing format, 
use `FORMAT`, `PROFILE`, or one of the classes.
-
-This topic describes the following:
-
--   FORMAT clause
--   Profile
--   Accessor
--   Resolver
--   Avro
-
-**Note:** For more details about the API and classes, see [PXF External 
Tables and 
API](PXFExternalTableandAPIReference.html#pxfexternaltableandapireference).
-
-### FORMAT clause
-
-Use one of the following formats to read data with any PXF connector:
-
--   `FORMAT 'TEXT'`: Use with plain delimited text files on HDFS.
--   `FORMAT 'CSV'`: Use with comma-separated value files on HDFS.

[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612280#comment-15612280
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user kavinderd commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r85361807
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,506 +2,449 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes. See [Installing 
PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation information.
+-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
 
-``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
-( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://host[:port]/path-to-data?[=value...]')
-  FORMAT '[TEXT | CSV | CUSTOM]' ();
-```
+## HDFS File Formats
 
-where `` is:
+The PXF HDFS plug-in supports reading the following file formats:
 
-``` pre
-   
FRAGMENTER=fragmenter_class=accessor_class=resolver_class]
- | PROFILE=profile-name
-```
+- Text File - comma-separated value (.csv) or delimited format plain text 
file
+- Avro - JSON-defined, schema-based data serialization format
 
-**Note:** Omit the `FRAGMENTER` parameter for `READABLE` external tables.
+The PXF HDFS plug-in includes the following profiles to support the file 
formats listed above:
 
-Use an SQL `SELECT` statement to read from an HDFS READABLE table:
+- `HdfsTextSimple` - text files
+- `HdfsTextMulti` - text files with embedded line feeds
+- `Avro` - Avro files
 
-``` sql
-SELECT ... FROM table_name;
+If you find that the pre-defined PXF HDFS profiles do not meet your needs, 
you may choose to create a custom HDFS profile from the existing HDFS 
serialization and deserialization classes. Refer to [Adding and Updating 
Profiles](ReadWritePXF.html#addingandupdatingprofiles) for information on 
creating a custom profile.
+
+## HDFS Shell Commands
+Hadoop includes command-line tools that interact directly with HDFS.  
These tools support typical file system operations including copying and 
listing files, changing file permissions, and so forth. 
+
+The HDFS file system command syntax is `hdfs dfs  []`. 
Invoked with no options, `hdfs dfs` lists the file system options supported by 
the tool.
+
+`hdfs dfs` options used in this topic are:
+
+| Option  | Description |
+|---|-|
+| `-cat`| Display file contents. |
+| `-mkdir`| Create directory in HDFS. |
+| `-put`| Copy file from local file system to HDFS. |
+
+Examples:
+
+Create a directory in HDFS:
+
+``` shell
+$ sudo -u hdfs hdfs dfs -mkdir -p /data/exampledir
--- End diff --

You don't necessarily have to run hdfs commands as `sudo -u hdfs` if the 
current user has the hdfs client and permissions.


> PXF HDFS documentation - restructure content and include more examples
> --
>
> Key: HAWQ-1107
> URL: https://issues.apache.org/jira/browse/HAWQ-1107
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF HDFS documentation does not include any runnable examples.  
> add runnable examples for all (HdfsTextSimple, HdfsTextMulti, SerialWritable, 
> Avro) profiles.  restructure the content as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612282#comment-15612282
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user kavinderd commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r85362806
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,506 +2,449 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes. See [Installing 
PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation information.
+-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
 
-``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
-( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://host[:port]/path-to-data?[=value...]')
-  FORMAT '[TEXT | CSV | CUSTOM]' ();
-```
+## HDFS File Formats
 
-where `` is:
+The PXF HDFS plug-in supports reading the following file formats:
 
-``` pre
-   
FRAGMENTER=fragmenter_class=accessor_class=resolver_class]
- | PROFILE=profile-name
-```
+- Text File - comma-separated value (.csv) or delimited format plain text 
file
+- Avro - JSON-defined, schema-based data serialization format
 
-**Note:** Omit the `FRAGMENTER` parameter for `READABLE` external tables.
+The PXF HDFS plug-in includes the following profiles to support the file 
formats listed above:
 
-Use an SQL `SELECT` statement to read from an HDFS READABLE table:
+- `HdfsTextSimple` - text files
+- `HdfsTextMulti` - text files with embedded line feeds
+- `Avro` - Avro files
 
-``` sql
-SELECT ... FROM table_name;
+If you find that the pre-defined PXF HDFS profiles do not meet your needs, 
you may choose to create a custom HDFS profile from the existing HDFS 
serialization and deserialization classes. Refer to [Adding and Updating 
Profiles](ReadWritePXF.html#addingandupdatingprofiles) for information on 
creating a custom profile.
+
+## HDFS Shell Commands
+Hadoop includes command-line tools that interact directly with HDFS.  
These tools support typical file system operations including copying and 
listing files, changing file permissions, and so forth. 
+
+The HDFS file system command syntax is `hdfs dfs  []`. 
Invoked with no options, `hdfs dfs` lists the file system options supported by 
the tool.
+
+`hdfs dfs` options used in this topic are:
+
+| Option  | Description |
+|---|-|
+| `-cat`| Display file contents. |
+| `-mkdir`| Create directory in HDFS. |
+| `-put`| Copy file from local file system to HDFS. |
+
+Examples:
+
+Create a directory in HDFS:
+
+``` shell
+$ sudo -u hdfs hdfs dfs -mkdir -p /data/exampledir
 ```
 
-Use an SQL `INSERT` statement to add data to an HDFS WRITABLE table:
+Copy a text file to HDFS:
 
-``` sql
-INSERT INTO table_name ...;
+``` shell
+$ sudo -u hdfs hdfs dfs -put /tmp/example.txt /data/exampledir/
 ```
 
-To read the data in the files or to write based on the existing format, 
use `FORMAT`, `PROFILE`, or one of the classes.
-
-This topic describes the following:
-
--   FORMAT clause
--   Profile
--   Accessor
--   Resolver
--   Avro
-
-**Note:** For more details about the API and classes, see [PXF External 
Tables and 
API](PXFExternalTableandAPIReference.html#pxfexternaltableandapireference).
-
-### FORMAT clause
-
-Use one of the following formats to read data with any PXF connector:
-
--   `FORMAT 'TEXT'`: Use with plain delimited text files on HDFS.
--   `FORMAT 'CSV'`: Use with comma-separated value files on HDFS.

[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609898#comment-15609898
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

GitHub user lisakowen opened a pull request:

https://github.com/apache/incubator-hawq-docs/pull/40

HAWQ-1071 - subnav changes for pxf enhancement work

removed all submenus from the subnav

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lisakowen/incubator-hawq-docs 
feature/subnav-pxfhive-enhance

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq-docs/pull/40.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #40


commit c3f381265b2c48b89f98863888ccd00b2926880c
Author: Lisa Owen 
Date:   2016-10-24T19:43:04Z

subnav chgs for hive plugin content restructure

commit 54445c6815a166e4e275455ea64221322087
Author: Lisa Owen 
Date:   2016-10-26T16:36:31Z

remove submenu from pxf hive plugin subnav




> add PXF HiveText and HiveRC profile examples to the documentation
> -
>
> Key: HAWQ-1071
> URL: https://issues.apache.org/jira/browse/HAWQ-1071
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF Hive documentation includes an example for only the Hive 
> profile.  add examples for HiveText and HiveRC profiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1071) add PXF HiveText and HiveRC profile examples to the documentation

2016-10-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609892#comment-15609892
 ] 

ASF GitHub Bot commented on HAWQ-1071:
--

GitHub user lisakowen opened a pull request:

https://github.com/apache/incubator-hawq-docs/pull/39

HAWQ-1071 - add examples for HiveText and HiveRC plugins

added examples, restructured content, added hive command line section.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lisakowen/incubator-hawq-docs 
feature/pxfhive-enhance

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq-docs/pull/39.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #39


commit 0398a62fefd3627273927f938b4d082a25bf3003
Author: Lisa Owen 
Date:   2016-09-26T21:37:04Z

restructure PXF Hive pulug-in page; add more relevant examples

commit 457d703a3f5c057e241acf985fbc35da34f6a075
Author: Lisa Owen 
Date:   2016-09-26T22:40:10Z

PXF Hive plug-in mods

commit 822d7545e746490e55507866c62dca5ea2d5349a
Author: Lisa Owen 
Date:   2016-10-03T22:19:03Z

clean up some extra whitespace

commit 8c986b60b8db3edd77c10f23704cc9174c52a803
Author: Lisa Owen 
Date:   2016-10-11T18:37:34Z

include list of hive profile names in file format section

commit 150fa67857871d58ea05eb14c023215c932ab7b1
Author: Lisa Owen 
Date:   2016-10-11T19:03:39Z

link to CREATE EXTERNAL TABLE ref page

commit 5cdd8f8c35a51360fe3bfdedeff796bf1e0f31f3
Author: Lisa Owen 
Date:   2016-10-11T20:27:17Z

sql commands all caps

commit 67e8b9699c9eec64d04ce9e6048ffb385f7f3573
Author: Lisa Owen 
Date:   2016-10-11T20:33:35Z

use <> for optional args

commit 54b2c01a80d477cc093d7eb1ed2aa8c0bf762d36
Author: Lisa Owen 
Date:   2016-10-22T00:16:24Z

fix some duplicate ids

commit 284c3ec2db38e8d9020826e3bf292efad76c1819
Author: Lisa Owen 
Date:   2016-10-26T15:38:37Z

restructure to use numbered steps

commit 2a38a0322abda804cfd4fc8aa39f142f0d83ea11
Author: Lisa Owen 
Date:   2016-10-26T17:20:28Z

note/notice




> add PXF HiveText and HiveRC profile examples to the documentation
> -
>
> Key: HAWQ-1071
> URL: https://issues.apache.org/jira/browse/HAWQ-1071
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF Hive documentation includes an example for only the Hive 
> profile.  add examples for HiveText and HiveRC profiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609382#comment-15609382
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq-docs/pull/33


> PXF HDFS documentation - restructure content and include more examples
> --
>
> Key: HAWQ-1107
> URL: https://issues.apache.org/jira/browse/HAWQ-1107
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF HDFS documentation does not include any runnable examples.  
> add runnable examples for all (HdfsTextSimple, HdfsTextMulti, SerialWritable, 
> Avro) profiles.  restructure the content as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609383#comment-15609383
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq-docs/pull/38


> PXF HDFS documentation - restructure content and include more examples
> --
>
> Key: HAWQ-1107
> URL: https://issues.apache.org/jira/browse/HAWQ-1107
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF HDFS documentation does not include any runnable examples.  
> add runnable examples for all (HdfsTextSimple, HdfsTextMulti, SerialWritable, 
> Avro) profiles.  restructure the content as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608979#comment-15608979
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

GitHub user lisakowen opened a pull request:

https://github.com/apache/incubator-hawq-docs/pull/38

HAWQ-1107 - more subnav changes for HDFS plugin

remove all submenus

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lisakowen/incubator-hawq-docs 
feature/subnav-pxfhdfs-enhance

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq-docs/pull/38.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #38






> PXF HDFS documentation - restructure content and include more examples
> --
>
> Key: HAWQ-1107
> URL: https://issues.apache.org/jira/browse/HAWQ-1107
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF HDFS documentation does not include any runnable examples.  
> add runnable examples for all (HdfsTextSimple, HdfsTextMulti, SerialWritable, 
> Avro) profiles.  restructure the content as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606649#comment-15606649
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq-docs/pull/34


> PXF HDFS documentation - restructure content and include more examples
> --
>
> Key: HAWQ-1107
> URL: https://issues.apache.org/jira/browse/HAWQ-1107
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF HDFS documentation does not include any runnable examples.  
> add runnable examples for all (HdfsTextSimple, HdfsTextMulti, SerialWritable, 
> Avro) profiles.  restructure the content as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606507#comment-15606507
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r84997781
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,388 +2,282 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes.
+-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
 
-``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
-( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://host[:port]/path-to-data?[=value...]')
-  FORMAT '[TEXT | CSV | CUSTOM]' ();
-```
+## HDFS File Formats
 
-where `` is:
+The PXF HDFS plug-in supports reading the following file formats:
 
-``` pre
-   
FRAGMENTER=fragmenter_class=accessor_class=resolver_class]
- | PROFILE=profile-name
-```
+- Text File - comma-separated value (.csv) or delimited format plain text 
file
+- Avro - JSON-defined, schema-based data serialization format
 
-**Note:** Omit the `FRAGMENTER` parameter for `READABLE` external tables.
+The PXF HDFS plug-in includes the following profiles to support the file 
formats listed above:
 
-Use an SQL `SELECT` statement to read from an HDFS READABLE table:
+- `HdfsTextSimple` - text files
+- `HdfsTextMulti` - text files with embedded line feeds
+- `Avro` - Avro files
 
-``` sql
-SELECT ... FROM table_name;
-```
 
-Use an SQL `INSERT` statement to add data to an HDFS WRITABLE table:
+## HDFS Shell Commands
+Hadoop includes command-line tools that interact directly with HDFS.  
These tools support typical file system operations including copying and 
listing files, changing file permissions, etc. 
 
-``` sql
-INSERT INTO table_name ...;
-```
+The HDFS file system command is `hdfs dfs  []`. Invoked 
with no options, `hdfs dfs` lists the file system options supported by the tool.
--- End diff --

command -> command syntax


> PXF HDFS documentation - restructure content and include more examples
> --
>
> Key: HAWQ-1107
> URL: https://issues.apache.org/jira/browse/HAWQ-1107
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF HDFS documentation does not include any runnable examples.  
> add runnable examples for all (HdfsTextSimple, HdfsTextMulti, SerialWritable, 
> Avro) profiles.  restructure the content as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606512#comment-15606512
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r84999604
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,388 +2,282 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes.
+-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
 
-``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
-( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://host[:port]/path-to-data?[=value...]')
-  FORMAT '[TEXT | CSV | CUSTOM]' ();
-```
+## HDFS File Formats
 
-where `` is:
+The PXF HDFS plug-in supports reading the following file formats:
 
-``` pre
-   
FRAGMENTER=fragmenter_class=accessor_class=resolver_class]
- | PROFILE=profile-name
-```
+- Text File - comma-separated value (.csv) or delimited format plain text 
file
+- Avro - JSON-defined, schema-based data serialization format
 
-**Note:** Omit the `FRAGMENTER` parameter for `READABLE` external tables.
+The PXF HDFS plug-in includes the following profiles to support the file 
formats listed above:
 
-Use an SQL `SELECT` statement to read from an HDFS READABLE table:
+- `HdfsTextSimple` - text files
+- `HdfsTextMulti` - text files with embedded line feeds
+- `Avro` - Avro files
 
-``` sql
-SELECT ... FROM table_name;
-```
 
-Use an SQL `INSERT` statement to add data to an HDFS WRITABLE table:
+## HDFS Shell Commands
+Hadoop includes command-line tools that interact directly with HDFS.  
These tools support typical file system operations including copying and 
listing files, changing file permissions, etc. 
 
-``` sql
-INSERT INTO table_name ...;
-```
+The HDFS file system command is `hdfs dfs  []`. Invoked 
with no options, `hdfs dfs` lists the file system options supported by the tool.
+
+`hdfs dfs` options used in this section are identified in the table below:
+
+| Option  | Description |
+|---|-|
+| `-cat`| Display file contents. |
+| `-mkdir`| Create directory in HDFS. |
+| `-put`| Copy file from local file system to HDFS. |
+
+### Create Data Files
+
+Perform the following steps to create data files used in subsequent 
exercises:
+
+1. Create an HDFS directory for PXF example data files:
+
+``` shell
+ $ sudo -u hdfs hdfs dfs -mkdir -p /data/pxf_examples
+```
+
+2. Create a delimited plain text file:
+
+``` shell
+$ vi /tmp/pxf_hdfs_simple.txt
--- End diff --

Does it make sense to change these into `echo` commands so they can just be 
cut/pasted?  Like:

$ echo 'Prague,Jan,101,4875.33
Rome,Mar,87,1557.39
Bangalore,May,317,8936.99
Beijing,Jul,411,11600.67' >> pxf_hdfs_simple.txt


> PXF HDFS documentation - restructure content and include more examples
> --
>
> Key: HAWQ-1107
> URL: https://issues.apache.org/jira/browse/HAWQ-1107
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF HDFS documentation does not include any runnable examples.  
> add runnable examples for all (HdfsTextSimple, HdfsTextMulti, SerialWritable, 
> Avro) profiles.  restructure the content 

[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606511#comment-15606511
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r84996425
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,388 +2,282 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes.
--- End diff --

Add an XREF here.


> PXF HDFS documentation - restructure content and include more examples
> --
>
> Key: HAWQ-1107
> URL: https://issues.apache.org/jira/browse/HAWQ-1107
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF HDFS documentation does not include any runnable examples.  
> add runnable examples for all (HdfsTextSimple, HdfsTextMulti, SerialWritable, 
> Avro) profiles.  restructure the content as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606513#comment-15606513
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r85002565
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,388 +2,282 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes.
+-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
 
-``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
-( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://host[:port]/path-to-data?[=value...]')
-  FORMAT '[TEXT | CSV | CUSTOM]' ();
-```
+## HDFS File Formats
 
-where `` is:
+The PXF HDFS plug-in supports reading the following file formats:
 
-``` pre
-   
FRAGMENTER=fragmenter_class=accessor_class=resolver_class]
- | PROFILE=profile-name
-```
+- Text File - comma-separated value (.csv) or delimited format plain text 
file
+- Avro - JSON-defined, schema-based data serialization format
 
-**Note:** Omit the `FRAGMENTER` parameter for `READABLE` external tables.
+The PXF HDFS plug-in includes the following profiles to support the file 
formats listed above:
 
-Use an SQL `SELECT` statement to read from an HDFS READABLE table:
+- `HdfsTextSimple` - text files
+- `HdfsTextMulti` - text files with embedded line feeds
+- `Avro` - Avro files
 
-``` sql
-SELECT ... FROM table_name;
-```
 
-Use an SQL `INSERT` statement to add data to an HDFS WRITABLE table:
+## HDFS Shell Commands
+Hadoop includes command-line tools that interact directly with HDFS.  
These tools support typical file system operations including copying and 
listing files, changing file permissions, etc. 
 
-``` sql
-INSERT INTO table_name ...;
-```
+The HDFS file system command is `hdfs dfs  []`. Invoked 
with no options, `hdfs dfs` lists the file system options supported by the tool.
+
+`hdfs dfs` options used in this section are identified in the table below:
+
+| Option  | Description |
+|---|-|
+| `-cat`| Display file contents. |
+| `-mkdir`| Create directory in HDFS. |
+| `-put`| Copy file from local file system to HDFS. |
+
+### Create Data Files
+
+Perform the following steps to create data files used in subsequent 
exercises:
+
+1. Create an HDFS directory for PXF example data files:
+
+``` shell
+ $ sudo -u hdfs hdfs dfs -mkdir -p /data/pxf_examples
+```
+
+2. Create a delimited plain text file:
+
+``` shell
+$ vi /tmp/pxf_hdfs_simple.txt
+```
+
+3. Copy and paste the following data into `pxf_hdfs_simple.txt`:
+
+``` pre
+Prague,Jan,101,4875.33
+Rome,Mar,87,1557.39
+Bangalore,May,317,8936.99
+Beijing,Jul,411,11600.67
+```
+
+Notice the use of the comma `,` to separate the four data fields.
+
+4. Add the data file to HDFS:
+
+``` shell
+$ sudo -u hdfs hdfs dfs -put /tmp/pxf_hdfs_simple.txt 
/data/pxf_examples/
+```
+
+5. Display the contents of the `pxf_hdfs_simple.txt` file stored in HDFS:
+
+``` shell
+$ sudo -u hdfs hdfs dfs -cat /data/pxf_examples/pxf_hdfs_simple.txt
+```
+
+6. Create a second delimited plain text file:
+
+``` shell
+$ vi /tmp/pxf_hdfs_multi.txt
+```
 
-To read the data in the files or to write based on the existing format, 
use `FORMAT`, `PROFILE`, or one of the classes.
-
-This 

[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606514#comment-15606514
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r85003214
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,388 +2,282 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes.
+-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
 
-``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
-( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://host[:port]/path-to-data?[=value...]')
-  FORMAT '[TEXT | CSV | CUSTOM]' ();
-```
+## HDFS File Formats
 
-where `` is:
+The PXF HDFS plug-in supports reading the following file formats:
 
-``` pre
-   
FRAGMENTER=fragmenter_class=accessor_class=resolver_class]
- | PROFILE=profile-name
-```
+- Text File - comma-separated value (.csv) or delimited format plain text 
file
+- Avro - JSON-defined, schema-based data serialization format
 
-**Note:** Omit the `FRAGMENTER` parameter for `READABLE` external tables.
+The PXF HDFS plug-in includes the following profiles to support the file 
formats listed above:
 
-Use an SQL `SELECT` statement to read from an HDFS READABLE table:
+- `HdfsTextSimple` - text files
+- `HdfsTextMulti` - text files with embedded line feeds
+- `Avro` - Avro files
 
-``` sql
-SELECT ... FROM table_name;
-```
 
-Use an SQL `INSERT` statement to add data to an HDFS WRITABLE table:
+## HDFS Shell Commands
+Hadoop includes command-line tools that interact directly with HDFS.  
These tools support typical file system operations including copying and 
listing files, changing file permissions, etc. 
 
-``` sql
-INSERT INTO table_name ...;
-```
+The HDFS file system command is `hdfs dfs  []`. Invoked 
with no options, `hdfs dfs` lists the file system options supported by the tool.
+
+`hdfs dfs` options used in this section are identified in the table below:
+
+| Option  | Description |
+|---|-|
+| `-cat`| Display file contents. |
+| `-mkdir`| Create directory in HDFS. |
+| `-put`| Copy file from local file system to HDFS. |
+
+### Create Data Files
+
+Perform the following steps to create data files used in subsequent 
exercises:
+
+1. Create an HDFS directory for PXF example data files:
+
+``` shell
+ $ sudo -u hdfs hdfs dfs -mkdir -p /data/pxf_examples
+```
+
+2. Create a delimited plain text file:
+
+``` shell
+$ vi /tmp/pxf_hdfs_simple.txt
+```
+
+3. Copy and paste the following data into `pxf_hdfs_simple.txt`:
+
+``` pre
+Prague,Jan,101,4875.33
+Rome,Mar,87,1557.39
+Bangalore,May,317,8936.99
+Beijing,Jul,411,11600.67
+```
+
+Notice the use of the comma `,` to separate the four data fields.
+
+4. Add the data file to HDFS:
+
+``` shell
+$ sudo -u hdfs hdfs dfs -put /tmp/pxf_hdfs_simple.txt 
/data/pxf_examples/
+```
+
+5. Display the contents of the `pxf_hdfs_simple.txt` file stored in HDFS:
+
+``` shell
+$ sudo -u hdfs hdfs dfs -cat /data/pxf_examples/pxf_hdfs_simple.txt
+```
+
+6. Create a second delimited plain text file:
+
+``` shell
+$ vi /tmp/pxf_hdfs_multi.txt
+```
 
-To read the data in the files or to write based on the existing format, 
use `FORMAT`, `PROFILE`, or one of the classes.
-
-This 

[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606515#comment-15606515
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r85003579
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -415,93 +312,101 @@ The following example uses the Avro schema shown in 
[Sample Avro Schema](#topic_
 {"name":"street", "type":"string"},
 {"name":"city", "type":"string"}]
 }
-  }, {
-   "name": "relationship",
-"type": {
-"type": "enum",
-"name": "relationshipEnum",
-"symbols": 
["MARRIED","LOVE","FRIEND","COLLEAGUE","STRANGER","ENEMY"]
-}
-  }, {
-"name" : "md5",
-"type": {
-"type" : "fixed",
-"name" : "md5Fixed",
-"size" : 4
-}
   } ],
   "doc:" : "A basic schema for storing messages"
 }
 ```
 
- Sample Avro Data 
(JSON)
+### Sample Avro Data 
(JSON)
+
+Create a text file named `pxf_hdfs_avro.txt`:
+
+``` shell
+$ vi /tmp/pxf_hdfs_avro.txt
+```
+
+Enter the following data into `pxf_hdfs_avro.txt`:
 
 ``` pre
-{"id":1, "username":"john","followers":["kate", "santosh"], "rank":null, 
"relationship": "FRIEND", "fmap": {"kate":10,"santosh":4},
-"address":{"street":"renaissance drive", "number":1,"city":"san jose"}, 
"md5":\u3F00\u007A\u0073\u0074}
+{"id":1, "username":"john","followers":["kate", "santosh"], 
"relationship": "FRIEND", "fmap": {"kate":10,"santosh":4}, 
"address":{"number":1, "street":"renaissance drive", "city":"san jose"}}
+
+{"id":2, "username":"jim","followers":["john", "pam"], "relationship": 
"COLLEAGUE", "fmap": {"john":3,"pam":3}, "address":{"number":9, "street":"deer 
creek", "city":"palo alto"}}
+```
+
+The sample data uses a comma `,` to separate top level records and a colon 
`:` to separate map/key values and record field name/values.
 
-{"id":2, "username":"jim","followers":["john", "pam"], "rank":3, 
"relationship": "COLLEAGUE", "fmap": {"john":3,"pam":3}, 
-"address":{"street":"deer creek", "number":9,"city":"palo alto"}, 
"md5":\u0010\u0021\u0003\u0004}
+Convert the text file to Avro format. There are various ways to perform 
the conversion programmatically and via the command line. In this example, we 
use the [Java Avro tools](http://avro.apache.org/releases.html), and the jar 
file resides in the current directory:
+
+``` shell
+$ java -jar ./avro-tools-1.8.1.jar fromjson --schema-file 
/tmp/avro_schema.avsc /tmp/pxf_hdfs_avro.txt > /tmp/pxf_hdfs_avro.avro
 ```
 
-To map this Avro file to an external table, the top-level primitive fields 
("id" of type long and "username" of type string) are mapped to their 
equivalent HAWQ types (bigint and text). The remaining complex fields are 
mapped to text columns:
+The generated Avro binary data file is written to 
`/tmp/pxf_hdfs_avro.avro`. Copy this file to HDFS:
 
-``` sql
-gpadmin=# CREATE EXTERNAL TABLE avro_complex 
-  (id bigint, 
-  username text, 
-  followers text, 
-  rank int, 
-  fmap text, 
-  address text, 
-  relationship text,
-  md5 bytea) 
-LOCATION ('pxf://namehost:51200/tmp/avro_complex?PROFILE=Avro')
-FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
+``` shell
+$ sudo -u hdfs hdfs dfs -put /tmp/pxf_hdfs_avro.avro /data/pxf_examples/
 ```
+### Querying Avro Data
+
+Create a queryable external table from this Avro file:
 
-The above command uses default delimiters for separating components of the 
complex types. This command is equivalent to the one above, but it explicitly 
sets the delimiters using the Avro profile parameters:
+-  Map the top-level primitive fields, `id` (type long) and `username` 
(type string), to their equivalent HAWQ types (bigint and text). 
+-  Map the remaining complex fields to type text.
+-  Explicitly set the record, map, and collection delimiters using the 
Avro profile custom options:
 
 ``` sql
-gpadmin=# CREATE EXTERNAL TABLE avro_complex 
-  (id bigint, 
-  username text, 
-  followers text, 
-  rank int, 
-  fmap text, 
-  address text, 
-  relationship text,
-  md5 bytea) 
-LOCATION 
('pxf://localhost:51200/tmp/avro_complex?PROFILE=Avro_DELIM=,_DELIM=:_DELIM=:')
-FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
+gpadmin=# CREATE EXTERNAL TABLE pxf_hdfs_avro(id bigint, username text, 
followers text, fmap text, relationship text, address text)
+LOCATION 
('pxf://namenode:51200/data/pxf_examples/pxf_hdfs_avro.avro?PROFILE=Avro_DELIM=,_DELIM=:_DELIM=:')
+  

[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606508#comment-15606508
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/33#discussion_r84997631
  
--- Diff: pxf/HDFSFileDataPXF.html.md.erb ---
@@ -2,388 +2,282 @@
 title: Accessing HDFS File Data
 ---
 
-## Prerequisites
+HDFS is the primary distributed storage mechanism used by Apache Hadoop 
applications. The PXF HDFS plug-in reads file data stored in HDFS.  The plug-in 
supports plain delimited and comma-separated-value format text files.  The HDFS 
plug-in also supports the Avro binary format.
 
-Before working with HDFS file data using HAWQ and PXF, you should perform 
the following operations:
+This section describes how to use PXF to access HDFS data, including how 
to create and query an external table from files in the HDFS data store.
 
--   Test PXF on HDFS before connecting to Hive or HBase.
--   Ensure that all HDFS users have read permissions to HDFS services and 
that write permissions have been limited to specific users.
+## Prerequisites
 
-## Syntax
+Before working with HDFS file data using HAWQ and PXF, ensure that:
 
-The syntax for creating an external HDFS file is as follows: 
+-   The HDFS plug-in is installed on all cluster nodes.
+-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
 
-``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
-( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://host[:port]/path-to-data?[=value...]')
-  FORMAT '[TEXT | CSV | CUSTOM]' ();
-```
+## HDFS File Formats
 
-where `` is:
+The PXF HDFS plug-in supports reading the following file formats:
 
-``` pre
-   
FRAGMENTER=fragmenter_class=accessor_class=resolver_class]
- | PROFILE=profile-name
-```
+- Text File - comma-separated value (.csv) or delimited format plain text 
file
+- Avro - JSON-defined, schema-based data serialization format
 
-**Note:** Omit the `FRAGMENTER` parameter for `READABLE` external tables.
+The PXF HDFS plug-in includes the following profiles to support the file 
formats listed above:
 
-Use an SQL `SELECT` statement to read from an HDFS READABLE table:
+- `HdfsTextSimple` - text files
+- `HdfsTextMulti` - text files with embedded line feeds
+- `Avro` - Avro files
 
-``` sql
-SELECT ... FROM table_name;
-```
 
-Use an SQL `INSERT` statement to add data to an HDFS WRITABLE table:
+## HDFS Shell Commands
+Hadoop includes command-line tools that interact directly with HDFS.  
These tools support typical file system operations including copying and 
listing files, changing file permissions, etc. 
--- End diff --

Change "etc." to "and so forth."


> PXF HDFS documentation - restructure content and include more examples
> --
>
> Key: HAWQ-1107
> URL: https://issues.apache.org/jira/browse/HAWQ-1107
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF HDFS documentation does not include any runnable examples.  
> add runnable examples for all (HdfsTextSimple, HdfsTextMulti, SerialWritable, 
> Avro) profiles.  restructure the content as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15602963#comment-15602963
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

GitHub user lisakowen opened a pull request:

https://github.com/apache/incubator-hawq-docs/pull/34

HAWQ-1107 - subnav chgs for pxf hdfs plugin content restructure

subnav changes

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lisakowen/incubator-hawq-docs 
feature/subnav-pxfhdfs-enhance

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq-docs/pull/34.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #34


commit f350e41fa419e9fb661f4ccb6e8793b7d9e9a40b
Author: Lisa Owen 
Date:   2016-10-24T19:30:37Z

subna chgs for pxf hdfs plugin content restructure




> PXF HDFS documentation - restructure content and include more examples
> --
>
> Key: HAWQ-1107
> URL: https://issues.apache.org/jira/browse/HAWQ-1107
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF HDFS documentation does not include any runnable examples.  
> add runnable examples for all (HdfsTextSimple, HdfsTextMulti, SerialWritable, 
> Avro) profiles.  restructure the content as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1107) PXF HDFS documentation - restructure content and include more examples

2016-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15602949#comment-15602949
 ] 

ASF GitHub Bot commented on HAWQ-1107:
--

GitHub user lisakowen opened a pull request:

https://github.com/apache/incubator-hawq-docs/pull/33

HAWQ-1107 - enhance PXF HDFS plugin documentation

added more examples, restructured the content, removed SequenceWritable 
references.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lisakowen/incubator-hawq-docs 
feature/pxfhdfs-enhance

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq-docs/pull/33.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #33


commit 9ca277927bebd9c8d79bdf4619dfaf94a695c838
Author: Lisa Owen 
Date:   2016-10-14T22:29:22Z

start restructuring HDFS plug-in page

commit 2da7a92a3e8431335a48005d55a70c9eba333e16
Author: Lisa Owen 
Date:   2016-10-17T23:27:23Z

more content and rearranging of pxf hdfs plugin page

commit 5a941a70bda0e8466b5aa5dd2885840fce14c522
Author: Lisa Owen 
Date:   2016-10-18T16:57:09Z

more rework of hdfs plug in page

commit fd029d568589f5a4e2461d92437963d97f7d3198
Author: Lisa Owen 
Date:   2016-10-20T19:20:21Z

remove SerialWritable, use namenode for host

commit 6ba64f94d5b11397c98f46eb14d5c6e48d17a6cc
Author: Lisa Owen 
Date:   2016-10-20T21:12:43Z

use more descriptive file names

commit 86d13b312ea8591949b8a811973937ab60f74df9
Author: Lisa Owen 
Date:   2016-10-20T22:36:01Z

more mods to HDFS plugin docs




> PXF HDFS documentation - restructure content and include more examples
> --
>
> Key: HAWQ-1107
> URL: https://issues.apache.org/jira/browse/HAWQ-1107
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> the current PXF HDFS documentation does not include any runnable examples.  
> add runnable examples for all (HdfsTextSimple, HdfsTextMulti, SerialWritable, 
> Avro) profiles.  restructure the content as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1096) document the HAWQ built-in languages (SQL, C, internal)

2016-10-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589313#comment-15589313
 ] 

ASF GitHub Bot commented on HAWQ-1096:
--

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq-docs/pull/25


> document the HAWQ built-in languages (SQL, C, internal)
> ---
>
> Key: HAWQ-1096
> URL: https://issues.apache.org/jira/browse/HAWQ-1096
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
>
> the HAWQ docs do not discuss the built-in languages supported by HAWQ - SQL, 
> C and internal.  add content to introduce these languages with relevant 
> examples and links. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1095) enhance database driver and API documentation

2016-10-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589318#comment-15589318
 ] 

ASF GitHub Bot commented on HAWQ-1095:
--

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq-docs/pull/23


> enhance database driver and API documentation
> -
>
> Key: HAWQ-1095
> URL: https://issues.apache.org/jira/browse/HAWQ-1095
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> docs contain very brief references to JDBC/ODBC and none at all to libpq.  
> add more content in these areas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1096) document the HAWQ built-in languages (SQL, C, internal)

2016-10-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589308#comment-15589308
 ] 

ASF GitHub Bot commented on HAWQ-1096:
--

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq-docs/pull/27


> document the HAWQ built-in languages (SQL, C, internal)
> ---
>
> Key: HAWQ-1096
> URL: https://issues.apache.org/jira/browse/HAWQ-1096
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
>
> the HAWQ docs do not discuss the built-in languages supported by HAWQ - SQL, 
> C and internal.  add content to introduce these languages with relevant 
> examples and links. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1095) enhance database driver and API documentation

2016-10-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589261#comment-15589261
 ] 

ASF GitHub Bot commented on HAWQ-1095:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/23#discussion_r84117499
  
--- Diff: clientaccess/g-database-application-interfaces.html.md.erb ---
@@ -1,8 +1,96 @@
 ---
-title: ODBC/JDBC Application Interfaces
+title: HAWQ Database Drivers and APIs
 ---
 
+You may want to connect your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The database application programming 
interfaces most commonly used with HAWQ are the Postgres and ODBC and JDBC APIs.
 
-You may want to deploy your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The most commonly used database application 
programming interfaces with HAWQ are the ODBC and JDBC APIs. 
+HAWQ provides the following connectivity tools for connecting to the 
database:
+
+  - ODBC driver
+  - JDBC driver
+  - `libpq` - PostgreSQL C API
+
+## HAWQ Drivers
+
+ODBC and JDBC drivers for HAWQ are available as a separate download from 
Pivotal Network [Pivotal 
Network](https://network.pivotal.io/products/pivotal-hdb).
+
+### ODBC Driver
+
+The ODBC API specifies a standard set of C interfaces for accessing 
database management systems.  For additional information on using the ODBC API, 
refer to the [ODBC Programmer's 
Reference](https://msdn.microsoft.com/en-us/library/ms714177(v=vs.85).aspx) 
documentation.
+
+HAWQ supports the DataDirect ODBC Driver. Installation instructions for 
this driver are provided on the Pivotal Network driver download page. Refer to 
[HAWQ ODBC 
Driver](http://media.datadirect.com/download/docs/odbc/allodbc/#page/odbc%2Fthe-greenplum-wire-protocol-driver.html%23)
 for HAWQ-specific ODBC driver information.
--- End diff --

Ok - thanks.  I think in other cases PDFs of the actual docs are included.  
This might only be in the Windows downloads.


> enhance database driver and API documentation
> -
>
> Key: HAWQ-1095
> URL: https://issues.apache.org/jira/browse/HAWQ-1095
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> docs contain very brief references to JDBC/ODBC and none at all to libpq.  
> add more content in these areas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1096) document the HAWQ built-in languages (SQL, C, internal)

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587174#comment-15587174
 ] 

ASF GitHub Bot commented on HAWQ-1096:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/25#discussion_r83978174
  
--- Diff: plext/builtin_langs.html.md.erb ---
@@ -0,0 +1,110 @@
+---
+title: Using HAWQ Built-In Languages
+---
+
+This section provides an introduction to using the HAWQ built-in languages.
+
+HAWQ supports user-defined functions created with the SQL and C built-in 
languages. HAWQ also supports user-defined aliases for internal functions.
+
+
+## Enabling Built-in Language Support
+
+Support for SQL, internal, and C language user-defined functions is 
enabled by default for all HAWQ databases.
+
+## SQL
+
+SQL functions execute an arbitrary list of SQL statements. The SQL 
statements in the body of an SQL function must be separated by semicolons. The 
final statement in a non-void-returning SQL function must be a 
[SELECT](../reference/sql/SELECT.html) that returns data of the type specified 
by the function's return type. The function will return a single or set of rows 
corresponding to this last SQL query.
+
+The following example creates and calls an SQL function to count the 
number of rows of the database named `orders`:
+
+``` sql
+gpadmin=# CREATE FUNCTION count_orders() RETURNS bigint AS $$
+ SELECT count(*) FROM orders;
+$$ LANGUAGE SQL;
+CREATE FUNCTION
+gpadmin=# select count_orders();
+ my_count 
+--
+   830513
+(1 row)
+```
+
+For additional information on creating SQL functions, refer to [Query 
Language (SQL) 
Functions](https://www.postgresql.org/docs/8.2/static/xfunc-sql.html) in the 
PostgreSQL documentation.
+
+## Internal
--- End diff --

Change title to "Internal Functions"?


> document the HAWQ built-in languages (SQL, C, internal)
> ---
>
> Key: HAWQ-1096
> URL: https://issues.apache.org/jira/browse/HAWQ-1096
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
>
> the HAWQ docs do not discuss the built-in languages supported by HAWQ - SQL, 
> C and internal.  add content to introduce these languages with relevant 
> examples and links. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1096) document the HAWQ built-in languages (SQL, C, internal)

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587169#comment-15587169
 ] 

ASF GitHub Bot commented on HAWQ-1096:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/25#discussion_r83979056
  
--- Diff: plext/builtin_langs.html.md.erb ---
@@ -0,0 +1,110 @@
+---
+title: Using HAWQ Built-In Languages
+---
+
+This section provides an introduction to using the HAWQ built-in languages.
+
+HAWQ supports user-defined functions created with the SQL and C built-in 
languages. HAWQ also supports user-defined aliases for internal functions.
+
+
+## Enabling Built-in Language Support
+
+Support for SQL, internal, and C language user-defined functions is 
enabled by default for all HAWQ databases.
+
+## SQL
+
+SQL functions execute an arbitrary list of SQL statements. The SQL 
statements in the body of an SQL function must be separated by semicolons. The 
final statement in a non-void-returning SQL function must be a 
[SELECT](../reference/sql/SELECT.html) that returns data of the type specified 
by the function's return type. The function will return a single or set of rows 
corresponding to this last SQL query.
+
+The following example creates and calls an SQL function to count the 
number of rows of the database named `orders`:
+
+``` sql
+gpadmin=# CREATE FUNCTION count_orders() RETURNS bigint AS $$
+ SELECT count(*) FROM orders;
+$$ LANGUAGE SQL;
+CREATE FUNCTION
+gpadmin=# select count_orders();
+ my_count 
+--
+   830513
+(1 row)
+```
+
+For additional information on creating SQL functions, refer to [Query 
Language (SQL) 
Functions](https://www.postgresql.org/docs/8.2/static/xfunc-sql.html) in the 
PostgreSQL documentation.
--- End diff --

Global edit:  Change "For additional information on" to "For additional 
information about"


> document the HAWQ built-in languages (SQL, C, internal)
> ---
>
> Key: HAWQ-1096
> URL: https://issues.apache.org/jira/browse/HAWQ-1096
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
>
> the HAWQ docs do not discuss the built-in languages supported by HAWQ - SQL, 
> C and internal.  add content to introduce these languages with relevant 
> examples and links. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1096) document the HAWQ built-in languages (SQL, C, internal)

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587168#comment-15587168
 ] 

ASF GitHub Bot commented on HAWQ-1096:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/25#discussion_r83977628
  
--- Diff: plext/builtin_langs.html.md.erb ---
@@ -0,0 +1,110 @@
+---
+title: Using HAWQ Built-In Languages
+---
+
+This section provides an introduction to using the HAWQ built-in languages.
+
+HAWQ supports user-defined functions created with the SQL and C built-in 
languages. HAWQ also supports user-defined aliases for internal functions.
+
+
+## Enabling Built-in Language Support
+
+Support for SQL, internal, and C language user-defined functions is 
enabled by default for all HAWQ databases.
+
+## SQL
+
+SQL functions execute an arbitrary list of SQL statements. The SQL 
statements in the body of an SQL function must be separated by semicolons. The 
final statement in a non-void-returning SQL function must be a 
[SELECT](../reference/sql/SELECT.html) that returns data of the type specified 
by the function's return type. The function will return a single or set of rows 
corresponding to this last SQL query.
--- End diff --

Global:  change "an SQL" to "a SQL" (pronounced 'sequel')


> document the HAWQ built-in languages (SQL, C, internal)
> ---
>
> Key: HAWQ-1096
> URL: https://issues.apache.org/jira/browse/HAWQ-1096
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
>
> the HAWQ docs do not discuss the built-in languages supported by HAWQ - SQL, 
> C and internal.  add content to introduce these languages with relevant 
> examples and links. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1096) document the HAWQ built-in languages (SQL, C, internal)

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587170#comment-15587170
 ] 

ASF GitHub Bot commented on HAWQ-1096:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/25#discussion_r83978854
  
--- Diff: plext/builtin_langs.html.md.erb ---
@@ -0,0 +1,110 @@
+---
+title: Using HAWQ Built-In Languages
+---
+
+This section provides an introduction to using the HAWQ built-in languages.
+
+HAWQ supports user-defined functions created with the SQL and C built-in 
languages. HAWQ also supports user-defined aliases for internal functions.
+
+
+## Enabling Built-in Language Support
+
+Support for SQL, internal, and C language user-defined functions is 
enabled by default for all HAWQ databases.
+
+## SQL
+
+SQL functions execute an arbitrary list of SQL statements. The SQL 
statements in the body of an SQL function must be separated by semicolons. The 
final statement in a non-void-returning SQL function must be a 
[SELECT](../reference/sql/SELECT.html) that returns data of the type specified 
by the function's return type. The function will return a single or set of rows 
corresponding to this last SQL query.
+
+The following example creates and calls an SQL function to count the 
number of rows of the database named `orders`:
+
+``` sql
+gpadmin=# CREATE FUNCTION count_orders() RETURNS bigint AS $$
+ SELECT count(*) FROM orders;
+$$ LANGUAGE SQL;
+CREATE FUNCTION
+gpadmin=# select count_orders();
+ my_count 
+--
+   830513
+(1 row)
+```
+
+For additional information on creating SQL functions, refer to [Query 
Language (SQL) 
Functions](https://www.postgresql.org/docs/8.2/static/xfunc-sql.html) in the 
PostgreSQL documentation.
+
+## Internal
+
+Many HAWQ internal functions are written in C. These functions are 
declared during initialization of the database cluster and statically linked to 
the HAWQ server. See [Built-in Functions and 
Operators](../query/functions-operators.html#topic29) for detailed information 
on HAWQ internal functions.
+
+While users cannot define new internal functions, they can create aliases 
for existing internal functions.
+
+The following example creates a new function named `all_caps` that will be 
defined as an alias for the `upper` HAWQ internal function:
+
+
+``` sql
+gpadmin=# CREATE FUNCTION all_caps (text) RETURNS text AS 'upper'
+LANGUAGE internal STRICT;
+CREATE FUNCTION
+gpadmin=# SELECT all_caps('change me');
+ all_caps  
+---
+ CHANGE ME
+(1 row)
+
+```
+
+For more information on aliasing internal functions, refer to [Internal 
Functions](https://www.postgresql.org/docs/8.2/static/xfunc-internal.html) in 
the PostgreSQL documentation.
+
+## C
+
+User-defined functions written in C must be compiled into shared libraries 
to be loaded by the HAWQ server on demand. This dynamic loading distinguishes C 
language functions from internal functions that are written in C.
--- End diff --

Avoid passive voice here:  "You must compile user-defined functions written 
in C into shared libraries so that the HAWQ server can load them on demand."


> document the HAWQ built-in languages (SQL, C, internal)
> ---
>
> Key: HAWQ-1096
> URL: https://issues.apache.org/jira/browse/HAWQ-1096
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
>
> the HAWQ docs do not discuss the built-in languages supported by HAWQ - SQL, 
> C and internal.  add content to introduce these languages with relevant 
> examples and links. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1096) document the HAWQ built-in languages (SQL, C, internal)

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587173#comment-15587173
 ] 

ASF GitHub Bot commented on HAWQ-1096:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/25#discussion_r83978549
  
--- Diff: plext/builtin_langs.html.md.erb ---
@@ -0,0 +1,110 @@
+---
+title: Using HAWQ Built-In Languages
+---
+
+This section provides an introduction to using the HAWQ built-in languages.
+
+HAWQ supports user-defined functions created with the SQL and C built-in 
languages. HAWQ also supports user-defined aliases for internal functions.
+
+
+## Enabling Built-in Language Support
+
+Support for SQL, internal, and C language user-defined functions is 
enabled by default for all HAWQ databases.
+
+## SQL
+
+SQL functions execute an arbitrary list of SQL statements. The SQL 
statements in the body of an SQL function must be separated by semicolons. The 
final statement in a non-void-returning SQL function must be a 
[SELECT](../reference/sql/SELECT.html) that returns data of the type specified 
by the function's return type. The function will return a single or set of rows 
corresponding to this last SQL query.
+
+The following example creates and calls an SQL function to count the 
number of rows of the database named `orders`:
+
+``` sql
+gpadmin=# CREATE FUNCTION count_orders() RETURNS bigint AS $$
+ SELECT count(*) FROM orders;
+$$ LANGUAGE SQL;
+CREATE FUNCTION
+gpadmin=# select count_orders();
+ my_count 
+--
+   830513
+(1 row)
+```
+
+For additional information on creating SQL functions, refer to [Query 
Language (SQL) 
Functions](https://www.postgresql.org/docs/8.2/static/xfunc-sql.html) in the 
PostgreSQL documentation.
+
+## Internal
+
+Many HAWQ internal functions are written in C. These functions are 
declared during initialization of the database cluster and statically linked to 
the HAWQ server. See [Built-in Functions and 
Operators](../query/functions-operators.html#topic29) for detailed information 
on HAWQ internal functions.
+
+While users cannot define new internal functions, they can create aliases 
for existing internal functions.
+
+The following example creates a new function named `all_caps` that will be 
defined as an alias for the `upper` HAWQ internal function:
--- End diff --

Edit:  change "that will be defined as an" to "that is an"


> document the HAWQ built-in languages (SQL, C, internal)
> ---
>
> Key: HAWQ-1096
> URL: https://issues.apache.org/jira/browse/HAWQ-1096
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
>
> the HAWQ docs do not discuss the built-in languages supported by HAWQ - SQL, 
> C and internal.  add content to introduce these languages with relevant 
> examples and links. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1096) document the HAWQ built-in languages (SQL, C, internal)

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587171#comment-15587171
 ] 

ASF GitHub Bot commented on HAWQ-1096:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/25#discussion_r83978465
  
--- Diff: plext/builtin_langs.html.md.erb ---
@@ -0,0 +1,110 @@
+---
+title: Using HAWQ Built-In Languages
+---
+
+This section provides an introduction to using the HAWQ built-in languages.
+
+HAWQ supports user-defined functions created with the SQL and C built-in 
languages. HAWQ also supports user-defined aliases for internal functions.
+
+
+## Enabling Built-in Language Support
+
+Support for SQL, internal, and C language user-defined functions is 
enabled by default for all HAWQ databases.
+
+## SQL
+
+SQL functions execute an arbitrary list of SQL statements. The SQL 
statements in the body of an SQL function must be separated by semicolons. The 
final statement in a non-void-returning SQL function must be a 
[SELECT](../reference/sql/SELECT.html) that returns data of the type specified 
by the function's return type. The function will return a single or set of rows 
corresponding to this last SQL query.
+
+The following example creates and calls an SQL function to count the 
number of rows of the database named `orders`:
+
+``` sql
+gpadmin=# CREATE FUNCTION count_orders() RETURNS bigint AS $$
+ SELECT count(*) FROM orders;
+$$ LANGUAGE SQL;
+CREATE FUNCTION
+gpadmin=# select count_orders();
+ my_count 
+--
+   830513
+(1 row)
+```
+
+For additional information on creating SQL functions, refer to [Query 
Language (SQL) 
Functions](https://www.postgresql.org/docs/8.2/static/xfunc-sql.html) in the 
PostgreSQL documentation.
+
+## Internal
+
+Many HAWQ internal functions are written in C. These functions are 
declared during initialization of the database cluster and statically linked to 
the HAWQ server. See [Built-in Functions and 
Operators](../query/functions-operators.html#topic29) for detailed information 
on HAWQ internal functions.
+
+While users cannot define new internal functions, they can create aliases 
for existing internal functions.
--- End diff --

Reword:  **You** cannot define new internal functions, **but you** can 
create...


> document the HAWQ built-in languages (SQL, C, internal)
> ---
>
> Key: HAWQ-1096
> URL: https://issues.apache.org/jira/browse/HAWQ-1096
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
>
> the HAWQ docs do not discuss the built-in languages supported by HAWQ - SQL, 
> C and internal.  add content to introduce these languages with relevant 
> examples and links. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1096) document the HAWQ built-in languages (SQL, C, internal)

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587175#comment-15587175
 ] 

ASF GitHub Bot commented on HAWQ-1096:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/25#discussion_r83978153
  
--- Diff: plext/builtin_langs.html.md.erb ---
@@ -0,0 +1,110 @@
+---
+title: Using HAWQ Built-In Languages
+---
+
+This section provides an introduction to using the HAWQ built-in languages.
+
+HAWQ supports user-defined functions created with the SQL and C built-in 
languages. HAWQ also supports user-defined aliases for internal functions.
+
+
+## Enabling Built-in Language Support
+
+Support for SQL, internal, and C language user-defined functions is 
enabled by default for all HAWQ databases.
+
+## SQL
+
+SQL functions execute an arbitrary list of SQL statements. The SQL 
statements in the body of an SQL function must be separated by semicolons. The 
final statement in a non-void-returning SQL function must be a 
[SELECT](../reference/sql/SELECT.html) that returns data of the type specified 
by the function's return type. The function will return a single or set of rows 
corresponding to this last SQL query.
+
+The following example creates and calls an SQL function to count the 
number of rows of the database named `orders`:
+
+``` sql
+gpadmin=# CREATE FUNCTION count_orders() RETURNS bigint AS $$
+ SELECT count(*) FROM orders;
+$$ LANGUAGE SQL;
+CREATE FUNCTION
+gpadmin=# select count_orders();
+ my_count 
+--
+   830513
+(1 row)
+```
+
+For additional information on creating SQL functions, refer to [Query 
Language (SQL) 
Functions](https://www.postgresql.org/docs/8.2/static/xfunc-sql.html) in the 
PostgreSQL documentation.
+
+## Internal
+
+Many HAWQ internal functions are written in C. These functions are 
declared during initialization of the database cluster and statically linked to 
the HAWQ server. See [Built-in Functions and 
Operators](../query/functions-operators.html#topic29) for detailed information 
on HAWQ internal functions.
+
+While users cannot define new internal functions, they can create aliases 
for existing internal functions.
+
+The following example creates a new function named `all_caps` that will be 
defined as an alias for the `upper` HAWQ internal function:
+
+
+``` sql
+gpadmin=# CREATE FUNCTION all_caps (text) RETURNS text AS 'upper'
+LANGUAGE internal STRICT;
+CREATE FUNCTION
+gpadmin=# SELECT all_caps('change me');
+ all_caps  
+---
+ CHANGE ME
+(1 row)
+
+```
+
+For more information on aliasing internal functions, refer to [Internal 
Functions](https://www.postgresql.org/docs/8.2/static/xfunc-internal.html) in 
the PostgreSQL documentation.
+
+## C
--- End diff --

This id value is the same as the previous one - should be unique.  Also 
change header to "C Functions"?


> document the HAWQ built-in languages (SQL, C, internal)
> ---
>
> Key: HAWQ-1096
> URL: https://issues.apache.org/jira/browse/HAWQ-1096
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
>
> the HAWQ docs do not discuss the built-in languages supported by HAWQ - SQL, 
> C and internal.  add content to introduce these languages with relevant 
> examples and links. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1096) document the HAWQ built-in languages (SQL, C, internal)

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587172#comment-15587172
 ] 

ASF GitHub Bot commented on HAWQ-1096:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/25#discussion_r83976414
  
--- Diff: plext/UsingProceduralLanguages.html.md.erb ---
@@ -1,13 +1,16 @@
 ---
-title: Using Procedural Languages and Extensions in HAWQ
+title: Using Languages and Extensions in HAWQ
 ---
 
-HAWQ allows user-defined functions to be written in other languages 
besides SQL and C. These other languages are generically called *procedural 
languages* (PLs).
+HAWQ supports user-defined functions created with the SQL and C built-in 
languages, including supporting user-defined aliases for internal functions.
--- End diff --

This needs a bit of an edit:  HAWQ supports user-defined functions **that 
are** created with the SQL and C built-in languages, **and also supports** 
user-defined aliases for internal functions.



> document the HAWQ built-in languages (SQL, C, internal)
> ---
>
> Key: HAWQ-1096
> URL: https://issues.apache.org/jira/browse/HAWQ-1096
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
>
> the HAWQ docs do not discuss the built-in languages supported by HAWQ - SQL, 
> C and internal.  add content to introduce these languages with relevant 
> examples and links. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1095) enhance database driver and API documentation

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587132#comment-15587132
 ] 

ASF GitHub Bot commented on HAWQ-1095:
--

Github user lisakowen commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/23#discussion_r83977371
  
--- Diff: clientaccess/g-database-application-interfaces.html.md.erb ---
@@ -1,8 +1,96 @@
 ---
-title: ODBC/JDBC Application Interfaces
+title: HAWQ Database Drivers and APIs
 ---
 
+You may want to connect your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The database application programming 
interfaces most commonly used with HAWQ are the Postgres and ODBC and JDBC APIs.
 
-You may want to deploy your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The most commonly used database application 
programming interfaces with HAWQ are the ODBC and JDBC APIs. 
+HAWQ provides the following connectivity tools for connecting to the 
database:
+
+  - ODBC driver
+  - JDBC driver
+  - `libpq` - PostgreSQL C API
+
+## HAWQ Drivers
+
+ODBC and JDBC drivers for HAWQ are available as a separate download from 
Pivotal Network [Pivotal 
Network](https://network.pivotal.io/products/pivotal-hdb).
+
+### ODBC Driver
+
+The ODBC API specifies a standard set of C interfaces for accessing 
database management systems.  For additional information on using the ODBC API, 
refer to the [ODBC Programmer's 
Reference](https://msdn.microsoft.com/en-us/library/ms714177(v=vs.85).aspx) 
documentation.
+
+HAWQ supports the DataDirect ODBC Driver. Installation instructions for 
this driver are provided on the Pivotal Network driver download page. Refer to 
[HAWQ ODBC 
Driver](http://media.datadirect.com/download/docs/odbc/allodbc/#page/odbc%2Fthe-greenplum-wire-protocol-driver.html%23)
 for HAWQ-specific ODBC driver information.
--- End diff --

users will download the readme from pivnet.  the link at the end of the 
readme points to a datadirect page from which one could navigate to the links i 
have included.

i don't see any other docs when i untar the download package.


> enhance database driver and API documentation
> -
>
> Key: HAWQ-1095
> URL: https://issues.apache.org/jira/browse/HAWQ-1095
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> docs contain very brief references to JDBC/ODBC and none at all to libpq.  
> add more content in these areas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1095) enhance database driver and API documentation

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587108#comment-15587108
 ] 

ASF GitHub Bot commented on HAWQ-1095:
--

Github user lisakowen commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/23#discussion_r83976521
  
--- Diff: clientaccess/g-database-application-interfaces.html.md.erb ---
@@ -1,8 +1,96 @@
 ---
-title: ODBC/JDBC Application Interfaces
+title: HAWQ Database Drivers and APIs
 ---
 
+You may want to connect your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The database application programming 
interfaces most commonly used with HAWQ are the Postgres and ODBC and JDBC APIs.
 
-You may want to deploy your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The most commonly used database application 
programming interfaces with HAWQ are the ODBC and JDBC APIs. 
+HAWQ provides the following connectivity tools for connecting to the 
database:
+
+  - ODBC driver
+  - JDBC driver
+  - `libpq` - PostgreSQL C API
+
+## HAWQ Drivers
+
+ODBC and JDBC drivers for HAWQ are available as a separate download from 
Pivotal Network [Pivotal 
Network](https://network.pivotal.io/products/pivotal-hdb).
+
+### ODBC Driver
+
+The ODBC API specifies a standard set of C interfaces for accessing 
database management systems.  For additional information on using the ODBC API, 
refer to the [ODBC Programmer's 
Reference](https://msdn.microsoft.com/en-us/library/ms714177(v=vs.85).aspx) 
documentation.
+
+HAWQ supports the DataDirect ODBC Driver. Installation instructions for 
this driver are provided on the Pivotal Network driver download page. Refer to 
[HAWQ ODBC 
Driver](http://media.datadirect.com/download/docs/odbc/allodbc/#page/odbc%2Fthe-greenplum-wire-protocol-driver.html%23)
 for HAWQ-specific ODBC driver information.
+
+ Connection Data Source
+The information required by the HAWQ ODBC driver to connect to a database 
is typically stored in a named data source. Depending on your platform, you may 
use 
[GUI](http://media.datadirect.com/download/docs/odbc/allodbc/index.html#page/odbc%2FData_Source_Configuration_through_a_GUI_14.html%23)
 or [command 
line](http://media.datadirect.com/download/docs/odbc/allodbc/index.html#page/odbc%2FData_Source_Configuration_in_the_UNIX_2fLinux_odbc_13.html%23)
 tools to create your data source definition. On Linux, ODBC data sources are 
typically defined in a file named `odbc.ini`. 
+
+Commonly-specified HAWQ ODBC data source connection properties include:
+
+| Property Name| Value 
Description 

|

+|---|-|
+| Database | name of the database to which you want to connect |
+| Driver   | full path to the ODBC driver library file 
  |
+| HostName  | HAWQ master host name
 |
+| MaxLongVarcharSize  | maximum size of columns of type long varchar   

   |
+| Password  | password used to connect to the specified 
database
   |
+| PortNumber  | HAWQ master database port number   
|
+
+Refer to [Connection Option 
Descriptions](http://media.datadirect.com/download/docs/odbc/allodbc/#page/odbc%2Fgreenplum-connection-option-descriptions.html%23)
 for a list of ODBC connection properties supported by the HAWQ DataDirect ODBC 
driver.
+
+Example HAWQ DataDirect ODBC driver data source definition:
+
+``` shell
+[HAWQ-201]
+Driver=/usr/local/hawq_drivers/odbc/lib/ddgplm27.so
+Description=DataDirect 7.1 Greenplum Wire Protocol - for HAWQ
+Database=getstartdb
+HostName=hdm1
+PortNumber=5432
+Password=changeme
+MaxLongVarcharSize=8192
+```
+
+The first line, `[HAWQ-201]`, identifies the name of the data source.
+
+ODBC connection properties may also be specified in a connection string 
identifying either a data 

[jira] [Commented] (HAWQ-1095) enhance database driver and API documentation

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587103#comment-15587103
 ] 

ASF GitHub Bot commented on HAWQ-1095:
--

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq-docs/pull/26


> enhance database driver and API documentation
> -
>
> Key: HAWQ-1095
> URL: https://issues.apache.org/jira/browse/HAWQ-1095
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> docs contain very brief references to JDBC/ODBC and none at all to libpq.  
> add more content in these areas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1095) enhance database driver and API documentation

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587081#comment-15587081
 ] 

ASF GitHub Bot commented on HAWQ-1095:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/23#discussion_r83974367
  
--- Diff: clientaccess/g-database-application-interfaces.html.md.erb ---
@@ -1,8 +1,96 @@
 ---
-title: ODBC/JDBC Application Interfaces
+title: HAWQ Database Drivers and APIs
 ---
 
+You may want to connect your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The database application programming 
interfaces most commonly used with HAWQ are the Postgres and ODBC and JDBC APIs.
 
-You may want to deploy your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The most commonly used database application 
programming interfaces with HAWQ are the ODBC and JDBC APIs. 
+HAWQ provides the following connectivity tools for connecting to the 
database:
+
+  - ODBC driver
+  - JDBC driver
+  - `libpq` - PostgreSQL C API
+
+## HAWQ Drivers
+
+ODBC and JDBC drivers for HAWQ are available as a separate download from 
Pivotal Network [Pivotal 
Network](https://network.pivotal.io/products/pivotal-hdb).
+
+### ODBC Driver
+
+The ODBC API specifies a standard set of C interfaces for accessing 
database management systems.  For additional information on using the ODBC API, 
refer to the [ODBC Programmer's 
Reference](https://msdn.microsoft.com/en-us/library/ms714177(v=vs.85).aspx) 
documentation.
+
+HAWQ supports the DataDirect ODBC Driver. Installation instructions for 
this driver are provided on the Pivotal Network driver download page. Refer to 
[HAWQ ODBC 
Driver](http://media.datadirect.com/download/docs/odbc/allodbc/#page/odbc%2Fthe-greenplum-wire-protocol-driver.html%23)
 for HAWQ-specific ODBC driver information.
--- End diff --

Are you sure the datadirect link contains the same info available in the 
HAWQ ODBC download?


> enhance database driver and API documentation
> -
>
> Key: HAWQ-1095
> URL: https://issues.apache.org/jira/browse/HAWQ-1095
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> docs contain very brief references to JDBC/ODBC and none at all to libpq.  
> add more content in these areas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1095) enhance database driver and API documentation

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587078#comment-15587078
 ] 

ASF GitHub Bot commented on HAWQ-1095:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/23#discussion_r83974668
  
--- Diff: clientaccess/g-database-application-interfaces.html.md.erb ---
@@ -1,8 +1,96 @@
 ---
-title: ODBC/JDBC Application Interfaces
+title: HAWQ Database Drivers and APIs
 ---
 
+You may want to connect your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The database application programming 
interfaces most commonly used with HAWQ are the Postgres and ODBC and JDBC APIs.
 
-You may want to deploy your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The most commonly used database application 
programming interfaces with HAWQ are the ODBC and JDBC APIs. 
+HAWQ provides the following connectivity tools for connecting to the 
database:
+
+  - ODBC driver
+  - JDBC driver
+  - `libpq` - PostgreSQL C API
+
+## HAWQ Drivers
+
+ODBC and JDBC drivers for HAWQ are available as a separate download from 
Pivotal Network [Pivotal 
Network](https://network.pivotal.io/products/pivotal-hdb).
+
+### ODBC Driver
+
+The ODBC API specifies a standard set of C interfaces for accessing 
database management systems.  For additional information on using the ODBC API, 
refer to the [ODBC Programmer's 
Reference](https://msdn.microsoft.com/en-us/library/ms714177(v=vs.85).aspx) 
documentation.
+
+HAWQ supports the DataDirect ODBC Driver. Installation instructions for 
this driver are provided on the Pivotal Network driver download page. Refer to 
[HAWQ ODBC 
Driver](http://media.datadirect.com/download/docs/odbc/allodbc/#page/odbc%2Fthe-greenplum-wire-protocol-driver.html%23)
 for HAWQ-specific ODBC driver information.
+
+ Connection Data Source
+The information required by the HAWQ ODBC driver to connect to a database 
is typically stored in a named data source. Depending on your platform, you may 
use 
[GUI](http://media.datadirect.com/download/docs/odbc/allodbc/index.html#page/odbc%2FData_Source_Configuration_through_a_GUI_14.html%23)
 or [command 
line](http://media.datadirect.com/download/docs/odbc/allodbc/index.html#page/odbc%2FData_Source_Configuration_in_the_UNIX_2fLinux_odbc_13.html%23)
 tools to create your data source definition. On Linux, ODBC data sources are 
typically defined in a file named `odbc.ini`. 
+
+Commonly-specified HAWQ ODBC data source connection properties include:
+
+| Property Name| Value 
Description 

|

+|---|-|
+| Database | name of the database to which you want to connect |
+| Driver   | full path to the ODBC driver library file 
  |
+| HostName  | HAWQ master host name
 |
+| MaxLongVarcharSize  | maximum size of columns of type long varchar   

   |
+| Password  | password used to connect to the specified 
database
   |
+| PortNumber  | HAWQ master database port number   
|
+
+Refer to [Connection Option 
Descriptions](http://media.datadirect.com/download/docs/odbc/allodbc/#page/odbc%2Fgreenplum-connection-option-descriptions.html%23)
 for a list of ODBC connection properties supported by the HAWQ DataDirect ODBC 
driver.
+
+Example HAWQ DataDirect ODBC driver data source definition:
+
+``` shell
+[HAWQ-201]
+Driver=/usr/local/hawq_drivers/odbc/lib/ddgplm27.so
+Description=DataDirect 7.1 Greenplum Wire Protocol - for HAWQ
+Database=getstartdb
+HostName=hdm1
+PortNumber=5432
+Password=changeme
+MaxLongVarcharSize=8192
+```
+
+The first line, `[HAWQ-201]`, identifies the name of the data source.
+
+ODBC connection properties may also be specified in a connection string 
identifying either a data 

[jira] [Commented] (HAWQ-1095) enhance database driver and API documentation

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587079#comment-15587079
 ] 

ASF GitHub Bot commented on HAWQ-1095:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/23#discussion_r83974918
  
--- Diff: clientaccess/g-database-application-interfaces.html.md.erb ---
@@ -1,8 +1,96 @@
 ---
-title: ODBC/JDBC Application Interfaces
+title: HAWQ Database Drivers and APIs
 ---
 
+You may want to connect your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The database application programming 
interfaces most commonly used with HAWQ are the Postgres and ODBC and JDBC APIs.
 
-You may want to deploy your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The most commonly used database application 
programming interfaces with HAWQ are the ODBC and JDBC APIs. 
+HAWQ provides the following connectivity tools for connecting to the 
database:
+
+  - ODBC driver
+  - JDBC driver
+  - `libpq` - PostgreSQL C API
+
+## HAWQ Drivers
+
+ODBC and JDBC drivers for HAWQ are available as a separate download from 
Pivotal Network [Pivotal 
Network](https://network.pivotal.io/products/pivotal-hdb).
+
+### ODBC Driver
+
+The ODBC API specifies a standard set of C interfaces for accessing 
database management systems.  For additional information on using the ODBC API, 
refer to the [ODBC Programmer's 
Reference](https://msdn.microsoft.com/en-us/library/ms714177(v=vs.85).aspx) 
documentation.
+
+HAWQ supports the DataDirect ODBC Driver. Installation instructions for 
this driver are provided on the Pivotal Network driver download page. Refer to 
[HAWQ ODBC 
Driver](http://media.datadirect.com/download/docs/odbc/allodbc/#page/odbc%2Fthe-greenplum-wire-protocol-driver.html%23)
 for HAWQ-specific ODBC driver information.
+
+ Connection Data Source
+The information required by the HAWQ ODBC driver to connect to a database 
is typically stored in a named data source. Depending on your platform, you may 
use 
[GUI](http://media.datadirect.com/download/docs/odbc/allodbc/index.html#page/odbc%2FData_Source_Configuration_through_a_GUI_14.html%23)
 or [command 
line](http://media.datadirect.com/download/docs/odbc/allodbc/index.html#page/odbc%2FData_Source_Configuration_in_the_UNIX_2fLinux_odbc_13.html%23)
 tools to create your data source definition. On Linux, ODBC data sources are 
typically defined in a file named `odbc.ini`. 
+
+Commonly-specified HAWQ ODBC data source connection properties include:
+
+| Property Name| Value 
Description 

|

+|---|-|
+| Database | name of the database to which you want to connect |
+| Driver   | full path to the ODBC driver library file 
  |
+| HostName  | HAWQ master host name
 |
+| MaxLongVarcharSize  | maximum size of columns of type long varchar   

   |
+| Password  | password used to connect to the specified 
database
   |
+| PortNumber  | HAWQ master database port number   
|
+
+Refer to [Connection Option 
Descriptions](http://media.datadirect.com/download/docs/odbc/allodbc/#page/odbc%2Fgreenplum-connection-option-descriptions.html%23)
 for a list of ODBC connection properties supported by the HAWQ DataDirect ODBC 
driver.
+
+Example HAWQ DataDirect ODBC driver data source definition:
+
+``` shell
+[HAWQ-201]
+Driver=/usr/local/hawq_drivers/odbc/lib/ddgplm27.so
+Description=DataDirect 7.1 Greenplum Wire Protocol - for HAWQ
+Database=getstartdb
+HostName=hdm1
+PortNumber=5432
+Password=changeme
+MaxLongVarcharSize=8192
+```
+
+The first line, `[HAWQ-201]`, identifies the name of the data source.
+
+ODBC connection properties may also be specified in a connection string 
identifying either a data 

[jira] [Commented] (HAWQ-1095) enhance database driver and API documentation

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587077#comment-15587077
 ] 

ASF GitHub Bot commented on HAWQ-1095:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/23#discussion_r83974424
  
--- Diff: clientaccess/g-database-application-interfaces.html.md.erb ---
@@ -1,8 +1,96 @@
 ---
-title: ODBC/JDBC Application Interfaces
+title: HAWQ Database Drivers and APIs
 ---
 
+You may want to connect your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The database application programming 
interfaces most commonly used with HAWQ are the Postgres and ODBC and JDBC APIs.
 
-You may want to deploy your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The most commonly used database application 
programming interfaces with HAWQ are the ODBC and JDBC APIs. 
+HAWQ provides the following connectivity tools for connecting to the 
database:
+
+  - ODBC driver
+  - JDBC driver
+  - `libpq` - PostgreSQL C API
+
+## HAWQ Drivers
+
+ODBC and JDBC drivers for HAWQ are available as a separate download from 
Pivotal Network [Pivotal 
Network](https://network.pivotal.io/products/pivotal-hdb).
+
+### ODBC Driver
+
+The ODBC API specifies a standard set of C interfaces for accessing 
database management systems.  For additional information on using the ODBC API, 
refer to the [ODBC Programmer's 
Reference](https://msdn.microsoft.com/en-us/library/ms714177(v=vs.85).aspx) 
documentation.
+
+HAWQ supports the DataDirect ODBC Driver. Installation instructions for 
this driver are provided on the Pivotal Network driver download page. Refer to 
[HAWQ ODBC 
Driver](http://media.datadirect.com/download/docs/odbc/allodbc/#page/odbc%2Fthe-greenplum-wire-protocol-driver.html%23)
 for HAWQ-specific ODBC driver information.
+
+ Connection Data Source
+The information required by the HAWQ ODBC driver to connect to a database 
is typically stored in a named data source. Depending on your platform, you may 
use 
[GUI](http://media.datadirect.com/download/docs/odbc/allodbc/index.html#page/odbc%2FData_Source_Configuration_through_a_GUI_14.html%23)
 or [command 
line](http://media.datadirect.com/download/docs/odbc/allodbc/index.html#page/odbc%2FData_Source_Configuration_in_the_UNIX_2fLinux_odbc_13.html%23)
 tools to create your data source definition. On Linux, ODBC data sources are 
typically defined in a file named `odbc.ini`. 
+
+Commonly-specified HAWQ ODBC data source connection properties include:
+
+| Property Name| Value 
Description 

|

+|---|-|
+| Database | name of the database to which you want to connect |
+| Driver   | full path to the ODBC driver library file 
  |
+| HostName  | HAWQ master host name
 |
+| MaxLongVarcharSize  | maximum size of columns of type long varchar   

   |
+| Password  | password used to connect to the specified 
database
   |
+| PortNumber  | HAWQ master database port number   
|
--- End diff --

Let's initial-capitalize the second column.


> enhance database driver and API documentation
> -
>
> Key: HAWQ-1095
> URL: https://issues.apache.org/jira/browse/HAWQ-1095
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> docs contain very brief references to JDBC/ODBC and none at all to libpq.  
> add more content in these areas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1095) enhance database driver and API documentation

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587080#comment-15587080
 ] 

ASF GitHub Bot commented on HAWQ-1095:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/23#discussion_r83974977
  
--- Diff: clientaccess/g-database-application-interfaces.html.md.erb ---
@@ -1,8 +1,96 @@
 ---
-title: ODBC/JDBC Application Interfaces
+title: HAWQ Database Drivers and APIs
 ---
 
+You may want to connect your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The database application programming 
interfaces most commonly used with HAWQ are the Postgres and ODBC and JDBC APIs.
 
-You may want to deploy your existing Business Intelligence (BI) or 
Analytics applications with HAWQ. The most commonly used database application 
programming interfaces with HAWQ are the ODBC and JDBC APIs. 
+HAWQ provides the following connectivity tools for connecting to the 
database:
+
+  - ODBC driver
+  - JDBC driver
+  - `libpq` - PostgreSQL C API
+
+## HAWQ Drivers
+
+ODBC and JDBC drivers for HAWQ are available as a separate download from 
Pivotal Network [Pivotal 
Network](https://network.pivotal.io/products/pivotal-hdb).
+
+### ODBC Driver
+
+The ODBC API specifies a standard set of C interfaces for accessing 
database management systems.  For additional information on using the ODBC API, 
refer to the [ODBC Programmer's 
Reference](https://msdn.microsoft.com/en-us/library/ms714177(v=vs.85).aspx) 
documentation.
+
+HAWQ supports the DataDirect ODBC Driver. Installation instructions for 
this driver are provided on the Pivotal Network driver download page. Refer to 
[HAWQ ODBC 
Driver](http://media.datadirect.com/download/docs/odbc/allodbc/#page/odbc%2Fthe-greenplum-wire-protocol-driver.html%23)
 for HAWQ-specific ODBC driver information.
+
+ Connection Data Source
+The information required by the HAWQ ODBC driver to connect to a database 
is typically stored in a named data source. Depending on your platform, you may 
use 
[GUI](http://media.datadirect.com/download/docs/odbc/allodbc/index.html#page/odbc%2FData_Source_Configuration_through_a_GUI_14.html%23)
 or [command 
line](http://media.datadirect.com/download/docs/odbc/allodbc/index.html#page/odbc%2FData_Source_Configuration_in_the_UNIX_2fLinux_odbc_13.html%23)
 tools to create your data source definition. On Linux, ODBC data sources are 
typically defined in a file named `odbc.ini`. 
+
+Commonly-specified HAWQ ODBC data source connection properties include:
+
+| Property Name| Value 
Description 

|

+|---|-|
+| Database | name of the database to which you want to connect |
+| Driver   | full path to the ODBC driver library file 
  |
+| HostName  | HAWQ master host name
 |
+| MaxLongVarcharSize  | maximum size of columns of type long varchar   

   |
+| Password  | password used to connect to the specified 
database
   |
+| PortNumber  | HAWQ master database port number   
|
+
+Refer to [Connection Option 
Descriptions](http://media.datadirect.com/download/docs/odbc/allodbc/#page/odbc%2Fgreenplum-connection-option-descriptions.html%23)
 for a list of ODBC connection properties supported by the HAWQ DataDirect ODBC 
driver.
+
+Example HAWQ DataDirect ODBC driver data source definition:
+
+``` shell
+[HAWQ-201]
+Driver=/usr/local/hawq_drivers/odbc/lib/ddgplm27.so
+Description=DataDirect 7.1 Greenplum Wire Protocol - for HAWQ
+Database=getstartdb
+HostName=hdm1
+PortNumber=5432
+Password=changeme
+MaxLongVarcharSize=8192
+```
+
+The first line, `[HAWQ-201]`, identifies the name of the data source.
+
+ODBC connection properties may also be specified in a connection string 
identifying either a data 

[jira] [Commented] (HAWQ-1096) document the HAWQ built-in languages (SQL, C, internal)

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586400#comment-15586400
 ] 

ASF GitHub Bot commented on HAWQ-1096:
--

GitHub user lisakowen opened a pull request:

https://github.com/apache/incubator-hawq-docs/pull/27

HAWQ-1096 - add subnav entry for built-in languages

add subnav for new topic

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lisakowen/incubator-hawq-docs 
feature/subnav-builtin-langs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq-docs/pull/27.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #27






> document the HAWQ built-in languages (SQL, C, internal)
> ---
>
> Key: HAWQ-1096
> URL: https://issues.apache.org/jira/browse/HAWQ-1096
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
>
> the HAWQ docs do not discuss the built-in languages supported by HAWQ - SQL, 
> C and internal.  add content to introduce these languages with relevant 
> examples and links. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1096) document the HAWQ built-in languages (SQL, C, internal)

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586041#comment-15586041
 ] 

ASF GitHub Bot commented on HAWQ-1096:
--

GitHub user lisakowen opened a pull request:

https://github.com/apache/incubator-hawq-docs/pull/25

HAWQ-1096 - add content for hawq built-in languages

add content for sql, c, and internal hawq built in languages

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lisakowen/incubator-hawq-docs 
feature/builtin-langs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq-docs/pull/25.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #25


commit 504c662be21dc344a161b81a9c627a8f6d7861cd
Author: Lisa Owen 
Date:   2016-10-05T21:33:36Z

add file discussing hawq built-in languages

commit 8e27e9093f1d27277d676386144ee895ad004f86
Author: Lisa Owen 
Date:   2016-10-05T21:34:36Z

include built-in languages in PL lang landing page

commit bd85fdbc31cb463855c2606fde48d803dccb3de2
Author: Lisa Owen 
Date:   2016-10-05T21:47:11Z

c user-defined function example - add _c to function name to avoid confusion

commit 1332870d01d2f8da2f8284ac167253d7005c6dfd
Author: Lisa Owen 
Date:   2016-10-10T22:24:20Z

builtin langs -  clarify and add some links




> document the HAWQ built-in languages (SQL, C, internal)
> ---
>
> Key: HAWQ-1096
> URL: https://issues.apache.org/jira/browse/HAWQ-1096
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
>
> the HAWQ docs do not discuss the built-in languages supported by HAWQ - SQL, 
> C and internal.  add content to introduce these languages with relevant 
> examples and links. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1095) enhance database driver and API documentation

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586004#comment-15586004
 ] 

ASF GitHub Bot commented on HAWQ-1095:
--

GitHub user lisakowen opened a pull request:

https://github.com/apache/incubator-hawq-docs/pull/23

HAWQ-1095 - enhance database api docs

add content for jdbc, odbc, libpq


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lisakowen/incubator-hawq-docs 
feature/dbapiinfo

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq-docs/pull/23.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23


commit 2c0f4b19bb2baef545467c9d39f097344c6358b2
Author: Lisa Owen 
Date:   2016-10-04T19:25:29Z

restructure db API section; add libpq and links to driver and api docs

commit f066326f0241050a22a8b592fcaae3aab037c504
Author: Lisa Owen 
Date:   2016-10-04T20:36:41Z

clarify some statements

commit fbb0571df9cdb1ba05a2ba970b560cb6388b72eb
Author: Lisa Owen 
Date:   2016-10-04T23:11:07Z

hawq supports datadirect drivers

commit df2aaed3aab20b9d0fffa0c62df8a23c33864065
Author: Lisa Owen 
Date:   2016-10-04T23:26:02Z

update driver names

commit 245633e69bd0017f43a5cc20e82c9a5fc23b4079
Author: Lisa Owen 
Date:   2016-10-05T21:56:36Z

provide locations of libpq lib and include file

commit 57d76d2b86014f772754ca70cab95e4c337a71a2
Author: Lisa Owen 
Date:   2016-10-07T16:02:22Z

add jdbc connection string and example

commit 70e45af7d24a6699840eec176603b4b835121bef
Author: Lisa Owen 
Date:   2016-10-07T23:48:39Z

flesh out jdbc section; add connection URL specs

commit 3288da3e8ce51482e1d6e6913a237cbf5fc0bc8e
Author: Lisa Owen 
Date:   2016-10-10T19:08:48Z

db drivers and apis - flesh out odbc section




> enhance database driver and API documentation
> -
>
> Key: HAWQ-1095
> URL: https://issues.apache.org/jira/browse/HAWQ-1095
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> docs contain very brief references to JDBC/ODBC and none at all to libpq.  
> add more content in these areas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1056) "hawq check" help output and documentation updates needed

2016-09-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508041#comment-15508041
 ] 

ASF GitHub Bot commented on HAWQ-1056:
--

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq-docs/pull/9


> "hawq check" help output and documentation updates needed
> -
>
> Key: HAWQ-1056
> URL: https://issues.apache.org/jira/browse/HAWQ-1056
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools, Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> help output and reference documentation for "hawq check" --hadoop option is 
> not clear.  specifically, this option should identify the full path to the 
> hadoop installation.
> additionally, the [-h | --host ] option appears to be missing in 
> both areas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1056) "hawq check" help output and documentation updates needed

2016-09-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494197#comment-15494197
 ] 

ASF GitHub Bot commented on HAWQ-1056:
--

GitHub user lisakowen opened a pull request:

https://github.com/apache/incubator-hawq-docs/pull/9

Feature/hawqcheck hadoopopt

some cleanup to documentation for "hawq check" command.  fixes the 
documentation part of HAWQ-1056.

- add -h, --host  option
- clarify value of --hadoop, --hadoop-home option  value 
should be the full install path to hadoop
- modify the examples to use relevant values for hadoop_home

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lisakowen/incubator-hawq-docs 
feature/hawqcheck-hadoopopt

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq-docs/pull/9.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9


commit 0f642f1ce67bd570d2043174bbdaed990c7840bb
Author: Lisa Owen 
Date:   2016-09-14T21:11:38Z

clarify use of hawq check --hadoop option

commit 6704cc0b7a358fafba08b3fd66a5a12b5bb97f85
Author: Lisa Owen 
Date:   2016-09-14T21:21:07Z

hawq check --hadoop option - misc cleanup

commit 016630163015e782ef630338998ef1696f5f005e
Author: Lisa Owen 
Date:   2016-09-15T17:52:32Z

hawq check - add h/host option, cleanup

commit 4a617974cf04d1b1758bdfbc490116b60bdefb79
Author: Lisa Owen 
Date:   2016-09-15T18:38:36Z

hawq check - hadoop home is optional




> "hawq check" help output and documentation updates needed
> -
>
> Key: HAWQ-1056
> URL: https://issues.apache.org/jira/browse/HAWQ-1056
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools, Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
> Fix For: 2.0.1.0-incubating
>
>
> help output and reference documentation for "hawq check" --hadoop option is 
> not clear.  specifically, this option should identify the full path to the 
> hadoop installation.
> additionally, the [-h | --host ] option appears to be missing in 
> both areas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1019) clarify database application interfaces discussion

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15440281#comment-15440281
 ] 

ASF GitHub Bot commented on HAWQ-1019:
--

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq-docs/pull/3


> clarify database application interfaces discussion
> --
>
> Key: HAWQ-1019
> URL: https://issues.apache.org/jira/browse/HAWQ-1019
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: Lei Chang
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> discussion of drivers for database application interfaces needs to be 
> clarified.
> relevant incubator-hawq-docs file:  
> clientaccess/g-database-application-interfaces.html.md.erb 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1019) clarify database application interfaces discussion

2016-08-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15440251#comment-15440251
 ] 

ASF GitHub Bot commented on HAWQ-1019:
--

GitHub user lisakowen opened a pull request:

https://github.com/apache/incubator-hawq-docs/pull/3

misc doc updates clarifying APIs

updates to clarify database application interfaces.  fixes HAWQ-1019

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lisakowen/incubator-hawq-docs 
feature/dbappif-fixes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq-docs/pull/3.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3


commit bec5e3e1855bed4f0f8a5ef2a41b94c3c92f17fc
Author: Lisa Owen 
Date:   2016-08-26T23:59:00Z

misc doc updates clarifying APIs




> clarify database application interfaces discussion
> --
>
> Key: HAWQ-1019
> URL: https://issues.apache.org/jira/browse/HAWQ-1019
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: Lei Chang
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> discussion of drivers for database application interfaces needs to be 
> clarified.
> relevant incubator-hawq-docs file:  
> clientaccess/g-database-application-interfaces.html.md.erb 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-945) catalog:char/varchar test cases fail due to locale settings.

2016-07-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389139#comment-15389139
 ] 

ASF GitHub Bot commented on HAWQ-945:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq/pull/809


> catalog:char/varchar test cases fail due to locale settings.
> 
>
> Key: HAWQ-945
> URL: https://issues.apache.org/jira/browse/HAWQ-945
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Paul Guo
>Assignee: Paul Guo
> Fix For: 2.0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-945) catalog:char/varchar test cases fail due to locale settings.

2016-07-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389133#comment-15389133
 ] 

ASF GitHub Bot commented on HAWQ-945:
-

Github user radarwave commented on the issue:

https://github.com/apache/incubator-hawq/pull/809
  
+1


> catalog:char/varchar test cases fail due to locale settings.
> 
>
> Key: HAWQ-945
> URL: https://issues.apache.org/jira/browse/HAWQ-945
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Paul Guo
>Assignee: Paul Guo
> Fix For: 2.0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-945) catalog:char/varchar test cases fail due to locale settings.

2016-07-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389129#comment-15389129
 ] 

ASF GitHub Bot commented on HAWQ-945:
-

Github user yaoj2 commented on the issue:

https://github.com/apache/incubator-hawq/pull/809
  
+1


> catalog:char/varchar test cases fail due to locale settings.
> 
>
> Key: HAWQ-945
> URL: https://issues.apache.org/jira/browse/HAWQ-945
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Paul Guo
>Assignee: Paul Guo
> Fix For: 2.0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-945) catalog:char/varchar test cases fail due to locale settings.

2016-07-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389125#comment-15389125
 ] 

ASF GitHub Bot commented on HAWQ-945:
-

GitHub user paul-guo- opened a pull request:

https://github.com/apache/incubator-hawq/pull/809

HAWQ-945. catalog:char/varchar test cases fail due to locale settings.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/paul-guo-/incubator-hawq test3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq/pull/809.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #809


commit 1e3cdd448a30e8db62ad47e428d318f78acd1a34
Author: Paul Guo 
Date:   2016-07-22T07:50:37Z

HAWQ-945. catalog:char/varchar test cases fail due to locale settings.




> catalog:char/varchar test cases fail due to locale settings.
> 
>
> Key: HAWQ-945
> URL: https://issues.apache.org/jira/browse/HAWQ-945
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Paul Guo
>Assignee: Paul Guo
> Fix For: 2.0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-940) Kerberos Ticket Expired for LibYARN Operations

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388921#comment-15388921
 ] 

ASF GitHub Bot commented on HAWQ-940:
-

Github user linwen closed the pull request at:

https://github.com/apache/incubator-hawq/pull/804


> Kerberos Ticket Expired for LibYARN Operations
> --
>
> Key: HAWQ-940
> URL: https://issues.apache.org/jira/browse/HAWQ-940
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libyarn
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>
> HAWQ's libhdfs3 and libyarn use a same kerberos keyfile. 
> Whenever a hdfs operation is triggered, a function named login() is called, 
> in login() function, this ticket is initialized by "kinit". 
> But for libyarn, login() function is only called during the resource broker 
> process starts. So if HAWQ starts up and there is no query for a long 
> period(24 hours in kerberos's configure file, krb.conf), this ticket will 
> expire, and HAWQ fails to register itself in Hadoop YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-940) Kerberos Ticket Expired for LibYARN Operations

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388905#comment-15388905
 ] 

ASF GitHub Bot commented on HAWQ-940:
-

Github user paul-guo- commented on the issue:

https://github.com/apache/incubator-hawq/pull/804
  
+1


> Kerberos Ticket Expired for LibYARN Operations
> --
>
> Key: HAWQ-940
> URL: https://issues.apache.org/jira/browse/HAWQ-940
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libyarn
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>
> HAWQ's libhdfs3 and libyarn use a same kerberos keyfile. 
> Whenever a hdfs operation is triggered, a function named login() is called, 
> in login() function, this ticket is initialized by "kinit". 
> But for libyarn, login() function is only called during the resource broker 
> process starts. So if HAWQ starts up and there is no query for a long 
> period(24 hours in kerberos's configure file, krb.conf), this ticket will 
> expire, and HAWQ fails to register itself in Hadoop YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-932) HAWQ fails to query external table defined with "localhost" in URL

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388640#comment-15388640
 ] 

ASF GitHub Bot commented on HAWQ-932:
-

Github user sansanichfb commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/803#discussion_r71808630
  
--- Diff: src/backend/catalog/external/externalmd.c ---
@@ -84,7 +84,7 @@ List *ParsePxfEntries(StringInfo json, char *profile, Oid 
dboid)
{
struct json_object *jsonItem = 
json_object_array_get_idx(jsonItems, i);
PxfItem *pxfItem = ParsePxfItem(jsonItem, profile);
-   if (dboid != NULL)
+   if (dboid != InvalidOid)
--- End diff --

Warning.


> HAWQ fails to query external table defined with "localhost" in URL
> --
>
> Key: HAWQ-932
> URL: https://issues.apache.org/jira/browse/HAWQ-932
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: External Tables, PXF
>Reporter: Goden Yao
>Assignee: Oleksandr Diachenko
> Fix For: 2.0.1.0-incubating
>
>
> Originally reported by [~jpatel] when he's making a docker image based on 
> HAWQ 2.0.0.0-incubating dev build. Investigated by [~odiachenko]
> There is workaround to define it with 127.0.0.1, but there is not a 
> workaround for querying tables using HCatalog integration.
> It used to work before.
> {code}
> template1=# CREATE EXTERNAL TABLE ext_table1 (t1text, t2text,
> num1  integer, dub1  double precision) LOCATION
> (E'pxf://localhost:51200/hive_small_data?PROFILE=Hive') FORMAT 'CUSTOM'
> (formatter='pxfwritable_import');*
> CREATE EXTERNAL TABLE
> template1=# select * from ext_table1;
> ERROR:  remote component error (0): (libchurl.c:898)*
> {code}
> When I turned on debug mode in curl, I found this error in logs - "*
> Closing connection 0".
> I found a workaround, to set CURLOPT_RESOLVE option in curl:
> {code}
> struct curl_slist *host = NULL;
> host = curl_slist_append(NULL, "localhost:51200:127.0.0.1");*
> set_curl_option(context, CURLOPT_RESOLVE, host);
> {code}
> It seems like an issue with DNS cache,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-932) HAWQ fails to query external table defined with "localhost" in URL

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388639#comment-15388639
 ] 

ASF GitHub Bot commented on HAWQ-932:
-

Github user sansanichfb commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/803#discussion_r71808615
  
--- Diff: src/backend/utils/adt/pxf_functions.c ---
@@ -43,7 +44,7 @@ pxf_item_fields_enum_start(text *profile, text *pattern)
char *profile_cstr = text_to_cstring(profile);
char *pattern_cstr = text_to_cstring(pattern);
 
-   items = get_pxf_item_metadata(profile_cstr, pattern_cstr, NULL);
+   items = get_pxf_item_metadata(profile_cstr, pattern_cstr, InvalidOid);
--- End diff --

One more warning.


> HAWQ fails to query external table defined with "localhost" in URL
> --
>
> Key: HAWQ-932
> URL: https://issues.apache.org/jira/browse/HAWQ-932
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: External Tables, PXF
>Reporter: Goden Yao
>Assignee: Oleksandr Diachenko
> Fix For: 2.0.1.0-incubating
>
>
> Originally reported by [~jpatel] when he's making a docker image based on 
> HAWQ 2.0.0.0-incubating dev build. Investigated by [~odiachenko]
> There is workaround to define it with 127.0.0.1, but there is not a 
> workaround for querying tables using HCatalog integration.
> It used to work before.
> {code}
> template1=# CREATE EXTERNAL TABLE ext_table1 (t1text, t2text,
> num1  integer, dub1  double precision) LOCATION
> (E'pxf://localhost:51200/hive_small_data?PROFILE=Hive') FORMAT 'CUSTOM'
> (formatter='pxfwritable_import');*
> CREATE EXTERNAL TABLE
> template1=# select * from ext_table1;
> ERROR:  remote component error (0): (libchurl.c:898)*
> {code}
> When I turned on debug mode in curl, I found this error in logs - "*
> Closing connection 0".
> I found a workaround, to set CURLOPT_RESOLVE option in curl:
> {code}
> struct curl_slist *host = NULL;
> host = curl_slist_append(NULL, "localhost:51200:127.0.0.1");*
> set_curl_option(context, CURLOPT_RESOLVE, host);
> {code}
> It seems like an issue with DNS cache,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-932) HAWQ fails to query external table defined with "localhost" in URL

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388172#comment-15388172
 ] 

ASF GitHub Bot commented on HAWQ-932:
-

Github user shivzone commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/803#discussion_r71760909
  
--- Diff: src/backend/access/external/libchurl.c ---
@@ -312,6 +312,14 @@ CHURL_HANDLE churl_init_upload(const char* url, 
CHURL_HEADERS headers)
context->upload = true;
clear_error_buffer(context);
 
+   /* needed to resolve pxf service address */
+   struct curl_slist *resolve_hosts = NULL;
+   char *pxf_host_entry = (char *) palloc0(strlen(pxf_service_address) + 
strlen(LocalhostIpV4Entry) + 1);
+   strcat(pxf_host_entry, pxf_service_address);
--- End diff --

Yes. I was suggesting that we add the below OPT only when 
pxf_service_address is not based on IP address


> HAWQ fails to query external table defined with "localhost" in URL
> --
>
> Key: HAWQ-932
> URL: https://issues.apache.org/jira/browse/HAWQ-932
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: External Tables, PXF
>Reporter: Goden Yao
>Assignee: Oleksandr Diachenko
> Fix For: 2.0.1.0-incubating
>
>
> Originally reported by [~jpatel] when he's making a docker image based on 
> HAWQ 2.0.0.0-incubating dev build. Investigated by [~odiachenko]
> There is workaround to define it with 127.0.0.1, but there is not a 
> workaround for querying tables using HCatalog integration.
> It used to work before.
> {code}
> template1=# CREATE EXTERNAL TABLE ext_table1 (t1text, t2text,
> num1  integer, dub1  double precision) LOCATION
> (E'pxf://localhost:51200/hive_small_data?PROFILE=Hive') FORMAT 'CUSTOM'
> (formatter='pxfwritable_import');*
> CREATE EXTERNAL TABLE
> template1=# select * from ext_table1;
> ERROR:  remote component error (0): (libchurl.c:898)*
> {code}
> When I turned on debug mode in curl, I found this error in logs - "*
> Closing connection 0".
> I found a workaround, to set CURLOPT_RESOLVE option in curl:
> {code}
> struct curl_slist *host = NULL;
> host = curl_slist_append(NULL, "localhost:51200:127.0.0.1");*
> set_curl_option(context, CURLOPT_RESOLVE, host);
> {code}
> It seems like an issue with DNS cache,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-932) HAWQ fails to query external table defined with "localhost" in URL

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388167#comment-15388167
 ] 

ASF GitHub Bot commented on HAWQ-932:
-

Github user sansanichfb commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/803#discussion_r71760410
  
--- Diff: src/backend/access/external/libchurl.c ---
@@ -312,6 +312,14 @@ CHURL_HANDLE churl_init_upload(const char* url, 
CHURL_HEADERS headers)
context->upload = true;
clear_error_buffer(context);
 
+   /* needed to resolve pxf service address */
+   struct curl_slist *resolve_hosts = NULL;
+   char *pxf_host_entry = (char *) palloc0(strlen(pxf_service_address) + 
strlen(LocalhostIpV4Entry) + 1);
+   strcat(pxf_host_entry, pxf_service_address);
--- End diff --

For case when user created external table referring to "localhost" it's 
needed.


> HAWQ fails to query external table defined with "localhost" in URL
> --
>
> Key: HAWQ-932
> URL: https://issues.apache.org/jira/browse/HAWQ-932
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: External Tables, PXF
>Reporter: Goden Yao
>Assignee: Oleksandr Diachenko
> Fix For: 2.0.1.0-incubating
>
>
> Originally reported by [~jpatel] when he's making a docker image based on 
> HAWQ 2.0.0.0-incubating dev build. Investigated by [~odiachenko]
> There is workaround to define it with 127.0.0.1, but there is not a 
> workaround for querying tables using HCatalog integration.
> It used to work before.
> {code}
> template1=# CREATE EXTERNAL TABLE ext_table1 (t1text, t2text,
> num1  integer, dub1  double precision) LOCATION
> (E'pxf://localhost:51200/hive_small_data?PROFILE=Hive') FORMAT 'CUSTOM'
> (formatter='pxfwritable_import');*
> CREATE EXTERNAL TABLE
> template1=# select * from ext_table1;
> ERROR:  remote component error (0): (libchurl.c:898)*
> {code}
> When I turned on debug mode in curl, I found this error in logs - "*
> Closing connection 0".
> I found a workaround, to set CURLOPT_RESOLVE option in curl:
> {code}
> struct curl_slist *host = NULL;
> host = curl_slist_append(NULL, "localhost:51200:127.0.0.1");*
> set_curl_option(context, CURLOPT_RESOLVE, host);
> {code}
> It seems like an issue with DNS cache,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-932) HAWQ fails to query external table defined with "localhost" in URL

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388163#comment-15388163
 ] 

ASF GitHub Bot commented on HAWQ-932:
-

Github user shivzone commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/803#discussion_r71760169
  
--- Diff: src/backend/access/external/libchurl.c ---
@@ -312,6 +312,14 @@ CHURL_HANDLE churl_init_upload(const char* url, 
CHURL_HEADERS headers)
context->upload = true;
clear_error_buffer(context);
 
+   /* needed to resolve pxf service address */
+   struct curl_slist *resolve_hosts = NULL;
+   char *pxf_host_entry = (char *) palloc0(strlen(pxf_service_address) + 
strlen(LocalhostIpV4Entry) + 1);
+   strcat(pxf_host_entry, pxf_service_address);
--- End diff --

This step might be unnecssary if the pxf_service_address itself is based on 
the IP address


> HAWQ fails to query external table defined with "localhost" in URL
> --
>
> Key: HAWQ-932
> URL: https://issues.apache.org/jira/browse/HAWQ-932
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: External Tables, PXF
>Reporter: Goden Yao
>Assignee: Oleksandr Diachenko
> Fix For: 2.0.1.0-incubating
>
>
> Originally reported by [~jpatel] when he's making a docker image based on 
> HAWQ 2.0.0.0-incubating dev build. Investigated by [~odiachenko]
> There is workaround to define it with 127.0.0.1, but there is not a 
> workaround for querying tables using HCatalog integration.
> It used to work before.
> {code}
> template1=# CREATE EXTERNAL TABLE ext_table1 (t1text, t2text,
> num1  integer, dub1  double precision) LOCATION
> (E'pxf://localhost:51200/hive_small_data?PROFILE=Hive') FORMAT 'CUSTOM'
> (formatter='pxfwritable_import');*
> CREATE EXTERNAL TABLE
> template1=# select * from ext_table1;
> ERROR:  remote component error (0): (libchurl.c:898)*
> {code}
> When I turned on debug mode in curl, I found this error in logs - "*
> Closing connection 0".
> I found a workaround, to set CURLOPT_RESOLVE option in curl:
> {code}
> struct curl_slist *host = NULL;
> host = curl_slist_append(NULL, "localhost:51200:127.0.0.1");*
> set_curl_option(context, CURLOPT_RESOLVE, host);
> {code}
> It seems like an issue with DNS cache,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-944) Numutils.c: pg_ltoa and pg_itoa functions allocate unnecessary amount of bytes

2016-07-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388000#comment-15388000
 ] 

ASF GitHub Bot commented on HAWQ-944:
-

Github user kavinderd commented on the issue:

https://github.com/apache/incubator-hawq/pull/808
  
@paul-guo- @xunzhang Please review, I think I covered all invocations of 
the two functions.


> Numutils.c: pg_ltoa and pg_itoa functions allocate unnecessary amount of bytes
> --
>
> Key: HAWQ-944
> URL: https://issues.apache.org/jira/browse/HAWQ-944
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Core
>Reporter: Kavinder Dhaliwal
>Assignee: Kavinder Dhaliwal
>Priority: Minor
>
> The current implementations of {{pg_ltoa}} and {{pg_itoa}} allocate a 33 byte 
> char array and set the input pointer to that array. This is far too many 
> bytes than needed to translate an int16 or int32 to a string
> int32 -> 10 bytes maximum + 1 sign bit + '\0' = 12 bytes
> int16 ->  5 bytes maximum  + 1 sign bit + '\0' = 7 bytes
> When HAWQ/Greenplum forked from Postgres the two functions simply delegated 
> to {{sprintf}} so an optimization was introduced that involved the 33 byte 
> solution. Postgres itself implemented these functions in commit 
> https://github.com/postgres/postgres/commit/4fc115b2e981f8c63165ca86a23215380a3fda66
>  that require a 12 byte maximum char pointer.
> This is a minor improvement that can be made to the HAWQ codebase and it's 
> relatively little effort to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-897) Add feature test for create table distribution with new test framework

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387150#comment-15387150
 ] 

ASF GitHub Bot commented on HAWQ-897:
-

Github user yaoj2 commented on the issue:

https://github.com/apache/incubator-hawq/pull/807
  
LGTM


> Add feature test for create table distribution with new test framework
> --
>
> Key: HAWQ-897
> URL: https://issues.apache.org/jira/browse/HAWQ-897
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-940) Kerberos Ticket Expired for LibYARN Operations

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387148#comment-15387148
 ] 

ASF GitHub Bot commented on HAWQ-940:
-

Github user jiny2 commented on the issue:

https://github.com/apache/incubator-hawq/pull/804
  
LGTM +1


> Kerberos Ticket Expired for LibYARN Operations
> --
>
> Key: HAWQ-940
> URL: https://issues.apache.org/jira/browse/HAWQ-940
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libyarn
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>
> HAWQ's libhdfs3 and libyarn use a same kerberos keyfile. 
> Whenever a hdfs operation is triggered, a function named login() is called, 
> in login() function, this ticket is initialized by "kinit". 
> But for libyarn, login() function is only called during the resource broker 
> process starts. So if HAWQ starts up and there is no query for a long 
> period(24 hours in kerberos's configure file, krb.conf), this ticket will 
> expire, and HAWQ fails to register itself in Hadoop YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-897) Add feature test for create table distribution with new test framework

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387147#comment-15387147
 ] 

ASF GitHub Bot commented on HAWQ-897:
-

Github user linwen commented on the issue:

https://github.com/apache/incubator-hawq/pull/807
  
+1


> Add feature test for create table distribution with new test framework
> --
>
> Key: HAWQ-897
> URL: https://issues.apache.org/jira/browse/HAWQ-897
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-934) Populate canSetTag of PlannedStmt from Query object

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387135#comment-15387135
 ] 

ASF GitHub Bot commented on HAWQ-934:
-

Github user hsyuan commented on the issue:

https://github.com/apache/incubator-hawq/pull/799
  
@changleicn @wengyanqing @paul-guo- 
Please take a look.


> Populate canSetTag of PlannedStmt from Query object
> ---
>
> Key: HAWQ-934
> URL: https://issues.apache.org/jira/browse/HAWQ-934
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Optimizer
>Reporter: Haisheng Yuan
>Assignee: Venkatesh
> Fix For: 2.0.1.0-incubating
>
>
> HAWQ generated an error if a single query resulted in multiple query plans 
> because of rule transformation and the plans were produced by PQO. This is 
> because of an incorrect directive in the plan to lock the same resource more 
> than once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-897) Add feature test for create table distribution with new test framework

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387136#comment-15387136
 ] 

ASF GitHub Bot commented on HAWQ-897:
-

GitHub user jiny2 opened a pull request:

https://github.com/apache/incubator-hawq/pull/807

HAWQ-897. Add feature test for create table distribution with new test 
framework



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiny2/incubator-hawq HAWQ-897

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq/pull/807.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #807


commit 5fb757d8597b6dfa3e64cda81260e4d819e1793c
Author: YI JIN 
Date:   2016-07-21T04:28:24Z

HAWQ-897. Add feature test for create table distribution with new test 
framework




> Add feature test for create table distribution with new test framework
> --
>
> Key: HAWQ-897
> URL: https://issues.apache.org/jira/browse/HAWQ-897
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-938) Remove ivy.xml in gpopt and read orca version from header file

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387128#comment-15387128
 ] 

ASF GitHub Bot commented on HAWQ-938:
-

Github user hsyuan commented on the issue:

https://github.com/apache/incubator-hawq/pull/806
  
Thanks, will take care of it.


> Remove ivy.xml in gpopt and read orca version from header file
> --
>
> Key: HAWQ-938
> URL: https://issues.apache.org/jira/browse/HAWQ-938
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Optimizer
>Reporter: Haisheng Yuan
>Assignee: Haisheng Yuan
> Fix For: 2.0.1.0-incubating
>
>
> Currently, if we want to upgrade orca or gpos, we need change the orca SHA as 
> well as the version number in ivy.xml. The function gp_opt_version() returns 
> version number that is read from ivy.xml, which is not a right way. It should 
> only be dependent on the source file of orca and gpos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-940) Kerberos Ticket Expired for LibYARN Operations

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387080#comment-15387080
 ] 

ASF GitHub Bot commented on HAWQ-940:
-

GitHub user linwen reopened a pull request:

https://github.com/apache/incubator-hawq/pull/804

HAWQ-940. Fix Kerberos ticket expired for libyarn operations

Please review, thanks! 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/linwen/incubator-hawq hawq_940

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq/pull/804.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #804


commit 0942630825f6c84da155bd2d9aec4831f7e4d049
Author: Wen Lin 
Date:   2016-07-20T09:07:01Z

HAWQ-940. Fix Kerberos ticket expired for libyarn operations




> Kerberos Ticket Expired for LibYARN Operations
> --
>
> Key: HAWQ-940
> URL: https://issues.apache.org/jira/browse/HAWQ-940
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libyarn
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>
> HAWQ's libhdfs3 and libyarn use a same kerberos keyfile. 
> Whenever a hdfs operation is triggered, a function named login() is called, 
> in login() function, this ticket is initialized by "kinit". 
> But for libyarn, login() function is only called during the resource broker 
> process starts. So if HAWQ starts up and there is no query for a long 
> period(24 hours in kerberos's configure file, krb.conf), this ticket will 
> expire, and HAWQ fails to register itself in Hadoop YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-940) Kerberos Ticket Expired for LibYARN Operations

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387079#comment-15387079
 ] 

ASF GitHub Bot commented on HAWQ-940:
-

Github user linwen commented on the issue:

https://github.com/apache/incubator-hawq/pull/804
  
Another way of fix is to add the ticket check in resource broker process 
loop, for every time interval, login() is called. But this fix has to keep 
another variable to record the last updated time, which is duplicate with 
login(). 


> Kerberos Ticket Expired for LibYARN Operations
> --
>
> Key: HAWQ-940
> URL: https://issues.apache.org/jira/browse/HAWQ-940
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libyarn
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>
> HAWQ's libhdfs3 and libyarn use a same kerberos keyfile. 
> Whenever a hdfs operation is triggered, a function named login() is called, 
> in login() function, this ticket is initialized by "kinit". 
> But for libyarn, login() function is only called during the resource broker 
> process starts. So if HAWQ starts up and there is no query for a long 
> period(24 hours in kerberos's configure file, krb.conf), this ticket will 
> expire, and HAWQ fails to register itself in Hadoop YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-940) Kerberos Ticket Expired for LibYARN Operations

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387063#comment-15387063
 ] 

ASF GitHub Bot commented on HAWQ-940:
-

Github user linwen commented on the issue:

https://github.com/apache/incubator-hawq/pull/804
  
After re-think this fix, I think it's better to check ticket expiration in 
resource broker process loop.
so close it. 


> Kerberos Ticket Expired for LibYARN Operations
> --
>
> Key: HAWQ-940
> URL: https://issues.apache.org/jira/browse/HAWQ-940
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libyarn
>Reporter: Lin Wen
>Assignee: Lei Chang
> Fix For: 2.0.1.0-incubating
>
>
> HAWQ's libhdfs3 and libyarn use a same kerberos keyfile. 
> Whenever a hdfs operation is triggered, a function named login() is called, 
> in login() function, this ticket is initialized by "kinit". 
> But for libyarn, login() function is only called during the resource broker 
> process starts. So if HAWQ starts up and there is no query for a long 
> period(24 hours in kerberos's configure file, krb.conf), this ticket will 
> expire, and HAWQ fails to register itself in Hadoop YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-940) Kerberos Ticket Expired for LibYARN Operations

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387064#comment-15387064
 ] 

ASF GitHub Bot commented on HAWQ-940:
-

Github user linwen closed the pull request at:

https://github.com/apache/incubator-hawq/pull/804


> Kerberos Ticket Expired for LibYARN Operations
> --
>
> Key: HAWQ-940
> URL: https://issues.apache.org/jira/browse/HAWQ-940
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libyarn
>Reporter: Lin Wen
>Assignee: Lei Chang
> Fix For: 2.0.1.0-incubating
>
>
> HAWQ's libhdfs3 and libyarn use a same kerberos keyfile. 
> Whenever a hdfs operation is triggered, a function named login() is called, 
> in login() function, this ticket is initialized by "kinit". 
> But for libyarn, login() function is only called during the resource broker 
> process starts. So if HAWQ starts up and there is no query for a long 
> period(24 hours in kerberos's configure file, krb.conf), this ticket will 
> expire, and HAWQ fails to register itself in Hadoop YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-943) Various issues in hawq register feature_test cases

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387059#comment-15387059
 ] 

ASF GitHub Bot commented on HAWQ-943:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq/pull/805


> Various issues in hawq register feature_test cases
> --
>
> Key: HAWQ-943
> URL: https://issues.apache.org/jira/browse/HAWQ-943
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Paul Guo
>Assignee: Lei Chang
> Fix For: 2.0.1.0-incubating
>
>
> 1) Do not assume the test database is postgres.
> Use HAWQ_DB which is defined in sql_util.h
> 2) Use error immune options when creating a new hdfs file or directory.
> e.g. mkdir -p, put -f.
> Since nonexistence of those files/directories are not guaranteed.
> e.g. Previous test run was terminated by ctrl+c.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-943) Various issues in hawq register feature_test cases

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386997#comment-15386997
 ] 

ASF GitHub Bot commented on HAWQ-943:
-

Github user ictmalili commented on the issue:

https://github.com/apache/incubator-hawq/pull/805
  
LGTM. +1


> Various issues in hawq register feature_test cases
> --
>
> Key: HAWQ-943
> URL: https://issues.apache.org/jira/browse/HAWQ-943
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Paul Guo
>Assignee: Lei Chang
> Fix For: 2.0.1.0-incubating
>
>
> 1) Do not assume the test database is postgres.
> Use HAWQ_DB which is defined in sql_util.h
> 2) Use error immune options when creating a new hdfs file or directory.
> e.g. mkdir -p, put -f.
> Since nonexistence of those files/directories are not guaranteed.
> e.g. Previous test run was terminated by ctrl+c.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-936) Add GUC for array expansion in ORCA optimizer

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386950#comment-15386950
 ] 

ASF GitHub Bot commented on HAWQ-936:
-

Github user changleicn commented on the issue:

https://github.com/apache/incubator-hawq/pull/800
  
LGTM


> Add GUC for array expansion in ORCA optimizer
> -
>
> Key: HAWQ-936
> URL: https://issues.apache.org/jira/browse/HAWQ-936
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Optimizer
>Reporter: Haisheng Yuan
>Assignee: Venkatesh
> Fix For: 2.0.1.0-incubating
>
>
> Consider the query with the following pattern select * from foo where foo.a 
> IN {1,2,3,...}. Currently, when the number of constants in the IN subquery is 
> large, the query optimization time is unacceptable. This is stopping 
> customers from turning Orca on by default since many of the queries are 
> generated queries with such a pattern.
> The root cause is due to the expansion of the IN subquery into an expression 
> in disjunctive normal form. The objective of this story is to disable this 
> expansion when the number of constants in the IN list is large.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-938) Remove ivy.xml in gpopt and read orca version from header file

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386947#comment-15386947
 ] 

ASF GitHub Bot commented on HAWQ-938:
-

Github user changleicn commented on the issue:

https://github.com/apache/incubator-hawq/pull/806
  
@paul-guo- to review.


> Remove ivy.xml in gpopt and read orca version from header file
> --
>
> Key: HAWQ-938
> URL: https://issues.apache.org/jira/browse/HAWQ-938
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Optimizer
>Reporter: Haisheng Yuan
>Assignee: Haisheng Yuan
> Fix For: 2.0.1.0-incubating
>
>
> Currently, if we want to upgrade orca or gpos, we need change the orca SHA as 
> well as the version number in ivy.xml. The function gp_opt_version() returns 
> version number that is read from ivy.xml, which is not a right way. It should 
> only be dependent on the source file of orca and gpos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-927) Send Projection Info Data from HAWQ to PXF

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386817#comment-15386817
 ] 

ASF GitHub Bot commented on HAWQ-927:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq/pull/796


> Send Projection Info Data from HAWQ to PXF
> --
>
> Key: HAWQ-927
> URL: https://issues.apache.org/jira/browse/HAWQ-927
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: External Tables, PXF
>Reporter: Kavinder Dhaliwal
>Assignee: Kavinder Dhaliwal
> Fix For: backlog
>
>
> To achieve column projection at the level of PXF or the underlying readers we 
> need to first send this data as a Header/Param to PXF. Currently, PXF has no 
> knowledge whether a query requires all columns or a subset of columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-936) Add GUC for array expansion in ORCA optimizer

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386816#comment-15386816
 ] 

ASF GitHub Bot commented on HAWQ-936:
-

Github user hsyuan commented on the issue:

https://github.com/apache/incubator-hawq/pull/800
  
This PR will give https://github.com/apache/incubator-hawq/pull/795 a free 
ride.


> Add GUC for array expansion in ORCA optimizer
> -
>
> Key: HAWQ-936
> URL: https://issues.apache.org/jira/browse/HAWQ-936
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Optimizer
>Reporter: Haisheng Yuan
>Assignee: Venkatesh
> Fix For: 2.0.1.0-incubating
>
>
> Consider the query with the following pattern select * from foo where foo.a 
> IN {1,2,3,...}. Currently, when the number of constants in the IN subquery is 
> large, the query optimization time is unacceptable. This is stopping 
> customers from turning Orca on by default since many of the queries are 
> generated queries with such a pattern.
> The root cause is due to the expansion of the IN subquery into an expression 
> in disjunctive normal form. The objective of this story is to disable this 
> expansion when the number of constants in the IN list is large.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-927) Send Projection Info Data from HAWQ to PXF

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386809#comment-15386809
 ] 

ASF GitHub Bot commented on HAWQ-927:
-

Github user shivzone commented on the issue:

https://github.com/apache/incubator-hawq/pull/796
  
+1


> Send Projection Info Data from HAWQ to PXF
> --
>
> Key: HAWQ-927
> URL: https://issues.apache.org/jira/browse/HAWQ-927
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: External Tables, PXF
>Reporter: Kavinder Dhaliwal
>Assignee: Kavinder Dhaliwal
> Fix For: backlog
>
>
> To achieve column projection at the level of PXF or the underlying readers we 
> need to first send this data as a Header/Param to PXF. Currently, PXF has no 
> knowledge whether a query requires all columns or a subset of columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-938) Remove ivy.xml in gpopt and read orca version from header file

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386634#comment-15386634
 ] 

ASF GitHub Bot commented on HAWQ-938:
-

Github user hsyuan commented on the issue:

https://github.com/apache/incubator-hawq/pull/806
  
@changleicn @yaoj2 @wengyanqing 
Please take a look.


> Remove ivy.xml in gpopt and read orca version from header file
> --
>
> Key: HAWQ-938
> URL: https://issues.apache.org/jira/browse/HAWQ-938
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Optimizer
>Reporter: Haisheng Yuan
>Assignee: Haisheng Yuan
> Fix For: 2.0.1.0-incubating
>
>
> Currently, if we want to upgrade orca or gpos, we need change the orca SHA as 
> well as the version number in ivy.xml. The function gp_opt_version() returns 
> version number that is read from ivy.xml, which is not a right way. It should 
> only be dependent on the source file of orca and gpos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-938) Remove ivy.xml in gpopt and read orca version from header file

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386632#comment-15386632
 ] 

ASF GitHub Bot commented on HAWQ-938:
-

GitHub user hsyuan opened a pull request:

https://github.com/apache/incubator-hawq/pull/806

HAWQ-938. Remove ivy.xml in gpopt and read orca version from header file

The old mechanism extracted the version numbers from the Ivy config file,
which doesn't do the right thing if you build without Ivy. Using the
version headers is simpler, anyway. Also removed `ivy.xml` and 
`ivy-build.xml`
under `gpopt` folder.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hsyuan/incubator-hawq HAWQ-938

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq/pull/806.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #806


commit 2a2a89cc6b950d4067a8e4d8a5e79b2f7b1cf839
Author: Haisheng Yuan 
Date:   2016-07-20T20:14:25Z

HAWQ-938. Remove ivy.xml in gpopt and read orca version from header file

The old mechanism extracted the version numbers from the Ivy config file,
which doesn't do the right thing if you build without Ivy. Using the
version headers is simpler, anyway.




> Remove ivy.xml in gpopt and read orca version from header file
> --
>
> Key: HAWQ-938
> URL: https://issues.apache.org/jira/browse/HAWQ-938
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Optimizer
>Reporter: Haisheng Yuan
>Assignee: Haisheng Yuan
> Fix For: 2.0.1.0-incubating
>
>
> Currently, if we want to upgrade orca or gpos, we need change the orca SHA as 
> well as the version number in ivy.xml. The function gp_opt_version() returns 
> version number that is read from ivy.xml, which is not a right way. It should 
> only be dependent on the source file of orca and gpos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-927) Send Projection Info Data from HAWQ to PXF

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386188#comment-15386188
 ] 

ASF GitHub Bot commented on HAWQ-927:
-

Github user xunzhang commented on the issue:

https://github.com/apache/incubator-hawq/pull/796
  
LGTM. Remember rebasing the commit before checking in.  


> Send Projection Info Data from HAWQ to PXF
> --
>
> Key: HAWQ-927
> URL: https://issues.apache.org/jira/browse/HAWQ-927
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: External Tables, PXF
>Reporter: Kavinder Dhaliwal
>Assignee: Kavinder Dhaliwal
> Fix For: backlog
>
>
> To achieve column projection at the level of PXF or the underlying readers we 
> need to first send this data as a Header/Param to PXF. Currently, PXF has no 
> knowledge whether a query requires all columns or a subset of columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-927) Send Projection Info Data from HAWQ to PXF

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386155#comment-15386155
 ] 

ASF GitHub Bot commented on HAWQ-927:
-

Github user kavinderd commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/796#discussion_r71559534
  
--- Diff: src/backend/access/external/pxfheaders.c ---
@@ -158,6 +166,29 @@ static void add_tuple_desc_httpheader(CHURL_HEADERS 
headers, Relation rel)
pfree(formatter.data);
 }
 
+static void add_projection_desc_httpheader(CHURL_HEADERS headers, 
ProjectionInfo *projInfo) {
+   int i;
+   char long_number[sizeof(int32) * 8];
+   int *varNumbers = projInfo->pi_varNumbers;
+   StringInfoData formatter;
+   initStringInfo();
+
+/* Convert the number of projection columns to a string */
+pg_ltoa(list_length(projInfo->pi_targetlist), long_number);
+churl_headers_append(headers, "X-GP-ATTRS-PROJ", long_number);
+
+   for(i = 0; i < list_length(projInfo->pi_targetlist); i++) {
--- End diff --

Yes it will be in another PR related to 
[this](https://issues.apache.org/jira/browse/HAWQ-583?jql=project%20%3D%20HAWQ%20AND%20resolution%20%3D%20Unresolved%20AND%20assignee%20%3D%20kavinderd%20ORDER%20BY%20priority%20DESC)
 Jira.


> Send Projection Info Data from HAWQ to PXF
> --
>
> Key: HAWQ-927
> URL: https://issues.apache.org/jira/browse/HAWQ-927
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: External Tables, PXF
>Reporter: Kavinder Dhaliwal
>Assignee: Kavinder Dhaliwal
> Fix For: backlog
>
>
> To achieve column projection at the level of PXF or the underlying readers we 
> need to first send this data as a Header/Param to PXF. Currently, PXF has no 
> knowledge whether a query requires all columns or a subset of columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-927) Send Projection Info Data from HAWQ to PXF

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386147#comment-15386147
 ] 

ASF GitHub Bot commented on HAWQ-927:
-

Github user kavinderd commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/796#discussion_r71558813
  
--- Diff: src/backend/access/external/fileam.c ---
@@ -454,6 +454,21 @@ external_stopscan(FileScanDesc scan)
}
 }
 
+/* 
+ * external_getnext_init - prepare ExternalSelectDesc struct 
before external_getnext
+/* 
+ */
+
+ExternalSelectDesc
+external_getnext_init(PlanState *state) {
+   ExternalSelectDesc desc = (ExternalSelectDesc) 
palloc0(sizeof(ExternalSelectDescData));
--- End diff --

I missed adding `pfree` for `desc`. I added it to the end of 
`ExternalNext()`


> Send Projection Info Data from HAWQ to PXF
> --
>
> Key: HAWQ-927
> URL: https://issues.apache.org/jira/browse/HAWQ-927
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: External Tables, PXF
>Reporter: Kavinder Dhaliwal
>Assignee: Kavinder Dhaliwal
> Fix For: backlog
>
>
> To achieve column projection at the level of PXF or the underlying readers we 
> need to first send this data as a Header/Param to PXF. Currently, PXF has no 
> knowledge whether a query requires all columns or a subset of columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-932) HAWQ fails to query external table defined with "localhost" in URL

2016-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386111#comment-15386111
 ] 

ASF GitHub Bot commented on HAWQ-932:
-

Github user kavinderd commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/803#discussion_r71555148
  
--- Diff: src/backend/access/external/libchurl.c ---
@@ -312,6 +312,14 @@ CHURL_HANDLE churl_init_upload(const char* url, 
CHURL_HEADERS headers)
context->upload = true;
clear_error_buffer(context);
 
+   /* needed to resolve pxf service address */
+   struct curl_slist *resolve_hosts = NULL;
+   char *pxf_host_entry = (char *) palloc0(strlen(pxf_service_address) + 
strlen(LocalhostIpV4Entry) + 1);
--- End diff --

Is `pxf_host_entry` pfree'd?


> HAWQ fails to query external table defined with "localhost" in URL
> --
>
> Key: HAWQ-932
> URL: https://issues.apache.org/jira/browse/HAWQ-932
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: External Tables, PXF
>Reporter: Goden Yao
>Assignee: Oleksandr Diachenko
> Fix For: 2.0.1.0-incubating
>
>
> Originally reported by [~jpatel] when he's making a docker image based on 
> HAWQ 2.0.0.0-incubating dev build. Investigated by [~odiachenko]
> There is workaround to define it with 127.0.0.1, but there is not a 
> workaround for querying tables using HCatalog integration.
> It used to work before.
> {code}
> template1=# CREATE EXTERNAL TABLE ext_table1 (t1text, t2text,
> num1  integer, dub1  double precision) LOCATION
> (E'pxf://localhost:51200/hive_small_data?PROFILE=Hive') FORMAT 'CUSTOM'
> (formatter='pxfwritable_import');*
> CREATE EXTERNAL TABLE
> template1=# select * from ext_table1;
> ERROR:  remote component error (0): (libchurl.c:898)*
> {code}
> When I turned on debug mode in curl, I found this error in logs - "*
> Closing connection 0".
> I found a workaround, to set CURLOPT_RESOLVE option in curl:
> {code}
> struct curl_slist *host = NULL;
> host = curl_slist_append(NULL, "localhost:51200:127.0.0.1");*
> set_curl_option(context, CURLOPT_RESOLVE, host);
> {code}
> It seems like an issue with DNS cache,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2   3   4   5   6   7   8   9   10   >