[ https://issues.apache.org/jira/browse/HAWQ-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851828#comment-15851828 ]
ASF GitHub Bot commented on HAWQ-1304: -------------------------------------- Github user dyozie commented on a diff in the pull request: https://github.com/apache/incubator-hawq-docs/pull/94#discussion_r99383249 --- Diff: markdown/pxf/PXFExternalTableandAPIReference.html.md.erb --- @@ -27,48 +27,66 @@ The PXF Java API lets you extend PXF functionality and add new services and form The Fragmenter produces a list of data fragments that can be read in parallel from the data source. The Accessor produces a list of records from a single fragment, and the Resolver both deserializes and serializes records. -Together, the Fragmenter, Accessor, and Resolver classes implement a connector. PXF includes plug-ins for tables in HDFS, HBase, and Hive. +Together, the Fragmenter, Accessor, and Resolver classes implement a connector. PXF includes plug-ins for HDFS and JSON files and tables in HBase and Hive. ## <a id="creatinganexternaltable"></a>Creating an External Table -The syntax for a readable `EXTERNAL TABLE` that uses the PXF protocol is as follows: +The syntax for an `EXTERNAL TABLE` that uses the PXF protocol is as follows: ``` sql -CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name - ( column_name data_type [, ...] | LIKE other_table ) -LOCATION('pxf://host[:port]/path-to-data<pxf parameters>[&custom-option=value...]') +CREATE [READABLE|WRITABLE] EXTERNAL TABLE <table_name> + ( <column_name> <data_type> [, ...] | LIKE <other_table> ) +LOCATION('pxf://<host>[:<port>]/<path-to-data>?<pxf-parameters>[&<custom-option>=<value>[...]]') FORMAT 'custom' (formatter='pxfwritable_import|pxfwritable_export'); ``` - where *<pxf parameters>* is: + where \<pxf\-parameters\> is: ``` pre - ?FRAGMENTER=fragmenter_class&ACCESSOR=accessor_class&RESOLVER=resolver_class] - | ?PROFILE=profile-name + [FRAGMENTER=<fragmenter_class>&ACCESSOR=<accessor_class> + &RESOLVER=<resolver_class>] | ?PROFILE=profile-name ``` + +T <caption><span class="tablecap">Table 1. Parameter values and description</span></caption> <a id="creatinganexternaltable__table_pfy_htz_4p"></a> | Parameter | Value and description | |-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| host | The current host of the PXF service. | -| port | Connection port for the PXF service. If the port is omitted, PXF assumes that High Availability (HA) is enabled and connects to the HA name service port, 51200 by default. The HA name service port can be changed by setting the `pxf_service_port` configuration parameter. | -| *path\_to\_data* | A directory, file name, wildcard pattern, table name, etc. | -| FRAGMENTER | The plug-in (Java class) to use for fragmenting data. Used for READABLE external tables only. | -| ACCESSOR | The plug-in (Java class) to use for accessing the data. Used for READABLE and WRITABLE tables. | -| RESOLVER | The plug-in (Java class) to use for serializing and deserializing the data. Used for READABLE and WRITABLE tables. | -| *custom-option*=*value* | Additional values to pass to the plug-in class. The parameters are passed at runtime to the plug-ins indicated above. The plug-ins can lookup custom options with `org.apache.hawq.pxf.api.utilities.InputData`. | +| host | The HDFS NameNode. | +| port | Connection port for the PXF service. If the port is omitted, PXF assumes that High Availability (HA) is enabled and connects to the HA name service port, 51200, by default. The HA name service port can be changed by setting the `pxf_service_port` configuration parameter. | +| \<path\-to\-data\> | A directory, file name, wildcard pattern, table name, etc. | +| PROFILE | The profile PXF should use to access the data. PXF supports multiple plug-ins that currently expose profiles named `HBase`, `Hive`, `HiveRC`, `HiveText`, `HiveORC`, `HdfsTextSimple`, `HdfsTextMulti`, `Avro`, `SequenceWritable`, and `Json`. | +| FRAGMENTER | The Java class the plug-in uses for fragmenting data. Used for READABLE external tables only. | +| ACCESSOR | The Java class the plug-in uses for accessing the data. Used for READABLE and WRITABLE tables. | +| RESOLVER | The Java class the plug-in uses for serializing and deserializing the data. Used for READABLE and WRITABLE tables. | +| \<custom-option\> | Additional values to pass to the plug-in at runtime. A plug-in can parse custom options with the PXF helper class `org.apache.hawq.pxf.api.utilities.InputData`. | **Note:** When creating PXF external tables, you cannot use the `HEADER` option in your `FORMAT` specification. -For more information about this example, see [About the Java Class Services and Formats](#aboutthejavaclassservicesandformats). ## <a id="aboutthejavaclassservicesandformats"></a>About the Java Class Services and Formats -The `LOCATION` string in a PXF `CREATE EXTERNAL TABLE` statement is a URI that specifies the host and port of an external data source and the path to the data in the external data source. The query portion of the URI, introduced by the question mark (?), must include the required parameters `FRAGMENTER` (readable tables only), `ACCESSOR`, and `RESOLVER`, which specify Java class names that extend the base PXF API plug-in classes. Alternatively, the required parameters can be replaced with a `PROFILE` parameter with the name of a profile defined in the `/etc/conf/pxf-profiles.xml` that defines the required classes. +The `LOCATION` string in a PXF `CREATE EXTERNAL TABLE` statement is a URI that specifies the host and port of an external data source and the path to the data in the external data source. The query portion of the URI, introduced by the question mark (?), must include the PXF profile name or the plug-in's `FRAGMENTER` (readable tables only), `ACCESSOR`, and `RESOLVER` class names. + +PXF profiles are defined in the `/etc/pxf/conf/pxf-profiles.xml` file. Profile definitions include plug-in class names. For example, the `HdfsTextSimple` profile definition follows: --- End diff -- Change "follows" to "is" > documentation changes for HAWQ-1228 > ----------------------------------- > > Key: HAWQ-1304 > URL: https://issues.apache.org/jira/browse/HAWQ-1304 > Project: Apache HAWQ > Issue Type: New Feature > Components: Documentation > Affects Versions: 2.1.0.0-incubating > Reporter: Lisa Owen > Assignee: David Yozie > Priority: Minor > > - new pxf-profiles.xml outputFormat parameter > - hive table access via external table and hcatalog now uses optimal profile > for each fragment > - others -- This message was sent by Atlassian JIRA (v6.3.15#6346)