[jira] [Commented] (HAWQ-1491) docs - add usage info for HiveVectorizedORC profile
[ https://issues.apache.org/jira/browse/HAWQ-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16067457#comment-16067457 ] ASF GitHub Bot commented on HAWQ-1491: -- Github user asfgit closed the pull request at: https://github.com/apache/incubator-hawq-docs/pull/126 > docs - add usage info for HiveVectorizedORC profile > --- > > Key: HAWQ-1491 > URL: https://issues.apache.org/jira/browse/HAWQ-1491 > Project: Apache HAWQ > Issue Type: Improvement > Components: Documentation >Reporter: Lisa Owen >Assignee: David Yozie > > add usage info and an example for the new HiveVectorizedORC profile to the > Hive plug-in page. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HAWQ-1491) docs - add usage info for HiveVectorizedORC profile
[ https://issues.apache.org/jira/browse/HAWQ-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16066931#comment-16066931 ] ASF GitHub Bot commented on HAWQ-1491: -- Github user dyozie commented on a diff in the pull request: https://github.com/apache/incubator-hawq-docs/pull/126#discussion_r124606922 --- Diff: markdown/pxf/HivePXF.html.md.erb --- @@ -565,6 +577,44 @@ In the following example, you will create a Hive table stored in ORC format and Time: 425.416 ms ``` +### Example: Using the HiveVectorizedORC Profile + +In the following example, you will use the `HiveVectorizedORC` profile to query the `sales_info_ORC` Hive table you created in the previous example. + +**Note**: The `HiveVectorizedORC` profile does not support the timestamp data type and complex types. --- End diff -- Just to avoid any potential confusion, let's change this to "**or** complext types." > docs - add usage info for HiveVectorizedORC profile > --- > > Key: HAWQ-1491 > URL: https://issues.apache.org/jira/browse/HAWQ-1491 > Project: Apache HAWQ > Issue Type: Improvement > Components: Documentation >Reporter: Lisa Owen >Assignee: David Yozie > > add usage info and an example for the new HiveVectorizedORC profile to the > Hive plug-in page. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HAWQ-1491) docs - add usage info for HiveVectorizedORC profile
[ https://issues.apache.org/jira/browse/HAWQ-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065669#comment-16065669 ] ASF GitHub Bot commented on HAWQ-1491: -- Github user sansanichfb commented on a diff in the pull request: https://github.com/apache/incubator-hawq-docs/pull/126#discussion_r124423336 --- Diff: markdown/pxf/ReadWritePXF.html.md.erb --- @@ -105,6 +105,18 @@ Note: The DELIMITER parameter is mandatory. org.apache.hawq.pxf.service.io.GPDBWritable + +HiveVectorizedORC +Optimized block read of a Hive table where each partition is stored as an ORC file. --- End diff -- People might get confused with HDFS block, so we can maybe use bulk/batch read. > docs - add usage info for HiveVectorizedORC profile > --- > > Key: HAWQ-1491 > URL: https://issues.apache.org/jira/browse/HAWQ-1491 > Project: Apache HAWQ > Issue Type: Improvement > Components: Documentation >Reporter: Lisa Owen >Assignee: David Yozie > > add usage info and an example for the new HiveVectorizedORC profile to the > Hive plug-in page. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HAWQ-1491) docs - add usage info for HiveVectorizedORC profile
[ https://issues.apache.org/jira/browse/HAWQ-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065667#comment-16065667 ] ASF GitHub Bot commented on HAWQ-1491: -- Github user sansanichfb commented on a diff in the pull request: https://github.com/apache/incubator-hawq-docs/pull/126#discussion_r124423164 --- Diff: markdown/pxf/HivePXF.html.md.erb --- @@ -565,6 +577,44 @@ In the following example, you will create a Hive table stored in ORC format and Time: 425.416 ms ``` +### Example: Using the HiveVectorizedORC Profile + +In the following example, you will use the `HiveVectorizedORC` profile to query the `sales_info_ORC` Hive table you created in the previous example. + +**Note**: The `HiveVectorizedORC` profile does not support the timestamp data type and complex types. + +1. Start the `psql` subsystem: + +``` shell +$ psql -d postgres +``` + +2. Use the PXF `HiveVectorizedORC` profile to create a queryable HAWQ external table from the Hive table named `sales_info_ORC` that you created in Step 1 of the previous example. The `FORMAT` clause must specify `'CUSTOM'`. The `HiveVectorizedORC` `CUSTOM` format supports only the built-in `'pxfwritable_import'` `formatter`. --- End diff -- queryable - maybe readable? > docs - add usage info for HiveVectorizedORC profile > --- > > Key: HAWQ-1491 > URL: https://issues.apache.org/jira/browse/HAWQ-1491 > Project: Apache HAWQ > Issue Type: Improvement > Components: Documentation >Reporter: Lisa Owen >Assignee: David Yozie > > add usage info and an example for the new HiveVectorizedORC profile to the > Hive plug-in page. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HAWQ-1491) docs - add usage info for HiveVectorizedORC profile
[ https://issues.apache.org/jira/browse/HAWQ-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065665#comment-16065665 ] ASF GitHub Bot commented on HAWQ-1491: -- Github user sansanichfb commented on a diff in the pull request: https://github.com/apache/incubator-hawq-docs/pull/126#discussion_r124423088 --- Diff: markdown/pxf/HivePXF.html.md.erb --- @@ -495,9 +500,16 @@ Use the `HiveORC` profile to access ORC format data. The `HiveORC` profile provi - `=`, `>`, `<`, `>=`, `<=`, `IS NULL`, and `IS NOT NULL` operators and comparisons between the `float8` and `float4` types - `IN` operator on arrays of `int2`, `int4`, `int8`, `boolean`, and `text` -- Complex type support - You can access Hive tables composed of array, map, struct, and union data types. PXF serializes each of these complex types to `text`. +When choosing an ORC-supporting profile, consider the following: + +- The `HiveORC` profile supports complex types. You can access Hive tables composed of array, map, struct, and union data types. PXF serializes each of these complex types to `text`. + +The `HiveVectorizedORC` profile does not support complex types. + +- The `HiveVectorizedORC` profile reads 1024 rows of data, while the `HiveORC` profile reads only a single row at a time. --- End diff -- profile reads 1024 rows of data at once > docs - add usage info for HiveVectorizedORC profile > --- > > Key: HAWQ-1491 > URL: https://issues.apache.org/jira/browse/HAWQ-1491 > Project: Apache HAWQ > Issue Type: Improvement > Components: Documentation >Reporter: Lisa Owen >Assignee: David Yozie > > add usage info and an example for the new HiveVectorizedORC profile to the > Hive plug-in page. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HAWQ-1491) docs - add usage info for HiveVectorizedORC profile
[ https://issues.apache.org/jira/browse/HAWQ-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065576#comment-16065576 ] ASF GitHub Bot commented on HAWQ-1491: -- GitHub user lisakowen opened a pull request: https://github.com/apache/incubator-hawq-docs/pull/126 HAWQ-1491 - create usage docs for HiveVectorizedORC profile update hawq docs for new HiveVectorizedORC profile. - add example to hive plug-in page - include the profile and accessor/fragmenter/resolver classes in the appropriate tables in the other docs You can merge this pull request into a Git repository by running: $ git pull https://github.com/lisakowen/incubator-hawq-docs feature/pxf-hivevectorizedorc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq-docs/pull/126.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #126 commit dfd692cb355e9505e82669c311c413c06ae8518e Author: Lisa OwenDate: 2017-06-27T19:56:45Z HAWQ-1491 - create usage docs for HiveVectorizedORC profile > docs - add usage info for HiveVectorizedORC profile > --- > > Key: HAWQ-1491 > URL: https://issues.apache.org/jira/browse/HAWQ-1491 > Project: Apache HAWQ > Issue Type: Improvement > Components: Documentation >Reporter: Lisa Owen >Assignee: David Yozie > > add usage info and an example for the new HiveVectorizedORC profile to the > Hive plug-in page. -- This message was sent by Atlassian JIRA (v6.4.14#64029)