[
https://issues.apache.org/jira/browse/HAWQ-450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shivram Mani updated HAWQ-450:
------------------------------
Description:
File formats such as avro,json have the schema information along with the data.
Other formats such as text/CSV schema inference is a bit more complex.
Introduce additional parameters in the PXF api inferSchema, header in order to
auo discover schema.
Spark provides a similar option eg: https://github.com/databricks/spark-csv
provides options for schema inference
was:Automatically infer schema of HBase table and map it to HAWQ.
> Schema auto discovery on HDFS
> -----------------------------
>
> Key: HAWQ-450
> URL: https://issues.apache.org/jira/browse/HAWQ-450
> Project: Apache HAWQ
> Issue Type: New Feature
> Components: PXF
> Reporter: Shivram Mani
> Assignee: Goden Yao
>
> File formats such as avro,json have the schema information along with the
> data. Other formats such as text/CSV schema inference is a bit more complex.
> Introduce additional parameters in the PXF api inferSchema, header in order
> to auo discover schema.
> Spark provides a similar option eg: https://github.com/databricks/spark-csv
> provides options for schema inference
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)