[
https://issues.apache.org/jira/browse/HCATALOG-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445126#comment-13445126
]
Travis Crawford commented on HCATALOG-425:
------------------------------------------
Ideally we can keep things simple, and avoid cases where HCatSchema/HCatRecord
differ. Let's walk through an example reading with Pig.
Initially, Pig is going to ask HCat for the schema of the relation being
loaded. This means querying the metastore, converting the table schema into an
hcat schema, then converting the hcat schema into a pig schema. If we implement
conversions in the hive-->hcat schema layer, pig always sees records in data
types it has support for.
Now pig reads a record through HCat. HCat reads a record from the hive serde,
and converts to an hcat record using whatever conversion rules have been
enabled. This record is converted to a pig tuple that matches the expected
schema.
Now let's write something. Pig will provide a tuple that we need to write into
a table that might have a different schema. When converting the pig tuple into
an hcat record, we apply conversion rules "on the way out" so that our hcat
record and hcat schema match.
I believe if we follow this approach the schema and records will always match,
and we can avoid having to keep track of original data types, if fields have
been converted, etc. I do agree if we need lots of these a "conversion strategy
impl" would start to make sense. I'm not sure we'll get to that place though -
there are just a handful of conversion I know about.
> Pig cannot read/write SMALLINT/TINYINT columns
> ----------------------------------------------
>
> Key: HCATALOG-425
> URL: https://issues.apache.org/jira/browse/HCATALOG-425
> Project: HCatalog
> Issue Type: Bug
> Components: pig
> Affects Versions: 0.4
> Reporter: Thejas M Nair
> Assignee: Travis Crawford
> Fix For: 0.5
>
> Attachments: HCATALOG-425_small_tiny_int.1.patch,
> HCATALOG-425_small_tiny_int.2.patch, HCATALOG-425_small_tiny_int.3.patch
>
>
> Currently throw exception. We can always allow read and on write side, we can
> do out of boundary check at runtime.
> This issue described in HCATALOG-168, has not been fixed. It still throws an
> exception.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira