[ 
https://issues.apache.org/jira/browse/HCATALOG-425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445126#comment-13445126
 ] 

Travis Crawford commented on HCATALOG-425:
------------------------------------------

Ideally we can keep things simple, and avoid cases where HCatSchema/HCatRecord 
differ. Let's walk through an example reading with Pig.

Initially, Pig is going to ask HCat for the schema of the relation being 
loaded. This means querying the metastore, converting the table schema into an 
hcat schema, then converting the hcat schema into a pig schema. If we implement 
conversions in the hive-->hcat schema layer, pig always sees records in data 
types it has support for.

Now pig reads a record through HCat. HCat reads a record from the hive serde, 
and converts to an hcat record using whatever conversion rules have been 
enabled. This record is converted to a pig tuple that matches the expected 
schema.

Now let's write something. Pig will provide a tuple that we need to write into 
a table that might have a different schema. When converting the pig tuple into 
an hcat record, we apply conversion rules "on the way out" so that our hcat 
record and hcat schema match.

I believe if we follow this approach the schema and records will always match, 
and we can avoid having to keep track of original data types, if fields have 
been converted, etc. I do agree if we need lots of these a "conversion strategy 
impl" would start to make sense. I'm not sure we'll get to that place though - 
there are just a handful of conversion I know about.
                
> Pig cannot read/write SMALLINT/TINYINT columns
> ----------------------------------------------
>
>                 Key: HCATALOG-425
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-425
>             Project: HCatalog
>          Issue Type: Bug
>          Components: pig
>    Affects Versions: 0.4
>            Reporter: Thejas M Nair
>            Assignee: Travis Crawford
>             Fix For: 0.5
>
>         Attachments: HCATALOG-425_small_tiny_int.1.patch, 
> HCATALOG-425_small_tiny_int.2.patch, HCATALOG-425_small_tiny_int.3.patch
>
>
> Currently throw exception. We can always allow read and on write side, we can 
> do out of boundary check at runtime.
> This issue described in  HCATALOG-168, has not been fixed. It still throws an 
> exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to