[
https://issues.apache.org/jira/browse/HCATALOG-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083804#comment-13083804
]
Francis Liu commented on HCATALOG-64:
-------------------------------------
Thanks for the comments.
1) Yes my IDE was doing this for me automatically. Will turn it off and fix
this.
2) When is 0.2 going to be branched?
3) Logic differs in what is returned for HCatOutpuFormat.getTableSchema() and
HCatInputFormat.getTableSchema(), the former returns only dataColumns while the
latter returns dataColumns+partitionColumns hence the need for them to be
separated. Also I see having them separate as a good thing since there is no
information in a field schema identifying it as a partition or data field. That
could be another way to go?
4) HCatTableInfo is treated as an abstraction of hive's Table class while
InputJobInfo is a user specified configuration/action object. If we change
setInput() and setOutput to include arguments for explicitly specifying the
database and table name we can remove the redundancy though that doesn't look
too good either. I'm open to suggestions?
5) Agreed will rename them to be consistent.
6) Noted will change this. I'm not familiar with this indenting scheme. Any
pointers would be appreciated.
7)InputJobInfo.getProperties() and OutputJobInfo.getProperties() would enable
users to pass implementation specific parameters.
> Refactor HCatTableInfo, JobInfo and OutputJobInfo
> -------------------------------------------------
>
> Key: HCATALOG-64
> URL: https://issues.apache.org/jira/browse/HCATALOG-64
> Project: HCatalog
> Issue Type: Improvement
> Affects Versions: 0.1, 0.2
> Reporter: Francis Liu
> Assignee: Francis Liu
> Fix For: 0.2
>
> Attachments: HCatTableInfo_JobInfo_OutputJobInfo_3.patch
>
>
> These classes and their roles has become convoluted. HCatTableInfo should be
> an HCat abstraction of table and thus not have any job specific information
> and should not contain different information depending on usage. *JobInfo
> classes should contain job specific information (user provided, derived from
> metastore info, etc). Since *JobInfo contains such information it should be
> the object which is passed to HCatInputFormat.setInput and
> HCatInputFormat.setOutput. Also JobInfo should be renamed to InputJobInfo for
> consistency and clarity. Also there needs to be a way to pass implementation
> specific configuration information down to the actual storage driver.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira