[jira] [Commented] (HCATALOG-64) Refactor HCatTableInfo, JobInfo and OutputJobInfo

Alan Gates (JIRA) Thu, 11 Aug 2011 15:28:54 -0700

    [ 
https://issues.apache.org/jira/browse/HCATALOG-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083752#comment-13083752
 ]


Alan Gates commented on HCATALOG-64:
------------------------------------

1) This patch includes things that aren't related to the core issue.  For 
example, it's refactoring import lists of test classes.  A patch should address 
just issues related to its JIRA.  Please file a separate JIRA for import 
refactoring.  As a side note on this, don't refactor import statements to use 
'import x.*'.  Importing .* is usually bad.  We should import the classes we 
need.  Given that the patch also randomly moves around import statements I'm 
guessing your IDE may be doing this automatically.  If so, please turn that off 
for HCatalog patches.

2) This patch should not go in until we have branched for 0.2.  We don't want 
these interface changes in 0.2

3) In HCatTableInfo, why are there separate schemas for dataColumns and 
partitionColumns?

4) InputJobInfo and HCatTableInfo both contain dbName and tableName.  Shouldn't 
they only be in HCatTableInfo, since InputJobInfo has a reference to the former?

5) It's InputJobInfo.dbName but OutputJobInfo.databaseName.  We should pick one 
and stick with it.  Same question as above about why these are here at all.

6) In some places I notice you reformatted from 4 spaces to 2 for indention.  
Despite many of the files in HCatalog being indented 2, 4 is the official 
indention.  Except where you are editing an existing file that is already 2 and 
you want to be consistent with the formatting you should use 4 spaces.  Also, 
please don't reformat code when fixing bugs (see point 1 above).

7) One of your stated objectives is to "there needs to be a way to pass 
implementation specific configuration information down to the actual storage 
driver".  I missed where you made this change.

> Refactor HCatTableInfo, JobInfo and OutputJobInfo
> -------------------------------------------------
>
>                 Key: HCATALOG-64
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-64
>             Project: HCatalog
>          Issue Type: Improvement
>    Affects Versions: 0.1, 0.2
>            Reporter: Francis Liu
>            Assignee: Francis Liu
>             Fix For: 0.2
>
>         Attachments: HCatTableInfo_JobInfo_OutputJobInfo_3.patch
>
>
> These classes and their roles has become convoluted. HCatTableInfo should be 
> an HCat abstraction of table and thus not have any job specific information 
> and should not contain different information depending on usage. *JobInfo 
> classes should contain job specific information (user provided, derived from 
> metastore info, etc). Since *JobInfo contains such information it should be 
> the object which is passed to HCatInputFormat.setInput and 
> HCatInputFormat.setOutput. Also JobInfo should be renamed to InputJobInfo for 
> consistency and clarity. Also there needs to be a way to pass implementation 
> specific configuration information down to the actual storage driver.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HCATALOG-64) Refactor HCatTableInfo, JobInfo and OutputJobInfo

Reply via email to