[ 
https://issues.apache.org/jira/browse/HCATALOG-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083787#comment-13083787
 ] 

Sushanth Sowmyan commented on HCATALOG-64:
------------------------------------------

@Alan: Quick reply for a couple of your points, the rest hold:

#3 : We want to keep these separate because we do striping/adding of partition 
columns as needed - these columns are not actually stored in the data stored 
itself and not part of the Table/Partition schema, and add them in when we read 
it, and strip them when we write it. So it's a useful separation.

#7 : Was discussed on hcatalog-dev mailing list as not belonging here:

The relevant bit from my response there:
--
a) For configuration parameters - HCatTableInfo by itself is not
supposed to contain any parameters specific to any storage drivers.
The reason for this is that HCatTableInfo is how the M/R programmer
passes on info to HCatInputFormat, and thus, should not contain
anything specific to any storage driver implementation as you mention.
So, there is already a place for that, and that is stored in the
table(and partition metadata), in the Table and Partition objects, as
Table.getStorageDescriptor.getParameters() and
Partition.getStorageDescriptor.getParameters(). This is read by
HCatInputFormat/HCatOutputFormat and passed on to the respective ISD
OSD as part of the initialize() call and also in the getInputFormat()
and getOutputFormat() calls, and all properties have a hcat.* keyname.
Have a look at PigStorageInputDriver as an example - it reads a delim
parameter. Or the RCFileInput/OutputDriver.
--

There is already a place for the implementation specific configuration 
information in metadata, which is where it'd be necessary to store it for any 
manner of persistence of this information.



> Refactor HCatTableInfo, JobInfo and OutputJobInfo
> -------------------------------------------------
>
>                 Key: HCATALOG-64
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-64
>             Project: HCatalog
>          Issue Type: Improvement
>    Affects Versions: 0.1, 0.2
>            Reporter: Francis Liu
>            Assignee: Francis Liu
>             Fix For: 0.2
>
>         Attachments: HCatTableInfo_JobInfo_OutputJobInfo_3.patch
>
>
> These classes and their roles has become convoluted. HCatTableInfo should be 
> an HCat abstraction of table and thus not have any job specific information 
> and should not contain different information depending on usage. *JobInfo 
> classes should contain job specific information (user provided, derived from 
> metastore info, etc). Since *JobInfo contains such information it should be 
> the object which is passed to HCatInputFormat.setInput and 
> HCatInputFormat.setOutput. Also JobInfo should be renamed to InputJobInfo for 
> consistency and clarity. Also there needs to be a way to pass implementation 
> specific configuration information down to the actual storage driver.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to