[ 
https://issues.apache.org/jira/browse/HCATALOG-64?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084300#comment-13084300
 ] 

Sushanth Sowmyan commented on HCATALOG-64:
------------------------------------------


I'm going to split my position into wants, likes and nice-to-have: 

--
Want:

+ I want one important requirement to hold: that a user who has no idea of what 
a table's underlying storage uses should be able to use a trivial no-parameter 
read on HCat to be able to get data from the table.


Like:

+ I would like to not see jobs writable in such a way that they would work 
"only" on top of HBase(or any other storage driver, for that matter), because 
that compromises one of HCat's goals, namely, portability/migratability of 
data. - This might be a goal to take on later, when we know what all features 
of a "table" we intend to mimic across HDFS/HBase/?other tables as a HCatTable.

Nice-to-have:

+ I would further add that when a user submits a no-parameters job, I'd like 
this to be the latest version of the table as of the job submission time, but 
I'm okay with discussing that if that's not possible or has some other aspects 
to it.
--

I'm okay with having a Properties mechanism to pass in the parameters in the 
interim, if the goal is to revisit to see that it is not misused.

Regarding your replies:
a) As long as it works when people don't supply "hints", I'm okay with it for 
now.
b) I agree on the invasiveness, and it's beyond the scope of this patch, but I 
would like to take this on as an eventual design todo exercise.
c) Thanks!

And I look forward to your updated design doc. :)



> Refactor HCatTableInfo, JobInfo and OutputJobInfo
> -------------------------------------------------
>
>                 Key: HCATALOG-64
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-64
>             Project: HCatalog
>          Issue Type: Improvement
>    Affects Versions: 0.1, 0.2
>            Reporter: Francis Liu
>            Assignee: Francis Liu
>             Fix For: 0.2
>
>         Attachments: HCatTableInfo_JobInfo_OutputJobInfo_3.patch
>
>
> These classes and their roles has become convoluted. HCatTableInfo should be 
> an HCat abstraction of table and thus not have any job specific information 
> and should not contain different information depending on usage. *JobInfo 
> classes should contain job specific information (user provided, derived from 
> metastore info, etc). Since *JobInfo contains such information it should be 
> the object which is passed to HCatInputFormat.setInput and 
> HCatInputFormat.setOutput. Also JobInfo should be renamed to InputJobInfo for 
> consistency and clarity. Also there needs to be a way to pass implementation 
> specific configuration information down to the actual storage driver.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to