[ 
https://issues.apache.org/jira/browse/HCATALOG-341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Travis Crawford updated HCATALOG-341:
-------------------------------------

    Attachment: HCATALOG-341.2.patch

Finally updated this patch against trunk.

I'm playing around with a workflow using CI to build my feature branches before 
posting the patches, which hopefully makes testing really easy and shows the 
patch works correctly for the reviewer. Here's the link to CI for this patch:

https://travis.ci.cloudbees.com/job/HCATALOG-341_initializeinput_improvements/5/

On CI, TestSemanticAnalysis kept failing, and while taking a look I converted 
to extend HCatBaseTest to get the standard test setup. The only functional 
change is in testAddPartPass I changed the directory from /tmp to the test data 
dir so it references things inside the build root. This also has the benefit of 
making the test run in my IDE which is super handy.
                
> InitializeInput improvements
> ----------------------------
>
>                 Key: HCATALOG-341
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-341
>             Project: HCatalog
>          Issue Type: Improvement
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>         Attachments: HCATALOG-341.2.patch, HCATALOG-341.patch
>
>
> This came up in HCATALOG-328.
> {{InitializeInput}} is the HCatalog class that queries the HiveMetaStore and 
> stores the query result. It could be improved in the following ways:
> * The class has entirely static methods, so a private arg-less constructor 
> should be added to prevent people from accidentally creating instances.
> * Instead of querying the HiveMetaStore each time info is requested, the 
> results should be cached after the first query using a key of db+table+filter.
> * {{setInput}} and {{getSerializedHcatKeyJobInfo}} require an existing 
> {{InputJobInfo}} argument, however, the point of calling those methods is to 
> populate a {{InputJobInfo}} with info from the metastore. While this reduces 
> the number of arguments (instead of needing database name, table name, 
> partition filter) it confuses the user because its not clear only 
> db/table/filter should be set when passed as an argument.
> * {{getSerializedHcatKeyJobInfo}} should be renamed {{getInputJobInfo}} and 
> return an unserialized {{InputJobInfo}}. This avoids unnecessary 
> serialization/deserialization in the front-end when its not necessary to read 
> from the job configuration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to