[ 
https://issues.apache.org/jira/browse/SQOOP-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Veena Basavaraj updated SQOOP-1803:
-----------------------------------
    Attachment: SQOOP-1803-POC-2.patch

Since we decided not to use "distributed cache" for storing data, the idea of 
committing the config info into a file and then reading from it when job is 
finished successfully is not longer an option.

This patch does a few things
1. Introduces a config data object to be stored in context, it is a object so 
we can in future add more attributes to it, it stores the data as a object with 
a corr type, so we dont store a map or list or any
input type possible
2. Currently use a convention to name the configs, but we can as well ignore 
the key and have a name field in the config data that the user has to fill in 
while persisting.
3. There may be cases where every config data stored in this "map" may not be 
persisted, but if there is no use case for it, happy to remove "isPersistent 
boolean"
4. Since the job config update apis exist, this patch makes use of them rather 
than updating the entire job

Feedback welcome

> JobManager and Execution Engine changes: Support for a injecting and pulling 
> out configs and job output in connectors 
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: SQOOP-1803
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1803
>             Project: Sqoop
>          Issue Type: Sub-task
>            Reporter: Veena Basavaraj
>            Assignee: Veena Basavaraj
>             Fix For: 2.0.0
>
>         Attachments: SQOOP-1803-POC-2.patch, SQOOP-1803-POC.patch
>
>
> The details are in the design wiki, as the implementation happens more 
> discussions can happen here.
> https://cwiki.apache.org/confluence/display/SQOOP/Delta+Fetch+And+Merge+Design#DeltaFetchAndMergeDesign-Howtogetoutputfromconnectortosqoop?
> The goal is to dynamically inject a IncrementalConfig instance into the 
> FromJobConfiguration. The current MFromConfig and MToConfig can already hold 
> a list of configs, and a strong sentiment was expressed to keep it as a list, 
> why not for the first time actually make use of it and group the incremental 
> related configs in one config object
> This task will prepare the FromJobConfiguration from the job config data, 
> ExtractorContext with the relevant values from the prev job run 
> This task will prepare the ToJobConfiguration from the job config data, 
> LoaderContext with the relevant values from the prev job run if any
> We will use DistributedCache to get State information from the Extractor and 
> Loader out and finally persist it into the sqoop repository depending on 
> SQOOP-1804 once the outputcommitter commit is called



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to