[
https://issues.apache.org/jira/browse/SQOOP-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Veena Basavaraj updated SQOOP-1803:
-----------------------------------
Attachment: SQOOP-1803-POC-2.patch
Since we decided not to use "distributed cache" for storing data, the idea of
committing the config info into a file and then reading from it when job is
finished successfully is not longer an option.
This patch does a few things
1. Introduces a config data object to be stored in context, it is a object so
we can in future add more attributes to it, it stores the data as a object with
a corr type, so we dont store a map or list or any
input type possible
2. Currently use a convention to name the configs, but we can as well ignore
the key and have a name field in the config data that the user has to fill in
while persisting.
3. There may be cases where every config data stored in this "map" may not be
persisted, but if there is no use case for it, happy to remove "isPersistent
boolean"
4. Since the job config update apis exist, this patch makes use of them rather
than updating the entire job
Feedback welcome
> JobManager and Execution Engine changes: Support for a injecting and pulling
> out configs and job output in connectors
> ----------------------------------------------------------------------------------------------------------------------
>
> Key: SQOOP-1803
> URL: https://issues.apache.org/jira/browse/SQOOP-1803
> Project: Sqoop
> Issue Type: Sub-task
> Reporter: Veena Basavaraj
> Assignee: Veena Basavaraj
> Fix For: 2.0.0
>
> Attachments: SQOOP-1803-POC-2.patch, SQOOP-1803-POC.patch
>
>
> The details are in the design wiki, as the implementation happens more
> discussions can happen here.
> https://cwiki.apache.org/confluence/display/SQOOP/Delta+Fetch+And+Merge+Design#DeltaFetchAndMergeDesign-Howtogetoutputfromconnectortosqoop?
> The goal is to dynamically inject a IncrementalConfig instance into the
> FromJobConfiguration. The current MFromConfig and MToConfig can already hold
> a list of configs, and a strong sentiment was expressed to keep it as a list,
> why not for the first time actually make use of it and group the incremental
> related configs in one config object
> This task will prepare the FromJobConfiguration from the job config data,
> ExtractorContext with the relevant values from the prev job run
> This task will prepare the ToJobConfiguration from the job config data,
> LoaderContext with the relevant values from the prev job run if any
> We will use DistributedCache to get State information from the Extractor and
> Loader out and finally persist it into the sqoop repository depending on
> SQOOP-1804 once the outputcommitter commit is called
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)