Veena Basavaraj created SQOOP-1605:
--------------------------------------
Summary: Sqoop2: Misc From/to cleanups
Key: SQOOP-1605
URL: https://issues.apache.org/jira/browse/SQOOP-1605
Project: Sqoop
Issue Type: Bug
Reporter: Veena Basavaraj
Assignee: Qian Xu
The Destroyer api and its javadoc
{code}
/**
* This allows connector to define work to complete execution, for example,
* resource cleaning.
*/
public abstract class Destroyer<LinkConfiguration, JobConfiguration> {
/**
* Callback to clean up after job execution.
*
* @param context Destroyer context
* @param linkConfiguration link configuration object
* @param jobConfiguration job configuration object for the FROM and TO
* In case of the FROM initializer this will represent the FROM job
configuration
* In case of the TO initializer this will represent the TO job
configuration
*/
public abstract void destroy(DestroyerContext context,
LinkConfiguration linkConfiguration,
JobConfiguration jobConfiguration);
}
{code}
This ticket was created while reviewing the Kite Connector use case where the
destroyer does the actual temp data set merge
https://reviews.apache.org/r/26963/diff/# [~stanleyxu2005]
{code}
public void destroy(DestroyerContext context, LinkConfiguration link,
ToJobConfiguration job) {
LOG.info("Running Kite connector destroyer");
// Every loader instance creates a temporary dataset. If the MR job is
// successful, all temporary dataset should be merged as one dataset,
// otherwise they should be deleted all.
String[] uris = KiteDatasetExecutor.listTemporaryDatasetUris(
job.toDataset.uri);
if (context.isSuccess()) {
KiteDatasetExecutor executor = new KiteDatasetExecutor(job.toDataset.uri,
context.getSchema(), link.link.fileFormat);
for (String uri : uris) {
executor.mergeDataset(uri);
LOG.info(String.format("Temporary dataset %s merged", uri));
}
} else {
for (String uri : uris) {
KiteDatasetExecutor.deleteDataset(uri);
LOG.info(String.format("Temporary dataset %s deleted", uri));
}
}
}
{code}
Wondering if such things should be its own phase rather than in destroyers. The
responsibility of destroyer is more to clean up/ closing/ daat sources for both
FROM/TO data sources to be more precise .. should such operations that modify
records / merge/ munge be its own step ?.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)