[
https://issues.apache.org/jira/browse/SQOOP-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284489#comment-14284489
]
Veena Basavaraj commented on SQOOP-1938:
----------------------------------------
[~hshreedharan] I clarify on the aspect I mentioned "The only con I can see is
if we want to do some form of merging across the..", I think I misunderstood
it, even if we do merge in the reducer/ destroyer, this parallellism will still
benefit in writing to the temporary datasets as in KiteConnector. So
essentially in use cases such as the KiteConnector we are doing 2 levels of
writing, once to temp data sets and then merging them to the final destination
in HDFS/HIVE. The merge process today will not benefit from this, but the
temporary dataset writing as part of the KiteLoader will still benefit from it.
> DOC:update the sqoop MR engine implementation details
> -----------------------------------------------------
>
> Key: SQOOP-1938
> URL: https://issues.apache.org/jira/browse/SQOOP-1938
> Project: Sqoop
> Issue Type: Sub-task
> Reporter: Veena Basavaraj
> Assignee: Veena Basavaraj
> Fix For: 1.99.5
>
>
> https://cwiki.apache.org/confluence/display/SQOOP/Sqoop+MR+Execution+Engine
> 1. Why we need SqoopWritable, what can be done in future?
> 2. Even though we call sqoop as a map only, is that how it always works? what
> happend when numLoaders is non zero
> {code}
> // Set number of reducers as number of configured loaders or suppress
> // reduce phase entirely if loaders are not set at all.
> if(request.getLoaders() != null) {
> job.setNumReduceTasks(request.getLoaders());
> } else {
> job.setNumReduceTasks(0);
> }
> {code}
> 3. Internals of SqoopNullOutputFormat and how SqoopWritable is used in it
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)