[ 
https://issues.apache.org/jira/browse/SQOOP-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284489#comment-14284489
 ] 

Veena Basavaraj commented on SQOOP-1938:
----------------------------------------

[~hshreedharan] I clarify on the aspect I mentioned "The only con I can see is 
if we want to do some form of merging across the..", I think I misunderstood 
it, even if we do merge in the reducer/ destroyer, this parallellism will still 
benefit in writing to the temporary datasets as in KiteConnector. So 
essentially in use cases such as the KiteConnector we are doing 2 levels of 
writing, once to temp data sets and then merging them to the final destination 
in HDFS/HIVE. The merge process today will not benefit from this, but the 
temporary dataset writing as part of the KiteLoader will still benefit from it.

> DOC:update the sqoop MR engine implementation details
> -----------------------------------------------------
>
>                 Key: SQOOP-1938
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1938
>             Project: Sqoop
>          Issue Type: Sub-task
>            Reporter: Veena Basavaraj
>            Assignee: Veena Basavaraj
>             Fix For: 1.99.5
>
>
> https://cwiki.apache.org/confluence/display/SQOOP/Sqoop+MR+Execution+Engine
> 1. Why we need SqoopWritable, what can be done in future?
> 2. Even though we call sqoop as a map only, is that how it always works? what 
> happend when numLoaders is non zero
> {code}
>       // Set number of reducers as number of configured loaders  or suppress
>       // reduce phase entirely if loaders are not set at all.
>       if(request.getLoaders() != null) {
>         job.setNumReduceTasks(request.getLoaders());
>       } else {
>         job.setNumReduceTasks(0);
>       }
> {code}
> 3. Internals of SqoopNullOutputFormat and how SqoopWritable is used in it



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to