[
https://issues.apache.org/jira/browse/SQOOP-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14256719#comment-14256719
]
Jarek Jarcec Cecho commented on SQOOP-1938:
-------------------------------------------
I'll take a stab at answering those questions. Everyone please feel free to
jump in as well!
{quote}
1. Why we need SqoopWritable, what can be done in future?
{quote}
Having a {{Writable}} class is required by Hadoop framework - we are using the
current one as a wrapper for {{IntermediateDataFormat}} that we can't use
directly in MR as Hadoop doesn't support that (to my best knowledge). We're not
using a concrete implementation such as {{Text}}, so that we don't have to
convert all records to {{String}} to transfer data between mappers and reducers.
{quote}
2. Even though we call sqoop as a map only, is that how it always works? what
happend when numLoaders is non zero
{quote}
The current semantics is:
|| \# Extractors|| \# Loaders||Outcome||
|Default|Default|Map only job with 10 map tasks|
|Number X|Default|Map only job with X map tasks|
|Number X|Number Y|Map-reduce job with X map tasks and Y reduce tasks|
|Default|Number Y|Map-reduce job with 10 map tasks and Y reduce tasks|
The purpose have been to provide ability to user to throttle both number of
loader and extractors in an independent way (e.g. have different number of
loaders then extractors) and to have default values that won't run reduce phase
if not necessary.
> DOC:update the sqoop MR engine implementation details
> -----------------------------------------------------
>
> Key: SQOOP-1938
> URL: https://issues.apache.org/jira/browse/SQOOP-1938
> Project: Sqoop
> Issue Type: Sub-task
> Reporter: Veena Basavaraj
>
> 1. Why we need SqoopWritable, what can be done in future?
> 2. Even though we call sqoop as a map only, is that how it always works? what
> happend when numLoaders is non zero
> {code}
> // Set number of reducers as number of configured loaders or suppress
> // reduce phase entirely if loaders are not set at all.
> if(request.getLoaders() != null) {
> job.setNumReduceTasks(request.getLoaders());
> } else {
> job.setNumReduceTasks(0);
> }
> {code}
> 3. Internals of SqoopNullOutputFormat and how SqoopWritable is used in it
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)