[jira] Updated: (MAPREDUCE-1473) Sqoop should allow users to control export parallelism
[ https://issues.apache.org/jira/browse/MAPREDUCE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-1473: - Attachment: MAPREDUCE-1473.2.patch New patch; the bugfix in MAPREDUCE-1480 for CombineFileRecordReader eliminates some hackery in the CombineShimRecordReader / LineRecordReader interaction (e.g., the dummy MapContext). Sqoop should allow users to control export parallelism -- Key: MAPREDUCE-1473 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1473 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/sqoop Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-1473.2.patch, MAPREDUCE-1473.patch Sqoop uses MapReduce jobs to export files back to a table in the database. The degree of parallelism is controlled by the number of splits; i.e., the number of input files used. The bottleneck in the system, though, is likely to be the database itself. Users should have the ability to tune the number of parallel exporters being used to a degree appropriate to their database deployment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1473) Sqoop should allow users to control export parallelism
[ https://issues.apache.org/jira/browse/MAPREDUCE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-1473: - Status: Open (was: Patch Available) Sqoop should allow users to control export parallelism -- Key: MAPREDUCE-1473 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1473 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/sqoop Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-1473.2.patch, MAPREDUCE-1473.patch Sqoop uses MapReduce jobs to export files back to a table in the database. The degree of parallelism is controlled by the number of splits; i.e., the number of input files used. The bottleneck in the system, though, is likely to be the database itself. Users should have the ability to tune the number of parallel exporters being used to a degree appropriate to their database deployment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1473) Sqoop should allow users to control export parallelism
[ https://issues.apache.org/jira/browse/MAPREDUCE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-1473: - Attachment: MAPREDUCE-1473.patch Attaching a patch which provides this functionality. This uses CombineFileInputFormat to batch up Sqoop's input files into a user-defined number of splits. As in importing, the degree of parallelism is controlled with the {{\-m}} / {{--num-mappers}} parameters. Sqoop should allow users to control export parallelism -- Key: MAPREDUCE-1473 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1473 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/sqoop Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-1473.patch Sqoop uses MapReduce jobs to export files back to a table in the database. The degree of parallelism is controlled by the number of splits; i.e., the number of input files used. The bottleneck in the system, though, is likely to be the database itself. Users should have the ability to tune the number of parallel exporters being used to a degree appropriate to their database deployment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAPREDUCE-1473) Sqoop should allow users to control export parallelism
[ https://issues.apache.org/jira/browse/MAPREDUCE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-1473: - Status: Patch Available (was: Open) Sqoop should allow users to control export parallelism -- Key: MAPREDUCE-1473 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1473 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/sqoop Reporter: Aaron Kimball Assignee: Aaron Kimball Attachments: MAPREDUCE-1473.patch Sqoop uses MapReduce jobs to export files back to a table in the database. The degree of parallelism is controlled by the number of splits; i.e., the number of input files used. The bottleneck in the system, though, is likely to be the database itself. Users should have the ability to tune the number of parallel exporters being used to a degree appropriate to their database deployment. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.