[jira] Updated: (MAPREDUCE-1473) Sqoop should allow users to control export parallelism

2010-02-10 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-1473:
-

Attachment: MAPREDUCE-1473.2.patch

New patch; the bugfix in MAPREDUCE-1480 for CombineFileRecordReader eliminates 
some hackery in the CombineShimRecordReader / LineRecordReader interaction 
(e.g., the dummy MapContext).

 Sqoop should allow users to control export parallelism
 --

 Key: MAPREDUCE-1473
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1473
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-1473.2.patch, MAPREDUCE-1473.patch


 Sqoop uses MapReduce jobs to export files back to a table in the database. 
 The degree of parallelism is controlled by the number of splits; i.e., the 
 number of input files used. The bottleneck in the system, though, is likely 
 to be the database itself.
 Users should have the ability to tune the number of parallel exporters being 
 used to a degree appropriate to their database deployment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1473) Sqoop should allow users to control export parallelism

2010-02-10 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-1473:
-

Status: Open  (was: Patch Available)

 Sqoop should allow users to control export parallelism
 --

 Key: MAPREDUCE-1473
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1473
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-1473.2.patch, MAPREDUCE-1473.patch


 Sqoop uses MapReduce jobs to export files back to a table in the database. 
 The degree of parallelism is controlled by the number of splits; i.e., the 
 number of input files used. The bottleneck in the system, though, is likely 
 to be the database itself.
 Users should have the ability to tune the number of parallel exporters being 
 used to a degree appropriate to their database deployment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1473) Sqoop should allow users to control export parallelism

2010-02-09 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-1473:
-

Attachment: MAPREDUCE-1473.patch

Attaching a patch which provides this functionality. This uses 
CombineFileInputFormat to batch up Sqoop's input files into a user-defined 
number of splits.

As in importing, the degree of parallelism is controlled with the {{\-m}} / 
{{--num-mappers}} parameters.

 Sqoop should allow users to control export parallelism
 --

 Key: MAPREDUCE-1473
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1473
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-1473.patch


 Sqoop uses MapReduce jobs to export files back to a table in the database. 
 The degree of parallelism is controlled by the number of splits; i.e., the 
 number of input files used. The bottleneck in the system, though, is likely 
 to be the database itself.
 Users should have the ability to tune the number of parallel exporters being 
 used to a degree appropriate to their database deployment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1473) Sqoop should allow users to control export parallelism

2010-02-09 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-1473:
-

Status: Patch Available  (was: Open)

 Sqoop should allow users to control export parallelism
 --

 Key: MAPREDUCE-1473
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1473
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-1473.patch


 Sqoop uses MapReduce jobs to export files back to a table in the database. 
 The degree of parallelism is controlled by the number of splits; i.e., the 
 number of input files used. The bottleneck in the system, though, is likely 
 to be the database itself.
 Users should have the ability to tune the number of parallel exporters being 
 used to a degree appropriate to their database deployment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.