[
https://issues.apache.org/jira/browse/SQOOP-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489955#comment-13489955
]
Hudson commented on SQOOP-604:
------------------------------
Integrated in Sqoop-ant-jdk-1.6-hadoop23 #414 (See
[https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/414/])
SQOOP-604: Easy throttling feature for MySQL exports (Revision
c499f49097ebf04f9fac34f1df768a319e679cea)
Result = SUCCESS
abhijeet :
https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=c499f49097ebf04f9fac34f1df768a319e679cea
Files :
* src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java
> Easy throttling feature for MySQL exports
> -----------------------------------------
>
> Key: SQOOP-604
> URL: https://issues.apache.org/jira/browse/SQOOP-604
> Project: Sqoop
> Issue Type: Improvement
> Components: connectors/mysql
> Affects Versions: 1.4.2
> Reporter: Zoltan Toth-Czifra
> Priority: Minor
> Fix For: 1.4.3
>
> Attachments: SQOOP-604_v6.patch
>
>
> Sqoop always tries to achieve the best possible throughput with exports,
> which might not be desirable in all cases. Sometimes we need to export large
> data with Sqoop to a live relational database (MySQL in our case), that is, a
> database that is under a high load serving random queries from the users of
> our product.
> While data consistency issues during the export can be easily solved with a
> staging table, there is still a problem: the performance impact caused by the
> heavy export.
> First off, the resources of MySQL dedicated to the import process can affect
> the performance of the live product, both on the master and on the slaves.
> Second, even if the servers can handle the import with no significant
> performance impact (mysqlimport should be relatively "cheap"), importing big
> tables (GB+) can cause serious replication lag in the cluster risking data
> consistency.
> My suggestion is quite simple. Using the already existing "checkpoint"
> feature of the MySQL exports (the export process is restarted every X bytes
> written), extending it with a new config value that would simply make the
> thread sleep for X milliseconds at the checkbpoints. With low enough byte
> count limit this can be a simple yet powerful throttling mechanism.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira