[
https://issues.apache.org/jira/browse/SQOOP-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464829#comment-13464829
]
Zoltan Toth-Czifra commented on SQOOP-604:
------------------------------------------
Sorry, I have been busy with this so I did not have time to test.
Here is the review:
https://reviews.apache.org/r/7135/
Also, results when executing with different settings of
sqoop.mysql.export.checkpoint.bytes and sqoop.mysql.export.sleep.ms:
{code}
33554432B / 0ms: Transferred 4.7579 MB in 8.7175 seconds (558.8826 KB/sec)
102400B / 500ms: Transferred 4.7579 MB in 35.7794 seconds (136.1698 KB/sec)
51200B / 500ms: Transferred 4.758 MB in 57.8675 seconds (84.1959 KB/sec)
51200B / 250ms: Transferred 4.7579 MB in 35.0293 seconds (139.0854 KB/sec)
{code}
> Easy throttling feature for MySQL exports
> -----------------------------------------
>
> Key: SQOOP-604
> URL: https://issues.apache.org/jira/browse/SQOOP-604
> Project: Sqoop
> Issue Type: Improvement
> Components: connectors/mysql
> Affects Versions: 1.4.3
> Reporter: Zoltan Toth-Czifra
> Priority: Minor
> Fix For: 1.4.3
>
>
> Sqoop always tries to achieve the best possible throughput with exports,
> which might not be desirable in all cases. Sometimes we need to export large
> data with Sqoop to a live relational database (MySQL in our case), that is, a
> database that is under a high load serving random queries from the users of
> our product.
> While data consistency issues during the export can be easily solved with a
> staging table, there is still a problem: the performance impact caused by the
> heavy export.
> First off, the resources of MySQL dedicated to the import process can affect
> the performance of the live product, both on the master and on the slaves.
> Second, even if the servers can handle the import with no significant
> performance impact (mysqlimport should be relatively "cheap"), importing big
> tables (GB+) can cause serious replication lag in the cluster risking data
> consistency.
> My suggestion is quite simple. Using the already existing "checkpoint"
> feature of the MySQL exports (the export process is restarted every X bytes
> written), extending it with a new config value that would simply make the
> thread sleep for X milliseconds at the checkbpoints. With low enough byte
> count limit this can be a simple yet powerful throttling mechanism.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira