Zoltan Toth-Czifra created SQOOP-604:
----------------------------------------

             Summary: Easy throttling feature for MySQL exports
                 Key: SQOOP-604
                 URL: https://issues.apache.org/jira/browse/SQOOP-604
             Project: Sqoop
          Issue Type: Improvement
          Components: connectors/mysql
    Affects Versions: 1.4.3
            Reporter: Zoltan Toth-Czifra
            Priority: Minor
             Fix For: 1.4.3


Sqoop always tries to achieve the best possible throughput with exports, which 
might not be desirable in all cases. Sometimes we need to export large data 
with Sqoop to a live relational database (MySQL in our case), that is, a 
database that is under a high load serving random queries from the users of our 
product.

While data consistency issues during the export can be easily solved with a 
staging table, there is still a problem: the performance impact caused by the 
heavy export. 

First off, the resources of MySQL dedicated to the import process can affect 
the performance of the live product, both on the master and on the slaves. 
Second, even if the servers can handle the import with no significant 
performance impact (mysqlimport should be relatively "cheap"), importing big 
tables (GB+) can cause serious replication lag in the cluster risking data 
consistency.

My suggestion is quite simple. Using the already existing "checkpoint" feature 
of the MySQL exports (the export process is restarted every X bytes written), 
extending it with a new config value that would simply make the thread sleep 
for X milliseconds at the checkbpoints. With low enough byte count limit this 
can be a simple yet powerful throttling mechanism.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to