You are using super high number of mappers for very low amounts of data (50MB or less) and getting very low throughput (less than 1MB/s)
Can you try same jobs with just 1 mapper? And 4 mappers? Gwen On Mon, Jun 23, 2014 at 2:32 PM, Buntu Dev <[email protected]> wrote: > Hi, > > We are using sqoop (v1.4.4) export for exporting the uniques per user_id > into the mysql table with 2 integer columns and with 'user_id' as the > unique key with these options: > > sqoop export \ > -Dsqoop.export.records.per.statement=1000 \ > -Dsqoop.export.statements.per.transaction=1 \ > --connect "jdbc:mysql://host/db" \ > --username user \ > --password pwd \ > --table tbl \ > --batch \ > --relaxed-isolation \ > --update-mode allowinsert \ > --update-key user_id \ > --export-dir output/dir/ \ > --input-fields-terminated-by '\t' \ > --input-lines-terminated-by '\n' \ > --num-mappers=200 > > Are the options such as batching, records/statement or statements per > transaction applicable in case of the MySQL upserts? > > Also, we are noticing that for smaller jobs the throughput of the export > job is higher compared to the larger jobs: > > large job: > [ExportJobBase] - Transferred 37.3672 MB in 838.2908 seconds (45.6452 > KB/sec) > [ExportJobBase] - Exported 3025677 records. > > small job: > [ExportJobBase] - Transferred 12.0951 MB in 40.9846 seconds (302.1965 > KB/sec) > [ExportJobBase] - Exported 88042 records. > > and bumping up the mappers to 400 has similar behavior as well: > > large job: > [ExportJobBase] - Transferred 49.6578 MB in 638.6147 seconds (79.6249 > KB/sec) > [ExportJobBase] - Exported 3243995 records. > > small job: > [ExportJobBase] - Transferred 24.4653 MB in 59.1785 seconds (423.3366 > KB/sec) > [ExportJobBase] - Exported 139181 records. > > Attempting to remove batch option or increasing the number of statements > per transaction causes lock wait timeout exceeded exceptions. > > Please let me know if there is anything obvious we might be missing. > > Thanks! >
