Hey Syed, Sqoop has to boot a MR job in order to do the data transfer. This takes some time. As such, would the following work?
#!/bin/bash [[ $( mysql test -e "SELECT COUNT(*) FROM test" | tail -1 ) -gt 0 ]] && sqoop import ... The COUNT statement should be lightning fast if you're using MyISAM as your storage engine. -Abe On Thu, Mar 12, 2015 at 5:57 AM, Syed Akram <[email protected]> wrote: > > Hi, > > I am using Sqoop 1.4.5 and i'm doing import from MySQL to Hive > > I'm having a MySQL DBCluster of 200GB data, in which it have 200 db's and > in each db it has at least 600 tables(mixture of big and small/empty > tables). > > When I'm importing big tables, The performance is quite good. > > But When i'm trying to do sqoop import small tables ( i say empty tables > with 0 records) is taking at least 20 secs of time for each table. > > *1.How can i reduce this time for small tables?* > > *my sqoop import query looks like this:* > > > > > > > > > > > * sqoop "import", "--connect", > uri, "--query", sqlText, "--map-column-java", "oprtype=Integer", > "--target-dir", targetDir, "--hive-import", "--hive-table", hiveTable, > "--username", userName, "--password", password, "--split-by", > primaryKey, "--num-mappers","2", "--boundary-query",boundaryQry, > "--hive-overwrite", "--class-name",tableName, "--outdir", > "tmp_sqoop/"+tableNamewhere "--query" is "select tableName.*, oprtype as 0, > modified_time as 0 where $CONDITIONS""--split-by" > primarykey"--boundary-query" select min(primarykey), max(primarykey) from > table;This runs fine for big table having even billions of rows.But for > small table, iam noticing constant time taking to do sqoop import.How do i > optimize the things for small tables or tables with 0 records. I want to > reduce the latency for small tables.Please suggest me in this > area,Cheers!!!!* > > >
