What do the hadoop job/attempt logs say? Kill the job by running mapred job -kill <job_id>
Get the job *attempt* logs by running mapred job -kill <job_id> <attempt_id> Whenever Sqoop fails because of SQL errors - e.g. if a decimal value being inserted is too big for the column or a varchar field would be truncated, it'll appear to hang like this, and you'll find the actual SQL error in the task attempt logs. IMO this is a huge flaw in Sqoop's export feature and I hate to use Sqoop for exports because of it. My time-sensitive Hadoop job queues get jammed by Sqoop exports choking on one bad value in one row. Unless you really need a clustered export to SQL for this, i.e. you're exporting so many rows that you're exhausting a single *client* machine's CPU/RAM, you might want to avoid the export feature. I've found it too immature to be useful, at least in Sqoop1. Good luck. On Thu, Mar 17, 2016 at 12:25 AM, Rajkumar <[email protected]> wrote: > Hi folks, > > > > > > I am trying to export data from HDFS to SQL Server. The original table has > over 500 columns and every time I execute Sqoop export job it gets stuck > showing mapreduce completed at 100%. I created two dummy tables as shown > below to find out where the exact problem persists. The only difference > between table1 and table2 is that the later has one additional column > [col14 varchar(5)] > > First, I ran export job for Table1 which has 13 columns [datatype varchar > (5)]. The job completed successfully and exported all the 3 records to the > SQL Server. > > Next, I executed the export job for Table2 with 14 columns. When I ran > this job, I didn’t see any error messages/exceptions, but it hangs forever > after map completed at 100%. The SQL Server Activity Monitor shows a > process is being created however it’s not receiving any data/prepared > statement from Hadoop. > > Is this problem exists only with SQL Server? Is there any limitation on > the number of columns exported to SQL Server? Please advise. > > *Configuration* > Hadoop Version – Cloudera 2.6.0-CDH-5.5.2 > Sqoop Version – 1.4.6 > SQL Server Version – 2008 R2 > > *Table 1 * > > CREATE TABLE [dbo].[tbldummy1]( > [col1] [varchar] (5) NOT NULL, > [col2] [varchar](5) NULL, > [col3] [varchar](5) NULL, > [col4] [varchar](5) NULL, > [col5] [varchar](5) NULL, > [col6] [varchar](5) NULL, > [col7] [varchar](5) NULL, > [col8] [varchar](5) NULL, > [col9] [varchar](5) NULL, > [col10] [varchar](5) NULL, > [col11] [varchar](5) NULL, > [col12] [varchar](5) NULL, > [col13] [varchar](5) NULL, > CONSTRAINT [PK_dummy1] PRIMARY KEY ([col1] ASC)) > > *Sqoop Command for Table 1* > > sqoop export \ > --connect “jdbc:sqlserver://x.x.x.x:port;database=xxxxxxx” \ > --username xxxxxx --password yyyyyy \ > --table tbldummy1 \ > --export-dir /user/hue/Out1 \ > --input-fields-terminated-by '|' \ > -m 1 \ > --verbose > > *Input data for Out1* > > aa|01|02|03|04|05|06|07|08|09|10|11|12 > bb|01|02|03|04|05|06|07|08|09|10|11|12 > cc|01|02|03|04|05|06|07|08|09|10|11|12 > > > *Table 2 * > > CREATE TABLE [dbo].[tbldummy2]( > [col1] [varchar] (5) NOT NULL, > [col2] [varchar](5) NULL, > [col3] [varchar](5) NULL, > [col4] [varchar](5) NULL, > [col5] [varchar](5) NULL, > [col6] [varchar](5) NULL, > [col7] [varchar](5) NULL, > [col8] [varchar](5) NULL, > [col9] [varchar](5) NULL, > [col10] [varchar](5) NULL, > [col11] [varchar](5) NULL, > [col12] [varchar](5) NULL, > [col13] [varchar](5) NULL, > [col14] [varchar](5) NULL, > CONSTRAINT [PK_dummy2] PRIMARY KEY ([col1] ASC)) > > *Sqoop Command for Table 2* > > sqoop export \ > --connect "jdbc:sqlserver://x.x.x.x:port;database=xxxxxxx" \ > --username xxxxxx --password yyyyyy \ > --table tbldummy2 \ > --export-dir /user/hue/Out2 \ > --input-fields-terminated-by '|' \ > -m 1 \ > --verbose > > *Input data for Table 2* > > aa|01|02|03|04|05|06|07|08|09|10|11|12|13 > bb|01|02|03|04|05|06|07|08|09|10|11|12|13 > cc|01|02|03|04|05|06|07|08|09|10|11|12|13 > > *Console logs for Table 2* > > 16/03/16 23:35:01 INFO mapreduce.Job: Running job: job_1458150283440_0028 > 16/03/16 23:35:07 INFO mapreduce.Job: Job job_1458150283440_0028 running > in uber mode : false > 16/03/16 23:35:07 INFO mapreduce.Job: map 0% reduce 0% > 16/03/16 23:35:18 INFO mapreduce.Job: map 100% reduce 0% > > Thanks! > > > > >
