Can you get the logs from the individual map tasks? It seems like there are 
some map task failures and then hadoop by default will try to run each task 3 
times before bailing out (which would explain the jump from 100% back to 75%).

Jarcec

> On Apr 9, 2016, at 2:33 PM, Tobias Feldhaus <[email protected]> wrote:
> 
> Hi,
> 
> I'm running Apache Sqoop (v 1.4.6 from the Cloudera distribution), I
> have a cluster on AWS with 1 Master, 1 Name, and 5 worker nodes
> (m4.4xlarge) and I'm trying to import a MySQL (v 5.6) table via the 
> following command:
> 
> sqoop import -direct --table ads --connect jdbc:mysql://10.0.0.125:8500/db
> --password XXX --username XXX
> 
> The command executes but the job never completes, for the past 24 hours
> the output is:
> 
> (...)
> 16/04/09 10:52:54 INFO mapreduce.Job:  map 100% reduce 0%
> 16/04/09 13:06:44 INFO mapreduce.Job:  map 75% reduce 0%
> 16/04/09 13:06:54 INFO mapreduce.Job:  map 100% reduce 0%
> 16/04/09 15:24:43 INFO mapreduce.Job:  map 75% reduce 0%
> ([always prints alternating  100%/75% lines hereafter])
> 
> The nodes a are almost idle, disks are not full and network is at
> about 3-5 MByte/s (in/out).
> 
> Question: Is this normal behavior or do I have a deadlock here? What
> would be the next step for investigation? Table size is estimated at
> around 60 GiB, load average, disk usage and physical memory
> utilization is low and can be found here (http://puu.sh/ocfDq/f4c3592530.png).
> The full log can be found here 
> (https://gist.github.com/james-woods/b0745c96e0ef31e954d038de256a5b83).
> 
> Thanks for any advice,
> Tobi

Reply via email to