Hey,

We are running Map reduce jobs against a 12 machine hbase cluster and
for a long time they took approx 30 mins to return a result against ~95
million rows. Without any major changes to the data or any upgrade of
hbase/hadoop they now seem to be taking about 4 hours. and the logs are
full of

2012-12-04 13:33:15,602 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_201211210952_0293_m_000031_0 0.0% row: 63 6f 6d 2e 70 72 6f 75
67 68 74
...
2012-12-04 13:45:17,134 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_201211210952_0293_m_000031_0 0.0% row: 63 6f 6d 2e 70 75 72 70
6c 65 64 65 73 69 67 6e 73 65 72 76 69 63 65 73
...
2012-12-04 13:46:11,515 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_201211210952_0293_m_000031_0 0.0% row: 63 6f 6d 2e 70 75 73 68
74 6f 74 61 6c 6b 2d 6f 6e 6c 69 6e 65

I presume the 0% is percent complete but I'm not sure as to why the time
to complete has now jumped massively. Ganglia shows no major load on the
nodes in question so I don't think it's that.

What steps should I be taking to try troubleshoot the problem?

Regards,

Jay

Reply via email to