You can introduce a second thread in the reducer which periodically
reports status to hadoop.
At the same time, you can record the longest put operation to see the
amount of time it takes.
lowering the number of cells in a put to some value under 1000 may
help as well.
On Saturday, March 6, 2010,
thanks stack.
I think the timeout may caused by 1. HDFS is slow 2. the single row is way
to big(millions of cells, around 50-100MB), but I don't know clearly how it
happened.
I have checked the regionserver log, there were lots of WARN message like:
"2010-03-05 01:44:22,881 WARN org.apache.hadoop
On Fri, Mar 5, 2010 at 1:12 AM, steven zhuang
wrote:
> when I import data into the HTable with a Map/Reduce job, the task runs
> smoothly until the last reducer failed 6 times to report its status.
How many reducers? All completed except this last one and it failed
inspite of 6 attempts?
Pe
thanks, J.D, again.
For my table some rows may has as much as several million cells, which
is why we choose Hbase over any RDB.
And sorry I didn't notice that I pasted the tasktracker log in the
previous log, I am check the regionserver log, but I don't know which
regionserver will ge
How many cells do you output normally per row? Looks like a lot the
way you are doing it and it might cause problems.
Also did you take a look at the region servers logs? Anything distress
messages in there?
J-D
On Fri, Mar 5, 2010 at 1:12 AM, steven zhuang
wrote:
> hi, all,
>
> when I impo
hi, all,
when I import data into the HTable with a Map/Reduce job, the task runs
smoothly until the last reducer failed 6 times to report its status.
In my program I use batchupdate to collect for every 1000 cells, and
update the status. I don't think the normal inserting will cost 10 mi