Re: Task attempt failed to report status

2010-03-06 Thread Ted Yu
You can introduce a second thread in the reducer which periodically reports status to hadoop. At the same time, you can record the longest put operation to see the amount of time it takes. lowering the number of cells in a put to some value under 1000 may help as well. On Saturday, March 6, 2010,

Re: Task attempt failed to report status

2010-03-06 Thread steven zhuang
thanks stack. I think the timeout may caused by 1. HDFS is slow 2. the single row is way to big(millions of cells, around 50-100MB), but I don't know clearly how it happened. I have checked the regionserver log, there were lots of WARN message like: "2010-03-05 01:44:22,881 WARN org.apache.hadoop

Re: Task attempt failed to report status

2010-03-06 Thread Stack
On Fri, Mar 5, 2010 at 1:12 AM, steven zhuang wrote: >     when I import data into the HTable with a Map/Reduce job, the task runs > smoothly until the last reducer failed 6 times to report its status. How many reducers? All completed except this last one and it failed inspite of 6 attempts? Pe

Re: Task attempt failed to report status

2010-03-05 Thread steven zhuang
thanks, J.D, again. For my table some rows may has as much as several million cells, which is why we choose Hbase over any RDB. And sorry I didn't notice that I pasted the tasktracker log in the previous log, I am check the regionserver log, but I don't know which regionserver will ge

Re: Task attempt failed to report status

2010-03-05 Thread Jean-Daniel Cryans
How many cells do you output normally per row? Looks like a lot the way you are doing it and it might cause problems. Also did you take a look at the region servers logs? Anything distress messages in there? J-D On Fri, Mar 5, 2010 at 1:12 AM, steven zhuang wrote: > hi, all, > >     when I impo

Task attempt failed to report status

2010-03-05 Thread steven zhuang
hi, all, when I import data into the HTable with a Map/Reduce job, the task runs smoothly until the last reducer failed 6 times to report its status. In my program I use batchupdate to collect for every 1000 cells, and update the status. I don't think the normal inserting will cost 10 mi