Hello everyone,

I am running a MapReduce job where the the map task executes one GET for each 
key/value pair it processes.

Although the map tasks which run first complete fast (in 2 minutes for example) 
then the next map tasks need much more time to complete (4mins) and even later 
the following map tasks need more that 15 mins to complete.

It seems like HBase overloads and cannot respond fast enough.

While the MR job is running I have noticed the following:

1) The cpu usage of the map tasks is high at the beginning and then goes down 
to 4-5%. I think that this means that the results of the GET command take long 
to be returned.

2) The used stack of the RegionServers (as shown in the web GUI) increases and 
it doesn't decrease even when the job is completed.

3) Using the "top" command, I see that the memory used by the regionserver 
increases up to the stack limit I have selected (2GB) and it doesn't go down 
even when the job is completed.

Has anyone noticed something like that???

Can you please help me?

Thank you for your help,
Panagiotis

                                          

Reply via email to