Hello everyone,
I am running a MapReduce job where the the map task executes one GET for each
key/value pair it processes.
Although the map tasks which run first complete fast (in 2 minutes for example)
then the next map tasks need much more time to complete (4mins) and even later
the following map tasks need more that 15 mins to complete.
It seems like HBase overloads and cannot respond fast enough.
While the MR job is running I have noticed the following:
1) The cpu usage of the map tasks is high at the beginning and then goes down
to 4-5%. I think that this means that the results of the GET command take long
to be returned.
2) The used stack of the RegionServers (as shown in the web GUI) increases and
it doesn't decrease even when the job is completed.
3) Using the "top" command, I see that the memory used by the regionserver
increases up to the stack limit I have selected (2GB) and it doesn't go down
even when the job is completed.
Has anyone noticed something like that???
Can you please help me?
Thank you for your help,
Panagiotis