Hi, I am doing bulk insertion into Hbase using Map reduce reading from lot of small(10MB approximation) files, resulting mappers = no of files. I am also monitoring the performance using Ganglia. The machines are c1.xlarge for processing the files(task trackers+data nodes) and m1.xlarge for hbase cluster(region servers+data nodes). The CPU usage remain 75%-100% for almost all of the servers. The ram usage also below 5 GB. But the job fails due to killing of lot of maps. If i run the same job without insertion then processing complete in 9-10 minutes. So the question is why it is killing so many maps? Any clue?
-- Regards Shuja-ur-Rehman Baig <http://pk.linkedin.com/in/shujamughal>
