Hi all, I have a small question regarding the MapReduce jobs behaviour with HBase.
I have a HBase test table with only 8 rows. I splitted the table with the hbase shell split command into 2 splits. So now there are 4 rows in every split. I create a MapReduce job that only prints the row key in the log files. When I run the MapReduce job, every row is processed by 1 mapper. But the mappers in the same split are executed sequentially (inside the same container). That means, the first four rows are processed sequentially by 4 mappers. The system has cores that are free, so is it possible to process rows in parallel if they are located in the same split? The only way I found to have 8 mappers executed in parallel is split the table in 8 splits (1 split per row). But obviously this is not the best solution for big tables ... Thanks, Ivan.