Hi all , I have one small doubt . Kindly answer it even if it sounds silly.
Iam using Map Reduce in HBase in distributed mode . I have a table which spans across 5 region servers . I am using TableInputFormat to read the data from the tables in the map . When i run the program , by default how many map regions are created ? Is it one per region server or more ? Also after the map task is over.. reduce task is taking a bit more time . Is it due to moving the map output across the regionservers? i.e, moving the values of same key to a particular reduce phase to start the reducer? Is there any way i can optimize the code (e.g. by storing data of same reducer nearby ) Thanks :)