Hi All, I'm writing an MR job to read data using HCatInputFormat... however, the job is generating too many splits. I don't have this problem when running queries in Hive since it combines splits by default.
Is there an equivalent in MR so that I'm not generating thousands of mappers? Thanks, Pradeep