Hi, I am running hadoop over a collection of several millions of small files using the CombineFileInputFormat.
However, when generating splits, the job fails because of a Garbage Collector Overhead limit exceed exception. I disabled the Garbage Colelctor overhead limit exception with -server -XX:-UseGCOverheadLimit; I get a java.lang.OutOfMemoryError: Java heap space with -Xmx8192m -server. Is there any solution to avoid this limit when splitting input? Regards