Gzip files (unlike uncompressed files) are not splittable, which may be
causing the behavior that you described.
On Dec 24, 2011 6:24 AM, "Niels Basjes" <ni...@basjes.nl> wrote:

> Hi,
>
> I noticed that the mapper progress indication in the hadoop cdh3
> distribution jumps from 0% to 100% for each gzipped input file. So when
> running with big gzipped input files the job appears to be stuck.
>
> I was unable to find a jira issue that describes this effect.
> Before I dive into this I have a few questions to you guys:
> 1) is this a known effect for the 0.20 version? If so what is the jira
> issue?
> 2) is this specific to gzip?
> 3) is this effect still present in the MRv2/yarn version of Hadoop?
>
> Thanks.
> --
> Met vriendelijke groet,
> Niels Basjes
> (Verstuurd vanaf mobiel )
>

Reply via email to