Thanks Douglas, Details asked are Yarn.scheduler. minimum-allocation-mb=2gb Yarn.scheduler. maximum-allocation-mb=128gb Increment=512 MB
Please help with design considerations about how many mappers should be used for sqoop. I believe that mapper memory is capped so does thus mean that data to be fetched with 6 mappers using 2gb memory is capped around 12 GB. Cluster is precisely following number of mappers specified and not exceeding the task count. Regards Harpreet Singh On Aug 2, 2017 7:19 PM, "Douglas Spadotto" <[email protected]> wrote: Hello Harpreet, It seems that your job is going beyond the limits established. What are the values for yarn.scheduler.minimum-allocation-mb and yarn.scheduler.maximum-allocation-mb on your cluster? Some background on the meaning of these configurations can be found here: https://discuss.pivotal.io/hc/en-us/articles/201462036-MapReduce-YARN- Memory-Parameters Regards, Douglas On Wed, Aug 2, 2017 at 8:00 AM, Harpreet Singh <[email protected]> wrote: > Hi All, > I have a sqoop job which is running in production and fails sometimes. > Restart of job executes successfully . > Logs show that failure happens with error that container is running beyond > physical memory limits. Current usage 2.3 GB of 2GB physical memory used. > 4.0 GB of 4.2 GB virtual memory used. Killing container. > Environment is > Cdh5.8.3 > Sqoop 1 client > Mapreduce.map.Java.opts=-Djava.net.preferIPv4Stack=true -Xmx1717986918 > Mapreduce.map.memory.MB= 2GB > > Sqoop job details. Pulling data from netezza using 6 mappers and putting > into parquet format on hdfs. Data processed is 14 GB. Splits seem to be > even. > Please provide your insights. > > Regards > Harpreet Singh >
