Re: Out of memory running on Yarn

Alex Soto Fri, 27 Apr 2018 12:17:38 -0700

Hi Koji,

That did help, thank you.  Now, can I specify this in the PIG_OPTS environment 
variable instead of in the Pig script?


Best regards,
Alex soto




> On Apr 27, 2018, at 2:21 PM, Koji Noguchi <[email protected]> wrote:
> 
> Hi Alex,
> 
> Can you try increasing the heapsize of the ApplicationMaster?
> 
> yarn.app.mapreduce.am.resource.mb=3584
> yarn.app.mapreduce.am.command-opts=-Xmx3096m
> 
> Koji
> 
> 
> 
> On Fri, Apr 27, 2018 at 1:49 PM, Alex Soto <[email protected]> wrote:
> 
>> Hello,
>> 
>> I am using Pig version 0.17.0.  When I attempt to run my pig script from
>> the command line on a Yarn cluster I get out of memory errors.  From the
>> Yarn application logs, I see this stack trace:
>> 
>> 2018-04-27 13:22:10,543 ERROR [main] 
>> org.apache.hadoop.mapreduce.v2.app.MRAppMaster:
>> Error starting MRAppMaster
>> java.lang.OutOfMemoryError: Java heap space
>>        at java.util.Arrays.copyOfRange(Arrays.java:3664)
>>        at java.lang.String.<init>(String.java:207)
>>        at java.lang.StringBuilder.toString(StringBuilder.java:407)
>>        at org.apache.hadoop.conf.Configuration.loadResource(
>> Configuration.java:2992)
>>        at org.apache.hadoop.conf.Configuration.loadResources(
>> Configuration.java:2817)
>>        at org.apache.hadoop.conf.Configuration.getProps(
>> Configuration.java:2689)
>>        at org.apache.hadoop.conf.Configuration.set(
>> Configuration.java:1326)
>>        at org.apache.hadoop.conf.Configuration.set(
>> Configuration.java:1298)
>>        at org.apache.pig.backend.hadoop.datastorage.ConfigurationUtil.
>> mergeConf(ConfigurationUtil.java:70)
>>        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.
>> PigOutputFormat.setLocation(PigOutputFormat.java:185)
>>        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.
>> PigOutputCommitter.setUpContext(PigOutputCommitter.java:115)
>>        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.
>> PigOutputCommitter.getCommitters(PigOutputCommitter.java:89)
>>        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.
>> PigOutputCommitter.<init>(PigOutputCommitter.java:70)
>>        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.
>> PigOutputFormat.getOutputCommitter(PigOutputFormat.java:297)
>>        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$3.call(
>> MRAppMaster.java:550)
>>        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$3.call(
>> MRAppMaster.java:532)
>>        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.
>> callWithJobClassLoader(MRAppMaster.java:1779)
>>        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.
>> createOutputCommitter(MRAppMaster.java:532)
>>        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.
>> serviceInit(MRAppMaster.java:309)
>>        at org.apache.hadoop.service.AbstractService.init(
>> AbstractService.java:164)
>>        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(
>> MRAppMaster.java:1737)
>>        at java.security.AccessController.doPrivileged(Native Method)
>>        at javax.security.auth.Subject.doAs(Subject.java:422)
>>        at org.apache.hadoop.security.UserGroupInformation.doAs(
>> UserGroupInformation.java:1962)
>>        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.
>> initAndStartAppMaster(MRAppMaster.java:1734)
>>        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(
>> MRAppMaster.java:1668)
>> 
>> 
>> Now, in trying to increase the heap size, I added this to the beginning of
>> the script:
>> 
>> 
>> SET mapreduce.map.java.opts '-Xmx2048m';
>> SET mapreduce.reduce.java.opts '-Xmx2048m';
>> SET mapreduce.map.memory.mb 2536;
>> SET mapreduce.reduce.memory.mb 2536;
>> 
>> But this causes no effect, as it is being ignored.  From the Yarn logs, I
>> see the Container being launched with 1024m heap size:
>> 
>> echo "Launching container"
>> exec /bin/bash -c "$JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp
>> -Dlog4j.configuration=container-log4j.properties
>> -Dyarn.app.container.log.dir=/opt/hadoop/lo
>> gs/userlogs/application_1523452171521_0223/container_1523452171521_0223_01_000001
>> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
>> -Dhadoop.
>> root.logfile=syslog  -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster
>> 1>/opt/hadoop/logs/userlogs/application_1523452171521_
>> 0223/container_1523452171
>> 521_0223_01_000001/stdout 2>/opt/hadoop/logs/userlogs/
>> application_1523452171521_0223/container_1523452171521_0223_01_000001/stderr
>> “
>> 
>> I also tried setting the memory requirements with the PIG_OPTS environment
>> variable:
>> 
>> export PIG_OPTS="-Dmapreduce.reduce.memory.mb=5000
>> -Dmapreduce.map.memory.mb=5000 -Dmapreduce.map.java.opts=-Xmx5000m”
>> 
>> No matter what I do, the container is always launched with -Xmx1024m and
>> the same OOM error occurs.
>> The question is, what is the proper way to specify the heap sizes for my
>> Pig mappers and reducers?
>> 
>> Best regards,
>> Alex soto
>> 
>> 
>>

Re: Out of memory running on Yarn

Reply via email to