[ https://issues.apache.org/jira/browse/PIG-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981042#action_12981042 ]
Olga Natkovich commented on PIG-1803: ------------------------------------- There are two things: (1) Have you tried Pig 0.8 as we have made quite a bit of progress on memory utilization (2) There is a bug in Hadoop that causes memory overuse when combiner is used. I don't believe it has been addressed. Thejas, do you remember what JIRA number is for MR? > Maps are failing if combiner is enabled > --------------------------------------- > > Key: PIG-1803 > URL: https://issues.apache.org/jira/browse/PIG-1803 > Project: Pig > Issue Type: Bug > Reporter: Alex Rovner > Fix For: 0.7.0 > > > We are constantly hitting the java heap space memory issue if the combiner is > enabled on our jobs. > Configs: > pig.cachedbag.memusage=20 > io.sort.mb=300 > pig.exec.nocombiner=false > mapred.child.java.opts=-Xmx750m > Sample job: > {noformat} > A = LOAD '$INPUT' USING > com.contextweb.pig.CWHeaderLoader('$WORK_DIR/schema/rpt.xml'); > AA = foreach A GENERATE checkPointStart, PublisherId, TagId, > ContextCategoryId,Impressions, Clicks, Actions; > DESCRIBE AA; > B = GROUP AA BY (checkPointStart, PublisherId, TagId, > ContextCategoryId); > result = FOREACH B GENERATE group, SUM(AA.Impressions) as Impressions, > SUM(AA.Clicks) as Clicks, SUM(AA.Actions) as Actions; > DESCRIBE result; > STORE result INTO '$OUTPUT' USING com.contextweb.pig.CWHeaderStore(); > {noformat} > Mapper Error Log: > 2011-01-12 18:43:22,084 FATAL org.apache.hadoop.mapred.Child: Error running > child : java.lang.OutOfMemoryError: Java heap space > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:799) > at > org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:549) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:631) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:315) > at org.apache.hadoop.mapred.Child$4.run(Child.java:217) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063) > at org.apache.hadoop.mapred.Child.main(Child.java:211) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.