[ https://issues.apache.org/jira/browse/PIG-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Koji Noguchi updated PIG-5384: ------------------------------ Attachment: pig-5384-v01-halfway.patch Attaching {{pig-5384-v01-halfway.patch}} to give you an idea. I only changed DefaultDataBag but if I were to take this route, I need to make similar changes to other bags. For handling spill failures, calling System.exit() is the reliable way but I think setting mContents to null would let the reader reliably fail (unless users have a custom Bag that is doing something very unique). > OOM while spilling large bag > ----------------------------- > > Key: PIG-5384 > URL: https://issues.apache.org/jira/browse/PIG-5384 > Project: Pig > Issue Type: Improvement > Reporter: Koji Noguchi > Assignee: Koji Noguchi > Priority: Major > Attachments: pig-5384-v01-halfway.patch > > > One of the common OOM issue in Pig is, Pig hitting OOM while trying to spill > a large bag. Current solutions is to give higher heapsize or tweak > {noformat} > pig.spill.memory.usage.threshold.fraction > pig.spill.collection.threshold.fraction > pig.spill.unused.memory.threshold.size > {noformat} > and make sure spilling starts early enough. These params are still critical > but wondering if any improvement can be made to increase the chances of > avoiding OOM while spilling a single large bag. -- This message was sent by Atlassian JIRA (v7.6.3#76005)