[ https://issues.apache.org/jira/browse/PIG-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899221#action_12899221 ]
Thejas M Nair commented on PIG-1544: ------------------------------------ Note that it will not be possible to determine at query plan generation time, the number of bags that will be present at a time during query execution in all cases. For example, a udf could collect several bags. But that use case is likely to be rare, so i don't think it needs to be considered for memory size limit estimate. It should be sufficient to determine the number of places bags are created in the query plan. > proactive-spill bags should share the memory alloted for it > ----------------------------------------------------------- > > Key: PIG-1544 > URL: https://issues.apache.org/jira/browse/PIG-1544 > Project: Pig > Issue Type: Bug > Reporter: Thejas M Nair > > Initially proactive spill bags were designed for use in (co)group > (InternalCacheBag) and they knew the total number of proactive bags that were > present, and shared the memory limit specified using the property > pig.cachedbag.memusage . > But the two proactive bag implementations were added later - > InternalDistinctBag and InternalSortedBag are not aware of actual number of > bags being used - their users always assume total-numbags = 3. > This needs to be fixed and all proactive-spill bags should share the > memory-limit . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.