[ 
https://issues.apache.org/jira/browse/PIG-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899221#action_12899221
 ] 

Thejas M Nair commented on PIG-1544:
------------------------------------

Note that it will not be possible to determine at query plan generation time, 
the number of bags that will be present at a time during query execution in all 
cases. For example, a udf could collect several bags. But that use case is 
likely to be rare, so i don't think it needs to be considered for  memory size 
limit estimate. It should be sufficient to determine the number of places bags 
are created in the query plan.




> proactive-spill bags should share the memory alloted for it
> -----------------------------------------------------------
>
>                 Key: PIG-1544
>                 URL: https://issues.apache.org/jira/browse/PIG-1544
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>
> Initially proactive spill bags were designed for use in (co)group 
> (InternalCacheBag) and they knew the total number of proactive bags that were 
> present, and shared the memory limit specified using the property 
> pig.cachedbag.memusage .
> But the two proactive bag implementations were added later - 
> InternalDistinctBag and InternalSortedBag are not aware of actual number of 
> bags being used - their users always assume total-numbags = 3. 
> This needs to be fixed and all proactive-spill bags should share the 
> memory-limit .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to