[ 
https://issues.apache.org/jira/browse/PIG-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547283
 ] 

Benjamin Reed commented on PIG-40:
----------------------------------

We aren't really doing memory management. We just need to decide when to spill 
a bag to disk. We can't just count the elements of a bag since elements can be 
of different size. We also need to spill earlier if memory is constrained.

Using freeMemory may cause us to spill before we need to, but the important 
thing is that we make sure to spill when memory is constrained. There doesn't 
seem to be a better way to do it.

> Memory management in BigDataBag is probably wrong
> -------------------------------------------------
>
>                 Key: PIG-40
>                 URL: https://issues.apache.org/jira/browse/PIG-40
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Sam Pullara
>
> src/org/apache/pig/data/BigDataBag.java
> 1) You should not use finalizers for things other than external resources -- 
> using them here is very dangerous and could inadvertantly lead to deadlocks 
> and object resurrection and just decreases performance without any advantage.
> 2) Using .freeMemory() the way it is used in this class is broken.  
> freeMemory() is going to return a mostly random number between 0 and the real 
> amount.  Adding gc() in here is a terrible performance burden.  If you really 
> want to do something like this you should using softreferences and 
> finalization queues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to