[ 
https://issues.apache.org/jira/browse/PIG-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547269
 ] 

Sam Pullara commented on PIG-40:
--------------------------------

In BigDataBag.java:

    private boolean isMemoryAvailable(long memLimit){
        long freeMemory = Runtime.getRuntime().freeMemory();
        long usedMemory = Runtime.getRuntime().totalMemory() - freeMemory;
        return MAX_MEMORY-usedMemory > memLimit;
    }

Then it is used here to make a decision about whether to write to disk based on 
that call:

                    if (!isMemoryAvailable(FREE_MEMORY_TO_MAINTAIN) && 
trueCount > 10) {
                        writeContentToDisk();
                    }

isMemoryAvailable isn't quite a random boolean but it is close in all the JVM 
implementation I am aware of.

> Memory management in BigDataBag is probably wrong
> -------------------------------------------------
>
>                 Key: PIG-40
>                 URL: https://issues.apache.org/jira/browse/PIG-40
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Sam Pullara
>
> src/org/apache/pig/data/BigDataBag.java
> 1) You should not use finalizers for things other than external resources -- 
> using them here is very dangerous and could inadvertantly lead to deadlocks 
> and object resurrection and just decreases performance without any advantage.
> 2) Using .freeMemory() the way it is used in this class is broken.  
> freeMemory() is going to return a mostly random number between 0 and the real 
> amount.  Adding gc() in here is a terrible performance burden.  If you really 
> want to do something like this you should using softreferences and 
> finalization queues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to