[
https://issues.apache.org/jira/browse/PIG-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547269
]
Sam Pullara commented on PIG-40:
--------------------------------
In BigDataBag.java:
private boolean isMemoryAvailable(long memLimit){
long freeMemory = Runtime.getRuntime().freeMemory();
long usedMemory = Runtime.getRuntime().totalMemory() - freeMemory;
return MAX_MEMORY-usedMemory > memLimit;
}
Then it is used here to make a decision about whether to write to disk based on
that call:
if (!isMemoryAvailable(FREE_MEMORY_TO_MAINTAIN) &&
trueCount > 10) {
writeContentToDisk();
}
isMemoryAvailable isn't quite a random boolean but it is close in all the JVM
implementation I am aware of.
> Memory management in BigDataBag is probably wrong
> -------------------------------------------------
>
> Key: PIG-40
> URL: https://issues.apache.org/jira/browse/PIG-40
> Project: Pig
> Issue Type: Bug
> Components: impl
> Reporter: Sam Pullara
>
> src/org/apache/pig/data/BigDataBag.java
> 1) You should not use finalizers for things other than external resources --
> using them here is very dangerous and could inadvertantly lead to deadlocks
> and object resurrection and just decreases performance without any advantage.
> 2) Using .freeMemory() the way it is used in this class is broken.
> freeMemory() is going to return a mostly random number between 0 and the real
> amount. Adding gc() in here is a terrible performance burden. If you really
> want to do something like this you should using softreferences and
> finalization queues.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.