[
https://issues.apache.org/jira/browse/PIG-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854608#action_12854608
]
Richard Ding commented on PIG-1348:
-----------------------------------
Thanks Ashutosh. I changed signature of write() to take values of type Tuple
instead of type Object.
On 1) and 3), Hadoop LineRecordWriter#write() is a synchronized method, and I
think that JVM is optimized for 'instanceof'' construct and also for
uncontended synchronization. I prefer that we have some performance numbers
before adding optimizations.
> PigStorage making unnecessary byte array copy when storing data
> ---------------------------------------------------------------
>
> Key: PIG-1348
> URL: https://issues.apache.org/jira/browse/PIG-1348
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.7.0
> Reporter: Ashutosh Chauhan
> Assignee: Richard Ding
> Fix For: 0.7.0
>
> Attachments: PIG-1348.patch, PIG-1348_2.patch
>
>
> InternalCachedBag makes estimate of memory available to the VM by using
> Runtime.getRuntime().maxMemory(). It then uses 10%(by default, though
> configurable) of this memory and divides this memory into number of bags. It
> keeps track of the memory used by bags and then proactively spills if bags
> memory usage reach close to these limits. Given all this in theory when
> presented with data more then it can handle InternalCachedBag should not run
> out of memory. But in practice we find OOM happening.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.