[ 
https://issues.apache.org/jira/browse/PIG-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854611#action_12854611
 ] 

Alan Gates commented on PIG-1348:
---------------------------------

bq. In StorageUtil.putField(), is it possible to get rid of 
DataType.findType(), possibly by getting hold of schema and getting type 
information from there. If not, then may be we cache the type info first time, 
instead of finding it on every call. At the very least, we shall get rid of 
casts for simple types as thats unnecessary. DataType.isComplex() can be used 
to determine that. 

We have to be careful here.  In the case where a schema is given, it's ok to 
use that to cast types.  In cases without schema we cannot assume that all 
records match the first, because Pig does not impose that as a requirement on 
the data.  So looking at the first record and caching results is not ok.

> PigStorage making unnecessary byte array copy when storing data
> ---------------------------------------------------------------
>
>                 Key: PIG-1348
>                 URL: https://issues.apache.org/jira/browse/PIG-1348
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.7.0
>            Reporter: Ashutosh Chauhan
>            Assignee: Richard Ding
>             Fix For: 0.7.0
>
>         Attachments: PIG-1348.patch, PIG-1348_2.patch
>
>
> InternalCachedBag makes estimate of memory available to the VM by using 
> Runtime.getRuntime().maxMemory(). It then uses 10%(by default, though 
> configurable) of this memory and divides this memory into number of bags. It 
> keeps track of the memory used by bags and then proactively spills if bags 
> memory usage reach close to these limits. Given all this in theory when 
> presented with data more then it can handle InternalCachedBag should not run 
> out of memory. But in practice we find OOM happening. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to