[ 
https://issues.apache.org/jira/browse/PIG-793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754491#action_12754491
 ] 

Ashutosh Chauhan commented on PIG-793:
--------------------------------------

In addition to String Vs Text, Alan also mentioned using array instead of 
ArrayList<Object>. Did any took a look at that? I think that change should also 
help. When I benchmarked merge join, nearly 20-30% CPU time was spent in 
arraylist's operations, which should benefit a lot if an array is used instead. 
So, changing to arrays should help both in memory and CPU runtime at the cost 
of expensive appends.

Also, some small benefits can be gained by very simple changes introduced in 
https://issues.apache.org/jira/browse/PIG-513

> Improving memory efficiency of Tuple implementation
> ---------------------------------------------------
>
>                 Key: PIG-793
>                 URL: https://issues.apache.org/jira/browse/PIG-793
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Olga Natkovich
>            Assignee: Alan Gates
>
> Currently, our tuple is a real pig and uses a lot of extra memory. 
> There are several places where we can improve memory efficiency:
> (1) Laying out memory for the fields rather than using java objects since 
> since each object for a numeric field takes 16 bytes
> (2) For the cases where we know the schema using Java arrays rather than 
> ArrayList.
> There might be more.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to