[ 
https://issues.apache.org/jira/browse/PIG-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-3255:
------------------------------------

    Attachment: PIG-3255-2.patch

Attaching a patch, that gets rid of the byte array copy totally. But this 
requires a interface change and will cause backward incompatibility. If it was 
an abstract class could have added a default implementation retaining the old 
method. But since it is an interface, adding a new method will anyways give a 
runtime error. So removed the deserialize(bytes[]) method.  
 
I don't see any documentation on implementing StreamToPig. Not sure how many 
would actually be implementing this interface. Is this change acceptable for 
the sake of performance? Thoughts?
                
> Avoid extra byte array copy in streaming deserialize
> ----------------------------------------------------
>
>                 Key: PIG-3255
>                 URL: https://issues.apache.org/jira/browse/PIG-3255
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.11
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.12
>
>         Attachments: PIG-3255-1.patch, PIG-3255-2.patch
>
>
> PigStreaming.java:
>  public Tuple deserialize(byte[] bytes) throws IOException {
>         Text val = new Text(bytes);  
>         return StorageUtil.textToTuple(val, fieldDel);
>     }
> Should remove new Text(bytes) copy and construct the tuple directly from the 
> bytes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to