[
https://issues.apache.org/jira/browse/PIG-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohini Palaniswamy updated PIG-3255:
------------------------------------
Attachment: PIG-3255-2.patch
Attaching a patch, that gets rid of the byte array copy totally. But this
requires a interface change and will cause backward incompatibility. If it was
an abstract class could have added a default implementation retaining the old
method. But since it is an interface, adding a new method will anyways give a
runtime error. So removed the deserialize(bytes[]) method.
I don't see any documentation on implementing StreamToPig. Not sure how many
would actually be implementing this interface. Is this change acceptable for
the sake of performance? Thoughts?
> Avoid extra byte array copy in streaming deserialize
> ----------------------------------------------------
>
> Key: PIG-3255
> URL: https://issues.apache.org/jira/browse/PIG-3255
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.11
> Reporter: Rohini Palaniswamy
> Assignee: Rohini Palaniswamy
> Fix For: 0.12
>
> Attachments: PIG-3255-1.patch, PIG-3255-2.patch
>
>
> PigStreaming.java:
> public Tuple deserialize(byte[] bytes) throws IOException {
> Text val = new Text(bytes);
> return StorageUtil.textToTuple(val, fieldDel);
> }
> Should remove new Text(bytes) copy and construct the tuple directly from the
> bytes
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira