[
https://issues.apache.org/jira/browse/CRUNCH-639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929315#comment-15929315
]
Micah Whitacre commented on CRUNCH-639:
---------------------------------------
So the proposal is not functionally correct. Specifically, ByteBuffer.array()
returns the backing storage byte[]. The relevant bytes could only occupy a
portion of the the array. Here's a good write up[1] to explain the difference
and why coping is necessary and specifically why using input.array() is
incorrect.
[1] -
https://worldmodscode.wordpress.com/2012/12/14/the-java-bytebuffer-a-crash-course/
> Writable Bytes does an unnecessary copy
> ---------------------------------------
>
> Key: CRUNCH-639
> URL: https://issues.apache.org/jira/browse/CRUNCH-639
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Reporter: Stephen Patel
> Assignee: Josh Wills
> Priority: Minor
>
> In the Writable.bytes() Output MapFn, an unnecessary (I believe) copy of the
> incoming ByteBuffer occurs[0].
> Current:
> {code}
> BytesWritable bw = new BytesWritable();
> bw.set(input.array(), input.arrayOffset(), input.limit()); <- copies the array
> {code}
> Proposed:
> {code}
> BytesWritable bw = new BytesWritable(input.array());
> {code}
> [0]:
> https://github.com/apache/crunch/blob/apache-crunch-0.15.0/crunch-core/src/main/java/org/apache/crunch/types/writable/Writables.java#L271
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)