[ https://issues.apache.org/jira/browse/HADOOP-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971034#action_12971034 ]
David Rosenstrauch commented on HADOOP-6298: -------------------------------------------- Yeesh. I just got bit on this same bug, but from a different direction. Calling BytesWritable.getBytes() returns a reference to the BytesWritable's internal byte array. I was calling that, and then using that byte array in subsequent processing. Problem is that the BytesWritable was also still holding onto a copy of that array, and later modifying it - thus modifying my copy as well. This was a really subtle bug that was hard to find, and I wasted a lot of time on it. I realize there's a need to get access to a BytesWriteable's internal byte storage without performing an array copy. But again, I think there needs to be some additional *safe* method to retrieve a byte array that's a *copy* of a ByteWriteable's contents. There's just too many potential pitfalls for developers if the situation is just left as is. > BytesWritable#getBytes is a bad name that leads to programming mistakes > ----------------------------------------------------------------------- > > Key: HADOOP-6298 > URL: https://issues.apache.org/jira/browse/HADOOP-6298 > Project: Hadoop Common > Issue Type: Improvement > Affects Versions: 0.20.1 > Reporter: Nathan Marz > > Pretty much everyone at Rapleaf who has worked with Hadoop has misused > BytesWritable#getBytes at some point, not expecting the byte array to be > padded. I think we can completely alleviate these programming mistakes by > deprecating and renaming this method (again) to be more descriptive. I > propose "getPaddedBytes()" or "getPaddedValue()". It would also be helpful to > have a helper method "getNonPaddedValue()" that makes a copy into a > non-padded byte array. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.