I tried to store protocolbuffer as BytesWritable in a sequence file <Text, BytesWritable>. It's stored using SequenceFile.Writer(new Text(key), new BytesWritable(protobuf.convertToBytes())). When reading the values from key/value pairs using value.get(), it returns more then what's stored. However, value.getSize() returns the correct number. This means in order to convert the byte[] to protocol buffer again, I have to do Arrays.copyOf(value.get(), value.getSize()). This happens on both version 0.17.2 and 0.18.3. Does anyone know why this happens? Sample sizes for a few entries in the sequence file below. The extra bytes in value.get() all have values of zero.
value.getSize(): 7066 value.get().length: 10599 value.getSize(): 36456 value.get().length: 54684 value.getSize(): 32275 value.get().length: 54684 value.getSize(): 40561 value.get().length: 54684 value.getSize(): 16855 value.get().length: 54684 value.getSize(): 66304 value.get().length: 99456 value.getSize(): 26488 value.get().length: 99456 value.getSize(): 59327 value.get().length: 99456 value.getSize(): 36865 value.get().length: 99456 -- View this message in context: http://www.nabble.com/BytesWritable-get%28%29-returns-more-bytes-then-what%27s-stored-tp22962146p22962146.html Sent from the Hadoop core-user mailing list archive at Nabble.com.