Hi,

I think there is a bug CompressedWritable, more exactly in write() method, which makes subclasses of CompressedWritable not "reusable". More exactly: the data that is written to the DataOutput is always the same, because is "cached" in the /compressed/ field.

 public final void write(DataOutput out) throws IOException {
   /if (compressed == null) {/
     ByteArrayOutputStream deflated = new ByteArrayOutputStream();
     Deflater deflater = new Deflater(Deflater.BEST_SPEED);
     DataOutputStream dout =
       new DataOutputStream(new DeflaterOutputStream(deflated, deflater));
     writeCompressed(dout);
     dout.close();
     compressed = deflated.toByteArray();
   }
   out.writeInt(compressed.length);
   out.write(compressed);
 }

(code from hadoop 0.12.1)


This makes subclasses not "reusable", no matter if the fields change, data written will be the same.

John


Reply via email to