On Tue, 2011-01-25 at 09:37 +0100, Patrik Modesto wrote: > While developing really simple MR task, I've found that a > combiantion of Hadoop optimalization and Cassandra > ColumnFamilyRecordWriter queue creates wrong keys to send to > batch_mutate().
I've seen similar behaviour (junk rows being written), although my keys are always a result from LongSerializer.get().toByteBuffer(key) i'm interested in looking into it - but can you provide a code example? From what i can see TextOutputFormat.LineRecordWriter.write(..) doesn't clone anything, but it does write it out immediately. While ColumnFamilyRecordWriter does batch the mutations up as you say, it takes a ByteBuffer as a key, why/how are you re-using this client-side (arn't you creating a new ByteBuffer each call to write(..))? ~mck -- "Never let your sense of morals get in the way of doing what's right." Isaac Asimov | http://semb.wever.org | http://sesat.no | http://finn.no | Java XSS Filter
signature.asc
Description: This is a digitally signed message part