Hi Micael, Do you know if the invalid tx list inside the Transaction object is large?
Terence > On May 31, 2017, at 1:49 AM, Micael Capitão <[email protected]> > wrote: > > Hi all, > > I've been testing Tephra 0.11.0 for a project that may need transactions on > top of HBase and I find it's performance, for instance, for a bulk load, very > poor. Let's not discuss why am I doing a bulk load with transactions. > > In my use case I am generating batches of ~10000 elements and inserting them > with the *put(List<Put> puts)* method. There is no concurrent writers or > readers. > If I do the put without transactions it takes ~0.5s. If I use the > *TransactionAwareHTable* it takes ~12s. > I've tracked down the performance killer to be the > *addToOperation(OperationWithAttributes op, Transaction tx)*, more > specifically the *txCodec.encode(tx)*. > > I've created a TransactionAwareHTableFix with the *addToOperation(txPut, tx)* > commented, and used it in my code, and each batch started to take ~0.5s. > > I've noticed that inside the *TransactionCodec* you were instantiating a new > TSerializer and TDeserializer on each call to encode/decode. I tried > instantiating the ser/deser on the constructor but even that way each of my > batches would take the same ~12s. > > Further investigation has shown me that the Transaction instance, after being > encoded by the TransactionCodec, has 104171 bytes of length. So in my 10000 > elements batch, ~970MB is metadata. Is that supposed to happen? > > > Regards, > > Micael Capitão
