Just clarified with Keith in IRC (because I wasn't positive)
This approach will work if you want Accumulo to assign timestamps (e.g.
not specify them at all in the client). If you can manage that yourself,
you can try what I suggested in the other message.
Keith Turner wrote:
There are no order guarantees for two mutations added prior to flush being
called. One possible solution it to have two batch writers. One for
deletes and flush it first.
On Wed, Mar 16, 2016 at 4:33 PM, z11373<[email protected]> wrote:
Hi,
I have object abstraction class which delete/add operation will eventually
translate to calling Accumulo writer.putDelete and writer.put
To achieve higher throughput, the code will only call writer.flush per
request (my implementation knows when it's end of request), instead of
flushing per each delete or add operation.
In this case we have client request calling my service which for example
would be:
1. delete A
2. add A
3. add B
I'd expect the end result would be both row id A and B exists in the table,
but apparently it's only B. I already checked from the log, the order the
code being executed is delete first before add operation. However, I guess
since I call flush after all putDelete and put calls being made, Accumulo
somehow make putDelete 'win' (in same flush cycle), is that correct? If
yes,
how to workaround this without sacrificing performance.
Thanks,
Z
--
View this message in context:
http://apache-accumulo.1065345.n5.nabble.com/delete-insert-case-tp16375.html
Sent from the Developers mailing list archive at Nabble.com.