In your case, there shouldn't be a delete marker unless you're explicitly
writing one.

The tricky thing about deletes in a summing combiner is that sums and
deletes together are not commutative, and combiners require associativity
and commutativity. If I have three operations: add 1 to x, delete x, and
then add 1 to x, I might reasonably expect the result of performing these
operations in order to be x = 1. However, if I reorder the first add and
the delete operations I could get alternatively get x = 2. When using a
combiner this could happen when the first and last entries are included in
two files that go through a non-full major compaction, and the second entry
is in a third file that is not included. For this reason, we shouldn't have
general support for deletes in a SummingCombiner (but maybe we should have
better documentation).

There are a couple of alternative implementations to get delete
functionality:
1. Use a read-write loop to negate the current value of a key. Read the
current value and write back the same key with negative that value. Make
sure to batch this for performance.
2. Write a different iterator that supports deletes, but only operates on
minor compaction and full major compaction scopes.

There may also be a project that the Accumulo dev community would be
interested in, which would be to add a compaction strategy that makes sure
compactions always include a contiguous range of timestamps. I think this
would remove the requirement for commutativity in iterator operations and
wouldn't introduce performance problems in most cases.

Cheers,
Adam


On Tue, Sep 1, 2015 at 9:13 AM, z11373 <z11...@outlook.com> wrote:

> Thanks Eric and Josh.
>
> There shouldn't be delete marker because my code doesn't perform any delete
> operation, right?
>
> Josh: if that out-of-the-box SummingCombiner cannot handle delete marker,
> then I'd think that's bug :-)
>
>
> Thanks,
> Z
>
>
>
> --
> View this message in context:
> http://apache-accumulo.1065345.n5.nabble.com/exception-thrown-during-minor-compaction-tp15010p15025.html
> Sent from the Developers mailing list archive at Nabble.com.
>

Reply via email to