What happens when you subtract the time to read all of your rows?
deleteRows is designed so you don't have to read any data-- you can compute
a range to delete. For instance, in time series table, it's trivial to give
a start and end date as your rows and call deleteRows.

On Mon, Nov 16, 2015 at 10:35 AM, z11373 <z11...@outlook.com> wrote:

> Last week on separate thread I was suggested to use
> tableOperations.deleteRows for deleting rows that matched with specific
> ranges. So I was curious to try it out to see if it's better than my
> current
> implementation which is iterating all rows, and call putDelete for each.
> While researching, I also found Accumulo already provides BatchDeleter,
> which also does the same thing.
> I tried all of three, and below is my test results against three different
> tables (numbers are in milliseconds):
>
> Test 1 (using iterator and call putDelete for each):
> Table 1: 5,702
> Table 2: 6,912
> Table 3: 4,694
>
> Test 2 (using BatchDeleter class):
> Table 1: 8,089
> Table 2: 10,405
> Table 3: 7,818
>
> Test 3 (using tableOperations.deleteRows, note that I first iterate all
> rows, just to get the last row id, which then being passed as argument to
> the function):
> Table 1: 196,597
> Table 2: 226,496
> Table 3: 8,442
>
>
> I ran the tests few times, and pretty much got the consistent results
> above.
> I didn't look at the code what deleteRows really doing, but looking at my
> test results, I can say it sucks!
> Note that for that test, I did scan and iterate just to get the last row
> id,
> but even I subtract the time for doing that, it's still way too slow.
> Therefore, I'd recommend anyone to avoid using deleteRows for this
> scenario.
> YMMV, but I'd stick with my original approach, which is doing the same like
> Test 1 above.
>
>
> Thanks,
> Z
>
>
>
>
> --
> View this message in context:
> http://apache-accumulo.1065345.n5.nabble.com/delete-rows-test-result-tp15569.html
> Sent from the Developers mailing list archive at Nabble.com.
>

Reply via email to