We do this on large tables by setting our own iterator to age things off
based on our key structure but then use compact range to delete specific
days with a cron.
https://accumulo.apache.org/1.7/apidocs/org/apache/accumulo/core/client/admin/TableOperations.html#compact-java.lang.String-org.apache.hadoop.io.Text-org.apache.hadoop.io.Text-boolean-boolean-
However, the tables in question have keys that that can be computed from
a date range which is why it works. Then it only compacts (deletes) that
specific date range.
Since you have the time of arrival in the key you could likely do the
same thing.
Andrew
On 2/14/19 2:57 AM, Krzysztof Martyn wrote:
Hi Accumulo,
I want to make a system that will store data for a certain period of time,
say 2 days, after that time the data should be deleted.
All time there are ingestion with big amount of new data.
The key is the time of arrival, and splits are generated every 1s so that
data from 1s have a separate tablet.
Is there any possibility to remove a range of tablets without having to run
major compact and merge?
I have tested AgeOffFilter, however, it requires manual launch of the major
compact which almost makes it impossible to scan the database.
I also have tested BatchDeleter, and deleteRows from tableOperations,
however, they are even worse than AgeOffFilter.
Krzysztof
--
Sent from: http://apache-accumulo.1065345.n5.nabble.com/Developers-f3.html