Deleting many rows that match a given criterion

2013-10-22 Thread Mike Drob
I'm attempting to delete all rows from a table that contain a specific word in the value of a specified column. My current process looks like: accumulo shell -e 'egrep .*EXPRESSION.* -np -t tab -c col' | awk 'BEGIN {print "table tab"}; {print "deletemany -f -np -r" $1}; END {print "exit"}' > rows.

Re: Deleting many rows that match a given criterion

2013-10-22 Thread Aru Sahni
You could use a RegEx filter to get the rowids, and then pass them to the RowDeletingIterator . ~A On Tue, Oct 22, 2013 at 11:45 AM, Mike Drob wrote: > I'm attempting to delete all rows from a table that contain a specific > word in the value of a specified column. My current process looks lik

Re: Deleting many rows that match a given criterion

2013-10-22 Thread Keith Turner
If its a significant amount of data, you could create a class that extends row filter and set it as a compaction iterator. On Tue, Oct 22, 2013 at 11:45 AM, Mike Drob wrote: > I'm attempting to delete all rows from a table that contain a specific > word in the value of a specified column. My cu

Re: Deleting many rows that match a given criterion

2013-10-23 Thread Mike Drob
Thanks for the feedback, Aru and Keith. I've had some more time to play around with this, and here's some additional observations. My existing process is very slow. I think this is due to each deletemany command starting up a new scanner and batchwriter, and creating a lot of rpc overhead. I didn

Re: Deleting many rows that match a given criterion

2013-10-31 Thread Terry P.
Hi Mike, Did you wind up writing java code to do this? Did you go with a RowFilter? I have a similar circumstance where I need to delete millions of rows daily and the criteria for deletion is not in the rowkey. Thanks in advance, Terry On Wed, Oct 23, 2013 at 4:21 PM, Mike Drob wrote: > Th

Re: Deleting many rows that match a given criterion

2013-10-31 Thread Mike Drob
Terry, Yea, a RowFilter + full compaction takes care of the issue. Note that simply setting a RowFilter for scan time and expecting the data to delete naturally might not work if your clients set varying fetch columns on their scanners. Mike On Thu, Oct 31, 2013 at 5:11 PM, Terry P. wrote: >

Re: Deleting many rows that match a given criterion

2013-11-01 Thread Terry P.
Thanks Mike. It looks like the AgeOffFilter class would be a good starting point as a template for my filter: override the logic in the *init *method as appropriate and put the criteria in the *accept* method. What I can't figure out is where is the magic to remove entries? I don't see anything i