Hey devs, I was presenting at GOTO Amsterdam yesterday and I got a question about a scenario that I've never thought about before. I'm wondering what others think.
How do you efficiently wipe out random data in HBase? For example, you have a website and a user asks you to close their account and get rid of the data. Would you say "sure can do, lemme just issue a couple of Deletes!" and call it a day? What if you really have to delete the data, not just mask it, because of contractual obligations or local laws? Major compacting is the obvious solution but it seems really inefficient. Let's say you've got some truly random data to delete and it happens so that you have at least one row per region to get rid of... then you need to basically rewrite the whole table? My answer was such, and I told the attendee that it's not an easy use case to manage in HBase. Thoughts? J-D
