On 11/14/17 3:04 PM, Mike Drob wrote:
I don't think the second part of my email ever got addressed.

I see  "HBase Backup/Restore Phase 3: Security"[1] resolved as "Later"
and claims that it will be implemented in the client, both of which make me
uncomfortable. Security Later is a general bad practice, and it is very
rarely correct to rely on client-side security for anything.
Is there another issue that covers security? Do we rely completely on
HDFS security here for more than just the DistCP? What kind of testing has
been done with security, do we have assurances that the backups aren't
accidentally exposing tables to the world?

"Security" as you phrase is pretty open ended, no? The current security model is based around the filesystem permissions and the enforcement of an HBase superuser to execute the necessary service operations behind the BackupAdmin "facade" (e.g. WAL roll procedure execution, snapshot creation, snapshot restore, update hbase:backup are the HBase client actions actually being performed). That's the state of what it is right now and, yes, it does rely on the filesystem backups are sent to (e.g. HDFS, S3, Isilon, WASB) are properly secured. We certainly don't want to be testing correctness of those systems in HBase.

I can see a small section on the documentation update I've already been hacking on to include details on the issue "We can't help you secure where you put the data". Given how many instances of "globally readable S3 bucket" I've seen recently, this strikes me as prudent.

The final issue then is about the backup containing other table's data -- somehow a backup would reference data from another table than the one the admin intended to access. For full backups, this is out of scope (the full backup is relying on Snapshots -- we shouldn't be testing correctness of Snapshots via B&R). For incremental backups, specifically when we're filtering WALs, this is a concern. Thankfully, it's an analogous problem to "correctness". We have unit test coverage in this area already, and we should get good coverage in the up-coming integration test.

Does that help paint a better picture, Mike? Have I missed or glossed over any points?

Reply via email to