[ https://issues.apache.org/jira/browse/HBASE-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220530#comment-13220530 ]
Francis Liu commented on HBASE-5498: ------------------------------------ So here's another solution. It's computationally secure just like delegation tokens. The idea is to have an hbase owned staging directory which is world traversable (711): /hbase/staging A user writes out MR data to his secure output directory: /user/foo/data A call is made to hbase to create a secret staging directory which is rwx (777): /user/staging/averylongandrandomdirectoryname The user makes the data world readable and writable, then moves it into the secret staging directory, then calls completeBulkLoad. Like delegation tokens the strength of the security lies in the length and randomness of the secret directory. If we mimic SHA1 it'd be a 40 character hexstring. Though we might need something longer since delegation tokens include timestamps and a nonce. Some issues: * Automated way of cleaning up secret directories in the absence of hbase (ie hbase failure). * side channels leaking the secret directory (ie logs), though this may be only on secured nodes? > Secure Bulk Load > ---------------- > > Key: HBASE-5498 > URL: https://issues.apache.org/jira/browse/HBASE-5498 > Project: HBase > Issue Type: Improvement > Reporter: Francis Liu > > Design doc: > https://cwiki.apache.org/confluence/display/HCATALOG/HBase+Secure+Bulk+Load > Short summary: > Security as it stands does not cover the bulkLoadHFiles() feature. Users > calling this method will bypass ACLs. Also loading is made more cumbersome in > a secure setting because of hdfs privileges. bulkLoadHFiles() moves the data > from user's directory to the hbase directory, which would require certain > write access privileges set. > Our solution is to create a coprocessor which makes use of AuthManager to > verify if a user has write access to the table. If so, launches a MR job as > the hbase user to do the importing (ie rewrite from text to hfiles). One > tricky part this job will have to do is impersonate the calling user when > reading the input files. We can do this by expecting the user to pass an hdfs > delegation token as part of the secureBulkLoad() coprocessor call and extend > an inputformat to make use of that token. The output is written to a > temporary directory accessible only by hbase and then bulkloadHFiles() is > called. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira