Adding security@hadoop list as well... On Fri, Oct 20, 2017 at 2:29 PM, larry mccay <lmc...@apache.org> wrote:
> All - > > Given the maturity of Hadoop at this point, I would like to propose that > we start doing explicit security audits of features at merge time. > > There are a few reasons that I think this is a good place/time to do the > review: > > 1. It represents a specific snapshot of where the feature stands as a > whole. This means that we can more easily identity the attack surface of a > given feature. > 2. We can identify any security gaps that need to be fixed before a > release that carries the feature can be considered ready. > 3. We - in extreme cases - can block a feature from merging until some > baseline of security coverage is achieved. > 4. The folks that are interested and able to review security aspects can't > scale for every iteration over every JIRA but can review the checklist and > follow pointers for specific areas of interest. > > I have provided an impromptu security audit checklist on the DISCUSS > thread for merging Ozone - HDFS-7240 into trunk. > > I don't want to pick on it particularly but I think it is a good way to > bootstrap this audit process and figure out how to incorporate it without > being too intrusive. > > The questions that I provided below are a mix of general questions that > could be on a standard checklist that you provide along with the merge > thread and some that are specific to what I read about ozone in the > excellent docs provided. So, we should consider some subset of the > following as a proposal for a general checklist. > > Perhaps, a shared document can be created to iterate over the list to fine > tune it? > > Any thoughts on this, any additional datapoints to collect, etc? > > thanks! > > --larry > > 1. UIs > I see there are at least two UIs - Storage Container Manager and Key Space > Manager. There are a number of typical vulnerabilities that we find in UIs > > 1.1. What sort of validation is being done on any accepted user input? > (pointers to code would be appreciated) > 1.2. What explicit protections have been built in for (pointers to code > would be appreciated): > 1.2.1. cross site scripting > 1.2.2. cross site request forgery > 1.2.3. click jacking (X-Frame-Options) > 1.3. What sort of authentication is required for access to the UIs? > 1.4. What authorization is available for determining who can access what > capabilities of the UIs for either viewing, modifying data or affecting > object stores and related processes? > 1.5. Are the UIs built with proxying in mind by leveraging X-Forwarded > headers? > 1.6. Is there any input that will ultimately be persisted in configuration > for executing shell commands or processes? > 1.7. Do the UIs support the trusted proxy pattern with doas impersonation? > 1.8. Is there TLS/SSL support? > > 2. REST APIs > > 2.1. Do the REST APIs support the trusted proxy pattern with doas > impersonation capabilities? > 2.2. What explicit protections have been built in for: > 2.2.1. cross site scripting (XSS) > 2.2.2. cross site request forgery (CSRF) > 2.2.3. XML External Entity (XXE) > 2.3. What is being used for authentication - Hadoop Auth Module? > 2.4. Are there separate processes for the HTTP resources (UIs and REST > endpoints) or are the part of existing HDFS processes? > 2.5. Is there TLS/SSL support? > 2.6. Are there new CLI commands and/or clients for access the REST APIs? > 2.7. Bucket Level API allows for setting of ACLs on a bucket - what > authorization is required here - is there a restrictive ACL set on creation? > 2.8. Bucket Level API allows for deleting a bucket - I assume this is > dependent on ACLs based access control? > 2.9. Bucket Level API to list bucket returns up to 1000 keys - is there > paging available? > 2.10. Storage Level APIs indicate “Signed with User Authorization” what > does this refer to exactly? > 2.11. Object Level APIs indicate that there is no ACL support and only > bucket owners can read and write - but there are ACL APIs on the Bucket > Level are they meaningless for now? > 2.12. How does a REST client know which Ozone Handler to connect to or am > I missing some well known NN type endpoint in the architecture doc > somewhere? > > 3. Encryption > > 3.1. Is there any support for encryption of persisted data? > 3.2. If so, is KMS and the hadoop key command used for key management? > > 4. Configuration > > 4.1. Are there any passwords or secrets being added to configuration? > 4.2. If so, are they accessed via Configuration.getPassword() to allow for > provisioning in credential providers? > 4.3. Are there any settings that are used to launch docker containers or > shell out any commands, etc? > > 5. HA > > 5.1. Are there provisions for HA? > 5.2. Are we leveraging the existing HA capabilities in HDFS? > 5.3. Is Storage Container Manager a SPOF? > 5.4. I see HA listed in future work in the architecture doc - is this > still an open issue? >