[ https://issues.apache.org/jira/browse/HDFS-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094501#comment-15094501 ]
Colin Patrick McCabe edited comment on HDFS-9395 at 2/2/16 9:47 PM: -------------------------------------------------------------------- [edit: see my later comments about this. I was writing this assuming that audit log strategy #1 was correct, but without considering the other possibilities] If I understand correctly, the relevant code block is here: {code} ContentSummary getContentSummary(final String src) throws IOException { checkOperation(OperationCategory.READ); readLock(); boolean success = true; try { checkOperation(OperationCategory.READ); return FSDirStatAndListingOp.getContentSummary(dir, src); } catch (AccessControlException ace) { success = false; throw ace; } finally { readUnlock(); logAuditEvent(success, "contentSummary", src); } } {code} The code appears to be making the assumption that the only IOE that can be thrown is {{AccessControlException}}. I don't think this is correct. It would be better to change this to something like this, similar to our other audit log use-cases: {code} ContentSummary getContentSummary(final String src) throws IOException { checkOperation(OperationCategory.READ); readLock(); boolean success = false; try { checkOperation(OperationCategory.READ); ContentSummary csum = FSDirStatAndListingOp.getContentSummary(dir, src); success = true; return csum; } catch (AccessControlException ace) { throw ace; } finally { readUnlock(); logAuditEvent(success, "contentSummary", src); } } {code} bq. It's by design? HDFS-5163 No, it's a bug. Also, I looked at the code prior to the HDFS-4949 branch merge, and the bug existed prior to HDFS-5163 or any of the other HDFS-4949 JIRAs. Hope this helps. was (Author: cmccabe): If I understand correctly, the relevant code block is here: {code} ContentSummary getContentSummary(final String src) throws IOException { checkOperation(OperationCategory.READ); readLock(); boolean success = true; try { checkOperation(OperationCategory.READ); return FSDirStatAndListingOp.getContentSummary(dir, src); } catch (AccessControlException ace) { success = false; throw ace; } finally { readUnlock(); logAuditEvent(success, "contentSummary", src); } } {code} The code appears to be making the assumption that the only IOE that can be thrown is {{AccessControlException}}. I don't think this is correct. It would be better to change this to something like this, similar to our other audit log use-cases: {code} ContentSummary getContentSummary(final String src) throws IOException { checkOperation(OperationCategory.READ); readLock(); boolean success = false; try { checkOperation(OperationCategory.READ); ContentSummary csum = FSDirStatAndListingOp.getContentSummary(dir, src); success = true; return csum; } catch (AccessControlException ace) { throw ace; } finally { readUnlock(); logAuditEvent(success, "contentSummary", src); } } {code} bq. It's by design? HDFS-5163 No, it's a bug. Also, I looked at the code prior to the HDFS-4949 branch merge, and the bug existed prior to HDFS-5163 or any of the other HDFS-4949 JIRAs. Hope this helps. > HDFS operations vary widely in which failures they put in the audit log and > which they leave out > ------------------------------------------------------------------------------------------------ > > Key: HDFS-9395 > URL: https://issues.apache.org/jira/browse/HDFS-9395 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Kihwal Lee > Assignee: Kuhu Shukla > Attachments: HDFS-9395.001.patch, HDFS-9395.002.patch > > > So, the big question here is what should go in the audit log? All failures, > or just "permission denied" failures? Or, to put it a different way, if > someone attempts to do something and it fails because a file doesn't exist, > is that worth an audit log entry? > We are currently inconsistent on this point. For example, concat, > getContentSummary, addCacheDirective, and setErasureEncodingPolicy create an > audit log entry for all failures, but setOwner, delete, and setAclEntries > attempt to only create an entry for AccessControlException-based failures. > There are a few operations, like allowSnapshot, disallowSnapshot, and > startRollingUpgrade that never create audit log failure entries at all. They > simply log nothing for any failure, and log success for a successful > operation. > So to summarize, different HDFS operations currently fall into 3 categories: > 1. audit-log all failures > 2. audit-log only AccessControlException failures > 3. never audit-log failures > Which category is right? And how can we fix the inconsistency -- This message was sent by Atlassian JIRA (v6.3.4#6332)