[ 
https://issues.apache.org/jira/browse/HDFS-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427692#comment-13427692
 ] 

Aaron T. Myers commented on HDFS-3680:
--------------------------------------

bq. Yes, a faulty customer logger will affect the functionality of the NN. The 
question is what is level of risk to the NN, and is it acceptable? 

I'll reiterate my previous point: this is not arbitrary users writing and 
installing custom logger implementations into the NN. This will likely be a 
handful of people writing these, and operators will have to consciously install 
them into the NN. The people who are involved in writing a custom logger should 
be aware of the inherent risks of doing so and should write defensive code. 
This is not the place in the Hadoop code where we should be holding the hand of 
the users.

I think this point is further supported by your observation that people seem to 
agree that shutting down the NN is the right answer if an event fails to get 
logged by a custom logger. If that's what we intend to have happen in the event 
of custom audit log failure, then the worst case scenario of a custom audit 
logger seg faulting or calling System.exit really isn't that bad. The NN 
already has to handle an ungraceful shutdown and maintain data integrity, so 
the marginal increase of danger should be low.

bq. If it's in a daemon, the NN will notice the RPC server crashed and may 
initiate a clean shutdown.

This is one potential implementation of a custom audit logger that would 
potentially reduce the risk, but we should not force this design on the writers 
of custom loggers.

bq. I'd assume it [latency of RPC call] can't be that bad since most of hadoop 
uses it.

I agree with Marcelo that the latency of doing an RPC per audit log is likely 
unacceptably high. Just because the latency of the Hadoop Server/Client 
implementations is acceptable for FS/MR job operations doesn't mean it's 
sufficient for audit logging. My guess would be that it's not acceptable for 
anything but the smallest use cases. At the very least, it seems to make the 
performance near acceptable we couldn't actually do an RPC per log event, but 
instead would have to buffer and group the events into fewer calls, effectively 
group commit on the log events. That sort of complexity should be left up to 
the custom logger author, if using a separate daemon is actually required.
                
> Allows customized audit logging in HDFS FSNamesystem
> ----------------------------------------------------
>
>                 Key: HDFS-3680
>                 URL: https://issues.apache.org/jira/browse/HDFS-3680
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 2.0.0-alpha
>            Reporter: Marcelo Vanzin
>            Assignee: Marcelo Vanzin
>            Priority: Minor
>         Attachments: accesslogger-v1.patch, accesslogger-v2.patch, 
> hdfs-3680-v3.patch, hdfs-3680-v4.patch, hdfs-3680-v5.patch
>
>
> Currently, FSNamesystem writes audit logs to a logger; that makes it easy to 
> get audit logs in some log file. But it makes it kinda tricky to store audit 
> logs in any other way (let's say a database), because it would require the 
> code to implement a log appender (and thus know what logging system is 
> actually being used underneath the façade), and parse the textual log message 
> generated by FSNamesystem.
> I'm attaching a patch that introduces a cleaner interface for this use case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to