[ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128844#comment-14128844
 ] 

Maysam Yabandeh commented on HDFS-6982:
---------------------------------------

bq. Given that the second architecture has been proven in production – maybe we 
can first push the code that can help the second architecture into the 
repository first?

That is also an option to pursue. I do not have strong preferences to one or 
another. So I let the community's interest determine the next steps.

So, if we go for the second architecture, there will be absolutely no change in 
the nn. In fact I am not sure whether HDFS would be the right place to submit 
the patch for that. In this architecture, a separate process tails the audit 
logs--which are produced by the default audit logger--off the local hard disk. 
It then parses them and aggregate them before emitting the top users via jmx 
(or the web page).

Let us know if this is the path you would to be pursued and where you think 
would be the right place for the code of this separate process to reside.

> nntop: top­-like tool for name node users
> -----------------------------------------
>
>                 Key: HDFS-6982
>                 URL: https://issues.apache.org/jira/browse/HDFS-6982
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Maysam Yabandeh
>            Assignee: Maysam Yabandeh
>         Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to