[ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126226#comment-14126226
 ] 

Maysam Yabandeh commented on HDFS-6982:
---------------------------------------

bq. Which architecture did you guys run in production?

We ran the second architecture in production. However as we shared the idea 
with some folks in the community they showed interest for the first 
architecture where the aggregator is inside the nn process. The rational was 
that to avoid extra complexity of maintaining an additional process and also 
avoid the overhead of tailing the audit logs. The purpose of the patch is 
therefore not supporting both architectures, it is rather only the first. So, 
if you see some complexity in the patch that in unnecessary for the first 
architecture feel free to point it out.

bq. If you only aim for the first architecture that's totally fine too, but I 
prefer to somehow pushing the concept of rolling window to the hadoop metric 
system.

Yeah the attached patch aims for the first architecture since we already have 
received interest for that. I am however open to alternatives either to be 
considered as part of this jira or to be investigated later. About the 
particular case of letting the aggregation to be performed by ganglia the 
concern is the volume of the data that needs to be transfered to the external 
aggregator.

> nntop: top­-like tool for name node users
> -----------------------------------------
>
>                 Key: HDFS-6982
>                 URL: https://issues.apache.org/jira/browse/HDFS-6982
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Maysam Yabandeh
>            Assignee: Maysam Yabandeh
>         Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to