[ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208995#comment-14208995
 ] 

Haohui Mai commented on HDFS-6982:
----------------------------------

bq. However, my understanding is that there's no direct link between the alpha 
parameter and a time-based window, e.g. 1mi, 5 min, 30min.

Let n equals to the number of observations per window. Setting {{alpha = (n-1) 
/ n}} would make the math right assuming that the number of requests follows 
Poisson distribution.

bq. IIUC the situation you describe will lead to small errors, not big ones. If 
there are bigger correctness issues, I think we can fix them by adding more 
synchronization. Thanks.

Depending on the timing, the errors will lead to one of the following: (1) 
correct results, (2) consistently missing one measurement from some users, (3) 
inconsistent measurement for the same users. The artificial errors makes nntop 
less valuable.

I don't quite understand your concerns on fixing the issue. This is a variant 
of the online counting problem which is relatively well-studied. Applying the 
de facto solution can eliminate the errors and makes the implementation 
simpler. I'm not sure why we need to reinvent the wheel here.

> nntop: top­-like tool for name node users
> -----------------------------------------
>
>                 Key: HDFS-6982
>                 URL: https://issues.apache.org/jira/browse/HDFS-6982
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Maysam Yabandeh
>            Assignee: Maysam Yabandeh
>         Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, HDFS-6982.v3.patch, 
> HDFS-6982.v4.patch, HDFS-6982.v5.patch, HDFS-6982.v6.patch, 
> nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to