[ 
https://issues.apache.org/jira/browse/HDFS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856500#action_12856500
 ] 

Joshua Harlow commented on HDFS-708:
------------------------------------

For the distributions I was thinking that this could occur.
The input would expected to be between [0,1] and output expected to be between 
[0,1].
The way I was thinking this would work is that the mapper would give the 
current time and divide it by the maximum time (both known) and for each 
iteration of the mapper's inner loop (the one producing & running operations) 
it would calculate the distribution using these simple formulas for each 
operation type and distribution given. This would then give a list of numbers 
between [0,1] which can then be multiplied by a new config variable 
(slive.ops.per.iteration) and also multiplied by the operations ratio 
(percentage) to then determine how many operations should occur in that 
iteration. If the total operations after each loop reaches slive.map.ops or 
current time reaches the maximum time the loop would stop and the results would 
be sent to the reducer.

Here are possible equations to be used:
Beg would be defined by x^2 (having a number approaching 1 at the end)
End would be defined by (x-1)^2 (having a number approach 0 at the end) 
Mid would be defined by -2*(x-1/2)^2+1/2 (having a bell shaped curve)
Uniform would just return 1/3 (the above equations have areas of 1/3 so this 
seems to make sense)
Suggestions are welcome.



> A stress-test tool for HDFS.
> ----------------------------
>
>                 Key: HDFS-708
>                 URL: https://issues.apache.org/jira/browse/HDFS-708
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: test, tools
>    Affects Versions: 0.22.0
>            Reporter: Konstantin Shvachko
>             Fix For: 0.22.0
>
>         Attachments: SLiveTest.pdf
>
>
> It would be good to have a tool for automatic stress testing HDFS, which 
> would provide IO-intensive load on HDFS cluster.
> The idea is to start the tool, let it run overnight, and then be able to 
> analyze possible failures.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to