[ 
https://issues.apache.org/jira/browse/HDFS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852501#action_12852501
 ] 

Joshua Harlow commented on HDFS-708:
------------------------------------

Looks good to me as well.
Just a couple thoughts/questions.

1. Would it be correct to have a "create" set of jobs job that would ensure 
before reads/deletes/writes.. that the files exist (instead of generating in a 
previous job)? That way the data is created on demand, instead of needing to 
have a separate job that runs beforehand that just does data population (this 
stage would not affect the overall timing allotted and could be done at the 
start of the testing)?
2. It would probably be useful to add in a seed number so that the tests can be 
"mostly" repeated (ie write and deletes can't really be truly repeated since 
they modify underlying storage)?
3. Might it be useful to add in the future the ability to specify your own 
distribution "objects" that "generate" operation objects so that the current 
set of operations can be expanded without core changes, ie a plugin like 
framework for generating the distribution and for generating the actual set of 
operations that will occur (allowing for something like a AppendReadDelete 
operation or similar which will be created distributed according to a square 
wave as an example)?

> A stress-test tool for HDFS.
> ----------------------------
>
>                 Key: HDFS-708
>                 URL: https://issues.apache.org/jira/browse/HDFS-708
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: test, tools
>    Affects Versions: 0.22.0
>            Reporter: Konstantin Shvachko
>             Fix For: 0.22.0
>
>         Attachments: SLiveTest.pdf
>
>
> It would be good to have a tool for automatic stress testing HDFS, which 
> would provide IO-intensive load on HDFS cluster.
> The idea is to start the tool, let it run overnight, and then be able to 
> analyze possible failures.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to