In our YARN application, we are considering whether to store temporary data with replication=1 or replication=3 (or give the user an option). Obviously there is a tradeoff between reliability and performance, but on smaller clusters I'd expect this to be less of an issue.
What is the difference in write performance using replication=1 vs 3? For reading I'd expect the performance to be roughly requivalent. john