Thanks, that makes sense. john -----Original Message----- From: Harsh J [mailto:ha...@cloudera.com] Sent: Sunday, September 15, 2013 12:39 PM To: <user@hadoop.apache.org> Subject: Re: HDFS performance with an without replication
Write performance improves with lesser replicas (as a result of synchronous and sequenced write pipelines in HDFS). Reads would be the same, unless you're unable to schedule a rack-local read (at worst case) due to only one (busy) rack holding it. On Sun, Sep 15, 2013 at 10:38 PM, John Lilley <john.lil...@redpoint.net> wrote: > In our YARN application, we are considering whether to store temporary > data with replication=1 or replication=3 (or give the user an option). > Obviously there is a tradeoff between reliability and performance, but > on smaller clusters I'd expect this to be less of an issue. > > > > What is the difference in write performance using replication=1 vs 3? > For reading I'd expect the performance to be roughly requivalent. > > > > john -- Harsh J