Solution 1: Throw more hardware at the cluster. That's the whole point of hadoop. Solution 2: Try to optimize the mapreduce jobs. It depends on what kind of jobs you are running.
I wouldn't suggest decreasing the number of replications as it kind of defeats the purpose of using Hadoop. You could do this if you can't get more hardware, are running experimental non-critical non-production data. What kind of Hadoop monitoring are you talking about? Regards, Vinayak. On Thu, Sep 5, 2013 at 7:51 PM, Chris Embree <cemb...@gmail.com> wrote: > I think you just went backwards. more replicas (generally speaking) are > better. > > I'd take 60 cheap, 1 U servers over 20 "highly fault tolerant" ones for > almost every problem. I'd get them for the same or less $ too. > > > > > On Thu, Sep 5, 2013 at 8:41 PM, Sundeep Kambhampati < > kambh...@cse.ohio-state.edu> wrote: > >> Hi all, >> >> I am looking for ways to configure Hadoop inorder to speed up data >> processing. Assuming all my nodes are highly fault tolerant, will making >> data replication factor 1 speed up the processing? Are there some way to >> disable failure monitoring done by Hadoop? >> >> Thank you for your time. >> >> -Sundeep >> > >