Re: How to speed up Hadoop?

2013-09-05 Thread Harsh J
I'd recommend reading Eric Sammer's "Hadoop Operations" (O'Reilly) book. It goes over a lot of this stuff - building, monitoring, tuning, optimizing, etc.. If your goal is just speed and quicker results, and not retention or safety, by all means use replication factor as 1. Note that its difficult

Re: How to speed up Hadoop?

2013-09-05 Thread Sundeep Kambhampati
On 9/5/2013 8:57 PM, Preethi Vinayak Ponangi wrote: Solution 1: Throw more hardware at the cluster. That's the whole point of hadoop. Solution 2: Try to optimize the mapreduce jobs. It depends on what kind of jobs you are running. I wouldn't suggest decreasing the number of replications as it

Re: How to speed up Hadoop?

2013-09-05 Thread Sundeep Kambhampati
On 9/5/2013 8:57 PM, Preethi Vinayak Ponangi wrote: Solution 1: Throw more hardware at the cluster. That's the whole point of hadoop. Solution 2: Try to optimize the mapreduce jobs. It depends on what kind of jobs you are running. I wouldn't suggest decreasing the number of replications as it

Re: How to speed up Hadoop?

2013-09-05 Thread Peyman Mohajerian
How about this: http://hadoop.apache.org/docs/stable/vaidya.html I've never tried it myself, i was just reading about it today. On Thu, Sep 5, 2013 at 5:57 PM, Preethi Vinayak Ponangi < vinayakpona...@gmail.com> wrote: > Solution 1: Throw more hardware at the cluster. That's the whole point of >

Re: How to speed up Hadoop?

2013-09-05 Thread Preethi Vinayak Ponangi
Solution 1: Throw more hardware at the cluster. That's the whole point of hadoop. Solution 2: Try to optimize the mapreduce jobs. It depends on what kind of jobs you are running. I wouldn't suggest decreasing the number of replications as it kind of defeats the purpose of using Hadoop. You could d

Re: How to speed up Hadoop?

2013-09-05 Thread Chris Embree
I think you just went backwards. more replicas (generally speaking) are better. I'd take 60 cheap, 1 U servers over 20 "highly fault tolerant" ones for almost every problem. I'd get them for the same or less $ too. On Thu, Sep 5, 2013 at 8:41 PM, Sundeep Kambhampati < kambh...@cse.ohio-stat

How to speed up Hadoop?

2013-09-05 Thread Sundeep Kambhampati
Hi all, I am looking for ways to configure Hadoop inorder to speed up data processing. Assuming all my nodes are highly fault tolerant, will making data replication factor 1 speed up the processing? Are there some way to disable failure monitoring done by Hadoop? Thank you for your time.