I'd recommend reading Eric Sammer's "Hadoop Operations" (O'Reilly)
book. It goes over a lot of this stuff - building, monitoring, tuning,
optimizing, etc..
If your goal is just speed and quicker results, and not retention or
safety, by all means use replication factor as 1. Note that its
difficult
On 9/5/2013 8:57 PM, Preethi Vinayak Ponangi wrote:
Solution 1: Throw more hardware at the cluster. That's the whole point
of hadoop.
Solution 2: Try to optimize the mapreduce jobs. It depends on what
kind of jobs you are running.
I wouldn't suggest decreasing the number of replications as it
On 9/5/2013 8:57 PM, Preethi Vinayak Ponangi wrote:
Solution 1: Throw more hardware at the cluster. That's the whole point
of hadoop.
Solution 2: Try to optimize the mapreduce jobs. It depends on what
kind of jobs you are running.
I wouldn't suggest decreasing the number of replications as it
How about this: http://hadoop.apache.org/docs/stable/vaidya.html
I've never tried it myself, i was just reading about it today.
On Thu, Sep 5, 2013 at 5:57 PM, Preethi Vinayak Ponangi <
vinayakpona...@gmail.com> wrote:
> Solution 1: Throw more hardware at the cluster. That's the whole point of
>
Solution 1: Throw more hardware at the cluster. That's the whole point of
hadoop.
Solution 2: Try to optimize the mapreduce jobs. It depends on what kind of
jobs you are running.
I wouldn't suggest decreasing the number of replications as it kind of
defeats the purpose of using Hadoop. You could d
I think you just went backwards. more replicas (generally speaking) are
better.
I'd take 60 cheap, 1 U servers over 20 "highly fault tolerant" ones for
almost every problem. I'd get them for the same or less $ too.
On Thu, Sep 5, 2013 at 8:41 PM, Sundeep Kambhampati <
kambh...@cse.ohio-stat
Hi all,
I am looking for ways to configure Hadoop inorder to speed up data
processing. Assuming all my nodes are highly fault tolerant, will making
data replication factor 1 speed up the processing? Are there some way to
disable failure monitoring done by Hadoop?
Thank you for your time.