Re: Platform reliability with Hadoop

Jason Venner Wed, 16 Jan 2008 10:05:00 -0800

The /tmp default has caught us once or twice too. Now we put the fileselsewhere.


[EMAIL PROTECTED] wrote:

The DFS is stored in /tmp on each box.The developers who own the machines occasionally reboot and reprofile them


Wont you lose your blocks after reboot since /tmp gets cleaned up? Could this 
be the reason you see data corruption?

Good idea is to configure DFS to be any place other than /tmp

Thanks,
Lohit
----- Original Message ----
From: Jeff Eastman <[EMAIL PROTECTED]>
To: hadoop-user@lucene.apache.org
Sent: Wednesday, January 16, 2008 9:32:41 AM
Subject: Platform reliability with Hadoop

I've been running Hadoop 0.14.4 and, more recently, 0.15.2 on a dozen
machines in our CUBiT array for the last month. During this time I have
experienced two major data corruption losses on relatively small
 amounts
of data (<50gb) that make me wonder about the suitability of this
platform for hosting Hadoop. CUBiT is one of our products for managing
 a
pool of development servers, allowing developers to check out machines,
install various OS profiles on them and monitor their utilization via
the web. With most machines reporting very low utilization it seemed a
natural place to run Hadoop in the background. I have an NFS-mounted
account on all of the machines and have installed Hadoop there. The DFS
is stored in /tmp on each box. The developers who own the machines
occasionally reboot and reprofile them, but this occurs infrequently
 and
does not clobber /tmp. Hadoop is designed to deal with slave failures
 of
this nature, though this platform may well be an acid test.

My initial cloud was configured for replication factor of 3 and I have
increased that now to 4 in hopes of improving data reliability in the
face of these more-prevalent slave outages. Ted Dunning has suggested
aggressive rebalancing in his recent posts and I have done this by
increasing replication to 5 (from 3) and then dropping it to 4. Are
there other rebalancing or configuration techniques that might improve
my data reliability? Or, is this platform just too unstable to be a
 good
fit for Hadoop?

Jeff

Re: Platform reliability with Hadoop

Reply via email to