I am not sure what the real concern is... You can set it to 1.0 (or even 1.1 :)) if you prefer. Many admins do.
Raghu. On Tue, Oct 6, 2009 at 5:20 PM, Manhee Jo <j...@nttdocomo.com> wrote: > Thank you, Raghu. > Then, when the percentage is below 0.999, how can you tell > if some datanodes are just slower than others or some of the data blocks > are lost? > I think "percentage 1" should have speacial meaning like > it guarantees integrity of data in HDFS. > If it's below 1, then the integrity is not said to be guaranteed. > > Or are there any other useful means that a NameNode can fix the lost > blocks, > so that it doesn't care even 0.1% of data is lost? > > > Thanks, > Manhee > > ----- Original Message ----- From: "Raghu Angadi" <rang...@apache.org> > To: <common-user@hadoop.apache.org> > Sent: Wednesday, October 07, 2009 1:26 AM > Subject: Re: A question on dfs.safemode.threshold.pct > > > > Yes, it is mostly geared towards replication greater than 1. One of the >> reasons for waiting for this threshold is to avoid HDFS starting >> unnecessary >> replications of blocks at the start up when some of the datanodes are >> slower >> to start up. >> >> When the replication is 1, you don't have that issue. A block either >> exists >> or does not. >> >> Raghu >> 2009/10/5 Manhee Jo <j...@nttdocomo.com> >> >> Hi all, >>> >>> Why isn't the dfs.safemode.threshold.pct 1 by default? >>> When dfs.replication.min=1 with dfs.safemode.threshold.pct=0.999, >>> there might be chances for a NameNode to check in with incomplete data >>> in its file system. Am I right? Is it permissible? Or is it assuming that >>> replication would be always more than 1? >>> >>> >>> Thanks, >>> Manhee >>> >> >> > >