Hi Todd, Yes, sorry I should have said we are running CDH3u2.
Thanks, James. On 14 Jun 2012, at 17:48, Todd Lipcon <t...@cloudera.com> wrote: > Hi James, > > Could you please let us know exactly what version of Hadoop you're > running? This is an area that has had some bug fixes throughout the > last year, so identifying the particular version is important. > > -Todd > > On Thu, Jun 14, 2012 at 9:16 AM, James Kinley > <jamesrobertkin...@gmail.com> wrote: >> Hi, >> >> Our production cluster started reporting "too many open files" this >> afternoon and subsequently was unable to save any snapshots to disk. >> We have been able to recover it ok, but I would have expected the NN >> to complain more if it cannot save a snapshot. All I saw in the log >> was... >> >> "WARN org.apache.hadoop.hdfs.server.common.Storage: rollEdidLog: >> removing storage <local dir>" >> "WARN org.apache.hadoop.hdfs.server.common.Storage: rollEdidLog: >> removing storage <nfs dir>" >> >> Do you think this should trigger the NN to enter safe mode. The longer >> this goes un-noticed, the more data could be lost if the NN cannot be >> recovered? >> >> Regards, >> >> James. > > > > -- > Todd Lipcon > Software Engineer, Cloudera