Hi,

Our production cluster started reporting "too many open files" this
afternoon and subsequently was unable to save any snapshots to disk.
We have been able to recover it ok, but I would have expected the NN
to complain more if it cannot save a snapshot. All I saw in the log
was...

"WARN org.apache.hadoop.hdfs.server.common.Storage: rollEdidLog:
removing storage <local dir>"
"WARN org.apache.hadoop.hdfs.server.common.Storage: rollEdidLog:
removing storage <nfs dir>"

Do you think this should trigger the NN to enter safe mode. The longer
this goes un-noticed, the more data could be lost if the NN cannot be
recovered?

Regards,

James.

Reply via email to