So we were storing the a  hadoop.job.history.user.location
(attempt_blah) files on local disk on each node.  We keep them around for
about a week. We have had to reduce this to 1 day, because as the number of
files in that directory increases, eventually jobs fail to run on that
machine til I clear/move the logs out. I am guessing that this is a glob
failure.

Reply via email to