On 5 December 2013 18:52, Mark Shuttleworth <m...@ubuntu.com> wrote: > On 04/12/13 17:34, Peter Waller wrote: >> This situation is now resolved with thanks to Roger, Gustavo and >> others in real time. There is no way we could have resolved it >> ourselves since there was corruption of the juju database caused by >> running out of disk space, which was unfortunate. We as a team were >> not aware that it is necessary to keep a backup of the juju database. > > Thanks for letting us dive in on it together, Peter. > > Would it help if Juju could maintain an awareness of the disk situation > and gracefully avoid making the problem worse (and avoid corruption) by > going read-only when disk is low?
That's an interesting idea. It would need careful thought though - how would we make the decision when the database is actually distributed over several machines? I believe that the corruption was caused by the fact that we were not making sure that mongo journal writes are synced to disk before returning from a database operation. If we can avoid corruption by enabling that safety mode, I think that would probably be preferable. The main problem in this case was that one problem caused a cascade of sub-problems (the above corruption occurring quite late in the chain). The principal issue was the fact that log files expanded incredibly rapidly. I think that there are a few things that could help here, most important points first: - We should limit agent restarting in some way (exponential backoff or retry limits or both) - We should rotate log files and compress old ones. - We should have kind of policy for expiring and deleting old log files. - We should have some way of garbage collecting the transaction log. We *could* consider disabling logging when the disk is tending towards full, but I suspect that could make a bad problem worse by losing any possibility of seeing what has actually been going on. An awareness of the disk situation could help towards deciding when some of the above actions might be triggered. cheers, rog. -- Juju mailing list Juju@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju