-------- Forwarded Message -------- From: Alva Couch <[EMAIL PROTECTED]> To: Mark Burgess <[EMAIL PROTECTED]> Subject: [Fwd: Re: convergence and undoing changes] Date: Fri, 18 Nov 2005 14:36:52 -0500
Alas, my post to the list bounced. Perhaps you could forward it? email message attachment (Re: convergence and undoing changes) -------- Forwarded Message -------- From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Re: convergence and undoing changes Date: Fri, 18 Nov 2005 14:05:11 -0500 You are not allowed to post to this mailing list, and your message has been automatically rejected. If you think that your messages are being rejected in error, contact the mailing list owner at [EMAIL PROTECTED] email message attachment -------- Forwarded Message -------- From: Alva Couch <[EMAIL PROTECTED]> To: Mark Burgess <[EMAIL PROTECTED]> Cc: christian pearce <[EMAIL PROTECTED]>, Viraj Alankar <[EMAIL PROTECTED]>, [email protected] Subject: Re: convergence and undoing changes Date: Fri, 18 Nov 2005 14:03:38 -0500 Mark Burgess wrote: >>Alva Couch gave a presentation about this: >>http://homepages.informatics.ed.ac.uk/group/lssconf/config2005e/Slides/cfengine.pdf > > I have not seen this before. In fact it annoys me a little because it is > factually incorrect and seems just to be an unnecessary slur on > cfengine. It is not intended as a slur, but perhaps I have been too bold in my statements. I apologise for the offense. I am trying, instead, to focus upon an aspect of cfengine that many users have found problematic, and propose an effective solution, including using cfengine but also things from outside of the cfengine framework. I think that the key to any progress is to identify the problem. In that talk, I identified the problem but didn't focus so much upon the solution. Let me make some amends by trying to focus upon the solution here in a constructive way. No tool can be effectively utilized without appropriate practice. I have been working in detail on the practice cfengine demands. The effective practice for utilizing cfengine has a drawback that I call the "observability" quandary. Like other incremental approaches, one is at the mercy of what one decides to control, in perpetuity. Once one has asserted a state for something, one cannot stop controlling that state unless one knows that the network has converged as a whole to the desired state; without that assurance, further commands must presume the co-existence of perhaps *both* formerly desired and currently desired states. Since this network-wide convergence is almost always impractical to assure, this means that cfengine configurations tend to "ratchet" up in size and complexity and never decrease in size. My conclusions from this are controversial. I do not consider the typical cfengine configuration file "specific enough" by nature to constitute a "configuration"; some things are left the way they are in a pseudo-default state. This is both a strength and a weakness. The lack of specificity leads cfengine to be extremely effective in legacy environments where we need to leave most everything alone. It is also appropriate in an environment where major upgrades will not occur for the operating system itself. The talk also contains the claim that "cfengine is not a configuration management system". It is instead a configuration enforcer; from a typical configuration file of cfengine alone, one cannot determine the whole configuration or its intent. It is best to think of the cfengine configuration as a "differential template" between what we start with and what we want. It is thus particularly hard to describe the intent of a random cfengine configuration file; more structure needs to be imposed from outside the configuration file in order to be able to make that interpretation. I think it is more accurate to think of the actual configuration as something external to the cfengine configuration, that is a combination of cfengine and pre-existing state. The "partial" nature of the cfengine configuration is problematic when, e.g., a major upgrade of the operating system occurs. The configuration file describes incremental changes to "one" baseline and we now have another baseline. Because changes are incremental in nature, latent variables can crop up easily and cause things to break. We are thus forced to completely re-validate the cfengine configuration's correctness, one statement at a time. This is not a scalable thing and there is no way to "port" the old configuration to the new OS. Comparing this to a "generative" solution for configuration management, a re-implementation of the generators fixes *all* porting problems. So there is a scalability problem: in one case, one fix fixes "all" porting problems, whereas for a cfengine configuration file, every instance must be fixed and revalidated separately. > This network level divergence example is highly misleading. > Of course there can be periods of divergence, but the talk seems to > imply that some other tool could improve on the problem. In fact I would > suggest that no other tool provides more consistency than cfengine > today. The claim was not that it does not provide long-term consistency, but that there is a practice that must be utilized to achieve that consistency, and that practice is external to cfengine itself. And just about every user I know has fallen into the trap of creating such a divergence: a) "ad-hoc editing" of a cfengine configuration file can produce latent effects on a network. b) thus "change management" is required in a cfengine environment in order to avoid producing latent variables and effects. c) cfengine doesn't support that change management itself. d) something needs to manage, therefore, the *input* to cfengine. The solution to the problem under discussion is easy. One must enter a command into the configuration that performs the rollback. Probably by copying the appropriate file from the original build of the machine. That command must stay in the configuration until every host has converged back to the original state. A systematic approach to rollback is to keep a "gold server" around that reflects the initial build state, or otherwise snapshot an initial build for rollback purposes. Note that I am not advocating that people stop using cfengine to enforce configurations. I think, instead, that they should *start* thinking about generating the cfengine configuration from a higher-level configuration, iteratively, so that the incidental divergence and latent variable problems disappear. The ideal tool is one that would interpret an *existing* cfengine configuration, and a new set of "intents", and produce a configuration free of latent effects in producing those intents. This is a problem that has proven particularly difficult for humans (as my experience with other cfengine admins has shown) but might be easier for a computer program. In short, I am not saying that one should stop using cfengine, but that one should realize its limitations and look for solutions to those limitations. It is still the best solution for enforcing configurations. It needs something else, however, which is the ability to generate a "complete" configuration consistent with a particular "intent" and combine that with the changes it has already made to a system. Ideally, this generation and combination is portable; i.e., it can be constructed so that the target architecture is independent of the intent. As an aside, however, one does not need rollback in any such system. Baselining, consistent observability and control, and carry-forward are enough. It is not a history of changes, but a history of configurations, that is needed. To roll back a change, rebaseline the affected files and replay changes up to the point before the crucial change. I certainly do not mean to attack the cfengine community. I think, however, that we must remain realistic about what our tools do and do not do. That is the spirit of my presentation and my response here. _______________________________________________ Help-cfengine mailing list [email protected] http://lists.gnu.org/mailman/listinfo/help-cfengine
