Arch is designed for "programming in the large" -- dealing with very, very large collections of source. Ludovic's comments are an occasion to refresh people's memory about that.
Ludovic argues that very large trees are questionable practice and that, although it might not be the revision control system's job to impose practice, nevertheless some large-tree projects could be improved by using configs. Ok, I want to refine those arguments but first some context: The ultimate "huge source tree", and the one that was one of the main inspirations for configs, is a tree of complete system source including kernel, system configuration files, all libraries, and all installed programs. In such a tree there are lots of components which are separately maintained and many which are used by more than one tool. In some cases, we have to expect a single component to occur in more than one place in the huge tree. Realistically, we even have to expect the huge tree to contain multiple distinct versions of some components. Developers have needs to create multiple instances of such huge trees in their workspaces and to have some of these multiple instances be coherent subsets. For example: "give me everything other than the X11 libraries and libc that I need to build Evolution". Administrators, distribution maintainers, and users have a need to be able to audit these huge trees -- to accurately summarize the layout and list of included components in terms of global names of the particular version of each component. Such a list of components is a good *definition* for a named release of a (source-based) distribution. Configs are largely for those kinds of developer, admin, distribution maintainer, and user needs. They facilitate the separate development of logically separate components and they give us a global name-space for specific constellations of components. Making sure that the config system works on top of a revision control system which is not only distributed but features peer-to-peer replication, dumb-fs servers, and good cryptographic-based integrity checking and authentication helps too. For example, if we had a network of dumb boxes being constantly, incrementally flood-filled with updates to components, the distribution business would be improved. A distribution publisher could simply sign a config saying "This has passed our testing and is dubbed distribution release 10.0." No need for a central, closely held network of "update servers" (though they would still have a limited utility for some customers). Instead, most systems could update from the closest public mirror. For an extra layer of paranoia reduction, public crawlers could be built which compare these mirrors to one another, etc. Security would be increased. Emergency distribution updates would be less vulnerable to denial of service attacks. Of course, good revision control is only half of the equation. We would also need well designed conventions for configure/build/install infrastructure. A huge tree of all system sources needs, for example, something like "make world" which, at the root, configures, builds, and installs all subtrees. Subtrees have to be independently constructable. Installation conventions have to be flexible enough so that there is a universal mechanism for installing and running test versions without interfering with or accidentally using the system install. Of course nearly every package has something kinda- sorta like this but there is too little consistency among packages and so to get from upstream of individual components to a source-based distribution requires too much work. Autoconf was never designed with programming in the large in mind -- it takes the view that people work with one package at a time. The package-framework component that tla uses was never intended to me more than a sketch of what is actually needed. It's politically tricky (not impossible, I think) to get upstreams to adopt coherent distributed revision control practices and improved configure/build/install conventions. Tools designed with those needs in mind are part of the solution. Obtaining or simulating a critical mass of upstream projects that use such tools might do the trick. Part of the political problem is that the FOSS community has lost (mistaken for solved) the problem of constructing a "complete GNU system" (or something morally comparable). The vendors and Debian have taught people to think of each upstream project as separate -- to think of the harmonizing of components into a complete system as "somebody else's problem". In fact, it would take only minor shifts in tools and conventional practices to eliminate most of the expensive drudge-work of distribution assembly, freeing up those resources for more appropriate tasks like component testing, review, and other forms of vetting, not to mention for more aggressive programs of forward-thinking R&D. So, Ludovic: Yes, factoring into configs or something very much like them is not only best practice, it's just about the only practice that makes good sense. Revision control's proper role there is to provide the config-like thing. Performance limitations of revision control are clearly not a happy excuse for using configs. Reports about achievable tla performance on things like gcc and the kernel are mixed: my understanding is that some people have obtained performance far better than that recently reported here. People are complaining about `baz status' but my understanding is that that command has long-known, unfixed implementation bugs which give rise to that bad performance -- it seems odd to tar tla with that. In any event there's the Arch 2.0 direction, gathering dust on a shelf. No matter how many times Matthieu calls it a "complete rewrite" that doesn't make it true. *`revc'* is, indeed, a completely newly coded storage manager. It does replace one part of Arch but not the rest. It can do things like give git-like speed for commits and filename-based tree comparisons. It rests for want of resources to port inventory and merging features from tla. -t _______________________________________________ Gnu-arch-users mailing list [email protected] http://lists.gnu.org/mailman/listinfo/gnu-arch-users GNU arch home page: http://savannah.gnu.org/projects/gnu-arch/
