Codename: apt-fetcher Mentors: Michael Vogt, David Kalnischkies Project proposal page: [0] Project design page: [1]
Here is a summary of weeks 3 and 4 of the coding period: What I've done: - refactored the public parser to be ABI compatible - designed the pluggable acquire framework, the format of the plugins and plugin objects - partial implementation of this framework What problems I've run into: - C++ problems when refactoring the parser - SIGSEGV everywhere - not really a problem, but it was pretty time consuming to understand the whole present process and think of a new design for metadata download What I plan to do further: - finish implementation of the framework - provide unit tests for it - integrate with the acquire module and finish apt-get update default functionality Two weeks ago, at the last report's milestone, the implementation of the public parser was in functional state, with some unit tests covered. This public parser is one of the project's deliverables, intended to be exported from libapt and used not only by APT itself, but also by other package management applications, as needed. Its purpose is to parse the sources.list entries and to expose them as abstract objects with an access interface and predicate iteration functionalities. These Source objects would contain and provide all sources.list information, in a structured and coherent way. First thing I've done at the beginning of week 3 and 4 of GSoC coding period was to integrate the parser tests into the libapt tests. I'm not yet sure how to update the makefile hierarchy so that a simple `make test` from the project root would run the parser tests along with the others - it can't fix some dependencies right now. Also I've refactored the parser considering ABI forward compatibility. The next step was to make use of these sources to download the Debian Archive metadata. To achieve this, I first had to understand and follow the current flow of the metadata download algorithm. The sources.list entries are parsed by a specific internal perser and they are transformed into metaIndex objects. A metaIndex object represent an unique (URI, Distribution) tuple, a.k.a. and unique distribution of debian from a specific location. Each such distribution has a Release file in its root, that ennumerates all the Debian index files in this distribution. A Debian index file is the main storage format for Debian metadata. There currently are index files for package sources, binaries, translations, tags, contents, etc. Anyways, a metaIndex file contains the URI and the Distribution of the Release file, and some additional information such as the sections - main / contrib / non-free - the architectures and the trusted nature of the metaIndex. So, a metaIndex object fully describes what data + metadata to download from a specific debian archive distribution. The metaIndex object offers primitives to download the metadata - GetIndexes() - and build objects used for downloading the actual Debian Packages - GetIndexFiles(). These objects are pkgIndexFile objects, and their main use is to download actual Package data. Of course, my main point of interest was downloading the metadata, and as I've seen, the process follows a pipelined flow: - it first produces metaIndex objects from the entries in the sources.list file(s), merging them for the same URI and Distribution. - it downloads all the metadata for a specific metaIndex - the GetIndexes() method has an intermediate, private step - ComputeIndexTargets() - that builds specific download locations for Debian index files. - it produces objects to download actual packages. These two weeks' goal was to design a framework that would refactor and enhance the metadata download process, and make it plugin-based. The framework's input is a list of Source objects, as they are parsed by the previously developed public parser. The framework will build metaIndex objects for these Sources, using specific metaIndexPlugins - currently there is only one metaIndex plugin, for Debian, as currently all Sources are Debian Sources. The metaIndex, like in the current APT versions, will contain additional information, such as trusted, sections, architectures, and one more type of info: METADATA TO DOWNLOAD. The default metadata types are Packages, Translations and Sources. Other types of metadata can be provided as sources.list options, like [contents=true] (apt-file metadata) or [tags=true] (debtags metadata). This metaIndex implements the same interface as before, but the metaIndexPlugin provides an interface to build acquireIndex objects. An acquireIndex is a new abstract object for the framework, that corresponds to an individual Debian Archive index file. The metaIndex plugin, using the metaIndex information, will build acquireIndex objects, using the framework's registered acquireIndex plugins. An acquireIndex plugin will build the acquireIndex objects for a specific type of metadata. So the framework, by default, will have installed acquireIndex plugins for Packages, Translations and Sources. Developing support for new types of metadata will be as simple as extending the acquireIndex plugin and acquireIndex classes. The acquireIndex is used to build IndexTarget objects, which represent locations for index files in the Debian Arhive. These are then used by the APT acquire module to download the files, using available technologies - diff files, compression. From what I've read in the APT source code, the acquire module can support the download of generic index files, through the pkgAcqIndex class. This way, new metadata files can be added in the Debian Archive and downloaded with the current acquire module. The pluggable acquire framework must provide a way to translate sources.list metadata information into remote index file locations. Here is a framework usage example, as I've thought of it so far: * the framework object is instantiated, and several plugins are registered to it: - the metaIndex plugins - the Debian metaIndex plugin by default. To determine which metaIndex plugin will be used to create the metaIndex, the Type information in the Source object is used. In the future, if more types of metaIndexes are to be supported (new Release files, new type of metaIndex information, etc.), new metaIndex plugins can be implemented. - the acquireIndex plugins - the Packages, Sources and Translations acquireIndex plugins by default, and all the other plugins of applications that are installed and provide a plugin - in the future, we plan to implement plugins for debtags and apt-file. These acquireIndex plugins are associated with Debian's metaIndex plugin. * the framework receives a list of Source objects as input and transforms them into metaIndex objects, using the metaIndex plugin. Multiple metaIndex objects can be merged, if they refer to the same URI and Distribution. * for each of these metaIndex objects, the metaIndex plugin builds acquireIndex objects, according to the framework's installed acquireIndex plugins and the metadata types of the metaIndex. * the acquireIndex objects are then used to compute IndexTarget objects, which represent index files locations in the Debian Archive. * in metaIndex->GetIndexes(), there IndexTarget objects are used to build pkgAcqIndex objects and download the metadata files. At this point, I've done most of the implementation of this framework, but it is not done yet. I'm hoping to finish it during week 5 and also build some unit tests. Regarding the initial timeline, I think it will be slightly adjusted: as it turns out, there is no need to provide new download and security mechanisms for the backend, the present ones in the acquire module are usable. Also, I had to first design the plugin model and implement the support for building IndexTarget objects, before I could fully integrate with apt-get update default functionality. So even though I haven't managed to provide the default apt-get update functionality with the new components by now, I've done the generic plugin definition and interface, which was planned after that. Also, the parser implementation is pretty much final. As a conclusion, I'm feeling a lot more comfortable with the APT code, and I think it's slowly coming to the desired purpose. Just a few more steps and apt-get update will work with the new parser and metadata acquire framework. After this, the code will be throughly tested, new plugins will be built, and certain aspects - acquire logic, donwload security - will be optimized. My contributions to the APT package can be found in the repo [2]: the header file [3], the implementation file [4] and the tester [5]. The framework: header file [6] and implementation file [7]. Bogdan Purcareata [0] http://wiki.debian.org/SummerOfCode2012/Projects#Pluggable_acquire-system_for_APT [1] http://wiki.debian.org/BogdanPurcareata/PluggableAptBackend [2] https://launchpad.net/apt-fetcher [3] http://bazaar.launchpad.net/~bogdan-purcareata/apt-fetcher/trunk/view/head:/apt-pkg/parser.h [4] http://bazaar.launchpad.net/~bogdan-purcareata/apt-fetcher/trunk/view/head:/apt-pkg/parser.cc [5] http://bazaar.launchpad.net/~bogdan-purcareata/apt-fetcher/trunk/view/head:/test/libapt/parser_tester.cc [6] http://bazaar.launchpad.net/~bogdan-purcareata/apt-fetcher/trunk/view/head:/apt-pkg/framework.h [7] http://bazaar.launchpad.net/~bogdan-purcareata/apt-fetcher/trunk/view/head:/apt-pkg/framework.cc _______________________________________________ Soc-coordination mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/soc-coordination
