On Sat, 16 Dec 2006, seth vidal wrote:
On Sat, 2006-12-16 at 20:56 +0200, Panu Matilainen wrote:
Or maybe I'm being crazy, but I don't see it as another copy of the
metadata. I see it as the same copy, just removing the extra processing
of it from every person's machine and having the processing only occur
once.
On a second thought...
What do we need the xml for then, anyway? Even on fast computers the
repodata format is very expensive to process (otherwise we wouldn't have
had several generations of pickle and sqlite caches).
If we were going to get rid of the xml format altogether I would
recommend:
1. using the sqlitedb's as an optimization and non-api-breaking test in
3.0.X or so
2. figure out what improvements we could make to the db format to make
searching faster or to make it smaller on disk. This would mean working
out the right indexes, etc.
I wrote a quick-n-dirty sqlite "backend" for apt to see how things would
look like for another implementation (with totally different usage
patterns) of repodata.
The initial version performs so badly you wouldn't believe it :D
On 2.4GHz AMD64 the creation of dependency cache (which involves walking
over all packages and recording the dependency data to it's the memory
mapped dependency cache) takes over 3 MINUTES (!) for just FC6 core
data. With xml repodata, that's ~6.5s operation on this system, fully
reading filelists.xml info as well, which the sqlite version doesn't do at
all at this point.
After a bit of investigation, the major bottleneck here is that the
provides, requires, conflicts and obsoletes tables don't have index on
pkgKey. After creating those indexes, it's back to ~6.5s even with the
naive initial implementation.
I haven't done any timings on how those indexes would affect yum's usage
patterns, probably not *that* dramatic but it might be something to look
at.
- Panu -
_______________________________________________
Yum-devel mailing list
[email protected]
https://lists.dulug.duke.edu/mailman/listinfo/yum-devel