On Thu, 21 Dec 2006, seth vidal wrote:
On Mon, 2006-12-18 at 09:33 +0200, Panu Matilainen wrote:

The overall repodata size could be cut down somewhat by at least couple of
ways:
- Drop the filenames redundancy from primary.xml. It's going to require
   of course the full filelists file to be downloaded at all times (diffs
   would help a lot of course), but that's what apt and smart need to do
   anyway (because both calculate full dependency tree at all times). Only
   yum benefits from the primary.xml stuff to some extent, and sooner or
   later it needs the full filelists too.

We're punishing low bandwidth clients more, then, by requiring they
download all of filelists to do anything.

For yum users, yes. OTOH Smart and apt need the full filelists anyway, so for them the clients end up downloading quite a bit of redundant data because of the partial filelists in primary.xml.

- other.xml is not typically loaded, but it could be made quite a bit
   smaller by storing the changelogs just once by source rpm. The
   difference is *huge* - eg FC6 SRPMS/repodata/other.xml.gz is roughly ~2M,
   but ~6M for i386 and ~8M for x86_64. With that kind of size savings
   somebody might even want to use it for something :)

I see what you mean here, but I'm not sure how that's possible w/o a lot
of substantial changes in how we look up changelogs. Not impossible,
just invasive, I think.

cur.execute("select changelog.date as date, "
            "changelog.author as author, "
            "changelog.changelog as changelog "
            "from packages,changelog where packages.pkgId = %s"
            "and packages.pkgKey = changelog.pkgKey", self.pkgId)

becomes something like

cur.execute("select changelog.date as date, "
            "changelog.author as author, "
            "changelog.changelog as changelog "
            "from packages,changelog where packages.pkgId = %s"
            "and packages.rpm_sourcerpm = changelog.rpm_sourcerpm", self.pkgId)

Yes, it needs changing the other database scheme a bit so it might not be something you'll want to deal with in, say, yum-3.0.x, but if we're looking at the scale of "next gen repodata" things like this *should* be dealt with IMHO.

We really ought to take this to the metadata list though :)

        - Panu -
_______________________________________________
Yum-devel mailing list
[email protected]
https://lists.dulug.duke.edu/mailman/listinfo/yum-devel

Reply via email to