Re: Software Management call for RFEs

Alek Paunov Tue, 28 May 2013 13:04:12 -0700

On 28.05.2013 21:18, seth vidal wrote:

On Tue, 28 May 2013 20:42:13 +0300
Alek Paunov <a...@declera.com> wrote:

So, it seems that yum already have the "filelists on demand"
optimization implemented. Why you are asking for removing a feature,
which do not make the things worse ... ?


I'm not.

But when you download the filelists - it is A LOT of data.

It is of course :-). It is big and slow now, but it implements one moredistinguishing and convenient Fedora feature ... and under carefulschema and encoding, can be scaled down several times in both space andquery time.

Actually, every "positive" (install, update) yum operation impliesaccess to the repos. Repos contain everything. If our software wasperfectly optimized, not only filelists but all other parts of thedatabase (including primary.files, which you have cited initially)should be lazily synced, right?


I'd rather not have filedeps so it doesn't get pulled in for other
things in depsolving.

Sorry, I do not know how this amount of data will impact libsolv in thefuture. IMO, for yum (I mean in the sqlite based solution) it is amatter of optimizations.

I have a few questions:

   * What is the reasoning behind the splitting of the database across
many .sqlite files?


many? it's 3 afaik. primary, filelists, other.

how do you mean 'many?

Multiplied by the number of the repos. That is what I am trying tounderstand - Why not just single .sqlite file for the whole yum database?

   * Why the sql schema is so denormalized (IMO, leads to both
bandwidth and disk overspending without speed benefits)?. For
example: Why provides and requires tables do not use the common
domain table?


B/c it was designed 8yrs ago and we were going for compressable space
and making it as quick as possible to search?

In the provides and requires example, we do not have any space/speedbenefits achieved by the missing common domain (dependency +dependency_evr tables). In the current situation we have fat and slowtext duplication and indexes instead of integer references to the domainsubnodes (dependencies is the biggest domain in the primary). Yes, inbunch of cases a little denormalization is inevitable when we fight forspeed, but IMO, this and few other space flaws are with negative impacton the speed too.

   * Why the incremental update mechanism (eg. applying xml diffs to
the sqlite database) was not been considered from the very beginning?


It wasn't necessary? There was a massively smaller number of pkgs to
consider.

Indeed. Also, 8 years ago the possibilities and the number of ideas toreuse were definitely different :-)


Thank you,
Alek

--
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: Software Management call for RFEs

Reply via email to