On 05/27/2013 09:32 AM, Jan Zelený wrote:
On 25. 5. 2013 at 09:34:32, Nico Kadel-Garcia wrote:
On Wed, May 22, 2013 at 11:55 AM, Michael Ekstrand <mich...@elehack.net>
wrote:
Performance improvement: improve scaling to 5K+ installed packages.

* Amen. This is particularly compounded by poor caching default
behavior, so that a few yum commands in a row each wind up reaching
out to downloading metadata again, and again, and again.

I think this can be addressed by moving the metadata updates to a
different function, and calling it *separately* only as needed. The
Debian "apt" tool does this quite effectively.

Unfortunately there is not much we can do about this. Debian has completely
different repository policy - they keep all versions of packages in the repo so
there is no need to update metadata on client machines every time.

I don't quite understand this comment.

Debian repository policy varies quite a bit. Some repositories keep old versions, some don't. Mostly the latter, actually, because not all repository managers (there a couple of implementations) can deal with multiple versions for a single package/architecture combination.

As far as I can tell, the main difference is that apt-get and apt-cache read very few, relatively large files at the beginning, so they don't block on disk reads early.

dpkg, on the other hand, uses a database scatter across many small files on disk, so you get the delay only when you actually install or remove any packages. At the beginning, this is quite fast, but eventually, the files will be scattered quite badly, and there is a considerable delay at this step.

--
Florian Weimer / Red Hat Product Security Team
--
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Reply via email to