On 13/12/2009, at 2:31 AM, Nagy Gabor wrote:

On Sat, 12 Dec 2009, Dimitrios Apostolou wrote:
Hello list,

I have been investigating the slow performance of pacman regarding
the cold caches scenario and I'm trying to write some proof of
concept code that improves things a lot for some cases. However I
need your help regarding some facts I might have misunderstood, and
any pointers to the source code you also give me would also help a
lot. I wouldn't like to lose time changing stuff that would break
current functionality. So here are some first questions that come
to mind, just by using strace:

When doing "pacman -Q blah" I can see that besides the getdents()
syscalls in /var/lib/pacman/local (probably caused by readdir()),
there are also stat() and access() calls for every single
subdirectory. Why are the last ones necessary? Isn't readdir enough?

The same goes when doing "pacman -S blah". But in that case it
stat()'s both 'local' and 'sync' directories, so worst case is
really bad, it will stat() all contents of local, core, extra and
community...

Regarding the stat() and access() operations I finally found out why
they happen exactly:

In case of corrupted db the sync, for example, directory might
contain files, not subdirectories. So in that case
_alpm_db_populate() just makes sure it's a directory. However
stat()ing thousands of files is too much of a price to pay.
Similarly, access() checks it is accessible by the user.

In the attached patch I have just removed the relevant lines, with
the following rationale: In the rare case of corrupted db, even if we
do open("sync/not_a_dir/depends") it will still fail and we'll catch
the failure there, no need to investigate the cause further, just
write a message like "couldn't access sync/not_a_dir/depends".

By dropping caches ("echo 3 > /proc/sys/vm/drop_caches") before
running, I measure a nice performance boost on my old laptop: "pacman
-Q gdb" time is reduced from about 7s to 2.5s.

Hm. This is a nice time boost... Did you test this with other
operations, too?

What do you think? Is it possible to remove those checks?
Dimitris

The best solution would be to rewrite our whole database crap as Dan
said. I am pretty sure that this patch would not cause any harm irl, but
our code would become a little bit more dangerous: As I see,
db_read(INFRQ_BASE) would become a ~NOP function and db_populate would
become a simple "ls" function (the only remaining sanity check is
splitname).

It occurs to me time and time again, that it would be a good idea to try and abstract the database functions, to allow different database backends to be "plugged in." This would make experimentation with backends a lot easier, since you just compile a different file in (the interface remains the same). A library called libpkg[1] does something along these lines, by leveraging function pointers. Unfortunately I don't have a lot of time to look into it further, but it's an interesting idea.

[1]: http://libpkg.berlios.de/doc/trunk/html/pkg__db_8h_source.html


Reply via email to