On Sat, 16 Dec 2006, Paul Nasrat wrote:
The only way I can think of would be a different format so we don't have
to parse the xml or pre-parsing the metadata into a sqlite db. This
would make downloads of the metadata larger but maybe it would be faster
for operations.
For example - fedora extras:
-rw-r--r-- 1 root root 1.6M Dec 16 03:06 primary.xml.gz
-rw-r--r-- 1 root root 2.2M Dec 16 03:09 primary.xml.sqlite.bz2
bzipped the primary xml sqlite db is 2.2M vs 1.6M for the xml itself.
The reason I didn't go this route for FC5 anaconda is that it's just
the same problem as having hdlist, etc. Multiple versions of the same
metadata, the problem we were trying to avoid by moving to repodata.
I'd strongly argue this is the wrong approach.
+1
I suggest looking closer at where the time is *really* spent. Remember the
libxml2 "slowness" which turned out to be something in the way things are
copied between C and python? Is it really the xml parsing where most of
the time is spent, or is it something else like sqlite interactions or...?
Parsing those xml files sure isn't cheap, but it's not *that* slow in
C/C++ - I'd look for other places first.
Here's one easy target for optimization (the time difference is
consistent over successive runs):
[EMAIL PROTECTED] yum]# yum clean dbcache
Loading "installonlyn" plugin
3 cache files removed
[EMAIL PROTECTED] yum]# time ./yummain.py -C --disablerepo='*'
--enablerepo='core' makecache
Loading "installonlyn" plugin
Setting up repositories
################################################## 2931/2931
################################################## 2931/2931
################################################## 2931/2931
Metadata Cache Created
real 0m9.509s
user 0m6.634s
sys 0m0.664s
[EMAIL PROTECTED] yum]# yum clean dbcache
Loading "installonlyn" plugin
3 cache files removed
[EMAIL PROTECTED] yum]# time ./yummain.py -d0 -C --disablerepo='*'
--enablerepo='core' makecache
real 0m8.093s
user 0m6.003s
sys 0m0.469s
---
9.5 vs 8.0 seconds is one helluva big difference in percentage just to
tell the user "something is happening". This is with a reasonably fast
display adapter, I could imagine OLPC suffers even more from this. Didn't
try it, but simply making the progress callbacks (well, writing to screen)
less frequent should shave off quite an amount of time. The user doesn't
*really* need to know we're now processing exactly 1654th of 2001 records,
a rough idea of making progress (every 5/10 percent update for example) is
quite enough.
- Panu -
_______________________________________________
Yum-devel mailing list
[email protected]
https://lists.dulug.duke.edu/mailman/listinfo/yum-devel