Hi!

I did some statistics on how much space the sqlite indexes need. I deleted the indexes by hand (using sqlite3) and called VACUUM afterwards to shrink the databases to the really needed size. After that I added all indexes that I think are useful (which are the current plus the one that can be found here: http://devel.linux.duke.edu/gitweb/?p=yum.git;a=commitdiff;h=42283902f929ac131cda7b3497ae047b497e02bc )

The growth is relative to the files without indexes.

-------------------------------------------------------
|                   Fedora 7 Everything               |
|-----------------------------------------------------|
|           |  files  | primary |files.bz2|primary.bz2|
|No indexes | 34.8 MB | 15.8 MB |  7.2 MB |  3.3 MB   |
|Now        | 37.2 MB | 20.0 MB |  8.0 MB |  4.9 MB   |
|All indexes| 45.5 MB | 25.3 MB |  9.2 MB |  5.5 MB   |
|-----------------------------------------------------|
|Growth now |   6.9%  |  26.6%  |  11.1%  |  48.5%    |
| Sum       |       30.0%       |       22.9%         |
|Growth all |  30.7%  |  60.1%  |  27.8%  |  66.7%    |
| Sum       |       40.0%       |       40.0%         |
-------------------------------------------------------

I attached a small python program that is timing the creation of the indexes. On my machine it takes exactly 6 seconds. Don't know how this countervails saving 2.4 (4.2) MB download size.

Any opinions?


BTW: What happened to the delta metadata approach? Is there any code that should be reviewed? Any help wanted? Project already abandoned?

Florian
#!/usr/bin/python
import sqlite3
import time

class DB:
    def __init__(self, filename):
        f = open(filename, 'rw')                        
        self.db = sqlite3.connect(filename)
        self.cursor = self.db.cursor()

class Filelists(DB):

    def createIndices(self):
        cur = self.cursor
        cur.execute("CREATE INDEX keyfile ON filelist (pkgKey)")
        cur.execute("CREATE INDEX pkgId ON packages (pkgId)")
        cur.execute("CREATE INDEX dirnames ON filelist (dirname)")

    def removeIndices(self):
        for index in ("keyfile", "pkgId", "dirnames"):
            self.cursor.execute("DROP INDEX IF EXISTS %s" % index)
        self.cursor.execute("VACUUM")

class Primary(DB):

    def createIndices(self):
        cur = self.cursor
        cur.execute("CREATE INDEX packagename ON packages (name)")
        cur.execute("CREATE INDEX providesname ON provides (name)")
        cur.execute("CREATE INDEX pkgprovides ON provides (pkgKey)")
        cur.execute("CREATE INDEX requiresname ON requires (name)")
        cur.execute("CREATE INDEX pkgrequires ON requires (pkgKey)")
        cur.execute("CREATE INDEX pkgconflicts ON conflicts (pkgKey)")
        cur.execute("CREATE INDEX pkgobsoletes ON obsoletes (pkgKey)")
        cur.execute("CREATE INDEX packageId ON packages (pkgId)")
        cur.execute("CREATE INDEX filenames ON files (name)")

    def removeIndices(self):
        for index in ("packagename", "providesname",
                      "pkgprovides", "requiresname",
                      "pkgrequires", "pkgconflicts",
                      "pkgobsoletes", "packageId",
                      "filenames"):
            self.cursor.execute("DROP INDEX IF EXISTS %s" % index)
        self.cursor.execute("VACUUM")


def main():
    primary = Primary('/var/cache/yum/fedora/primary.sqlite')
    filelists = Filelists('/var/cache/yum/fedora/filelists.sqlite')
    print "Removing indices"
    t1 = time.time()
    primary.removeIndices()
    filelists.removeIndices()
    print "\ttook %.1f s" % (time.time() - t1)
    print "Creating primary indices"
    t1 = time.time()
    primary.createIndices()
    print "\ttook %.1f s" % (time.time() - t1)
    print "Creating filelists indices"
    t2 = time.time()
    filelists.createIndices()
    print "\ttook %.1f s" % (time.time() - t2)
    print "Creating both together took %.1f s" % (time.time() - t1)
    

main()
_______________________________________________
Yum-devel mailing list
[email protected]
https://lists.dulug.duke.edu/mailman/listinfo/yum-devel

Reply via email to