The largest chunk of data needed for font matching is the Unicode coverage 
for each font; this coverage is represented internally by a sparse bit 
array stored in a btree -- each level of the btree represents one byte 
from the unicode coverage, so trees are at most four levels deep.

At the bottom, a bit array represents the font coverage within a single 
page; 256 bits for each of the members of that page.

A BMP font with glyphs in 8 pages takes:

        1 branch node (256 pointers + overhead)                 1280
        8 leaves (256/8 bytes)                          8 * 32   256
        1 header                                                  16
                                                                ----
                                                                1552

Multiply that by the number of fonts in the system to find that 400 fonts
takes over 600K of memory.  Seems a bit excessive to me.

I've taken steps to reduce this, using the obvious fact that many fonts 
cover the same glyphs in each page.  I now share branches and leaves 
wherever possible.  The results are encouraging.  In my configuration with 
383 fonts:
                        total   total mem       unique  unique mem
        leaves:         4962    158784          793      25376
        branches:        360    460800           75      96000

        total                   619584                  121376

This reduces memory consumption by a factor of 5.

Savings improve when adding more fonts; increase the configuration to 
include several thousand of the ugliest fonts you've ever seen and
with 4803 fonts:

                        total   total mem       unique  unique mem
        leaves:         46572   1490304         1705     54560
        branches:       4779    6117120         519     664320

        total                   7607424                 718880

This reduces memory consumption by a factor of 10.

(for the curious, there are 20 some fonts in this list which have no 
Unicode mapping, and hence have no branch allocated).

This analysis highlights one other direction for savings -- the actual 
bitmaps are taking only  a small fraction of the total space; I can 
represent the coverage of nearly 5000 fonts in only 50KB, but holding the 
various branches above that takes more than 10 times as much space.

I like the current data structure's performance in locating a leaf; it 
consists of walking down the btree a byte at a time.  However, wasting 
this much memory for inactive fonts doesn't justify the cost.  In the 519
branches above, only 6071 of their 132864 entries had non-zero values.

I think I'll convert the branches to a packed representation and take the 
performance hit when searching.  That should reduce memory usage for
the branches from 664320 to 32403 bytes.  I'll take the performance hit 
for the 630K savings; the reduction in cache misses should more than 
compensate.

Keith Packard        XFree86 Core Team        HP Cambridge Research Lab


_______________________________________________
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts

Reply via email to