In the interest of contributing (instead of just trashing) to the proposed implementation, I wrote a simple Perl script to get some counts out of a real-world XSL-FO file.
Input: The XSL-FO file produced from a DocBook file I have left from a dormant project. The perl program counts the number of properties in the source file. PDF size: 130 Pages // some users have a lot more FO file size: 1.2M bytes Properties: 22,815 Unique prop names: 89 // bounded by the spec Unique prop values: 2,227 // bounded by the real world Note that storing the property name and value refs supplied to the Property constructor will use 45,620 strings. If the Property implementation employs canonical mapping to ensure that only one copy of each unique string is stored, then just over 2,300 strings are required. The property strings are given to the Property object constructor by some path beginning with a SAX parser. It is reasonable to assume that the SAX parser loses refs to most of these strings and that the Property implementation retains the only references to these String objects. How big are String Objects ? At least 16 bytes plus storage for characters. What does this save us ? Probably only about 1,600,000 bytes for this file. CPU cost of creating strings is probably similar to cost of checking string table for a copy. What does it buy for us ? Bounds a source of current Order(n) memory growth. It gets us in the habit of using another good technique. I am all ready thinking along the lines of: The property lists for these FO's are usually generated by programs and will be the repeated many times. Perhaps we could use larger, faster working Property Lists consolidated with Canonical Mappings to save both time and space. I am thinking again along the lines of handling properties more like C++ virtual function table (vTable). This object is larger than Peter's ordered Property array, but would be faster. That's a reason C++ has fast virtual function dispatching. -- John Austin <[EMAIL PROTECTED]>