I've been re-visiting the code I wrote some months ago, and discussed
here, to provide us with a faster and more readily extensible way of
having access to all the Win32 API constants that we might need.
This is long. Sorry.
The main purpose of this mail is to document the research and test
results that I have from the last week, but first I'd like to write down
what I intend to change in Win32::GUI itself:
Proposal for change in Win32::GUI
=================================
- Win32::GUI::Constants will become the module that supports constants.
It will be possible for end-users to use this module directly, or to
pass the same export requests to Win32::GUI directly (as we previously
discussed). So either
use Win32::GUI qw(export_list);
or
use Win32::GUI::Constants qw(export_list);
will have the same behaviour with respect to exporting constants.
- I will deprecate the way Win32::GUI currently exports over 300
constants without being explicitly asked. In the 1.04 release
use Win32::GUI;
will still export the same constants, but will generated a 'deprecated'
warning, including instructions about the correct syntax to use.
- I will deprecate calling the Win32::GUI::constant() function directly.
Again, it will continue to work in the 1.04 release and generate a
'deprecated' warning.
- There will need to be a small XS based constant lookup function within
Win32::GUI to support the small number of places that Win32::GUI itself
currently uses Win32::GUI::constant() - only to lookup the Win32__GUI__
constants, which don't belong in the new Win32::GUI::Constants package
anyway.
Changes to Win32::GUI::Constants
================================
The last time we talked about this I had implemented
Win32::GUI::Constants in pure perl, basing the lookup on a perl hash. I
had reservations about the memory usage and speed of this
implementation, and have done some further research to look at the
reality. It turns out that the perl hash is *very* fast, but uses a lot
of memory.
I have looked into 3 alternative hashing strategies, all implemented in
XS (as an aside, the overhead of calling an XS function that does
nothing compares very favourably with a perl hash lookup, and so
previous concerns that I had about XS calls being slow seem unfounded):
Strategy 1: non-order preserving minimal perfect hash, based on the
algorithm and C source from
http://burtleburtle.net/bob/hash/perfect.html I will call this the
'Perfect' algorithm
Strategy 2: non-order preserving (non-minimal) perfect hash, generated
using gperf: http://gnuwin32.sourceforge.net/packages/gperf.htm I will
call this the 'Gperf' algorithm.
Strategy 3: Order-preserving minimal perfect hash. based on the code
from: http://www.ibiblio.org/pub/Linux/devel/lang/c/mph-1.2.tar.gz I
will call this the 'MPH' algorithm
The original ladder-of-if's found in GUI_constants.cpp will be called
the 'Win32::GUI' algorithm
The current hash implementation discussed here for Win32::GUI::Constants
will be called the 'Hash' algorithm.
There are a number of results to share. It has been difficult to ensure
that I am really comparing like-with-like; where there appears to be
data points missing, it is because I have not found a satisfactory (read
quick enough) way of generating these points, and I feel that I have
enough knowledge of the algorithms to extrapolate further behaviour.
All performance tests were made on Win98, ActiveState Perl 5.8.7 (build
813), using the Benchmark module's cmpthese() function and the
:hireswallclock pragma specified.
The performance figures are measured while retrieving every constant -
this will give us a good average figure. I have made no attempt to
benchmark worst and best case figures. The figures are from a lightly
loaded 750Mhz PIII machine, and tests were repeated to ensure
consistancy - in all case a maximum vriation of +/-5% was achieved.
All XS parts were compiled with MS VC++ 6 compiler with the -O flag set
Unless otherwise stated all memory figures were obtained using WinTop
(from Microsoft's Windows 95 kernel toys distribution)
- Comparison of existing Win32::GUI and new Hash algorithms
-----------------------------------------------------------
No memory comparison available, as it's too hard to separate out the
constant code from Win32::GUI. Performance benchmarks for retrieving
all 364 constants:
Rate Win32::GUI Hash
Win32::GUI 691/s -- -58%
Hash 1653/s 139% --
showing the new hash implementation to be well over twice as fast as the
existing implementation, and we know that the existing linear search
performed by Win32::GUI will get slower as the number of constants
increases.
- Comparison of Hash, Perfect, Gperf and MPH algorithms
-------------------------------------------------------
Performance comparisons, retrieving 1588 constants:
Rate MPH Gperf Perfect Hash
MPH 182/s -- -0% -13% -40%
Gperf 183/s 0% -- -13% -40%
Perfect 211/s 15% 15% -- -30%
Hash 303/s 66% 66% 44% --
Memory usage comparisons, as deltas from a baseline, having loaded the
module (and nothing else) over and above the baseline. It should be
noted that a number of the generation tools (most notably gperf) have
many options for trading off memory usage for speed. In general I was
aiming for better memory usage, as it quickly became clear that if
memory is not an issue then the Hash alogithm wins.
All figures in Kbytes:
Hash Perfect Gperf MPH
DLL Size 0 57 94 78
Memory Usage 416 160 164 176 (delta over common baseline)
I was a little surprised at the memory overhead of the Hash algorithm,
as Devel::Size estimated the hash size at a little over 88Kbytes, but
loading the module with out populating the hash clearly showed that
nearly all of the 416Kbytes (all bar about 50Kbytes) was attributable to
the hash.
On the basis of these numbers I am going to choose the 'Perfect'
algorithm to implement the Win32::GUI::Constants module: it's
performance is clearly better than what we have today, whilst using well
under half the memory footprint of the Hash algorithm.
Regards,
Rob.