Glenn Linderman wrote:
On approximately 11/1/2005 8:51 AM, came the following characters from
the keyboard of Jeremy White:
If the Win32::GUI::Constant package is a XS/C code we'll be able to
come up with various algorithms that compress the text strings
meaning the resulting dll could be relatively small.
The more I think about it the more I like a pure perl solution. Most of
the techniques for space saving can be applied to pure perl too.
Perhaps most attractive to me about a pure perl approach is the ability
to use a perl hash for fast lookup, rather than having to (re)invent our
own in C. Having looked at how the Exporter module works, we should be
able to have our own implementation that would use less memory than the
current implementation before adding the GUI_Constants.cpp overhead.
I'm most of the way to a proof of concept implementation that I'll share
as soon as I have something stable.
But that imports the name into the current package, which is true
namespace clutter, and using bare-name constants from the current
package don't reflect the origin of the constant, and using the
fully-qualified name anyway to reflectt the origin of the constant is
extremely long.
>>
>> I think that's what Win32::GUI does now, it pollutes each name space
>> if Win32::GUI () is used - hence I was loosing around 100k per package.
>
> Hmm. I thought that "use Win32::GUI ();" avoided the exports that are
> done by "use Win32::GUI;"
Current implementation:
use Win32::GUI;
Imports *all* the currently defined constants into the calling
namespace: i.e. the symbol table (stash) gets an entry for each constant
containing a reference to the corresponding symbol table entry in the
Win32::GUI package. So, as well as cluttering the namespace of the
calling package it also clutters the namespace of Win32::GUI *AND* uses
lots of symbol table entries, taking up memory. AUTOLOADing is used to
save compile time (and memory: only constants that are actually used get
subroutine bodies), at the expense of some run time (compiling must
happen the first time any constant is used), and the inability to allow
the compiler to inline the constants, requiring a subroutine call each
time a constant is referenced.
In this case you can use bareword constants (either plain ES_WANTRETURN
or Win32::GUI::ES_WANTRETURN)
use Win32::GUI ();
Imports nothing into the calling namespace, saving symbol table entries
in both the calling and Win32::GUI namespaces. AUTOLOADing has the same
advantages and disadvantages, but fully qualified constants must be used
with parenthesis to allow the compiler to know that it is a subroutine
call (otherwise it will be treated as a bareword: i.e. treated as an
error under strict pragma, or as a string if not. So in the calling
package you write Win32::GUI::ES_WANTRETURN();
You can, or course, do
use Win32::GUI qw( ES_WANTRETURN );
to import just ES_WANTRETURN into both calling and Win32::GUI symbol
tables, affording reduced namespace clutter and reduced memory usage,
whilst adding the advantages (if you see them that way) of being able to
use barewords for constants listed this way.
I'm not 100% sure how all this mechanism works either, I'm glad Rob
clearly understands it better, and it does sound like there are some
issues, and although he doesn't currently believe we can have our cake
and eat it too, I'm willing to keep some somewhat educated and somewhat
uneducated discussion going to see if it helps him discover a different
flavor of cake that perhaps we could both have and eat :)
I've done some background. If you want to get into this I thoroughly
recommend the Exporter module docs, the 'Constant Subroutines' and
'AutoLoading' sections from perlsub.
I've also had a good trawl through Exporter.pm and Exporter/Heavy.pm.
I'm sure I don't understand it all yet, but definitely understand more
than I did 24 hours ago.
Flavours of cake, and eating it all:
There's really no right way to do everything that I wanted to achieve,
and it all comes down to a trade-off between speed at compile time vs.
speed at run time. Memory usage doesn't really come into it (except for
controlling how many entries get into the various symbol tables) as
memory usage for the autoloaded subroutines is always used (assuming you
use a constant) - it's just a question of when the memory gets used.
So, I'm working on the following:
(1) A separate module (currently Win32::GUI::Constants, but I'm open to
suggestions). I like this for 2 reasons: (1) If you don't want it you
don't use it, removing some bloat from your distributable. (2) From a
maintenance point of view it's easier to have the functionality
separated. (I'll address Glenn's issue with longer namespace in a moment)
(2) An import syntax that follows the Exporter module's syntax. Everyone
should be at least somewhat familiar with this - If you've never read
about it's capabilities then read the docs - I didn't know about the
regex and negation options to select which symbols to import.
I'll think about classes (e.g. :toolbar :all), as I don't think most
people will want to list each constant individually - it's only an issue
if you really want to tune for memory usage.
Following precedent that I've seen in the CGI module there will be some
import tokens that act as pragmata:
-inline - will sacrifice compile time speed to create the constant
functions at compile time (when the use statement is issued). This
allows the compiler to see the subroutine body and inline the constants,
increasing runtime speed by avoiding subroutine calls at runtime.
-noexport - will similarly sacrifice compile time speed, autoloading the
subroutines for inlining, but without exporting the constants. This
will reduce caller namespace clutter (and symbol table entries) and
allow bareword usage of Win32::GUI::Constants::SYMBOL_NAME, while also
allowing inlining of the constants.
(3) I see no reason why we couldn't subsequently use/require
Win32::GUI::Constants from within Win32::GUI and allow Win32::GUI to
have the same use statement semantics, although there will be some
symbol table overhead. In fact I see this as a necessary migration
step, so we don't break existing scripts. (we probably need a
':original' import class that exports the current set of symbols that is
used by default when/if we change Win32::GUI)
The advantage of having a separate package/dll is that if you dont
want to use it, you dont have to - this could become an important
issue if the constant code becomes large (especally if it's generated
automatically via the headers).
Right. My current prototype has about 1500 constants (and I think
that's about a third of them from the main header files), and weighs in
at 32KB. I've not made much effort at compression yet, but do have it
down from the 75kb that it was at this morning. It's nearly all
constant names and values, there's less than 100 lines of code.
If the groups of constants are separately importable via the imports
list in the use statement, then the only size cost would be the DLL
size, correct? (except for constants you actually want to import and
use, and pay the extra space for) And I suspect with a good algorithm,
that wouldn't be large compared to the size of Win32::GUI itself.
Measurements will tell the tale. I'd be pretty disappointed to have to
use really long names like
Win32::GUI::Constant::Windows_Names_Are_Long_Enough_Already but maybe
there would be the possibility that Win32::GUI::Constant could actually
export the names into the Win32::GUI namespace? As an option, of course.
Indeed, I had independently though of this. There is an overhead for
this though (as both Win32::GUI and Win32::GUI::Constants will have
symbol table entries. If you're happy will the current subroutine call
per constant reference at runtime, then it would be possible to get
Win32::GUI's AUTOLOAD to delegate directly to Win32::GUI::Constants, but
there's a (small) speed penalty for this. In this case a true trade-off
between speed and memory usage.
Importing exactly the constants used (per your first example) into
the local namespace would (I guess) allow the performance speedup of
inlined constants, which May may like.
Inlining has little to do with which namespace holds the symbol
reference. To inline the subroutine must have an empty prototype (i.e.
be defined as sub SYMBOL_NAME() { ... } ), and the subroutine body must
have been seen by the compiler as suitable for inlining before the
compiler sees the subroutine call. See perldoc perlsub.
Watch this space.
Regards,
Rob.