Author: leo Date: Wed Sep 27 12:57:22 2006 New Revision: 14774 Added: trunk/docs/pdds/clip/pddXX_cstruct.pod (contents, props changed) trunk/docs/pdds/clip/pddXX_pmc.pod (contents, props changed)
Changes in other areas also in this revision: Modified: trunk/MANIFEST Log: add 2 new design docs - see also mail Added: trunk/docs/pdds/clip/pddXX_cstruct.pod ============================================================================== --- (empty file) +++ trunk/docs/pdds/clip/pddXX_cstruct.pod Wed Sep 27 12:57:22 2006 @@ -0,0 +1,324 @@ +=head1 TITLE + +C Structure Class + +=head1 STATUS + +Proposal. + +=head1 AUTHOR + +Leopold Toetsch + +=head1 ABSTRACT + +The ParrotClass PMC is the default implementation (and the meta class) +of parrot's HLL classes. It provides attribute access and (TODO) +introspection of attribute names. It is also handling method +dispatch and inheritance. + +C structures used all over in parrot (PMCs) and user-visible C +structures provided by the C<{Un,}ManagedStruct> PMC dont't have this +flexibility. + +The proposed C<CStruct> PMC is trying to bridge this gap. + +=head1 DESCRIPTION + +The C<CStruct> PMC is the class PMC of classes, which are not +based on PMC-only attributes but on the general case of a C structure. +That is, the C<CStruct> is actually the parent class of +C<ParrotClass>, which is a PMC-only special case. And it is the +theoretical ancestor class of all PMCs (including itself :). + +The relationship of C<CStruct> to other PMCs is like this: + + PASM/PIR code C code + Class ParrotClass CStruct + Object ParrotObject *ManagedStruct + (other PMCs) + +That is, it is the missing piece of already existing PMCs. The current +*ManagedStruct PMCs are providing the class and object functionality in +one and the same PMC (as BTW all other existing PMCs are doing). But +this totally prevents proper inheritance and reusability of such PMCs. + +The C<CStruct> class provides the necessary abstract backings to get +rid of current limitations. + +=head1 SYNTAX BITS + +=head2 Constructing a CStruct + +A typical C structure: + + struct foo { + int a; + char b; + }; + +could be created in PIR with: + + cs = subclass 'CStruct', 'foo' # or maybe cs = new_c_class 'foo' + addattribute cs, 'a' + addattribute cs, 'b' + +The semantics of a C struture are the same as of a Parrot Class. +But we need the types of the attributes too: + +Handwavingly TBD 1) + +with ad-hoc existing syntax: + + .include "datatypes.pasm" + cs['a'] = .DATATYPE_INT + cs['b'] = .DATATYPE_CHAR + +Handwavingly TBD 2) + +with new variants of the C<addattribute> opcode: + + addattribute cs, 'a', .DATATYPE_INT + addattribute cs, 'b', .DATATYPE_CHAR + +Probably desired and with not much effort TBD 3): + + addattribute(s) cs, <<'DEF' + int a; + char b; + DEF + +The possible plural in the opcode name would match semantics, but it is not +necessary. The syntax is just using Parrot's here documents to define +all the attributes and types. + + addattribute(s) cs, <<'DEF' + int "a"; + char "b"; + DEF + +The generalization of arbitrary attribute names would of course be +possible too, but isn't likely needed. + +=head2 Syntax variant + + cs = subclass 'CStruct', <<'DEF + struct foo { + int a; + char b; + }; + DEF + +I.e. create all in one big step. + +=head2 Object creation and attribute usage + +This is straight forward and conforming to current ParrotObjects: + + o = new 'foo' # a ManagedStruct instance + setattribute o, 'a', 4711 + setattribute o, 'b', 22 + ... + +The only needed extension would be C<{get,set}attribute> variants with +natural types. + +Even (with nice to have IMCC syntax sugar): + + o.a = 4711 # setattribute + o.b = 22 + $I0 = o.a # getattribute + +=head2 Nested Structures + + foo_cs = subclass 'CStruct', 'foo' + addattribute(s) foo_cs, <<'DEF' + int a; + char b; + DEF + bar_cs = subclass 'CStruct', 'bar' + addattribute(s) bar_cs, <<'DEF' + double x; + foo foo; # the foo class is already defined + foo *fptr; + DEF + o = new 'bar' + setattribute o, 'x', 3.14 + setattribute o, ['foo'; 'a'], 4711 # o.foo.a = 4711 + setattribute o, ['fptr'; 'b'], 255 + +Attribute access is similar to current *ManagedStruct's hash syntax +but with a syntax matching ParrotObjects. + +=head2 Array Structures Elements + + foo_cs = subclass 'CStruct', 'foo' + addattribute(s) foo_cs, <<'DEF' + int a; + char b[100]; + DEF + +=head2 Possible future extemsios + + cs = subclass 'CStruct', 'todo' + addattribute(s) foo_cs, <<'DEF' + union { # union keyword + int a; + double b; + } u; + char b[100] :ro; # attributes like r/o + DEF + +=head2 Managed vs. Unmanaged Structs + +The term "managed" in current structure usage defines the owner of the +structure memory. C<ManagedStruct> means that parrot is the owner of +the memory and that GC will eventually free the structure memory. This +is typically used when C structures are created in parrot and passed +into external C code. + +C<UnManagedStruct> means that there's some external owner of the +structure memory. Such structures are typically return results of +external code. + +E.g.: + + $P0 = some_c_func() # UnManagedStruct result + assign $P0, foo_cs # assign a structure class to it + + o = new 'foo_cs' # ManagedStruct instance + setattribute o, 'a', 100 + setattribute o, ['b'; 99], 255 # set last elem + +=head1 RATIONAL + +Parrot as the planned interpreter glue language should have access to +all possible C libraries and structures. It has to abstract the +low-level bindings in a HLL independant way and should still be able to +communicate all information "upstairs" to the HLL users. + +But it's not HLL usage only, parrot itself is already suffering from +lack of abstraction at PMC level. + +=head2 Inheritance + +I've implemented an OO-ified HTTP server named F<httpd2.pir>. The +C<HTTP::Connection> class ought to be a subclass of C<ParrotIO> (we +don't have a base socket class, but ParrotIO would do it for now). +This kind of inheritance isn't possible. The implementation is now a +connection B<hasa> ParrotIO, instead of B<isa>. It's of course losing +all inheritance with that which leads to delegation code and work +arounds. + +The same workarounds are all over F<SDL/*> classes. There are +I<layout> helpers and raw structure accessores and what not. Please +read the code. It's really not a problem of the implementation (which is +totally fine) it's just the lack of usability of parrot (when it comes +to native structures (or PMCs)). + +All these experiments to use a C structures or a PMC as base class are +ending with a C<has> relationship instead of the natural C<isa>. +Any useful OO-ish abstraction is lost and is leading to clumsy code, +and - no - implementing interfaces/traits/mixins can't help here, as +these are all based on the abstraction, which is described here. + +=head2 Inheritance and attribute access + +This proposal alone doesn't solve all inheritance problems. It is also +needed that the memory layout of PMCs and ParrotObjects deriving from +PMCs is the same. E.g. + + cl = subclass 'Integer', 'MyInt' + +The C<int_val> attribute of the core C<Integer> type is located in the +C<cache> union of the PMC. The integer item in the subclass is hanging +off the C<data> array of attributes and worse it is a PMC too, not a +natural int. This not only causes additional indirections (see +F<deleg_pmc.pmc>) but also negatively impacts C<Integer> PMCs, as all +access to the C<int_val> has to be indirected through C<get_integer()> +or C<set_integer_native()> to be able to deal with subclassed integers. + +Again the implementation of above is: MyInt B<hasa> Integer, instead of +the desired B<isa> int_val. + +With the abstraction of a C<CStruct> describing the C<Integer> PMC and +with differently sized PMCs, we can create an object layout, where the +C<int_val> attribute of C<Integer> and C<MyInt> are at the same +location and of the same type. + +Given this (internal) definition of the C<Integer> PMC: + + intpmc_cl = subclass 'CStruct', 'Integer' + addattribute(s) intpmc_cl, <<'DEF' + INTVAL int_val; # PMC internals are hidden + DEF + +we can transparently subclass it as C<MyInt>, as all the needed +information is present in the C<CStruct intpmc_cl> class. + +=head2 Introspection, PMCs and more + + cc = subclass 'CStruct', 'Complex' + addattribute(s) cc, <<'DEF' + FLOATVAL re; + FLOATVAL im; + DEF + +This is the (hypothetical) description of a C<Complex> PMC class. An +equivalent syntax can be translated by the PMC compiler to achieve the +same result. + +This definition of the attributes of that PMC provides automagically +access to all the information stored in the PMC. All such access is +currently hand-crafted in the F<complex.pmc>. Not only that this +accessor code could be abandoned (and unified with common syntax), +all possible classes inheriting from that PMC could use this +information. + +=head1 Implementation + +C<CStruct> is basically yet another PMC and can be implemented and put +to functionality without any interference with existing code. It is +also orthogonal with possible PMC layout changes. + +The internals of C<CStruct> can vastly reuse code from F<src/objects.c> +to deal with inheritance or object instantiation. The main difference +is that attributes have additionally a type attached to it and +consequently that the attribute offsets are calculated differently +depending on type, alignment, and padding. These calculations are +already done in F<unmanagedstruct.pmc>. + +C<CStruct> classes can be attached to existing PMCs gradually (and by +far not all PMCs need that abstract backing). But think e.g. of the +C<Sub> PMC. Attaching a C<CStruct> to it, would instantly give access +to all it's attributes and vastly simplify introspection. + +Only the final step ("Inheritance and attribute access") needs all +parts to play together. + +=head1 All together now + +=over + +=item Differently sized PMCs + +Provide the flexible PMC layout. + +=item CStruct classes + +Are describing the structure of PMCs (or any C structure). + +=item R/O vtables + +Prohibit modification of readonly PMCs like the C<Sub> PMC. These are +already coded within the C<STM> project. + +=back + +=head1 SEE ALSO + +pddXX_pmc.pod (proposal for a flexible PMC layout) + +=cut + +vim: expandtab sw=2 tw=70: Added: trunk/docs/pdds/clip/pddXX_pmc.pod ============================================================================== --- (empty file) +++ trunk/docs/pdds/clip/pddXX_pmc.pod Wed Sep 27 12:57:22 2006 @@ -0,0 +1,295 @@ +# Copyright (C) 2001-2006, The Perl Foundation. +# $Id:$ + +=head1 NAME + +docs/pdds/pddXX_pmc.pod - PMCs + +=head1 STATUS + +Proposal + +=head1 VERSION + +$Revision:$ + +=head1 ABSTRACT + +This document defines Parrot Magic Cookies (PMCs). + +[[ maybe rename PMC to PO (Parrot Objects) or such to reduce confusion +with Perl5's PMC (compiled .pm files). ]] + +=head1 TERMINOLOGY + +This document uses C<OPMC>, when speaking of "old" PMCs of Parrot +Version 0.4.6 or less. C<PMC> is the new layout as proposed in this +document. + +An C<attribute> is otherwise also known as a C<field> or structure +element, but I'm using C<attribute> here because the difference of +PMCs and Parrot Objects should be minimized. + +=head1 DESCRIPTION + +PMCs are Parrot's low-level objects implemented in C. PMCs are small +non-resizable variable-sized structures. The C<PMC> itself is the +common part of all PMCs. The per-PMC payload holds PMC-specific +attributes. + +=head2 PMC Layout + + +---------------+ + | vtable | # common PMC attribute + +---------------+ + | flags | # common PMC attribute + +---------------+ + | attrib_1 | # "user" defined part + | ... | + +---------------+ + + #define THE_PMC \ + struct _vtable* vtable; \ + UINTVAL flags + +An Integer Value PMC could be defined with: + + struct VInt_PMC { + THE_PMC; + INTVAL val; + }; + +Such PMC definitions are typically private to the F<.pmc> files. All +access to PMCs shall be through VTABLE functions or methods. OTOH some +widely used PMCs might export their attributes for public use and are +then part of the Parrot API. + +A typical C<VInt> vtable function would look like this: + + INTVAL get_integer() + VInt_PMC *me = (VInt_PMC*)SELF; # [1] + return me->val; + } + +The C<OPMC> can be defined in terms of a PMC by rearranging the +structure elements. + +[1] The PMC compiler could provide this line automagically and define +a convenience variable C<ME> similar to the current C<SELF>. + +=head2 PMC creation + +PMCs are created via C<VTABLE_new> or variants of C<_new>. It's up to +the PMC to initialize it's attributes. C<new> is a class method, i.e. +it's called with the PMC's C<class> as C<SELF>. + + PMC* new() { + VInt_PMC *me = new_bufferlike_header(INTERP, sizeof(VInt_PMC)); + me->val = 0; + return (PMC*)me; + } + +[[ rename C<new_bufferlike_header> to something more meaningful ]] + +=head3 Optimization + +The vtable can provide a pointer to the sized header pool to possibly +speedup allocation. + +=head3 OPMC vs. PMC creation + +PMCs with a non-default C<new> method are PMCs, The old scheme via +C<pmc_new> and C<VTABLE_init> provides a fallback of creating C<OPMCs>. + +=head1 Additional PMC attributes + +=head2 pmc->_next_for_GC / opmc->pmc_ext->_next_for_GC + +All PMCs that refer to other PMCs have a 3rd mandatory attribute +C<_next_for_GC>, used for garbage collection, The presence of this +attribute is signaled by the flag bit C<PObj_is_PMC_EXT_FLAG>. + + +---------------+ + | vtable | + +---------------+ + | flags | + +---------------+ + | _next_for_GC | + +---------------+ + | ... | + +---------------+ + +=head2 Properties opmc->pmc_ext->_metadata + +PMCs do not support properties universally, If properties are still +desired, these can be implemented in one of the following ways: + +[[ TODO define something canonical ]] + +=head3 Per PMC type + +Each PMC that wants this extra hash can just provide an attribute for +it and implement the property vtable functions. + +=head3 interpreter->prop_hash + +This will be a Hash, indexed by the PMC's address, containing the property +Hash. An additional flag can be provided, if such a property hash +exists for a PMC. During collection of a PMC, this hash is invalidated +too. + +=head3 PropRef + +A transparent C<Ref> PMC can point to a structure holding the original +PMC and the property Hash. + +=head2 Locking or opmc->pmc_ext->_synchronize + +PMCs do no support locking universally. Creating sharable PMCs at +runtime (from standard PMCs) is again done by transparent Refs like +C<SharedRef> or C<STMRef>. + +=head2 Shared PMCs + +If needed, we can define shared PMCs by allocating the C<_Sync> +structure in front of the PMC: + + +---------------+ + | struct | + | _Sync | + +---------------+ <--- pmc points here + | vtable | + +---------------+ + | flags | + +---------------+ + +This works of course only, if PMCs are created as C<shared> in the +first place. The presence of the C<_Sync> structure is stated by a PMC +flag bit. + +=head2 PMCs and morphing + +PMCs (like current OPMCs), which may morph themselves, and thereby +change their vtable and the meaning of their attributes shall use a +union of the desired attributes, e.g.: + + struct Integer_PMC { + THE_PMC; + union { + INTVAL int_val; + FLOATVAL num_val; + STRING *str_val; + } u; + }; + +=head1 ATTACHMENTS + +(none) + +=head1 REFERENCES + + TODO pdd02_vtables.pod + TODO pddXX_interfaces.pod + TODO pddXX_classes.pod + TODO pddXX_objects.pod + TODO pddXX_cstruct.pod [2] + +[2] PMCs need a class object that defines their attributes to properly +allow subclassing. The attribute definition is held by a C<CStruct> +PMC, the meta class of all Parrot PMCs. It's a list of attribute +names, their types, and possibly the offsets in the PMC structure. See +also the C<UnManangedStruct> PMC. + +=head1 RATIONAL + +Current OPMCs are too rigid: mostly either too small or too big. A lot +of information is hanging off secondary malloced structures like +C<PMC_sub> in the C<Sub> OPMC. + +But more importantly, OPMCs don't properly allow subclassing. E.g. + + cl = subclass 'Hash', 'PGE::Match' + +is currently done by creating a C<ParrotClass>. When instantiate, this +is a "hidden" C<__value> element as first attribute, which is a +pointer to the hash parent PMC. This is creating internal structures +which aren't compatible, because the object attributes are differently +arranged. That is, above subclassing is mainly: C<PGE::Match> I<hasa> +C<Hash> instead of I<isa>, when it comes to attribute relationship. + +This limitation prevents further implementation of already (at least +partially) documented APIs, like the C<Compiler> one. + +A C<Compiler> object is either a Parrot C<Sub> like C<PGE::P6Regex> or +a builtin that is C<NCI> compiler like C<PIR>. But C<Sub> and C<NCI> +objects are that different that even currently needed attributes aren't +consistently arranged (e.g. C<multi_sig> or C<namespace_stash>). +Creating proper compiler objects like C<PIR_Compiler> with common and +needed C<Compiler> attributes isn't possible now. + +[[ Well, with another one or two indirections all can be implemented, +but that's just adding to code complexity. ]] + +Please note that the mentioned C<Compiler> PMC ist just one of many +PMCs that exhibit the same problem. + +In combination with a proper metaclass for PMCs, PMCs and "real" +Parrot objects should be able to work together seemlessly. + +=head1 PERFORMANCE CONSIDERATIONS + +Due to reduced memory consumption and reduced allocation of secondary +helper structures, this change will very likely speed up Parrot +performance slightly to moderate. No negative performance impact is +forseeable due to these changes. + +=head1 IMPLEMENTATION + +Parrot core needs very little changes to be able to deal with +differently sized PMCs. All the GC infrastructure is already there for +C<Buffer_like> objects, which are managed in sized header pools. + +Changing PMCs to the new scheme can be done as needed and isn't +mandatory. + +C<OPMCs> attribute access is currently already done through C macros, +like C<PMC_int_val> or C<PMC_struct_val>. These macros can cast the +passed pointer to C<(OPMC*)>, so that all these C<OPMCs> will still be +working. New PMCs shall use explicit and more verbose attribute names, +which don't collide with present C<OPMC> attributes. + +There'll be no implications to existing PASM or PIR code nor to +existing dynamic PMCs. + +=head1 ALTERNATIVE PMC IMPLEMENTATIONS + +I've in another document (F<PMC.pod>) already layed out a PMC scheme +optimized for generational garbage collection. The PMC layout is using +also differently but fixed sized user parts of PMCs, but these are +subject of one more indirection. If we see the need for optimized GC, +this PMC scheme can still be implemented. We probably could take +provisions that such a change is not too intrusive by cleverly using +the PMC compiler and/or C macros like the proposed C<ME> in [1] above. + +The payload of PMCs in this scheme is hanging off a C<pmc_body> +pointer: + + +---------------+ + | pmc_body | --> body (buffer) memory + +---------------+ + | vtable | + +---------------+ + | flags | + +---------------+ + +The implementation of C<PMC> might take into consideration that the PMC +layout could change further. + +=cut + +__END__ +Local Variables: + fill-column:78 +End: + +vim: expandtab sw=2 tw=70: