Author: leo
Date: Wed Sep 27 12:57:22 2006
New Revision: 14774

Added:
   trunk/docs/pdds/clip/pddXX_cstruct.pod   (contents, props changed)
   trunk/docs/pdds/clip/pddXX_pmc.pod   (contents, props changed)

Changes in other areas also in this revision:
Modified:
   trunk/MANIFEST

Log:
add 2 new design docs - see also mail

Added: trunk/docs/pdds/clip/pddXX_cstruct.pod
==============================================================================
--- (empty file)
+++ trunk/docs/pdds/clip/pddXX_cstruct.pod      Wed Sep 27 12:57:22 2006
@@ -0,0 +1,324 @@
+=head1 TITLE
+
+C Structure Class
+
+=head1 STATUS
+
+Proposal.
+
+=head1 AUTHOR
+
+Leopold Toetsch
+
+=head1 ABSTRACT
+
+The ParrotClass PMC is the default implementation (and the meta class)
+of parrot's HLL classes. It provides attribute access and (TODO)
+introspection of attribute names. It is also handling method
+dispatch and inheritance.
+
+C structures used all over in parrot (PMCs) and user-visible C
+structures provided by the C<{Un,}ManagedStruct> PMC dont't have this
+flexibility.
+
+The proposed C<CStruct> PMC is trying to bridge this gap.
+
+=head1 DESCRIPTION
+
+The C<CStruct> PMC is the class PMC of classes, which are not
+based on PMC-only attributes but on the general case of a C structure.
+That is, the C<CStruct> is actually the parent class of
+C<ParrotClass>, which is a PMC-only special case. And it is the
+theoretical ancestor class of all PMCs (including itself :).
+
+The relationship of C<CStruct> to other PMCs is like this:
+
+                PASM/PIR code         C code
+  Class         ParrotClass           CStruct
+  Object        ParrotObject          *ManagedStruct
+                                      (other PMCs) 
+
+That is, it is the missing piece of already existing PMCs. The current
+*ManagedStruct PMCs are providing the class and object functionality in
+one and the same PMC (as BTW all other existing PMCs are doing). But
+this totally prevents proper inheritance and reusability of such PMCs.
+
+The C<CStruct> class provides the necessary abstract backings to get
+rid of current limitations.
+
+=head1 SYNTAX BITS
+
+=head2 Constructing a CStruct
+
+A typical C structure:
+
+  struct foo {
+    int a;
+    char b;
+  };
+
+could be created in PIR with:
+
+  cs = subclass 'CStruct', 'foo'   # or maybe  cs = new_c_class 'foo'
+  addattribute cs, 'a'
+  addattribute cs, 'b'
+
+The semantics of a C struture are the same as of a Parrot Class.
+But we need the types of the attributes too:
+
+Handwavingly TBD 1)
+
+with ad-hoc existing syntax:
+
+  .include "datatypes.pasm"
+  cs['a'] = .DATATYPE_INT
+  cs['b'] = .DATATYPE_CHAR
+
+Handwavingly TBD 2)
+
+with new variants of the C<addattribute> opcode:
+  
+  addattribute cs, 'a', .DATATYPE_INT
+  addattribute cs, 'b', .DATATYPE_CHAR
+
+Probably desired and with not much effort TBD 3):
+
+  addattribute(s) cs, <<'DEF'
+    int a;
+    char b;
+  DEF   
+
+The possible plural in the opcode name would match semantics, but it is not
+necessary. The syntax is just using Parrot's here documents to define
+all the attributes and types.
+
+  addattribute(s) cs, <<'DEF'
+    int "a";
+    char "b";
+  DEF   
+
+The generalization of arbitrary attribute names would of course be
+possible too, but isn't likely needed.
+
+=head2 Syntax variant
+
+  cs = subclass 'CStruct', <<'DEF
+    struct foo {
+      int a;
+      char b;
+    };
+  DEF
+
+I.e. create all in one big step.
+
+=head2 Object creation and attribute usage
+
+This is straight forward and conforming to current ParrotObjects:
+
+  o = new 'foo'                 # a ManagedStruct instance
+  setattribute o, 'a', 4711
+  setattribute o, 'b', 22
+  ...
+
+The only needed extension would be C<{get,set}attribute> variants with
+natural types.
+
+Even (with nice to have IMCC syntax sugar):
+
+  o.a = 4711    # setattribute
+  o.b = 22
+  $I0 = o.a     # getattribute
+
+=head2 Nested Structures
+
+  foo_cs = subclass 'CStruct', 'foo'
+  addattribute(s) foo_cs, <<'DEF'
+    int a;
+    char b;
+  DEF   
+  bar_cs = subclass 'CStruct', 'bar'
+  addattribute(s) bar_cs, <<'DEF'
+    double x;
+    foo foo;        # the foo class is already defined
+    foo *fptr;
+  DEF   
+  o = new 'bar'
+  setattribute o, 'x', 3.14
+  setattribute o, ['foo'; 'a'], 4711         # o.foo.a = 4711
+  setattribute o, ['fptr'; 'b'], 255
+
+Attribute access is similar to current *ManagedStruct's hash syntax
+but with a syntax matching ParrotObjects.
+
+=head2 Array Structures Elements
+
+  foo_cs = subclass 'CStruct', 'foo'
+  addattribute(s) foo_cs, <<'DEF'
+    int a;
+    char b[100];
+  DEF   
+
+=head2 Possible future extemsios
+
+  cs = subclass 'CStruct', 'todo'
+  addattribute(s) foo_cs, <<'DEF'
+    union {              # union keyword
+      int a;
+      double b;
+    } u;  
+    char b[100]  :ro;    # attributes like r/o     
+  DEF   
+
+=head2 Managed vs. Unmanaged Structs
+
+The term "managed" in current structure usage defines the owner of the
+structure memory. C<ManagedStruct> means that parrot is the owner of
+the memory and that GC will eventually free the structure memory. This
+is typically used when C structures are created in parrot and passed
+into external C code.
+
+C<UnManagedStruct> means that there's some external owner of the
+structure memory. Such structures are typically return results of 
+external code.
+
+E.g.:
+
+  $P0 = some_c_func()          # UnManagedStruct result
+  assign $P0, foo_cs           # assign a structure class to it 
+
+  o = new 'foo_cs'             # ManagedStruct instance
+  setattribute o, 'a', 100
+  setattribute o, ['b'; 99], 255  # set last elem
+
+=head1 RATIONAL
+
+Parrot as the planned interpreter glue language should have access to
+all possible C libraries and structures. It has to abstract the
+low-level bindings in a HLL independant way and should still be able to
+communicate all information "upstairs" to the HLL users.
+
+But it's not HLL usage only, parrot itself is already suffering from
+lack of abstraction at PMC level.
+
+=head2 Inheritance
+
+I've implemented an OO-ified HTTP server named F<httpd2.pir>. The
+C<HTTP::Connection> class ought to be a subclass of C<ParrotIO> (we
+don't have a base socket class, but ParrotIO would do it for now).
+This kind of inheritance isn't possible. The implementation is now a
+connection B<hasa> ParrotIO, instead of B<isa>. It's of course losing
+all inheritance with that which leads to delegation code and work
+arounds.
+
+The same workarounds are all over F<SDL/*> classes. There are
+I<layout> helpers and raw structure accessores and what not. Please
+read the code. It's really not a problem of the implementation (which is
+totally fine) it's just the lack of usability of parrot (when it comes
+to native structures (or PMCs)).
+
+All these experiments to use a C structures or a PMC as base class are
+ending with a C<has> relationship instead of the natural C<isa>.
+Any useful OO-ish abstraction is lost and is leading to clumsy code,
+and - no - implementing interfaces/traits/mixins can't help here, as
+these are all based on the abstraction, which is described here.
+
+=head2 Inheritance and attribute access
+
+This proposal alone doesn't solve all inheritance problems. It is also
+needed that the memory layout of PMCs and ParrotObjects deriving from
+PMCs is the same. E.g.
+
+  cl = subclass 'Integer', 'MyInt'
+
+The C<int_val> attribute of the core C<Integer> type is located in the
+C<cache> union of the PMC. The integer item in the subclass is hanging
+off the C<data> array of attributes and worse it is a PMC too, not a
+natural int. This not only causes additional indirections (see
+F<deleg_pmc.pmc>) but also negatively impacts C<Integer> PMCs, as all
+access to the C<int_val> has to be indirected through C<get_integer()>
+or C<set_integer_native()> to be able to deal with subclassed integers.
+
+Again the implementation of above is: MyInt B<hasa> Integer, instead of
+the desired B<isa> int_val.
+
+With the abstraction of a C<CStruct> describing the C<Integer> PMC and
+with differently sized PMCs, we can create an object layout, where the
+C<int_val> attribute of C<Integer> and C<MyInt> are at the same
+location and of the same type.
+
+Given this (internal) definition of the C<Integer> PMC:
+
+  intpmc_cl = subclass 'CStruct', 'Integer'
+  addattribute(s) intpmc_cl, <<'DEF'
+    INTVAL int_val;            # PMC internals are hidden
+  DEF   
+
+we can transparently subclass it as C<MyInt>, as all the needed
+information is present in the C<CStruct intpmc_cl> class.
+
+=head2 Introspection, PMCs and more
+
+  cc = subclass 'CStruct', 'Complex'
+  addattribute(s) cc, <<'DEF'
+    FLOATVAL re; 
+    FLOATVAL im; 
+  DEF   
+
+This is the (hypothetical) description of a C<Complex> PMC class. An
+equivalent syntax can be translated by the PMC compiler to achieve the
+same result.
+
+This definition of the attributes of that PMC provides automagically
+access to all the information stored in the PMC. All such access is
+currently hand-crafted in the F<complex.pmc>.  Not only that this
+accessor code could be abandoned (and unified with common syntax),
+all possible classes inheriting from that PMC could use this
+information.
+
+=head1 Implementation
+
+C<CStruct> is basically yet another PMC and can be implemented and put
+to functionality without any interference with existing code. It is
+also orthogonal with possible PMC layout changes.
+
+The internals of C<CStruct> can vastly reuse code from F<src/objects.c>
+to deal with inheritance or object instantiation. The main difference
+is that attributes have additionally a type attached to it and
+consequently that the attribute offsets are calculated differently
+depending on type, alignment, and padding. These calculations are
+already done in F<unmanagedstruct.pmc>.
+
+C<CStruct> classes can be attached to existing PMCs gradually (and by
+far not all PMCs need that abstract backing). But think e.g. of the
+C<Sub> PMC. Attaching a C<CStruct> to it, would instantly give access
+to all it's attributes and vastly simplify introspection.
+
+Only the final step ("Inheritance and attribute access") needs all
+parts to play together.
+
+=head1 All together now
+
+=over
+
+=item Differently sized PMCs
+
+Provide the flexible PMC layout.
+
+=item CStruct classes
+
+Are describing the structure of PMCs (or any C structure).
+
+=item R/O vtables
+
+Prohibit modification of readonly PMCs like the C<Sub> PMC. These are
+already coded within the C<STM> project.
+
+=back
+
+=head1 SEE ALSO
+
+pddXX_pmc.pod (proposal for a flexible PMC layout)
+
+=cut
+
+vim: expandtab sw=2 tw=70:

Added: trunk/docs/pdds/clip/pddXX_pmc.pod
==============================================================================
--- (empty file)
+++ trunk/docs/pdds/clip/pddXX_pmc.pod  Wed Sep 27 12:57:22 2006
@@ -0,0 +1,295 @@
+# Copyright (C) 2001-2006, The Perl Foundation.
+# $Id:$
+
+=head1 NAME
+
+docs/pdds/pddXX_pmc.pod - PMCs
+
+=head1 STATUS
+
+Proposal
+
+=head1 VERSION
+
+$Revision:$
+
+=head1 ABSTRACT
+
+This document defines Parrot Magic Cookies (PMCs).
+
+[[ maybe rename PMC to PO (Parrot Objects) or such to reduce confusion
+with Perl5's PMC (compiled .pm files). ]]
+
+=head1 TERMINOLOGY
+
+This document uses C<OPMC>, when speaking of "old" PMCs of Parrot
+Version 0.4.6 or less. C<PMC> is the new layout as proposed in this
+document.
+
+An C<attribute> is otherwise also known as a C<field> or structure
+element, but I'm using C<attribute> here because the difference of
+PMCs and Parrot Objects should be minimized.
+
+=head1 DESCRIPTION
+
+PMCs are Parrot's low-level objects implemented in C. PMCs are small
+non-resizable variable-sized structures. The C<PMC> itself is the
+common part of all PMCs. The per-PMC payload holds PMC-specific
+attributes.
+
+=head2 PMC Layout
+
+  +---------------+
+  |   vtable      |    # common PMC attribute
+  +---------------+
+  |   flags       |     # common PMC attribute
+  +---------------+
+  |   attrib_1    |     # "user" defined part 
+  |   ...         |
+  +---------------+
+
+  #define THE_PMC \
+    struct _vtable* vtable; \
+    UINTVAL flags
+
+An Integer Value PMC could be defined with:
+
+  struct VInt_PMC {
+    THE_PMC;
+    INTVAL val;
+  };
+
+Such PMC definitions are typically private to the F<.pmc> files. All
+access to PMCs shall be through VTABLE functions or methods. OTOH some
+widely used PMCs might export their attributes for public use and are
+then part of the Parrot API.
+
+A typical C<VInt> vtable function would look like this:
+
+  INTVAL get_integer()
+    VInt_PMC *me = (VInt_PMC*)SELF;     # [1]
+    return me->val;
+  }
+
+The C<OPMC> can be defined in terms of a PMC by rearranging the
+structure elements.
+
+[1] The PMC compiler could provide this line automagically and define
+a convenience variable C<ME> similar to the current C<SELF>.
+
+=head2 PMC creation
+
+PMCs are created via C<VTABLE_new> or variants of C<_new>. It's up to
+the PMC to initialize it's attributes. C<new> is a class method, i.e.
+it's called with the PMC's C<class> as C<SELF>.
+
+  PMC* new() {
+    VInt_PMC *me = new_bufferlike_header(INTERP, sizeof(VInt_PMC));
+    me->val = 0;
+    return (PMC*)me;
+  }
+
+[[ rename C<new_bufferlike_header> to something more meaningful ]]
+
+=head3 Optimization
+
+The vtable can provide a pointer to the sized header pool to possibly
+speedup allocation.
+
+=head3 OPMC vs. PMC creation
+
+PMCs with a non-default C<new> method are PMCs, The old scheme via
+C<pmc_new> and C<VTABLE_init> provides a fallback of creating C<OPMCs>.
+
+=head1 Additional PMC attributes
+
+=head2 pmc->_next_for_GC  / opmc->pmc_ext->_next_for_GC  
+
+All PMCs that refer to other PMCs have a 3rd mandatory attribute
+C<_next_for_GC>, used for garbage collection, The presence of this
+attribute is signaled by the flag bit C<PObj_is_PMC_EXT_FLAG>.
+
+  +---------------+
+  |   vtable      |    
+  +---------------+
+  |   flags       |
+  +---------------+
+  | _next_for_GC  |
+  +---------------+
+  |   ...         |
+  +---------------+
+
+=head2 Properties opmc->pmc_ext->_metadata  
+
+PMCs do not support properties universally, If properties are still
+desired, these can be implemented in one of the following ways:
+
+[[ TODO define something canonical ]]
+
+=head3 Per PMC type
+
+Each PMC that wants this extra hash can just provide an attribute for
+it and implement the property vtable functions.
+
+=head3 interpreter->prop_hash
+
+This will be a Hash, indexed by the PMC's address, containing the property
+Hash. An additional flag can be provided, if such a property hash
+exists for a PMC. During collection of a PMC, this hash is invalidated
+too.
+
+=head3 PropRef
+
+A transparent C<Ref> PMC can point to a structure holding the original
+PMC and the property Hash.
+
+=head2 Locking or opmc->pmc_ext->_synchronize
+
+PMCs do no support locking universally. Creating sharable PMCs at
+runtime (from standard PMCs) is again done by transparent Refs like
+C<SharedRef> or C<STMRef>.
+
+=head2 Shared PMCs
+
+If needed, we can define shared PMCs by allocating the C<_Sync>
+structure in front of the PMC:
+
+  +---------------+
+  |   struct      |    
+  |   _Sync       |    
+  +---------------+ <--- pmc points here
+  |   vtable      |    
+  +---------------+
+  |   flags       |
+  +---------------+
+
+This works of course only, if PMCs are created as C<shared> in the
+first place. The presence of the C<_Sync> structure is stated by a PMC
+flag bit.
+
+=head2 PMCs and morphing 
+
+PMCs (like current OPMCs), which may morph themselves, and thereby
+change their vtable and the meaning of their attributes shall use a
+union of the desired attributes, e.g.:
+
+  struct Integer_PMC {
+    THE_PMC;
+    union {
+      INTVAL int_val;
+      FLOATVAL num_val;
+      STRING *str_val;
+    } u;
+  };
+
+=head1 ATTACHMENTS
+
+(none)
+
+=head1 REFERENCES
+
+  TODO pdd02_vtables.pod
+  TODO pddXX_interfaces.pod
+  TODO pddXX_classes.pod
+  TODO pddXX_objects.pod
+  TODO pddXX_cstruct.pod   [2]
+
+[2] PMCs need a class object that defines their attributes to properly
+allow subclassing. The attribute definition is held by a C<CStruct>
+PMC, the meta class of all Parrot PMCs. It's a list of attribute
+names, their types, and possibly the offsets in the PMC structure. See
+also the C<UnManangedStruct> PMC.
+
+=head1 RATIONAL
+
+Current OPMCs are too rigid: mostly either too small or too big. A lot
+of information is hanging off secondary malloced structures like
+C<PMC_sub> in the C<Sub> OPMC.
+
+But more importantly, OPMCs don't properly allow subclassing. E.g.
+
+  cl = subclass 'Hash', 'PGE::Match'
+
+is currently done by creating a C<ParrotClass>. When instantiate, this
+is a "hidden" C<__value> element as first attribute, which is a
+pointer to the hash parent PMC. This is creating internal structures
+which aren't compatible, because the object attributes are differently
+arranged. That is, above subclassing is mainly: C<PGE::Match> I<hasa>
+C<Hash> instead of I<isa>, when it comes to attribute relationship.
+
+This limitation prevents further implementation of already (at least
+partially) documented APIs, like the C<Compiler> one.
+
+A C<Compiler> object is either a Parrot C<Sub> like C<PGE::P6Regex> or
+a builtin that is C<NCI> compiler like C<PIR>. But C<Sub> and C<NCI>
+objects are that different that even currently needed attributes aren't
+consistently arranged (e.g. C<multi_sig> or C<namespace_stash>).
+Creating proper compiler objects like C<PIR_Compiler> with common and
+needed C<Compiler> attributes isn't possible now.
+
+[[ Well, with another one or two indirections all can be implemented,
+but that's just adding to code complexity. ]]
+
+Please note that the mentioned C<Compiler> PMC ist just one of many
+PMCs that exhibit the same problem.
+
+In combination with a proper metaclass for PMCs, PMCs and "real"
+Parrot objects should be able to work together seemlessly.
+
+=head1 PERFORMANCE CONSIDERATIONS
+
+Due to reduced memory consumption and reduced allocation of secondary
+helper structures, this change will very likely speed up Parrot
+performance slightly to moderate. No negative performance impact is
+forseeable due to these changes.
+
+=head1 IMPLEMENTATION
+
+Parrot core needs very little changes to be able to deal with
+differently sized PMCs. All the GC infrastructure is already there for
+C<Buffer_like> objects, which are managed in sized header pools.
+
+Changing PMCs to the new scheme can be done as needed and isn't
+mandatory.
+
+C<OPMCs> attribute access is currently already done through C macros,
+like C<PMC_int_val> or C<PMC_struct_val>. These macros can cast the
+passed pointer to C<(OPMC*)>, so that all these C<OPMCs> will still be
+working. New PMCs shall use explicit and more verbose attribute names,
+which don't collide with present C<OPMC> attributes.
+
+There'll be no implications to existing PASM or PIR code nor to
+existing dynamic PMCs.
+
+=head1 ALTERNATIVE PMC IMPLEMENTATIONS
+
+I've in another document (F<PMC.pod>) already layed out a PMC scheme
+optimized for generational garbage collection. The PMC layout is using
+also differently but fixed sized user parts of PMCs, but these are
+subject of one more indirection. If we see the need for optimized GC,
+this PMC scheme can still be implemented. We probably could take
+provisions that such a change is not too intrusive by cleverly using
+the PMC compiler and/or C macros like the proposed C<ME> in [1] above.
+
+The payload of PMCs in this scheme is hanging off a C<pmc_body>
+pointer:
+
+  +---------------+
+  |   pmc_body    |  --> body (buffer) memory
+  +---------------+
+  |   vtable      |
+  +---------------+
+  |   flags       |
+  +---------------+
+
+The implementation of C<PMC> might take into consideration that the PMC
+layout could change further.
+
+=cut
+
+__END__
+Local Variables:
+  fill-column:78
+End:
+
+vim: expandtab sw=2 tw=70:

Reply via email to