Re: Object spec [x-adr][x-bayes]

2003-03-10 Thread Christopher Armstrong
On Sun, Mar 09, 2003 at 03:46:39PM -0500, Dan Sugalski wrote:

 Presenting internal state in a rational form is a rather 
 significantly different thing than being able to serialize things, 
 and I don't think it's feasable, unfortunately. It'll require too 
 much consistency to be useful (as I don't think you'll get that 
 consistency) and you'll likely end up just dropping back and 
 grovelling over the internal nasty bits anyway.
 
 Serialization is a specific and very useful application, and one that 
 we need in general (otherwise we won't be able to have constant PMCs) 
 so it'll be there, but being able to expect to walk a pool of 
 heterogenous objects from a variety of different object systems is 
 more than I think you're likely to get.
 
 More to the point, it's a level of complexity I'm unwilling to commit 
 to, because it's not needed for our required functionality, useful 
 though it might be. (Even in the limited circumstances it's likely to 
 be available in)

Forgive me for jumping into the conversation late, but I'm not really
sure where this complexity is coming from. I've worked quite a bit
with serialization mechanisms in Python and other languages, but maybe
I'm missing something. In Python, we just have a method named
__getstate__ that must return one of the basic types: dict (hash),
list (vector), None, int, string. The marshaller must know how to
write these things to a file, send them across a socket via some
protocol, or whatever.  The dicts and lists can contain other objects,
of course, and those will be queried for their state, and so on. 

This isn't really that hard to implement, IME, and it's never been an
issue to put the state of an object into these datastructures. If a
method doesn't have a __getstate__, we just grab its attribute
__dict__. This bit is obviously not friendly to other languages, but
it's also unneccessary - Each language can implement a default
`get_state'-type method on its root object PMC that can either return
something useful or crash out (Python's would check for
self.__getstate__, and if not found, return self.__dict__).

Unserializing basically instantiates an object without initializing it
and calls __setstate__(data) on it where `data' is what was previously
returned from its __getstate__ call.

I'm not on a crusade to get a cross-language serialization mechanism,
but it would be very convenient and it doesn't seem it would require
that much hassle.

-- 
 Twisted | Christopher Armstrong: International Man of Twistery
  Radix  |  Release Manager,  Twisted Project
-+ http://twistedmatrix.com/users/radix.twistd/


Re: Object spec [x-adr][x-bayes]

2003-03-09 Thread Dan Sugalski
At 7:29 AM +1300 3/8/03, Sam Vilain wrote:
On Sat, 08 Mar 2003 06:58, Dan Sugalski wrote:
 At 2:08 PM +1300 3/7/03, Sam Vilain wrote:
 As long as mechanisms are put in place to allow modules to bypass
  object encapsulation and private/public constraints, and given that
  Parrot will have no XS,
 It wouldn't be wise to jump from Parrot won't do perl 5's XS scheme
 to Parrot won't have a way to write code for it in C. It will,
 arguably, be *easier* in parrot to write parts of your program in C,
 thus making it more likely that less of an object will be guaranteed
 to be done entirely in parrot-space.
OK.  Perhaps those structures should have a method/PMC that they must
export which will dump their internal state into a near equivalent Parrot
data structure rather than just having serialisation methods, for the sake
of the tools that want to traverse it rather than just freeze/thaw it to a
stream.
Presenting internal state in a rational form is a rather 
significantly different thing than being able to serialize things, 
and I don't think it's feasable, unfortunately. It'll require too 
much consistency to be useful (as I don't think you'll get that 
consistency) and you'll likely end up just dropping back and 
grovelling over the internal nasty bits anyway.

Serialization is a specific and very useful application, and one that 
we need in general (otherwise we won't be able to have constant PMCs) 
so it'll be there, but being able to expect to walk a pool of 
heterogenous objects from a variety of different object systems is 
more than I think you're likely to get.

More to the point, it's a level of complexity I'm unwilling to commit 
to, because it's not needed for our required functionality, useful 
though it might be. (Even in the limited circumstances it's likely to 
be available in)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Object spec [x-adr][x-bayes]

2003-03-07 Thread Dan Sugalski
At 2:08 PM +1300 3/7/03, Sam Vilain wrote:
As long as mechanisms are put in place to allow modules to bypass object
encapsulation and private/public constraints, and given that Parrot will
have no XS,
It wouldn't be wise to jump from Parrot won't do perl 5's XS scheme 
to Parrot won't have a way to write code for it in C. It will, 
arguably, be *easier* in parrot to write parts of your program in C, 
thus making it more likely that less of an object will be guaranteed 
to be done entirely in parrot-space.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Object spec [x-adr][x-bayes]

2003-03-07 Thread Sam Vilain
On Sat, 08 Mar 2003 06:58, Dan Sugalski wrote:
 At 2:08 PM +1300 3/7/03, Sam Vilain wrote:
 As long as mechanisms are put in place to allow modules to bypass
  object encapsulation and private/public constraints, and given that
  Parrot will have no XS,

 It wouldn't be wise to jump from Parrot won't do perl 5's XS scheme
 to Parrot won't have a way to write code for it in C. It will,
 arguably, be *easier* in parrot to write parts of your program in C,
 thus making it more likely that less of an object will be guaranteed
 to be done entirely in parrot-space.

OK.  Perhaps those structures should have a method/PMC that they must 
export which will dump their internal state into a near equivalent Parrot 
data structure rather than just having serialisation methods, for the sake 
of the tools that want to traverse it rather than just freeze/thaw it to a 
stream.  ie, something that extracts their state and can be passed to 
`bless' et al to reconstruct the original object.  The structure freezer  
heater can ask objects to present themselves as core types for 
serialisation.  I think this would be a great debugging win as well.
-- 
Sam Vilain, [EMAIL PROTECTED]

Real computer scientists like C's structured constructs, but they are
suspicious of it because its compiled.  (Only Batch freaks and
efficiency weirdos bother with compilers, they're s un-dynamic.)


RE: Object spec [x-adr][x-bayes]

2003-03-06 Thread Garrett Goebel
Sam Vilain wrote:
 
 On Thu, 06 Mar 2003 05:10, Garrett Goebel wrote:
  Several people have mentioned a desire to see Perl6
  and Parrot facilitate object persistence. Should
  such issues be tackled in Parrot?
 
 Not necessarily.  Just be friendly to object persistence 
 frameworks by exporting object relationships in a 
 sensible and consistent manner.  It is those relation-
 ships that drive the object persistence frameworks, and 
 implementations of the object structure in various
 languages.

exporting object relationships in a sensible and consistent manner

Sounds like something worthy of more than a passing reference. In the
context of parrot, what falls within the scope of a sensible and consistent
manner? 

 
 The exact semantics and mechanisms of object persistence
 are still quite a research topic, so it is not appropriate
 yet to select one method as the best.  However, I believe
 that the concepts of Object, Attribute, Association,
 Methods are stable.

  What does parrot need to facilitate object persistence
  and cross-language OO? Obviously the OMG's UML and
  family of specifications deal with these issues. What
  other practical approaches exist?
 
 UML does not deal with persistence.  It deals with
 specifying and modelling objects.

When I said UML and family I was lumping in Meta-Object Facility (MOF),
Common Warehouse Metamodel (CWM), Common Object Request Broker (CORBA),
Object Constraint Language (OCL), etc. 10,000+ pages of spine tingling
object modeling, persistence, and interchange specifications... Important in
itself not just as the provider of a common vocabulary and semantics for
programming, but also a helpful sleep aid.

MOF nails down Packages, Classes, Associations, Attributes, and Operations.
Objects being instances of classes... And while the OMG specs are overkill,
there's probably some gems of relevant insight in them. Especially if it is
a goal for parrot to allow python code to invoke methods on objects coded in
perl.

Over on perl6-internals you've been talking about the need for Associations.
Is the addition of associations all that's missing from Parrot to support
exporting object relationships in a sensible and consistent manner?

What are the traps and pitfalls that'll need to be avoided in Parrot's
design to make object persistence not just possible but realistically
feasible? For example: serializing objects without capturing their
associations. Will the absence of associations force us to rely on
attributes and convention to serialize an object's associations? I.e., as I
thought you were implying, a messy hack that'll come back to haunt us?

--
Garrett Goebel
IS Development Specialist

ScriptPro  Direct: 913.403.5261
5828 Reeds RoadMain:   913.384.1008
Mission, KS 66202  Fax:913.384.2180
www.scriptpro.com  garrett at scriptpro dot com


Re: Object spec [x-adr][x-bayes]

2003-03-06 Thread Sam Vilain
On Fri, 07 Mar 2003 05:48, Garrett Goebel wrote:
 Over on perl6-internals you've been talking about the need for
 Associations. Is the addition of associations all that's missing from
 Parrot to support exporting object relationships in a sensible and
 consistent manner?

A prudent question.  I believe so.

That is, with a description of a Class, its Attributes, Associations and
Methods I can generate Perl classes, a Tangram schema to map them to a SQL
Database, and ECMAScript `classes' that have the same intrinsic structure. 
And then use a JavaScript::Dumper class I've written to dump them to be
compliant with those ECMAScript classes 8-).

I'd like to be able to do that with Classes others have written, too.

 What are the traps and pitfalls that'll need to be avoided in Parrot's
 design to make object persistence not just possible but realistically
 feasible? For example: serializing objects without capturing their
 associations. 

Here's a summary that I hope backs up my assertion that it associations
will make it sufficient.  Let's consider three different cases of
persistence:

   - Mapping to a Table structure, either via an RDBMS or an ISAM library
   - Mapping to a Hash structure, either via a Berkeley DB or a filesystem
   - Serialising a structure to a stream - either a Data::Dumper/Storable
 object or an XML document

1.  Table structures.

OO - RDBMS/ISAM mapping in a nutshell: You generally map each Class to a
table, and each attribute to a column in the database.  You add a `type'
column - unique to each Class in the storage.  Extra attributes in
sub-classes are either put in seperate tables (``vertical'' mapping) or
extra columns (``horizontal'' mapping).  To a SQL database, vertical
mapping is equivalent to horizontal mapping once both tables are joined on
their ID column.

So, you can see that Association entities are different.  They won't get
mapped to a single column - one to many relationships are (well, possibly
two - an ID field and a `type' field, to allow for arbitrary inheritance on
the target).  Many to one relationships are mapped in the *destination*
class' table.  Many to Many relationships are mapped in a *seperate* link
table.

Any of these relationships may have the addition of a slot *key* (ie a hash
key) and/or a slot *number* (ie, an array key).  Possibly both.

So, you need to know at least this about the classes:

   - All of their allowed attributes, and enough information on each to
 practically map that to a column type.

 There's the sticky issue of the maximum length of scalars, the
 approach taken by Tangram is to default to VARCHAR(255) columns and
 force the user to specify columns which should be allowed more.  But
 that is an implementation detail.

 Practically speaking, you have:
- string fields of N octet maximum length
- integers, floats, fixed point numbers
- times  dates
- enumerated types
- sets of enumerated types (ie, bitwise mappable fields)

 I see other types as derivatives :-)

 It is up to the mapper to map these standard types to SQL column
 types.  It is up to the user of the mapper to provide mapping
 functions for their own custom, non-structural attribute types to a
 storage representation.

   - Enough information about the structure of their associations to
 generate database linking information.

 So, we're talking about:

- the source class of the association.
- the minimum/maximum destination multiplicity
- whether the association has an order, and/or key associated with
  it.
- a flag to say whether this is an aggregation or a composition

 In addition, if you want to be able to place constraints on the source
 multiplicity of the relationship (such as specifying a 1 to many
 relationship, which might be implemented with a foreign key), or to
 navigate back the other way, you need to make it a 2-way association,
 so need:

- the destination class of the association.
- the minimum/maximum source multiplicity.
- whether the reverse association has an order, and/or key
  associated with it *independant* of the one coming *to* it.

   - you also need to know all of the superclass relationships.

   - I'm considering object methods as unimportant for data storage.

2. Hash structures.

[note: I'm making this bit up as I go along, as I don't know of any mappers
that actually take this approach.  Please take with as many grains as salt
as you wish and feel free to pipe up and correct me where appropriate.]

Strict Hash databases, such as Berkerley, are a slightly different beast. 
You'd generally map chunks of your data structure to a single hash entity. 
A filesystem directory could also be considered a variant of a hash.

A straightforward approach would be to map from the store's `object ID' to
a serialisation of just that object.  I will consider 

RE: Object spec [x-adr][x-bayes]

2003-03-06 Thread Dan Sugalski
At 10:48 AM -0600 3/6/03, Garrett Goebel wrote:
Sam Vilain wrote:
 On Thu, 06 Mar 2003 05:10, Garrett Goebel wrote:
  Several people have mentioned a desire to see Perl6
  and Parrot facilitate object persistence. Should
  such issues be tackled in Parrot?
 Not necessarily.  Just be friendly to object persistence
 frameworks by exporting object relationships in a
 sensible and consistent manner.  It is those relation-
 ships that drive the object persistence frameworks, and
 implementations of the object structure in various
 languages.
exporting object relationships in a sensible and consistent manner

Sounds like something worthy of more than a passing reference. In the
context of parrot, what falls within the scope of a sensible and consistent
manner?
Each PMC has a serialize and deserialize entry. The core will have 
calls to serialize various data element types and a way to note 
contained PMCs that should be serialized, and we'll be using a 
variant on the GC tracing system to make sure everything gets dumped 
properly, and only once, so data structures with cyclic references 
get handled right.

There isn't going to be anything special about objects as such--the 
engine treats them the same as plain scalars, arrays, or hashes.

I don't particularly see anything that needs any special treatment 
here. If some object needs to do weird things to properly serialize 
itself, because it's got a connection to something external for 
example, it just overrides its own serialize and deserialize entries.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk