Dear Sage developers,
Summary of this rather long e-mail which builds up on section "2.9
Mutability" of the programming guide.
- In MuPAD, the memory behavior (and up to some point the data
structure) of objects is essentially forced. This relieves the
programmer of tedious memory management details. Luckily enough,
this behavior turned out in practice to be appropriate for our (=
MuPAD-Combinat people) purposes, at least most of the time. I could
be biased by habits there, but can give some rationale.
- Python being a general purpose language gives more freedom: the
programmer of a class can choose, but has to implement, the memory
behavior of the objects in that class.
- I would like the best of both worlds in sage: that is to have a
bookshelf of standard behaviors (and b.t.w. data structures) that
the programmer of a class could choose from, without having to
reimplement them.
- I make below a first list of such behaviors (different communities
use different terminology, so please tell me if I should call them
differently). Most of them, I already discussed with Mike to check
that they were implementable in python. Certainly 99% of the
technical code is already there, either directly in sage or
somewhere in the python world. So the question is about the exact
specifications of each behavior, the guidelines to choose between
behaviors, the choice among existing implementations, and the
programmer's interface.
An example:
Patrick is a young mathematician without much programming
background. He wants to implement a class whose instances are integer
compositions, that is lists of positive integers, like [3,1,4]. First
he has to take certain decisions:
- data structure: obvious here, a list (in the python sense)
- element or parent: obvious here: element
- memory behavior: i.e. what happens if one does:
sage: c = Composition([3,1,4])
sage: d = c
sage: d[1] = 2
sage: d
[3,2,4]
sage: c
[3,?,4]
Option 1: Compositions have a reference semantic: the last result
is [3,2,4]. That is, upon the assignment d = c, d becomes a
reference to the same object as c; modifying d also modifies
c. That's the behavior of lists in python.
Option 2: Compositions are immutable. The third command is just not
valid. This is the suggested default behavior in sage.
Option 3: Composition have copy semantic: the last result is
[3,1,4]. Upon the assignment d = c, d becomes a copy of c, so
modifying d does not modify c.
Following the sage-combinat guidelines, Patrick chooses copy semantic
(see below why). Now he picks up the already implemented data
structure and behavior from the bookshelf, and just has to implement
the mathematical methods associated to compositions. His code will
look like:
class Composition ( list, Element, CopySemantic):
def size(self):
...
def conjugate(self):
...
Comments:
- It is essential for us to make it that easy to implement new
classes, and factor out once for all technicalities (immutability
may be easy to implement, but not copy semantic). Our contributers
won't have the time, and often not the technical expertise for
those. Besides, factoring out promotes concise code, and consistent
behaviors.
- I used an inheritance syntax above, but that's just for the sake of
the example. All I really care about is that it should be easy to
have Composition get its data structure and copy semantic for free
directly from the bookshelf. The exact syntax and mechanism is open for
discussions (typically inheriting directly from list and Element
probably breaks currently, so something else needs to be done).
==============================================================================
More on "reference semantic" versus "immutability" versus "copy semantic".
To begin with, a followup on the example above:
sage: def f(d):
... d[1] = 2
...
sage: f(c)
sage: c
[2,?,4]
In this example, there is an implicit assignment of c to the local
variable d of f. My belief is that what happens there should be
consistent with what happens upon a usual assignment; in other words,
in a call f(c), it is not the function f who should decide whether the
argument c is passed by copy or by reference, but the object c. Of
course, this displays my bias toward functional programming, where
results are always passed as return values, not through reference
effect on the arguments.
- Reference semantic:
Pros: avoid unnecessary copies; necessary in some specific
algorithm to achieve optimal complexity
Cons: non-locality: it is hard to keep track of all the ways an
object you are manipulating in a given lexical scope can be
modified: it might have been passed to another function which
passed it to another which passed it to another defined in a
very different lexical scope, and which could inadvertently
modify it. This is error prone, and hard to debug (no
const-safeness to help us).
I guess that's why immutability is promoted in sage
- Immutability:
Pros: easy to implement, safe
Cons (combinatorics context): most of the time, new objects are
constructed from existing ones by small local modifications
(think exchanging two adjacent entries in a permutation, or
grafting a tree onto another one). Immutability imposes
to reconstruct the new object from scratch, usually pushing
the programmer to reveal the internal data structure
(look e.g. at ishift in combinat/permutation.py).
- Copy semantic:
Pros: safety
Does "what you expect"
With appropriate copy-on-write, can achieve most of the time
the same efficiency has for objects with reference semantic
Cons: more complicated to implement; the copy-on-write part can be
quite tricky, as we want to have instructions like:
c[i] = 1
or
c = c.modifying_method()
*not* to trigger a copy of c if there is no other reference
to c elsewhere.
==============================================================================
Preliminary bookshelf of typical behaviors:
- Unique representation:
class UniqueRepresentation
def __new__(...):
does whatever is appropriate to ensure unique representation
... equality test using the unique id ...
class MyParent (Parent, UniqueRepresentation):
def __init__(...):
- Copy semantic:
see above
- Object with cache:
Goal: have a very short mantra to cache the results of
complicated methods. For memory management, caching is best
stored inside the object (it will pass out of scope at the
same time as the object). The cache should be ignored in
equality tests. Typically the object will be semantically
immutable: caching a value does not change the object;
otherwise, all cached value (like the hash value!) should be
automatically cleared upon modification. We may want some
parts of the cache to be ignored upon pickling as well.
class MyGroup(Parent, ObjectWithCache):
@remember("is_solvable")
def is_solvable(self):
return ...
@remember("__hash__")
def __hash__(self):
return ...
@remember("my_function", pickle=false)
def my_function(self):
"""
Return a function that is costly to build but can't be pickled
so we will just reconstruct it after pickling/unpickling
"""
...
Typically, our Elements will have Copy Semantic, whereas our Parents
will be Semantically Immutable, have Unique Representation, and be
Objects With Cache (that's how it is in MuPAD, domains playing the
role of parents).
By the way, here are some typical data structures that should be on
the bookshelf as well:
- vectors with entries indexed by any set (possibly infinite)
(essentially done)
- Sparse matrices with rows and columns indexed by any set
- Lists, sets, multisets, which are Elements at the same time
- Tableaux, trees, ...
Suggestions and comments extremely welcome! I plan to post in the
coming days one ticket per behavior, with a detailed test suite. Btw:
Adrien Boussicault (student in Marne-la-vallée, in CC) offered to
participate to the implementation once the specs are precise.
Cheers,
Nicolas
--
Nicolas M. Thiery "Isil" <[EMAIL PROTECTED]>
http://Nicolas.Thiery.name/
--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~----------~----~----~----~------~----~------~--~---