On Mon, 2004-12-13 at 17:23, jfj wrote:
Yo.

Why can't we __setitem__ for tuples?
The way I see it is that if we enable __setitem__ for tuples there
doesn't seem to be any performance penalty if the users don't use it
(aka, python performance independent of tuple mutability).

On the other hand, right now we have to use a list if we want to
__setitem__ on a sequence. If we could use tuples in the cases where
we want to modify items but not modify the length of the sequence,
programs could be considerably faster. Yes?

Enlighten me.

G.

Lets forget discussions of performance of tuples vs. lists for a moment, performance issues associated with allowing for changing keys in dictionaries and consider the larger question.

Q: Why do we have immutable data types?

A: Abstraction, logical consistency and annoying theoretical computer science.

Let us consider the Set type.  Ignore for now the fact that it is basically a dictionary.   We can all agree that a Set is an unordered collection of distinct objects.  Similar to their theoretical counterparts we can add elements, remove elements and perform a variety of set arithmetic operations on sets.  There is no theoretical set operation called "transmute" - that is, you can't mutate an element to something else.  It isn't part of the model.  You can remove an element and add another, but that isn't the same.  This is probably a good thing, because if you allow mutation of elements withing a set, look at what can happen:

Set A has A, B and C.
Set B has C, D and E.

The intersection of the two sets is a Set containing C.  The cardinality of the sets (len) is the same.

Now, lets "mutate" the B in Set A to C.  First of all, A ceases to be a set, for we have the elements A C and C.   Do we keep this non-set like set?  Or do we remove the element?  The first is out of the question; Set carries with it a definition that is borrowed from mathematics.  Keeping the extra C around makes our object a collection, not a set, and should be called such.  I, for one, don't want to use a purposefully misleading language. 

Perhaps we resolve our issue by defining the mutation of B to C to result in set A containing the elements A and C.  This would still be a set, but there is another problem.  We didn't *do* anything to Set A, but it changed in a fundamental manner.  Because of a change in one place, the mutation of B, the cardinality of our set has suddenly changed without saying "remove B from Set A."  That is a Bad Programming practice.  Touching something in one place shouldn't break something else. 

The solution to this problem is to recognize that certain mathematical data-structures basically require the use of parameters that cannot be serendipitously changed.  But why kowtow to mathematics at the expense of cool hack-ability?

When you set your thoughts down to code, your writing serves two purposes. 

1) Instruct the machine on your intent.
2) Argue to the reader the correctness of #1.

Around here, we call #2 "self documenting code."  Now, I've seen a lot of discussion of this on python-list.  Frequently silly pieces of code such as (forgive me):

flatten = sum
list of strings = flatten( list of lists of strings )

are written and argued for in the name of self documenting code.  Personally, I think that is the "This line adds 1 to X and assigns it to Y" variety of self documenting code. 

What is really meant by self documenting code is code that argues in a semi-rigorous sense for its own correctness.  Programming is fundamentally an engineering discipline.  It differs from the more physical engineering disciplines - the parts are all abstract mathematical entities, but their reliability as such is offset by the sheer complexity of the constructed machine.   Much as a mechanical engineer might discuss a tie rod end or pressure vessel in the language of calculus and statics, a computer programmer really should wrap their discussion in the language of discrete mathematics and computational linguistics. 

A tuple has a specific mathematical meaning that admittedly differs somewhat depending on the context.  Same for a Set, List and so forth.  Cast your problem in terms of mathematics and keep your language's nomenclature and behavior similar to existing mathematical constructs.  Doing this will afford you the opportunity to stand upon the shoulders of giants as you go about your work.  And to properly stand upon those shoulders, we must match the theoetical models used.  That means we need more, not fewer, inmutable counterparts to most of our datatypes.



Adam DePrince

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to