Re: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol)

2005-11-01 Thread Noam Raphael
On 11/1/05, Josiah Carlson <[EMAIL PROTECTED]> wrote:
...
>
> I still consider it dead.
>"If the implementation is hard to explain, it's a bad idea."

It is sometimes true, but not always. It may mean two other things:
1. The one trying to explain is not talented enough.
2. The implementation is really not very simple. A hash table, used so
widely in Python, is really not a simple idea, and it's not that easy
to explain.
>
> Also, not all user-defined classes have a __dict__, and not all
> user-defined classes can have arbitrary attributes added to them.
>
> c>>> class foo(object):
> ... __slots__ = ['lst']
> ... def __init__(self):
> ... self.lst = []
> ...
> >>> a = foo()
> >>> a.bar = 1
> Traceback (most recent call last):
>  File "", line 1, in ?
> AttributeError: 'foo' object has no attribute 'bar'
> >>>
It doesn't matter. It only means that the implementation would have to
make frozen copies also of __slots__ items, when freezing a
user-defined class.

I am afraid that this question proves that I didn't convey my idea to
you. If you like, please forgive my inability to explain it clearly,
and try again to understand my idea, by going over what I wrote again,
and thinking on it. You can also wait for the PEP that I intend to
write. And you can also forget about it, if you don't want to bother
with it - you've already helped a lot.

Noam
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol)

2005-11-01 Thread Josiah Carlson
> That's fine. I wish that you read my answer, think about it a little,
> and just tell me in a yes or a no if you still consider it dead. I
> think that I have answered all your questions, and I hope that at
> least others would be convinced by them, and that at the end my
> suggestion would be accepted.

I still consider it dead.
"If the implementation is hard to explain, it's a bad idea."

Also, not all user-defined classes have a __dict__, and not all
user-defined classes can have arbitrary attributes added to them.

c>>> class foo(object):
... __slots__ = ['lst']
... def __init__(self):
... self.lst = []
...
>>> a = foo()
>>> a.bar = 1
Traceback (most recent call last):
  File "", line 1, in ?
AttributeError: 'foo' object has no attribute 'bar'
>>> 

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol)

2005-11-01 Thread Noam Raphael
On 11/1/05, Josiah Carlson <[EMAIL PROTECTED]> wrote:
...
>
> I am an advocate for PEP 351.  However, I am against your proposed
> implementation/variant of PEP 351 because I don't believe it ads enough
> to warrant the additional complication and overhead necessary for every
> object (even tuples would need to get a .frozen_cache member).
>
> Give me a recursive freeze from PEP 351 (which handles objects that are
> duplicated, but errors out on circular references), and I'll be happy.
>
That's fine - but it doesn't mean that I must be happy with it.
>
...
> >
> > This isn't correct - freezing a set won't require a single copy to be
> > performed, as long as the frozen copy isn't saved after the original
> > is changed. Copy+cache always requires one copy.
>
> You are wrong, and you even say you are wrong..."freezing a set doesn't
> require a COPY, IF the frozen COPY isn't saved after the original is
> CHANGED". Creating an immutable set IS CREATING A COPY, so it ALSO
> copies, and you admit as much, but then say the equivalent of "copying
> isn't copying because I say so".

No, I am not wrong. I am just using misleading terms. I will call a
"frozen copy" a "frozen image". Here it goes: "freezing a set doesn't
require a COPY, IF the frozen IMAGE isn't saved after the original is
CHANGED". I suggest that there would be a way to create a frozenset
without COPYING an O(n) amount of MEMORY. When a frozen set is created
by a call frozen(x), it would not copy all the data, but would rather
reference the existing data, which was created by the non-frozen set.
Only if the original set changes, when there's a frozen set
referencing the data, the MEMORY would be actually copied.

I call it a "frozen copy" because it behaves as a frozen copy, even
though not all the memory is being copied. When you call the COPY
function in the COPY module with a string, it doesn't really copy
memory - the same string is returned. When you copy a file inside
subversion, it doesn't actually copy all the data associated with it,
but does something smarter, which takes O(1). The point is, for the
user, it's a copy. Whether or not memory is actually being copied, is
an implementation detail.
>
...
>
> I think that adding an additional attribute to literally every single
> object to handle the caching of 'frozen' objects, as well as a list to
> every object to handle callbacks which should be called on object
> mutation, along with a _call_stuff_when_mutated() method that handles
> these callback calls, IN ADDITION TO the __freeze__ method which is
> necessary to support this, is a little much, AND IS CERTAINLY NOT A
> SIMPLIFICATION!

I don't agree. You don't need to add a list to every object, since you
can store all those relations in one place, with a standard function
for registering them. Anyway, code written in Python (which is the
language we are discussing) WON'T BE COMPLICATED! The frozen
mechanism, along with two new protocols (__frozen__ and __changed__),
would be added automatically! The internal state of a class written in
Python can be automatically frozen, since it's basically a dict. Now
let's see if it's a simplification:

1. No Python code would have to be made more complicated because of the change.
2. There would be no need to find workarounds, like cStringIO, for the
fact that strings and tuples are immutable.
3. You would be able to put any kind of object into a set, or use it
as a dict key.
4. Classes (like the graph example) would be able to give users things
without having to make a choice between risking their users with
strange bugs, making a complicated interface, making very inefficient
methods, and writing complicated wrapper classes.

I will ask you: Is this a complication?
The answer is: it requires a significent change of the CPython
implementation. But about the Python language: it's definitely a
simplification.
>
> Let us pause for a second and consider:
> Original PEP proposed 1 new method: __freeze__, which could be
> implemented as a subclass of the original object (now), and integrated
> into the original classes as time goes on.  One could /register/
> __freeze__ functions/methods a'la Pickle, at which point objects
> wouldn't even need a native freeze method.
>
> Your suggestion offers 2 new methods along with 2 new instance variables.
> Let's see, a callback handler, __freeze__, the cache, and the callback
> list.  Doesn't that seem a little excessive to you to support freezing?
> It does to me.  If Guido were to offer your implementation of freeze, or
> no freeze at all, I would opt for no freeze, as implementing your freeze
> on user-defined classes would be a pain in the ass, not to mention
> implementing them in C code would be more than I would care to do, and
> more than I would ask any of the core developers to work on.
>
As I said above: this suggestion would certainly require more change
in the Python implementation than your suggestion. But the Python
language would gain a lot more. Imp

Re: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol)

2005-11-01 Thread Josiah Carlson

Noam Raphael <[EMAIL PROTECTED]> wrote:
> On 10/31/05, Josiah Carlson <[EMAIL PROTECTED]> wrote:

> > > About the users-changing-my-internal-data issue:
> ...
> > You can have a printout before it dies:
> > "I'm crashing your program because something attempted to modify a data
> > structure (here's the traceback), and you were told not to."
> >
> > Then again, you can even raise an exception when people try to change
> > the object, as imdict does, as tuples do, etc.
> 
> Both solutions would solve the problem, but would require me to wrap
> the built-in set with something which doesn't allow changes. This is a
> lot of work - but it's quite similiar to what my solution would
> actually do, in a single built-in function.

I am an advocate for PEP 351.  However, I am against your proposed
implementation/variant of PEP 351 because I don't believe it ads enough
to warrant the additional complication and overhead necessary for every
object (even tuples would need to get a .frozen_cache member).

Give me a recursive freeze from PEP 351 (which handles objects that are
duplicated, but errors out on circular references), and I'll be happy.


> > > You suggest two ways for solving the problem. The first is by copying
> > > my mutable objects to immutable copies:
> >
> > And by caching those results, then invalidating them when they are
> > updated by your application.  This is the same as what you would like to
> > do, except that I do not rely on copy-on-write semantics, which aren't
> > any faster than freeze+cache by your application.
> 
> This isn't correct - freezing a set won't require a single copy to be
> performed, as long as the frozen copy isn't saved after the original
> is changed. Copy+cache always requires one copy.

You are wrong, and you even say you are wrong..."freezing a set doesn't
require a COPY, IF the frozen COPY isn't saved after the original is
CHANGED". Creating an immutable set IS CREATING A COPY, so it ALSO
copies, and you admit as much, but then say the equivalent of "copying
isn't copying because I say so".


> > In any case, whether you choose to use freeze, or use a different API,
> > this particular problem is solvable without copy-on-write semantics.
> 
> Right. But I think that a significant simplification of the API is a
> nice bonus for my solution. And about those copy-on-write semantics -
> it should be proven how complex they are. Remember that we are talking
> about frozen-copy-on-write, which I think would simplify matters
> considerably - for example, there are at most two instances sharing
> the same data, since the frozen copy can be returned again and again.

I think that adding an additional attribute to literally every single
object to handle the caching of 'frozen' objects, as well as a list to
every object to handle callbacks which should be called on object
mutation, along with a _call_stuff_when_mutated() method that handles
these callback calls, IN ADDITION TO the __freeze__ method which is
necessary to support this, is a little much, AND IS CERTAINLY NOT A
SIMPLIFICATION!

Let us pause for a second and consider:
Original PEP proposed 1 new method: __freeze__, which could be
implemented as a subclass of the original object (now), and integrated
into the original classes as time goes on.  One could /register/
__freeze__ functions/methods a'la Pickle, at which point objects
wouldn't even need a native freeze method.

Your suggestion offers 2 new methods along with 2 new instance variables. 
Let's see, a callback handler, __freeze__, the cache, and the callback
list.  Doesn't that seem a little excessive to you to support freezing?
It does to me.  If Guido were to offer your implementation of freeze, or
no freeze at all, I would opt for no freeze, as implementing your freeze
on user-defined classes would be a pain in the ass, not to mention
implementing them in C code would be more than I would care to do, and
more than I would ask any of the core developers to work on.


> > Even without validation, there are examples that force a high number of
> > calls, which are not O(1), ammortized or otherwise.
> >
> [Snap - a very interesting example]
> >
> > Now, the actual time analysis on repeated freezings and such gets ugly.
> > There are actually O(k) objects, which take up O(k**2) space.  When you
> > modify object b[i][j] (which has just been frozen), you get O(k)
> > callbacks, and when you call freeze(b), it actually results in O(k**2)
> > time to re-copy the O(k**2) pointers to the O(k) objects.  It should be
> > obvious that this IS NOT AMMORTIZABLE to original object creation time.
> >
> That's absolutely right. My ammortized analysis is correct only if you
> limit yourself to cases in which the original object doesn't change
> after a frozen() call was made. In that case, it's ok to count the
> O(k**2) copy with the O(k**2) object creation, because it's made only
> once.

But here's the crucial observation which you are missing.  You yourself
have stated tha