Re: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol)
On 11/1/05, Josiah Carlson <[EMAIL PROTECTED]> wrote: ... > > I still consider it dead. >"If the implementation is hard to explain, it's a bad idea." It is sometimes true, but not always. It may mean two other things: 1. The one trying to explain is not talented enough. 2. The implementation is really not very simple. A hash table, used so widely in Python, is really not a simple idea, and it's not that easy to explain. > > Also, not all user-defined classes have a __dict__, and not all > user-defined classes can have arbitrary attributes added to them. > > c>>> class foo(object): > ... __slots__ = ['lst'] > ... def __init__(self): > ... self.lst = [] > ... > >>> a = foo() > >>> a.bar = 1 > Traceback (most recent call last): > File "", line 1, in ? > AttributeError: 'foo' object has no attribute 'bar' > >>> It doesn't matter. It only means that the implementation would have to make frozen copies also of __slots__ items, when freezing a user-defined class. I am afraid that this question proves that I didn't convey my idea to you. If you like, please forgive my inability to explain it clearly, and try again to understand my idea, by going over what I wrote again, and thinking on it. You can also wait for the PEP that I intend to write. And you can also forget about it, if you don't want to bother with it - you've already helped a lot. Noam ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol)
> That's fine. I wish that you read my answer, think about it a little, > and just tell me in a yes or a no if you still consider it dead. I > think that I have answered all your questions, and I hope that at > least others would be convinced by them, and that at the end my > suggestion would be accepted. I still consider it dead. "If the implementation is hard to explain, it's a bad idea." Also, not all user-defined classes have a __dict__, and not all user-defined classes can have arbitrary attributes added to them. c>>> class foo(object): ... __slots__ = ['lst'] ... def __init__(self): ... self.lst = [] ... >>> a = foo() >>> a.bar = 1 Traceback (most recent call last): File "", line 1, in ? AttributeError: 'foo' object has no attribute 'bar' >>> - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol)
On 11/1/05, Josiah Carlson <[EMAIL PROTECTED]> wrote: ... > > I am an advocate for PEP 351. However, I am against your proposed > implementation/variant of PEP 351 because I don't believe it ads enough > to warrant the additional complication and overhead necessary for every > object (even tuples would need to get a .frozen_cache member). > > Give me a recursive freeze from PEP 351 (which handles objects that are > duplicated, but errors out on circular references), and I'll be happy. > That's fine - but it doesn't mean that I must be happy with it. > ... > > > > This isn't correct - freezing a set won't require a single copy to be > > performed, as long as the frozen copy isn't saved after the original > > is changed. Copy+cache always requires one copy. > > You are wrong, and you even say you are wrong..."freezing a set doesn't > require a COPY, IF the frozen COPY isn't saved after the original is > CHANGED". Creating an immutable set IS CREATING A COPY, so it ALSO > copies, and you admit as much, but then say the equivalent of "copying > isn't copying because I say so". No, I am not wrong. I am just using misleading terms. I will call a "frozen copy" a "frozen image". Here it goes: "freezing a set doesn't require a COPY, IF the frozen IMAGE isn't saved after the original is CHANGED". I suggest that there would be a way to create a frozenset without COPYING an O(n) amount of MEMORY. When a frozen set is created by a call frozen(x), it would not copy all the data, but would rather reference the existing data, which was created by the non-frozen set. Only if the original set changes, when there's a frozen set referencing the data, the MEMORY would be actually copied. I call it a "frozen copy" because it behaves as a frozen copy, even though not all the memory is being copied. When you call the COPY function in the COPY module with a string, it doesn't really copy memory - the same string is returned. When you copy a file inside subversion, it doesn't actually copy all the data associated with it, but does something smarter, which takes O(1). The point is, for the user, it's a copy. Whether or not memory is actually being copied, is an implementation detail. > ... > > I think that adding an additional attribute to literally every single > object to handle the caching of 'frozen' objects, as well as a list to > every object to handle callbacks which should be called on object > mutation, along with a _call_stuff_when_mutated() method that handles > these callback calls, IN ADDITION TO the __freeze__ method which is > necessary to support this, is a little much, AND IS CERTAINLY NOT A > SIMPLIFICATION! I don't agree. You don't need to add a list to every object, since you can store all those relations in one place, with a standard function for registering them. Anyway, code written in Python (which is the language we are discussing) WON'T BE COMPLICATED! The frozen mechanism, along with two new protocols (__frozen__ and __changed__), would be added automatically! The internal state of a class written in Python can be automatically frozen, since it's basically a dict. Now let's see if it's a simplification: 1. No Python code would have to be made more complicated because of the change. 2. There would be no need to find workarounds, like cStringIO, for the fact that strings and tuples are immutable. 3. You would be able to put any kind of object into a set, or use it as a dict key. 4. Classes (like the graph example) would be able to give users things without having to make a choice between risking their users with strange bugs, making a complicated interface, making very inefficient methods, and writing complicated wrapper classes. I will ask you: Is this a complication? The answer is: it requires a significent change of the CPython implementation. But about the Python language: it's definitely a simplification. > > Let us pause for a second and consider: > Original PEP proposed 1 new method: __freeze__, which could be > implemented as a subclass of the original object (now), and integrated > into the original classes as time goes on. One could /register/ > __freeze__ functions/methods a'la Pickle, at which point objects > wouldn't even need a native freeze method. > > Your suggestion offers 2 new methods along with 2 new instance variables. > Let's see, a callback handler, __freeze__, the cache, and the callback > list. Doesn't that seem a little excessive to you to support freezing? > It does to me. If Guido were to offer your implementation of freeze, or > no freeze at all, I would opt for no freeze, as implementing your freeze > on user-defined classes would be a pain in the ass, not to mention > implementing them in C code would be more than I would care to do, and > more than I would ask any of the core developers to work on. > As I said above: this suggestion would certainly require more change in the Python implementation than your suggestion. But the Python language would gain a lot more. Imp
Re: [Python-Dev] apparent ruminations on mutable immutables (was: PEP 351, the freeze protocol)
Noam Raphael <[EMAIL PROTECTED]> wrote: > On 10/31/05, Josiah Carlson <[EMAIL PROTECTED]> wrote: > > > About the users-changing-my-internal-data issue: > ... > > You can have a printout before it dies: > > "I'm crashing your program because something attempted to modify a data > > structure (here's the traceback), and you were told not to." > > > > Then again, you can even raise an exception when people try to change > > the object, as imdict does, as tuples do, etc. > > Both solutions would solve the problem, but would require me to wrap > the built-in set with something which doesn't allow changes. This is a > lot of work - but it's quite similiar to what my solution would > actually do, in a single built-in function. I am an advocate for PEP 351. However, I am against your proposed implementation/variant of PEP 351 because I don't believe it ads enough to warrant the additional complication and overhead necessary for every object (even tuples would need to get a .frozen_cache member). Give me a recursive freeze from PEP 351 (which handles objects that are duplicated, but errors out on circular references), and I'll be happy. > > > You suggest two ways for solving the problem. The first is by copying > > > my mutable objects to immutable copies: > > > > And by caching those results, then invalidating them when they are > > updated by your application. This is the same as what you would like to > > do, except that I do not rely on copy-on-write semantics, which aren't > > any faster than freeze+cache by your application. > > This isn't correct - freezing a set won't require a single copy to be > performed, as long as the frozen copy isn't saved after the original > is changed. Copy+cache always requires one copy. You are wrong, and you even say you are wrong..."freezing a set doesn't require a COPY, IF the frozen COPY isn't saved after the original is CHANGED". Creating an immutable set IS CREATING A COPY, so it ALSO copies, and you admit as much, but then say the equivalent of "copying isn't copying because I say so". > > In any case, whether you choose to use freeze, or use a different API, > > this particular problem is solvable without copy-on-write semantics. > > Right. But I think that a significant simplification of the API is a > nice bonus for my solution. And about those copy-on-write semantics - > it should be proven how complex they are. Remember that we are talking > about frozen-copy-on-write, which I think would simplify matters > considerably - for example, there are at most two instances sharing > the same data, since the frozen copy can be returned again and again. I think that adding an additional attribute to literally every single object to handle the caching of 'frozen' objects, as well as a list to every object to handle callbacks which should be called on object mutation, along with a _call_stuff_when_mutated() method that handles these callback calls, IN ADDITION TO the __freeze__ method which is necessary to support this, is a little much, AND IS CERTAINLY NOT A SIMPLIFICATION! Let us pause for a second and consider: Original PEP proposed 1 new method: __freeze__, which could be implemented as a subclass of the original object (now), and integrated into the original classes as time goes on. One could /register/ __freeze__ functions/methods a'la Pickle, at which point objects wouldn't even need a native freeze method. Your suggestion offers 2 new methods along with 2 new instance variables. Let's see, a callback handler, __freeze__, the cache, and the callback list. Doesn't that seem a little excessive to you to support freezing? It does to me. If Guido were to offer your implementation of freeze, or no freeze at all, I would opt for no freeze, as implementing your freeze on user-defined classes would be a pain in the ass, not to mention implementing them in C code would be more than I would care to do, and more than I would ask any of the core developers to work on. > > Even without validation, there are examples that force a high number of > > calls, which are not O(1), ammortized or otherwise. > > > [Snap - a very interesting example] > > > > Now, the actual time analysis on repeated freezings and such gets ugly. > > There are actually O(k) objects, which take up O(k**2) space. When you > > modify object b[i][j] (which has just been frozen), you get O(k) > > callbacks, and when you call freeze(b), it actually results in O(k**2) > > time to re-copy the O(k**2) pointers to the O(k) objects. It should be > > obvious that this IS NOT AMMORTIZABLE to original object creation time. > > > That's absolutely right. My ammortized analysis is correct only if you > limit yourself to cases in which the original object doesn't change > after a frozen() call was made. In that case, it's ok to count the > O(k**2) copy with the O(k**2) object creation, because it's made only > once. But here's the crucial observation which you are missing. You yourself have stated tha