Re: [Python-Dev] PEP 351, the freeze protocol
I thought about something - > > I think that the performance penalty may be rather small - remember > that in programs which do not change strings, there would never be a > need to copy the string data at all. And since I think that usually > most of the dict lookups are for method or function names, there would > almost never be a need to constuct a new object on dict lookup, > because you search for the same names again and again, and a new > object is created only on the first frozen() call. You might even gain > performance, because s += x would be faster. > Name lookups can take virtually the same time they take now - method names can be saved from the beginning as frozen strings, so finding them in a dict will take just another bit test - is the object frozen - before doing exactly what is done now. Remember, the strings we are familiar with are simply frozen strings... ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
On 10/31/05, Josiah Carlson <[EMAIL PROTECTED]> wrote: ... > And I'm going to point out why you are wrong. I still don't think so. I think that I need to reclarify what I mean. > > About the users-changing-my-internal-data issue: ... > You can have a printout before it dies: > "I'm crashing your program because something attempted to modify a data > structure (here's the traceback), and you were told not to." > > Then again, you can even raise an exception when people try to change > the object, as imdict does, as tuples do, etc. Both solutions would solve the problem, but would require me to wrap the built-in set with something which doesn't allow changes. This is a lot of work - but it's quite similiar to what my solution would actually do, in a single built-in function. > > > You suggest two ways for solving the problem. The first is by copying > > my mutable objects to immutable copies: > > And by caching those results, then invalidating them when they are > updated by your application. This is the same as what you would like to > do, except that I do not rely on copy-on-write semantics, which aren't > any faster than freeze+cache by your application. This isn't correct - freezing a set won't require a single copy to be performed, as long as the frozen copy isn't saved after the original is changed. Copy+cache always requires one copy. ... > I never claimed it was beautiful, I claimed it would work. And it does. > There are 7 methods, which you can reduce if you play the special method > game: > > RemEdge -> __delitem__((node, node)) > RemNode -> __delitem__(node) #forgot this method before > IterNodes -> __iter__() > IterOutgoing,IterIncoming -> IterAdjacent(node) > I just wanted to say that this game is of course a lot of fun, but it doesn't simplify the interface. > In any case, whether you choose to use freeze, or use a different API, > this particular problem is solvable without copy-on-write semantics. Right. But I think that a significant simplification of the API is a nice bonus for my solution. And about those copy-on-write semantics - it should be proven how complex they are. Remember that we are talking about frozen-copy-on-write, which I think would simplify matters considerably - for example, there are at most two instances sharing the same data, since the frozen copy can be returned again and again. > > > > Now, about copy-on-write: > > ... > Thank you for the clarification (btw, your english is far better than > any of the foreign languages I've been "taught" over the years). Thanks! It seems that if you are forced to use a language from time to time it improves a little... ... > Even without validation, there are examples that force a high number of > calls, which are not O(1), ammortized or otherwise. > [Snap - a very interesting example] > > Now, the actual time analysis on repeated freezings and such gets ugly. > There are actually O(k) objects, which take up O(k**2) space. When you > modify object b[i][j] (which has just been frozen), you get O(k) > callbacks, and when you call freeze(b), it actually results in O(k**2) > time to re-copy the O(k**2) pointers to the O(k) objects. It should be > obvious that this IS NOT AMMORTIZABLE to original object creation time. > That's absolutely right. My ammortized analysis is correct only if you limit yourself to cases in which the original object doesn't change after a frozen() call was made. In that case, it's ok to count the O(k**2) copy with the O(k**2) object creation, because it's made only once. Why it's ok to analyze only that limited case? I am suggesting a change in Python: that every object you would like be mutable, and would support the frozen() protocol. When you evaluate my suggestion, you need to take a program, and measure its performance in the current Python and in a Python which implements my suggestion. This means that the program should work also on the current Python. In that case, my assumption is true - you won't change objects after you have frozen them, simply because these objects (strings which are used as dict keys, for example) can't be changed at all in the current Python implementation! I will write it in another way: I am proposing a change that will make Python objects, including strings, mutable, and gives you other advantages as well. I claim that it won't make existing Python programs run slower in O() terms. It would allow you to do many things that you can't do today; some of them would be fast, like editing a string, and some of them would be less fast - for example, repeatedly changing an object and freezing it. I think that the performance penalty may be rather small - remember that in programs which do not change strings, there would never be a need to copy the string data at all. And since I think that usually most of the dict lookups are for method or function names, there would almost never be a need to constuct a new object on dict lookup, because you search for the same names ag
Re: [Python-Dev] PEP 351, the freeze protocol
Noam Raphael <[EMAIL PROTECTED]> wrote: > Hello, > > I have slept quite well, and talked about it with a few people, and I > still think I'm right. And I'm going to point out why you are wrong. > About the users-changing-my-internal-data issue: > > > Again, user semantics. Tell your users not to modify entries, and/or > > you can make copies of objects you return. If your users are too daft > > to read and/or follow directions, then they deserve to have their > > software not work. > ... > > When someone complains that something doesn't work, I tell them to read > > the documentation. If your users haven't been told to RTFM often enough > > to actually make it happen, then you need a RTFM-bat. Want to know how > > you make one? You start wrapping the objects you return which segfaults > > the process if they change things. When they start asking, tell them it > > is documented quite clearly "do not to modify objects returned, or else". > > Then there's the other option, which I provide below. > > I disagree. I read the manual when I don't know what something does. > If I can guess what it does (and this is where conventions are good), > I don't read the manual. And let's say I ought to read the complete > manual for every method that I use, and that I deserve a death > punishment (or a segmentation fault) if I don't. But let's say that I > wrote a software, without reading the manual, and it worked. I have > gone to work on other things, and suddenly a bug arises. When the poor > guy who needs to fix it goes over the code, everything looks > absolutely correct. Should he also read the complete manuals of every > library that I used, in order to fix that bug? And remember that in > this case, the object could have traveled between several places > (including functions in other libraries), before it was changed, and > the original data structure starts behaving weird. You can have a printout before it dies: "I'm crashing your program because something attempted to modify a data structure (here's the traceback), and you were told not to." Then again, you can even raise an exception when people try to change the object, as imdict does, as tuples do, etc. > You suggest two ways for solving the problem. The first is by copying > my mutable objects to immutable copies: And by caching those results, then invalidating them when they are updated by your application. This is the same as what you would like to do, except that I do not rely on copy-on-write semantics, which aren't any faster than freeze+cache by your application. [snip graph API example] > This will work. It's simply... well, not very beautiful. I have to > invent a lot of names, and my users need to remember them all. If I > give them a frozen set, with all the vertices than a vertex points to > (which is an absolutely reasonable API), they will know how to deal > with it without learning a lot of method names, thanks to the fact > that they are already familiar with sets, and that a lot of thought > has gone into the set interface. I never claimed it was beautiful, I claimed it would work. And it does. There are 7 methods, which you can reduce if you play the special method game: RemEdge -> __delitem__((node, node)) RemNode -> __delitem__(node) #forgot this method before IterNodes -> __iter__() IterOutgoing,IterIncoming -> IterAdjacent(node) In any case, whether you choose to use freeze, or use a different API, this particular problem is solvable without copy-on-write semantics. > > > > Now, about copy-on-write: > ... > > > > What you have written here is fairly unintelligible, but thankfully you > > clarify yourself...pity it still doesn't work, I explain below. > > I'm sorry if I am sometimes hard to understand. English is not my > mother tongue, and it degrades as the hour gets later - and sometimes, > things are hard to explain. If I don't explain myself, please say so > and I'll try again. This is an excellent example - I wrote about > callbacks, and went to sleep. Let me try to explain again how it > *does* work. Thank you for the clarification (btw, your english is far better than any of the foreign languages I've been "taught" over the years). > This is not so. When a list creates its frozen copy, it gives itself > to all those frozen() calls. This means that it will be notified if > one of its members is changed. In that case, it has to do two simple > actions: 1. release the reference to its frozen copy, so that > subsequent freezes of the list would create a new frozen copy, and: 2. > notify about the change any object which froze the list and requested > notification. > > This frees us of any validation code. If we haven't been notified > about a change, there was no change, and the frozen copy is valid. > > In case you ask, the cost of notification is O(1), amortized. The > reason is that every frozen() call can cause at most one callback in > the future. Even without validation, there are examples that for
Re: [Python-Dev] PEP 351, the freeze protocol
Steve Holden <[EMAIL PROTECTED]> wrote: > > Josiah Carlson wrote: > [...] > >>Perhaps I didn't make it clear. The difference between wxPython's Grid > >>and my table is that in the table, most values are *computed*. This > >>means that there's no point in changing the values themselves. They > >>are also used frequently as set members (I can describe why, but it's > >>a bit complicated.) > > > > Again, user semantics. Tell your users not to modify entries, and/or > > you can make copies of objects you return. If your users are too daft > > to read and/or follow directions, then they deserve to have their > > software not work. > > > That sounds like a "get out of jail free" card for Microsoft and many > other software vendors ... If/when vendors are COMPLETE in their specifications and documentation, they can have that card, but being that even when they specify such behaviors they are woefully incomplete, there is not a card to be found, and I stand by my opinion. - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
Hello, I have slept quite well, and talked about it with a few people, and I still think I'm right. About the users-changing-my-internal-data issue: > Again, user semantics. Tell your users not to modify entries, and/or > you can make copies of objects you return. If your users are too daft > to read and/or follow directions, then they deserve to have their > software not work. ... > When someone complains that something doesn't work, I tell them to read > the documentation. If your users haven't been told to RTFM often enough > to actually make it happen, then you need a RTFM-bat. Want to know how > you make one? You start wrapping the objects you return which segfaults > the process if they change things. When they start asking, tell them it > is documented quite clearly "do not to modify objects returned, or else". > Then there's the other option, which I provide below. I disagree. I read the manual when I don't know what something does. If I can guess what it does (and this is where conventions are good), I don't read the manual. And let's say I ought to read the complete manual for every method that I use, and that I deserve a death punishment (or a segmentation fault) if I don't. But let's say that I wrote a software, without reading the manual, and it worked. I have gone to work on other things, and suddenly a bug arises. When the poor guy who needs to fix it goes over the code, everything looks absolutely correct. Should he also read the complete manuals of every library that I used, in order to fix that bug? And remember that in this case, the object could have traveled between several places (including functions in other libraries), before it was changed, and the original data structure starts behaving weird. You suggest two ways for solving the problem. The first is by copying my mutable objects to immutable copies: > Also from the sounds of it, you are storing both source and destination > values in the same table...hrm, that sounds quite a bit like a > spreadsheet. How does every spreadsheet handle that again? Oh yeah, > they only ever store immutables (generally strings which are interpreted). > But I suppose since you are (of course) storing mutable objects, you > need to work a bit harder...so store mutables, and return immutable > copies (which you can cache if you want, and invalidate when your > application updates the results...like a wx.Grid update on changed). This is basically ok. It's just that in my solution, for many objects it's not necessary to make a complete copy just to prevent changing the value: Making frozen copies of objects which can't reference nonfrozen objects (sets, for example), costs O(1), thanks to the copy-on-write. Now, about the graph: > So let me get this straight: You've got a graph. You want to be able to > change the graph, but you don't want your users to accidentally change > the graph. Sounds to me like an API problem, not a freeze()/mutable problem. > Want an API? > > class graph: >... >def IterOutgoing(self, node): >... >def IterIncoming(self, node): >... >def IsAdjacent(self, node1, node2): >... >def IterNodes(self): >... >def AddEdge(self, f_node, t_node): >... >def RemEdge(self, node1, node2): >... >def AddNode(self): >... > > If you are reasonable in your implementation, all of the above > operations can be fast, and you will never have to worry about your > users accidentally mucking about with your internal data structures: > because you aren't exposing them. If you are really paranoid, you can > take the next step and implement it in Pyrex or C, so that only a > malicous user can muck about with internal structures, at which point > you stop supporting them. This will work. It's simply... well, not very beautiful. I have to invent a lot of names, and my users need to remember them all. If I give them a frozen set, with all the vertices than a vertex points to (which is an absolutely reasonable API), they will know how to deal with it without learning a lot of method names, thanks to the fact that they are already familiar with sets, and that a lot of thought has gone into the set interface. > > Now, about copy-on-write: ... > > What you have written here is fairly unintelligible, but thankfully you > clarify yourself...pity it still doesn't work, I explain below. I'm sorry if I am sometimes hard to understand. English is not my mother tongue, and it degrades as the hour gets later - and sometimes, things are hard to explain. If I don't explain myself, please say so and I'll try again. This is an excellent example - I wrote about callbacks, and went to sleep. Let me try to explain again how it *does* work. The frozen() function, and the __frozen__ protocol, would get another optional argument - an object to be notified when the *nonfrozen* object has changed. It may be called at most once - only on the first change to the object, and only
Re: [Python-Dev] PEP 351, the freeze protocol
On 10/31/05, Antoine Pitrou <[EMAIL PROTECTED]> wrote: > > > It allows everything in Python to be both mutable and hashable, > > I don't understand, since it's already the case. Any user-defined object > is at the same time mutable and hashable. By default, user-defined objects are equal iff they are the same object, regardless of their content. This makes mutability a non-issue. If you want to allow different objects be equal you need to implement a consistent equality operator (commutative, etc), a consistent hash function and ensure that any attributes affecting equality or hash value are immutable. If you fail to meet any of these requirements and put such objects in dictionaries or sets it will result in undefined behavior that may change between Python versions and implementations. Oren ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
Josiah Carlson wrote: [...] >>Perhaps I didn't make it clear. The difference between wxPython's Grid >>and my table is that in the table, most values are *computed*. This >>means that there's no point in changing the values themselves. They >>are also used frequently as set members (I can describe why, but it's >>a bit complicated.) > > > Again, user semantics. Tell your users not to modify entries, and/or > you can make copies of objects you return. If your users are too daft > to read and/or follow directions, then they deserve to have their > software not work. > That sounds like a "get out of jail free" card for Microsoft and many other software vendors ... regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
Noam Raphael <[EMAIL PROTECTED]> wrote: > > Hello, > > It seems that we both agree that freezing is cool (-; . We disagree on > whether a copy-on-write behaviour is desired. Your arguments agains > copy-on-write are: > 1. It's not needed. > 2. It's complicated to implement. > > But first of all, you didn't like my use cases. I want to argue with that. > > > In reading your description of a 'table of values', I can't help but be > > reminded of the wxPython (and wxWidget) wx.Grid and its semantics. It > > offers arbitrary tables of values (whose editors and viewers you can > > change at will), which offers a mechanism by which you can "listen" to > > changes that occur to the contents of a cell. I can't help but think > > that if you offered a protocol by which a user can signal that a cell > > has been changed, perhaps by writing the value to the table itself > > (table.SetValue(row, col, value)), every read a deepcopy (or a PEP 351 > > freeze), etc., that both you and the users of your table would be much > > happier. > > Perhaps I didn't make it clear. The difference between wxPython's Grid > and my table is that in the table, most values are *computed*. This > means that there's no point in changing the values themselves. They > are also used frequently as set members (I can describe why, but it's > a bit complicated.) Again, user semantics. Tell your users not to modify entries, and/or you can make copies of objects you return. If your users are too daft to read and/or follow directions, then they deserve to have their software not work. Also from the sounds of it, you are storing both source and destination values in the same table...hrm, that sounds quite a bit like a spreadsheet. How does every spreadsheet handle that again? Oh yeah, they only ever store immutables (generally strings which are interpreted). But I suppose since you are (of course) storing mutable objects, you need to work a bit harder...so store mutables, and return immutable copies (which you can cache if you want, and invalidate when your application updates the results...like a wx.Grid update on changed). > > As for the graph issue, you've got a bigger problem than users just > > being able to edit edge lists, users can clear the entire dictionary of > > vertices (outgoing.clear()). It seems to me that a more reasonable > > method to handle this particular case is to tell your users "don't > > modify the dictionaries or the edge lists", and/or store your edge lists > > as tuples instead of lists or dictionaries, and/or use an immutable > > dictionary (as offered by Barry in the PEP). > > As I wrote before, telling my users "don't modify the edge lists" is > just like making lists hashable, and telling all Python users, "dont > modify lists that are dictionary keys." There's no way to tell the > users that - there's no convention for objects which should not be > changed. You can write it in the documentation, but who'll bother > looking there? When someone complains that something doesn't work, I tell them to read the documentation. If your users haven't been told to RTFM often enough to actually make it happen, then you need a RTFM-bat. Want to know how you make one? You start wrapping the objects you return which segfaults the process if they change things. When they start asking, tell them it is documented quite clearly "do not to modify objects returned, or else". Then there's the other option, which I provide below. > I don't think that your other suggestions will work: the data > structure of the graph itself can't be made of immutable objects, > because of the fact that the graph is a mutable object - you can > change it. It can be made of immutable objects, but this means copying > all the data every time the graph changes. So let me get this straight: You've got a graph. You want to be able to change the graph, but you don't want your users to accidentally change the graph. Sounds to me like an API problem, not a freeze()/mutable problem. Want an API? class graph: ... def IterOutgoing(self, node): ... def IterIncoming(self, node): ... def IsAdjacent(self, node1, node2): ... def IterNodes(self): ... def AddEdge(self, f_node, t_node): ... def RemEdge(self, node1, node2): ... def AddNode(self): ... If you are reasonable in your implementation, all of the above operations can be fast, and you will never have to worry about your users accidentally mucking about with your internal data structures: because you aren't exposing them. If you are really paranoid, you can take the next step and implement it in Pyrex or C, so that only a malicous user can muck about with internal structures, at which point you stop supporting them. > Now, about copy-on-write: > > > There's also this little issue of "copy on write" semantics with Python. > > Anyone who tells you that "copy on write" is easy, is probably hanging > > o
Re: [Python-Dev] PEP 351, the freeze protocol
> It allows everything in Python to be both mutable and hashable, I don't understand, since it's already the case. Any user-defined object is at the same time mutable and hashable. And if you want the hash value to follow the changes in attribute values, just define an appropriate __hash__ method. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
Hello, It seems that we both agree that freezing is cool (-; . We disagree on whether a copy-on-write behaviour is desired. Your arguments agains copy-on-write are: 1. It's not needed. 2. It's complicated to implement. But first of all, you didn't like my use cases. I want to argue with that. > In reading your description of a 'table of values', I can't help but be > reminded of the wxPython (and wxWidget) wx.Grid and its semantics. It > offers arbitrary tables of values (whose editors and viewers you can > change at will), which offers a mechanism by which you can "listen" to > changes that occur to the contents of a cell. I can't help but think > that if you offered a protocol by which a user can signal that a cell > has been changed, perhaps by writing the value to the table itself > (table.SetValue(row, col, value)), every read a deepcopy (or a PEP 351 > freeze), etc., that both you and the users of your table would be much > happier. Perhaps I didn't make it clear. The difference between wxPython's Grid and my table is that in the table, most values are *computed*. This means that there's no point in changing the values themselves. They are also used frequently as set members (I can describe why, but it's a bit complicated.) I want to say that even if sets weren't used, the objects in the table should have been frozen. The fact the sets (and dicts) only allow immutable objects as members/keys is just for protecting the user. They could have declared, "you shouldn't change anything you insert - as long as you don't, we'll function properly." The only reason why you can't compute hash values of mutable objects is that you don't want your user to make mistakes, and make the data structure inconsistent. > As for the graph issue, you've got a bigger problem than users just > being able to edit edge lists, users can clear the entire dictionary of > vertices (outgoing.clear()). It seems to me that a more reasonable > method to handle this particular case is to tell your users "don't > modify the dictionaries or the edge lists", and/or store your edge lists > as tuples instead of lists or dictionaries, and/or use an immutable > dictionary (as offered by Barry in the PEP). As I wrote before, telling my users "don't modify the edge lists" is just like making lists hashable, and telling all Python users, "dont modify lists that are dictionary keys." There's no way to tell the users that - there's no convention for objects which should not be changed. You can write it in the documentation, but who'll bother looking there? I don't think that your other suggestions will work: the data structure of the graph itself can't be made of immutable objects, because of the fact that the graph is a mutable object - you can change it. It can be made of immutable objects, but this means copying all the data every time the graph changes. Now, about copy-on-write: > There's also this little issue of "copy on write" semantics with Python. > Anyone who tells you that "copy on write" is easy, is probably hanging > out with the same kind of people who say that "threading is easy". Of > course both are easy if you limit your uses to some small subset of > interesting interactions, but "copy on write" gets far harder when you > start thinking of dictionaries, lists, StringIOs, arrays, and all the > possible user-defined classes, which may be mutated beyond obj[key] = > value and/or obj.attr = value (some have obj.method() which mutates the > object). As such, offering a callback mechanism similar to weak > references is probably pretty close to impossible with CPython. Let's limit ourselves to copy-on-write of objects which do not contain nonfrozen objects. Perhaps it's enough - the table, the graph, and strings, are perfect examples of these. Implementation doesn't seem to complicated to me - whenever the object is about to change, and there is a connected frozen copy, you make a shallow copy of the object, point the frozen copy to it, release the reference to the frozen copy, and continue as usual. That's all. I really think that this kind of copy-on-write is "correct". The temporary freezing of sets in order to check if they are members of other sets is a not-very-nice way of implementing it. This kind of copy-on-write would allow, in principle, for Python strings to become mutable, with almost no speed penalty. It would allow my table, and other containers, to automatically freeze the objects that get into it, without having to trust the user on not changing the objects - and remember that there's no way of *telling* him not to change the objects. Now, the computer scientist in me wants to explain (and think about) freezing containers of nonfrozen objects. What I actually want is that as long as an object doesn't change after it's freezed, the cost of freezing would be nothing - that is, O(1). Think about a mutable string object, which is used in the same way as the current, immutable strings. It is constructed o
Re: [Python-Dev] PEP 351, the freeze protocol
Noam Raphael <[EMAIL PROTECTED]> wrote: > > Hello, > > I have thought about freezing for some time, and I think that it is a > fundamental need - the need to know, sometimes, that objects aren't > going to change. I agree with this point. > This is mostly the need of containers. dicts need to know that the > objects that are used as keys aren't going to change, because if they > change, their hash value changes, and you end up with a data structure > in an inconsistent state. This is the need of sets too, and of heaps, > and binary trees, and so on. You are exactly mirroring the sentiments of the PEP. > I want to give another example: I and my colleges designed something > which can be described as an "electronic spreadsheet in Python". We > called it a "table". The values in the table are Python objects, and > the functions which relate them are written in Python. Then comes the > problem: the user has, of course, access to the objects stored in the > table. What would happen if he changes them? The answer is that the > table would be in an inconsistent state, since something which should > be the return value of a function is now something else, and there's > no way for the table to know about that. I respectfully disagree with this point and the rest of your email. Why? For two use-cases, you offer 'tables of values' and 'graphs', as well as a possible solution to the 'problem'; copy on write. In reading your description of a 'table of values', I can't help but be reminded of the wxPython (and wxWidget) wx.Grid and its semantics. It offers arbitrary tables of values (whose editors and viewers you can change at will), which offers a mechanism by which you can "listen" to changes that occur to the contents of a cell. I can't help but think that if you offered a protocol by which a user can signal that a cell has been changed, perhaps by writing the value to the table itself (table.SetValue(row, col, value)), every read a deepcopy (or a PEP 351 freeze), etc., that both you and the users of your table would be much happier. As for the graph issue, you've got a bigger problem than users just being able to edit edge lists, users can clear the entire dictionary of vertices (outgoing.clear()). It seems to me that a more reasonable method to handle this particular case is to tell your users "don't modify the dictionaries or the edge lists", and/or store your edge lists as tuples instead of lists or dictionaries, and/or use an immutable dictionary (as offered by Barry in the PEP). There's also this little issue of "copy on write" semantics with Python. Anyone who tells you that "copy on write" is easy, is probably hanging out with the same kind of people who say that "threading is easy". Of course both are easy if you limit your uses to some small subset of interesting interactions, but "copy on write" gets far harder when you start thinking of dictionaries, lists, StringIOs, arrays, and all the possible user-defined classes, which may be mutated beyond obj[key] = value and/or obj.attr = value (some have obj.method() which mutates the object). As such, offering a callback mechanism similar to weak references is probably pretty close to impossible with CPython. One of the reasons why I liked the freeze protocol is that it offered a simple mechanism by which Python could easily offer support, for both new and old objects alike. Want an example? Here's the implementation for array freezing: tuple(a). What about lists? tuple(map(freeze, lst)) Freezing may not ultimately be the right solution for everything, but it is a simple solution which handles the majority of cases. Copy on write in Python, on the other hand, is significantly harder to implement, support, and is probably not the right solution for many problems. - Josiah P.S. To reiterate to Barry: map freeze to the contents of containers, otherwise the object can still be modified, and hence is not frozen. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
Hello, I have thought about freezing for some time, and I think that it is a fundamental need - the need to know, sometimes, that objects aren't going to change. This is mostly the need of containers. dicts need to know that the objects that are used as keys aren't going to change, because if they change, their hash value changes, and you end up with a data structure in an inconsistent state. This is the need of sets too, and of heaps, and binary trees, and so on. I want to give another example: I and my colleges designed something which can be described as an "electronic spreadsheet in Python". We called it a "table". The values in the table are Python objects, and the functions which relate them are written in Python. Then comes the problem: the user has, of course, access to the objects stored in the table. What would happen if he changes them? The answer is that the table would be in an inconsistent state, since something which should be the return value of a function is now something else, and there's no way for the table to know about that. The solution is to have a "freeze" protocol. It may be called "frozen" (like frozen(set([1,2,3]))), so that it will be clear that it does not change the object itself. The definition of a frozen object is that its value can't change - that is, if you compare it with another object, you should get the same result as long as the other object hasn't changed. As a rule, only frozen objects should be hashable. I want to give another, different, use case for freezing objects. I once thought about writing a graph package in Python - I mean a graph with vertices and edges. The most obvious way to store a directed graph is as a mapping (dict) from a node to the set of nodes that it points to. Since I want to be able to find also which nodes point to a specific node, I will store another mapping, from a node to the set of nodes that point to it. Now, I want a method of the graph which will return the set of nodes that a given node points to, for example to let me write "if y in graph.adjacent_nodes(x) then". The question is, what will the adjacent_nodes method return? If it returns the set which is a part of the data structure, there is nothing (even no convention!) that will prevent the user from playing with it. This will corrupt the data structure, since the change won't be recorded in the inverse mapping. adjacent_nodes can return a copy of the set, it's a waste if you only want to check whether an object is a member of the set. I gave this example to say that the "frozen" protocol should (when possible) return an object which doesn't really contain a copy of the data, but rather gives an "image" of the original object. If the original object changes while there are frozen copies of it, the data will be copied, and all the frozen objects will then reference a version of the data that will never change again. This will solve the graph problem nicely - adjacent_nodes would simply return a frozen copy of the set, and a copy operation would happen only in the rare cases when the returned set is being modified. This would also help the container use cases: they may call the frozen() method on objects that should be inserted into the container, and usually the data won't be copied. Some objects can't be created in their final form, but can only be constructed step after step. This means that they must be non-frozen objects. Sometimes they are constructed in order to get into a container. Unless the frozen() method is copy-on-change the way I described, all the data would have to be copied again, just for the commitment that it won't change. I don't mean to frighten, but in principle, this may mean that immutable strings might be introduced, which will allow us to get rid of all the cStringIO workarounds. Immutable strings would be constructed whenever they are needed, at a low performance cost (remember that a frozen copy of a given object has to be constructed only once - once it has been created, the same object can be returned on additional frozen() calls.) Copy-on-change of containers of non-frozen objects requires additional complication: it requires frozen objects to have a way for setting a callback that will be called when the original object was changed. This is because the change makes the container of the original object change, so it must drop its own frozen copy. This needs to happen only once per frozen object, since after a change, all the containers drop their frozen copies. I think this callback is conceptually similar to the weakref callback. Just an example that copy-on-change (at least of containers of frozen objects) is needed: sets. It was decided that you can test whether a non-frozen set is a member of a set. I understand that it is done by "temporarily freezing" the set, and that it caused some threading issues. A copy-on-change mechanism might solve it more elegantly. What do you think? Noam ___
Re: [Python-Dev] PEP 351, the freeze protocol
Paolino wrote: > Is __hash__=id inside a class enough to use a set (sets.Set before 2.5) > derived class instance as a key to a mapping? It is __hash__=lambda self:id(self) that is terribly slow ,it needs a faster way to state that to let them be useful as key to mapping as all set operations will pipe into the mechanism .In my application that function is eating time like hell, and will keep on doing it even with the PEP proposed .OT probably. Regards Paolino ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
Nick Coghlan <[EMAIL PROTECTED]> wrote: > > Josiah Carlson wrote: > > Nick Coghlan <[EMAIL PROTECTED]> wrote: > >> I think having dicts and sets automatically invoke freeze would be a > >> mistake, > >> because at least one of the following two cases would behave unexpectedly: > > > > I'm pretty sure that the PEP was only aslomg if one would freeze the > > contents of dicts IF the dict was being frozen. > > > > That is, which of the following should be the case: > > freeze({1:[2,3,4]}) -> {1:[2,3,4]} > > freeze({1:[2,3,4]}) -> xdict(1=(2,3,4)) > > I believe the choices you intended are: > freeze({1:[2,3,4]}) -> imdict(1=[2,3,4]) > freeze({1:[2,3,4]}) -> imdict(1=(2,3,4)) > > Regardless, that question makes a lot more sense (and looking at the PEP > again, I realised I simply read it wrong the first time). > > For containers where equality depends on the contents of the container (i.e., > all the builtin ones), I don't see how it is possible to implement a sensible > hash function without freezing the contents as well - otherwise your > immutable > isn't particularly immutable. > > Consider what would happen if list "__freeze__" simply returned a tuple > version of itself - you have a __freeze__ method which returns a potentially > unhashable object! I agree completely, hence my original statement on 10/23: "it is of my opinion that a container which is frozen should have its contents frozen as well." - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
Josiah Carlson wrote: > Nick Coghlan <[EMAIL PROTECTED]> wrote: >> I think having dicts and sets automatically invoke freeze would be a >> mistake, >> because at least one of the following two cases would behave unexpectedly: > > I'm pretty sure that the PEP was only aslomg if one would freeze the > contents of dicts IF the dict was being frozen. > > That is, which of the following should be the case: > freeze({1:[2,3,4]}) -> {1:[2,3,4]} > freeze({1:[2,3,4]}) -> xdict(1=(2,3,4)) I believe the choices you intended are: freeze({1:[2,3,4]}) -> imdict(1=[2,3,4]) freeze({1:[2,3,4]}) -> imdict(1=(2,3,4)) Regardless, that question makes a lot more sense (and looking at the PEP again, I realised I simply read it wrong the first time). For containers where equality depends on the contents of the container (i.e., all the builtin ones), I don't see how it is possible to implement a sensible hash function without freezing the contents as well - otherwise your immutable isn't particularly immutable. Consider what would happen if list "__freeze__" simply returned a tuple version of itself - you have a __freeze__ method which returns a potentially unhashable object! Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://boredomandlaziness.blogspot.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
[Barry Warsaw] > I've had this PEP laying around for quite a few months. It was inspired > by some code we'd written which wanted to be able to get immutable > versions of arbitrary objects. * FWIW, the _as_immutable() protocol was dropped from sets.py for a reason. User reports indicated that it was never helpful in practice. It added complexity and confusion without producing offsetting benefits. * AFAICT, there are no use cases for freezing arbitrary objects when the object types are restricted to just lists and sets but not dicts, arrays, or other containers. Even if the range of supported types were expanded, what applications could use this? Most apps cannot support generic substitution of lists and sets -- they have too few methods in common -- they are almost never interchangeable. * I'm concerned that generic freezing leads to poor design and hard-to-find bugs. One class of bugs results from conflating ordered and unordered collections as lookup keys. It is difficult to assess program correctness when the ordered/unordered distinction has been abstracted away. A second class of errors can arise when the original object mutates and gets out-of-sync with its frozen counterpart. * For a rare app needing mutable lookup keys, a simple recipe would suffice: freeze_pairs = [(list, tuple), (set, frozenset)] def freeze(obj): try: hash(obj) except TypeError: for sourcetype, desttype in freeze_pairs: if isinstance(obj, sourcetype): return desttype(obj) raise else: return obj Unlike the PEP, the recipe works with older pythons and is trivially easy to extend to include other containers. * The name "freeze" is problematic because it suggests an in-place change. Instead, the proposed mechanism creates a new object. In contrast, explicit conversions like tuple(l) or frozenset(s) are obvious about their running time, space consumed, and new object identity. Overall, I'm -1 on the PEP. Like a bad C macro, the proposed abstraction hides too much. We lose critical distinctions of ordered vs unordered, mutable vs immutable, new objects vs in-place change, etc. Without compelling use cases, the mechanism smells like a hyper-generalization. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
On Oct 23, 2005, at 6:43 PM, Barry Warsaw wrote: > I've had this PEP laying around for quite a few months. It was > inspired > by some code we'd written which wanted to be able to get immutable > versions of arbitrary objects. I've finally finished the PEP, > uploaded > a sample patch (albeit a bit incomplete), and I'm posting it here > to see > if there is any interest. > > http://www.python.org/peps/pep-0351.html I like this. I'd like it better if it integrated with the adapter PEP, so that the freezing mechanism for a given type could be pluggable, and could be provided even if the original object did not contemplate it. I don't know where the adapter PEP stands: skimming through the (most recent?) thread in January didn't give me a clear idea. As another poster mentioned, in-place freezing is also of interest to me (and why I read the PEP Initially), but as also as mentioned that's probably unrelated to your PEP. Gary ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
I'm not sure I understood completely the idea but deriving freeze function from hash gives hash a wider importance. Is __hash__=id inside a class enough to use a set (sets.Set before 2.5) derived class instance as a key to a mapping? Sure I missed the point. Regards Paolino ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
Nick Coghlan <[EMAIL PROTECTED]> wrote: > I think having dicts and sets automatically invoke freeze would be a mistake, > because at least one of the following two cases would behave unexpectedly: I'm pretty sure that the PEP was only aslomg if one would freeze the contents of dicts IF the dict was being frozen. That is, which of the following should be the case: freeze({1:[2,3,4]}) -> {1:[2,3,4]} freeze({1:[2,3,4]}) -> xdict(1=(2,3,4)) - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
Christopher Armstrong <[EMAIL PROTECTED]> wrote: > > On 10/24/05, Josiah Carlson <[EMAIL PROTECTED]> wrote: > > "Should dicts and sets automatically freeze their mutable keys?" > > > > Dictionaries don't have mutable keys, > > Since when? > > Maybe you meant something else? I can't think of any way in which > "dictionaries don't have mutable keys" is true. The only rule about > dictionary keys that I know of is that they need to be hashable and > need to be comparable with the equality operator. Good point, I forgot about user-defined classes (I rarely use them as keys myself, it's all too easy to make a mutable whose hash is dependant on mutable contents, as having an object which you can only find if you have the exact object is not quite as useful I generally need). I will, however, stand by, "a container which is frozen should have its contents frozen as well." - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
On 10/24/05, Josiah Carlson <[EMAIL PROTECTED]> wrote: > "Should dicts and sets automatically freeze their mutable keys?" > > Dictionaries don't have mutable keys, Since when? class Foo: def __init__(self): self.x = 1 f = Foo() d = {f: 1} f.x = 2 Maybe you meant something else? I can't think of any way in which "dictionaries don't have mutable keys" is true. The only rule about dictionary keys that I know of is that they need to be hashable and need to be comparable with the equality operator. -- Twisted | Christopher Armstrong: International Man of Twistery Radix|-- http://radix.twistedmatrix.com | Release Manager, Twisted Project \\\V/// |-- http://twistedmatrix.com |o O|| wvw-+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
Barry Warsaw wrote: > I've had this PEP laying around for quite a few months. It was inspired > by some code we'd written which wanted to be able to get immutable > versions of arbitrary objects. I've finally finished the PEP, uploaded > a sample patch (albeit a bit incomplete), and I'm posting it here to see > if there is any interest. > > http://www.python.org/peps/pep-0351.html I think it's definitely worth considering. It may also reduce the need for "x" and "frozenx" builtin pairs. We already have "set" and "frozenset", and the various "bytes" ideas that have been kicked around have generally considered the need for a "frozenbytes" as well. If freeze was available, then "freeze(x(*args))" might server as a replacement for any builtin "frozen" variants. I think having dicts and sets automatically invoke freeze would be a mistake, because at least one of the following two cases would behave unexpectedly: d = {} l = [] d[l] = "Oops!" d[l] # Raises KeyError if freeze() isn't also invoked in __getitem__ d = {} l = [] d[l] = "Oops!" l.append(1) d[l] # Raises KeyError regardless Oh, and the PEP's xdict example is even more broken than the PEP implies, because two imdicts which compare equal (same contents) may not hash equal (different id's). Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://boredomandlaziness.blogspot.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
Barry Warsaw <[EMAIL PROTECTED]> wrote: > I've had this PEP laying around for quite a few months. It was inspired > by some code we'd written which wanted to be able to get immutable > versions of arbitrary objects. I've finally finished the PEP, uploaded > a sample patch (albeit a bit incomplete), and I'm posting it here to see > if there is any interest. > > http://www.python.org/peps/pep-0351.html class xlist(list): def __freeze__(self): return tuple(self) Shouldn't that be: class xlist(list): def __freeze__(self): return tuple(map(freeze, self)) "Should dicts and sets automatically freeze their mutable keys?" Dictionaries don't have mutable keys, but it is of my opinion that a container which is frozen should have its contents frozen as well. - Josiah ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 351, the freeze protocol
On 10/23/05, Barry Warsaw <[EMAIL PROTECTED]> wrote: > I've had this PEP laying around for quite a few months. It was inspired > by some code we'd written which wanted to be able to get immutable > versions of arbitrary objects. I've finally finished the PEP, uploaded > a sample patch (albeit a bit incomplete), and I'm posting it here to see > if there is any interest. > > http://www.python.org/peps/pep-0351.html My sandboxes need freezing for some stuff and ultimately freezable user classes will be desirable, but for performance reasons I prefer freezing inplace. Not much overlap with PEP 351 really. -- Adam Olsen, aka Rhamphoryncus ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 351, the freeze protocol
I've had this PEP laying around for quite a few months. It was inspired by some code we'd written which wanted to be able to get immutable versions of arbitrary objects. I've finally finished the PEP, uploaded a sample patch (albeit a bit incomplete), and I'm posting it here to see if there is any interest. http://www.python.org/peps/pep-0351.html Cheers, -Barry signature.asc Description: This is a digitally signed message part ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com