Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-31 Thread Noam Raphael
I thought about something -
>
> I think that the performance penalty may be rather small - remember
> that in programs which do not change strings, there would never be a
> need to copy the string data at all. And since I think that usually
> most of the dict lookups are for method or function names, there would
> almost never be a need to constuct a new object on dict lookup,
> because you search for the same names again and again, and a new
> object is created only on the first frozen() call. You might even gain
> performance, because s += x would be faster.
>
Name lookups can take virtually the same time they take now - method
names can be saved from the beginning as frozen strings, so finding
them in a dict will take just another bit test - is the object frozen
- before doing exactly what is done now. Remember, the strings we are
familiar with are simply frozen strings...
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-31 Thread Noam Raphael
On 10/31/05, Josiah Carlson <[EMAIL PROTECTED]> wrote:
...
> And I'm going to point out why you are wrong.

I still don't think so. I think that I need to reclarify what I mean.

> > About the users-changing-my-internal-data issue:
...
> You can have a printout before it dies:
> "I'm crashing your program because something attempted to modify a data
> structure (here's the traceback), and you were told not to."
>
> Then again, you can even raise an exception when people try to change
> the object, as imdict does, as tuples do, etc.

Both solutions would solve the problem, but would require me to wrap
the built-in set with something which doesn't allow changes. This is a
lot of work - but it's quite similiar to what my solution would
actually do, in a single built-in function.
>
> > You suggest two ways for solving the problem. The first is by copying
> > my mutable objects to immutable copies:
>
> And by caching those results, then invalidating them when they are
> updated by your application.  This is the same as what you would like to
> do, except that I do not rely on copy-on-write semantics, which aren't
> any faster than freeze+cache by your application.

This isn't correct - freezing a set won't require a single copy to be
performed, as long as the frozen copy isn't saved after the original
is changed. Copy+cache always requires one copy.

...
> I never claimed it was beautiful, I claimed it would work.  And it does.
> There are 7 methods, which you can reduce if you play the special method
> game:
>
> RemEdge -> __delitem__((node, node))
> RemNode -> __delitem__(node) #forgot this method before
> IterNodes -> __iter__()
> IterOutgoing,IterIncoming -> IterAdjacent(node)
>
I just wanted to say that this game is of course a lot of fun, but it
doesn't simplify the interface.

> In any case, whether you choose to use freeze, or use a different API,
> this particular problem is solvable without copy-on-write semantics.

Right. But I think that a significant simplification of the API is a
nice bonus for my solution. And about those copy-on-write semantics -
it should be proven how complex they are. Remember that we are talking
about frozen-copy-on-write, which I think would simplify matters
considerably - for example, there are at most two instances sharing
the same data, since the frozen copy can be returned again and again.

> > > > Now, about copy-on-write:
> > ...
> Thank you for the clarification (btw, your english is far better than
> any of the foreign languages I've been "taught" over the years).
Thanks! It seems that if you are forced to use a language from time to
time it improves a little...

...

> Even without validation, there are examples that force a high number of
> calls, which are not O(1), ammortized or otherwise.
>
[Snap - a very interesting example]
>
> Now, the actual time analysis on repeated freezings and such gets ugly.
> There are actually O(k) objects, which take up O(k**2) space.  When you
> modify object b[i][j] (which has just been frozen), you get O(k)
> callbacks, and when you call freeze(b), it actually results in O(k**2)
> time to re-copy the O(k**2) pointers to the O(k) objects.  It should be
> obvious that this IS NOT AMMORTIZABLE to original object creation time.
>
That's absolutely right. My ammortized analysis is correct only if you
limit yourself to cases in which the original object doesn't change
after a frozen() call was made. In that case, it's ok to count the
O(k**2) copy with the O(k**2) object creation, because it's made only
once.

Why it's ok to analyze only that limited case? I am suggesting a
change in Python: that every object you would like be mutable, and
would support the frozen() protocol. When you evaluate my suggestion,
you need to take a program, and measure its performance in the current
Python and in a Python which implements my suggestion. This means that
the program should work also on the current Python. In that case, my
assumption is true - you won't change objects after you have frozen
them, simply because these objects (strings which are used as dict
keys, for example) can't be changed at all in the current Python
implementation!

I will write it in another way: I am proposing a change that will make
Python objects, including strings, mutable, and gives you other
advantages as well. I claim that it won't make existing Python
programs run slower in O() terms. It would allow you to do many things
that you can't do today; some of them would be fast, like editing a
string, and some of them would be less fast - for example, repeatedly
changing an object and freezing it.

I think that the performance penalty may be rather small - remember
that in programs which do not change strings, there would never be a
need to copy the string data at all. And since I think that usually
most of the dict lookups are for method or function names, there would
almost never be a need to constuct a new object on dict lookup,
because you search for the same names ag

Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-31 Thread Josiah Carlson

Noam Raphael <[EMAIL PROTECTED]> wrote:
> Hello,
> 
> I have slept quite well, and talked about it with a few people, and I
> still think I'm right.

And I'm going to point out why you are wrong.

> About the users-changing-my-internal-data issue:
> 
> > Again, user semantics.  Tell your users not to modify entries, and/or
> > you can make copies of objects you return.  If your users are too daft
> > to read and/or follow directions, then they deserve to have their
> > software not work.
> ...
> > When someone complains that something doesn't work, I tell them to read
> > the documentation.  If your users haven't been told to RTFM often enough
> > to actually make it happen, then you need a RTFM-bat. Want to know how
> > you make one?  You start wrapping the objects you return which segfaults
> > the process if they change things. When they start asking, tell them it
> > is documented quite clearly "do not to modify objects returned, or else".
> > Then there's the other option, which I provide below.
> 
> I disagree. I read the manual when I don't know what something does.
> If I can guess what it does (and this is where conventions are good),
> I don't read the manual. And let's say I ought to read the complete
> manual for every method that I use, and that I deserve a death
> punishment (or a segmentation fault) if I don't. But let's say that I
> wrote a software, without reading the manual, and it worked. I have
> gone to work on other things, and suddenly a bug arises. When the poor
> guy who needs to fix it goes over the code, everything looks
> absolutely correct. Should he also read the complete manuals of every
> library that I used, in order to fix that bug? And remember that in
> this case, the object could have traveled between several places
> (including functions in other libraries), before it was changed, and
> the original data structure starts behaving weird.

You can have a printout before it dies:
"I'm crashing your program because something attempted to modify a data
structure (here's the traceback), and you were told not to."

Then again, you can even raise an exception when people try to change
the object, as imdict does, as tuples do, etc.


> You suggest two ways for solving the problem. The first is by copying
> my mutable objects to immutable copies:

And by caching those results, then invalidating them when they are
updated by your application.  This is the same as what you would like to
do, except that I do not rely on copy-on-write semantics, which aren't
any faster than freeze+cache by your application.

[snip graph API example]

> This will work. It's simply... well, not very beautiful. I have to
> invent a lot of names, and my users need to remember them all. If I
> give them a frozen set, with all the vertices than a vertex points to
> (which is an absolutely reasonable API), they will know how to deal
> with it without learning a lot of method names, thanks to the fact
> that they are already familiar with sets, and that a lot of thought
> has gone into the set interface.

I never claimed it was beautiful, I claimed it would work.  And it does. 
There are 7 methods, which you can reduce if you play the special method
game:

RemEdge -> __delitem__((node, node))
RemNode -> __delitem__(node) #forgot this method before
IterNodes -> __iter__()
IterOutgoing,IterIncoming -> IterAdjacent(node)

In any case, whether you choose to use freeze, or use a different API,
this particular problem is solvable without copy-on-write semantics.

> 
> > > Now, about copy-on-write:
> ...
> >
> > What you have written here is fairly unintelligible, but thankfully you
> > clarify yourself...pity it still doesn't work, I explain below.
> 
> I'm sorry if I am sometimes hard to understand. English is not my
> mother tongue, and it degrades as the hour gets later - and sometimes,
> things are hard to explain. If I don't explain myself, please say so
> and I'll try again. This is an excellent example - I wrote about
> callbacks, and went to sleep. Let me try to explain again how it
> *does* work.

Thank you for the clarification (btw, your english is far better than
any of the foreign languages I've been "taught" over the years).

> This is not so. When a list creates its frozen copy, it gives itself
> to all those frozen() calls. This means that it will be notified if
> one of its members is changed. In that case, it has to do two simple
> actions: 1. release the reference to its frozen copy, so that
> subsequent freezes of the list would create a new frozen copy, and: 2.
> notify about the change any object which froze the list and requested
> notification.
> 
> This frees us of any validation code. If we haven't been notified
> about a change, there was no change, and the frozen copy is valid.
> 
> In case you ask, the cost of notification is O(1), amortized. The
> reason is that every frozen() call can cause at most one callback in
> the future.

Even without validation, there are examples that for

Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-31 Thread Josiah Carlson

Steve Holden <[EMAIL PROTECTED]> wrote:
> 
> Josiah Carlson wrote:
> [...]
> >>Perhaps I didn't make it clear. The difference between wxPython's Grid
> >>and my table is that in the table, most values are *computed*. This
> >>means that there's no point in changing the values themselves. They
> >>are also used frequently as set members (I can describe why, but it's
> >>a bit complicated.)
> > 
> > Again, user semantics.  Tell your users not to modify entries, and/or
> > you can make copies of objects you return.  If your users are too daft
> > to read and/or follow directions, then they deserve to have their
> > software not work.
> > 
> That sounds like a "get out of jail free" card for Microsoft and many 
> other software vendors ...

If/when vendors are COMPLETE in their specifications and documentation,
they can have that card, but being that even when they specify such
behaviors they are woefully incomplete, there is not a card to be found,
and I stand by my opinion.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-31 Thread Noam Raphael
Hello,

I have slept quite well, and talked about it with a few people, and I
still think I'm right.

About the users-changing-my-internal-data issue:

> Again, user semantics.  Tell your users not to modify entries, and/or
> you can make copies of objects you return.  If your users are too daft
> to read and/or follow directions, then they deserve to have their
> software not work.
...
> When someone complains that something doesn't work, I tell them to read
> the documentation.  If your users haven't been told to RTFM often enough
> to actually make it happen, then you need a RTFM-bat. Want to know how
> you make one?  You start wrapping the objects you return which segfaults
> the process if they change things. When they start asking, tell them it
> is documented quite clearly "do not to modify objects returned, or else".
> Then there's the other option, which I provide below.

I disagree. I read the manual when I don't know what something does.
If I can guess what it does (and this is where conventions are good),
I don't read the manual. And let's say I ought to read the complete
manual for every method that I use, and that I deserve a death
punishment (or a segmentation fault) if I don't. But let's say that I
wrote a software, without reading the manual, and it worked. I have
gone to work on other things, and suddenly a bug arises. When the poor
guy who needs to fix it goes over the code, everything looks
absolutely correct. Should he also read the complete manuals of every
library that I used, in order to fix that bug? And remember that in
this case, the object could have traveled between several places
(including functions in other libraries), before it was changed, and
the original data structure starts behaving weird.

You suggest two ways for solving the problem. The first is by copying
my mutable objects to immutable copies:

> Also from the sounds of it, you are storing both source and destination
> values in the same table...hrm, that sounds quite a bit like a
> spreadsheet.  How does every spreadsheet handle that again?  Oh yeah,
> they only ever store immutables (generally strings which are interpreted).
> But I suppose since you are (of course) storing mutable objects, you
> need to work a bit harder...so store mutables, and return immutable
> copies (which you can cache if you want, and invalidate when your
> application updates the results...like a wx.Grid update on changed).

This is basically ok. It's just that in my solution, for many objects
it's not necessary to make a complete copy just to prevent changing
the value: Making frozen copies of objects which can't reference
nonfrozen objects (sets, for example), costs O(1), thanks to the
copy-on-write.

Now, about the graph:

> So let me get this straight: You've got a graph.  You want to be able to
> change the graph, but you don't want your users to accidentally change
> the graph. Sounds to me like an API problem, not a freeze()/mutable problem.
> Want an API?
>
> class graph:
>...
>def IterOutgoing(self, node):
>...
>def IterIncoming(self, node):
>...
>def IsAdjacent(self, node1, node2):
>...
>def IterNodes(self):
>...
>def AddEdge(self, f_node, t_node):
>...
>def RemEdge(self, node1, node2):
>...
>def AddNode(self):
>...
>
> If you are reasonable in your implementation, all of the above
> operations can be fast, and you will never have to worry about your
> users accidentally mucking about with your internal data structures:
> because you aren't exposing them.  If you are really paranoid, you can
> take the next step and implement it in Pyrex or C, so that only a
> malicous user can muck about with internal structures, at which point
> you stop supporting them.

This will work. It's simply... well, not very beautiful. I have to
invent a lot of names, and my users need to remember them all. If I
give them a frozen set, with all the vertices than a vertex points to
(which is an absolutely reasonable API), they will know how to deal
with it without learning a lot of method names, thanks to the fact
that they are already familiar with sets, and that a lot of thought
has gone into the set interface.

> > Now, about copy-on-write:
...
>
> What you have written here is fairly unintelligible, but thankfully you
> clarify yourself...pity it still doesn't work, I explain below.

I'm sorry if I am sometimes hard to understand. English is not my
mother tongue, and it degrades as the hour gets later - and sometimes,
things are hard to explain. If I don't explain myself, please say so
and I'll try again. This is an excellent example - I wrote about
callbacks, and went to sleep. Let me try to explain again how it
*does* work.

The frozen() function, and the __frozen__ protocol, would get another
optional argument - an object to be notified when the *nonfrozen*
object has changed. It may be called at most once - only on the first
change to the object, and only 

Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-31 Thread Oren Tirosh
On 10/31/05, Antoine Pitrou <[EMAIL PROTECTED]> wrote:
>
> > It allows everything in Python to be both mutable and hashable,
>
> I don't understand, since it's already the case. Any user-defined object
> is at the same time mutable and hashable.

By default, user-defined objects are equal iff they are the same
object, regardless of their content. This makes mutability a
non-issue.

If you want to allow different objects be equal you need to implement
a consistent equality operator (commutative, etc), a consistent hash
function and ensure that any attributes affecting equality or hash
value are immutable. If you fail to meet any of these requirements and
put such objects in dictionaries or sets it will result in undefined
behavior that may change between Python versions and implementations.

  Oren
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-31 Thread Steve Holden
Josiah Carlson wrote:
[...]
>>Perhaps I didn't make it clear. The difference between wxPython's Grid
>>and my table is that in the table, most values are *computed*. This
>>means that there's no point in changing the values themselves. They
>>are also used frequently as set members (I can describe why, but it's
>>a bit complicated.)
> 
> 
> Again, user semantics.  Tell your users not to modify entries, and/or
> you can make copies of objects you return.  If your users are too daft
> to read and/or follow directions, then they deserve to have their
> software not work.
> 
That sounds like a "get out of jail free" card for Microsoft and many 
other software vendors ...

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006  www.python.org/pycon/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-30 Thread Josiah Carlson

Noam Raphael <[EMAIL PROTECTED]> wrote:
> 
> Hello,
> 
> It seems that we both agree that freezing is cool (-; . We disagree on
> whether a copy-on-write behaviour is desired. Your arguments agains
> copy-on-write are:
> 1. It's not needed.
> 2. It's complicated to implement.
> 
> But first of all, you didn't like my use cases. I want to argue with that.
> 
> > In reading your description of a 'table of values', I can't help but be
> > reminded of the wxPython (and wxWidget) wx.Grid and its semantics.  It
> > offers arbitrary tables of values (whose editors and viewers you can
> > change at will), which offers a mechanism by which you can "listen" to
> > changes that occur to the contents of a cell.  I can't help but think
> > that if you offered a protocol by which a user can signal that a cell
> > has been changed, perhaps by writing the value to the table itself
> > (table.SetValue(row, col, value)), every read a deepcopy (or a PEP 351
> > freeze), etc., that both you and the users of your table would be much
> > happier.
> 
> Perhaps I didn't make it clear. The difference between wxPython's Grid
> and my table is that in the table, most values are *computed*. This
> means that there's no point in changing the values themselves. They
> are also used frequently as set members (I can describe why, but it's
> a bit complicated.)

Again, user semantics.  Tell your users not to modify entries, and/or
you can make copies of objects you return.  If your users are too daft
to read and/or follow directions, then they deserve to have their
software not work.

Also from the sounds of it, you are storing both source and destination
values in the same table...hrm, that sounds quite a bit like a
spreadsheet.  How does every spreadsheet handle that again?  Oh yeah,
they only ever store immutables (generally strings which are interpreted). 
But I suppose since you are (of course) storing mutable objects, you
need to work a bit harder...so store mutables, and return immutable
copies (which you can cache if you want, and invalidate when your
application updates the results...like a wx.Grid update on changed).


> > As for the graph issue, you've got a bigger problem than users just
> > being able to edit edge lists, users can clear the entire dictionary of
> > vertices (outgoing.clear()).  It seems to me that a more reasonable
> > method to handle this particular case is to tell your users "don't
> > modify the dictionaries or the edge lists", and/or store your edge lists
> > as tuples instead of lists or dictionaries, and/or use an immutable
> > dictionary (as offered by Barry in the PEP).
> 
> As I wrote before, telling my users "don't modify the edge lists" is
> just like making lists hashable, and telling all Python users, "dont
> modify lists that are dictionary keys." There's no way to tell the
> users that - there's no convention for objects which should not be
> changed. You can write it in the documentation, but who'll bother
> looking there?

When someone complains that something doesn't work, I tell them to read
the documentation.  If your users haven't been told to RTFM often enough
to actually make it happen, then you need a RTFM-bat. Want to know how
you make one?  You start wrapping the objects you return which segfaults
the process if they change things. When they start asking, tell them it
is documented quite clearly "do not to modify objects returned, or else". 
Then there's the other option, which I provide below.


> I don't think that your other suggestions will work: the data
> structure of the graph itself can't be made of immutable objects,
> because of the fact that the graph is a mutable object - you can
> change it. It can be made of immutable objects, but this means copying
> all the data every time the graph changes.

So let me get this straight: You've got a graph.  You want to be able to
change the graph, but you don't want your users to accidentally change
the graph. Sounds to me like an API problem, not a freeze()/mutable problem.
Want an API?

class graph:
...
def IterOutgoing(self, node):
...
def IterIncoming(self, node):
...
def IsAdjacent(self, node1, node2):
...
def IterNodes(self):
...
def AddEdge(self, f_node, t_node):
...
def RemEdge(self, node1, node2):
...
def AddNode(self):
...

If you are reasonable in your implementation, all of the above
operations can be fast, and you will never have to worry about your
users accidentally mucking about with your internal data structures:
because you aren't exposing them.  If you are really paranoid, you can
take the next step and implement it in Pyrex or C, so that only a
malicous user can muck about with internal structures, at which point
you stop supporting them.


> Now, about copy-on-write:
> 
> > There's also this little issue of "copy on write" semantics with Python.
> > Anyone who tells you that "copy on write" is easy, is probably hanging
> > o

Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-30 Thread Antoine Pitrou

> It allows everything in Python to be both mutable and hashable,

I don't understand, since it's already the case. Any user-defined object
is at the same time mutable and hashable.
And if you want the hash value to follow the changes in attribute
values, just define an appropriate __hash__ method.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-30 Thread Noam Raphael
Hello,

It seems that we both agree that freezing is cool (-; . We disagree on
whether a copy-on-write behaviour is desired. Your arguments agains
copy-on-write are:
1. It's not needed.
2. It's complicated to implement.

But first of all, you didn't like my use cases. I want to argue with that.

> In reading your description of a 'table of values', I can't help but be
> reminded of the wxPython (and wxWidget) wx.Grid and its semantics.  It
> offers arbitrary tables of values (whose editors and viewers you can
> change at will), which offers a mechanism by which you can "listen" to
> changes that occur to the contents of a cell.  I can't help but think
> that if you offered a protocol by which a user can signal that a cell
> has been changed, perhaps by writing the value to the table itself
> (table.SetValue(row, col, value)), every read a deepcopy (or a PEP 351
> freeze), etc., that both you and the users of your table would be much
> happier.

Perhaps I didn't make it clear. The difference between wxPython's Grid
and my table is that in the table, most values are *computed*. This
means that there's no point in changing the values themselves. They
are also used frequently as set members (I can describe why, but it's
a bit complicated.)

I want to say that even if sets weren't used, the objects in the table
should have been frozen. The fact the sets (and dicts) only allow
immutable objects as members/keys is just for protecting the user.
They could have declared, "you shouldn't change anything you insert -
as long as you don't, we'll function properly." The only reason why
you can't compute hash values of mutable objects is that you don't
want your user to make mistakes, and make the data structure
inconsistent.

> As for the graph issue, you've got a bigger problem than users just
> being able to edit edge lists, users can clear the entire dictionary of
> vertices (outgoing.clear()).  It seems to me that a more reasonable
> method to handle this particular case is to tell your users "don't
> modify the dictionaries or the edge lists", and/or store your edge lists
> as tuples instead of lists or dictionaries, and/or use an immutable
> dictionary (as offered by Barry in the PEP).

As I wrote before, telling my users "don't modify the edge lists" is
just like making lists hashable, and telling all Python users, "dont
modify lists that are dictionary keys." There's no way to tell the
users that - there's no convention for objects which should not be
changed. You can write it in the documentation, but who'll bother
looking there?

I don't think that your other suggestions will work: the data
structure of the graph itself can't be made of immutable objects,
because of the fact that the graph is a mutable object - you can
change it. It can be made of immutable objects, but this means copying
all the data every time the graph changes.


Now, about copy-on-write:

> There's also this little issue of "copy on write" semantics with Python.
> Anyone who tells you that "copy on write" is easy, is probably hanging
> out with the same kind of people who say that "threading is easy".  Of
> course both are easy if you limit your uses to some small subset of
> interesting interactions, but "copy on write" gets far harder when you
> start thinking of dictionaries, lists, StringIOs, arrays, and all the
> possible user-defined classes, which may be mutated beyond obj[key] =
> value and/or obj.attr = value (some have obj.method() which mutates the
> object). As such, offering a callback mechanism similar to weak
> references is probably pretty close to impossible with CPython.

Let's limit ourselves to copy-on-write of objects which do not contain
nonfrozen objects. Perhaps it's enough - the table, the graph, and
strings, are perfect examples of these. Implementation doesn't seem to
complicated to me - whenever the object is about to change, and there
is a connected frozen copy, you make a shallow copy of the object,
point the frozen copy to it, release the reference to the frozen copy,
and continue as usual. That's all.

I really think that this kind of copy-on-write is "correct". The
temporary freezing of sets in order to check if they are members of
other sets is a not-very-nice way of implementing it. This kind of
copy-on-write would allow, in principle, for Python strings to become
mutable, with almost no speed penalty. It would allow my table, and
other containers, to automatically freeze the objects that get into
it, without having to trust the user on not changing the objects - and
remember that there's no way of *telling* him not to change the
objects.

Now, the computer scientist in me wants to explain (and think about)
freezing containers of nonfrozen objects. What I actually want is that
as long as an object doesn't change after it's freezed, the cost of
freezing would be nothing - that is, O(1). Think about a mutable
string object, which is used in the same way as the current, immutable
strings. It is constructed o

Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-29 Thread Josiah Carlson

Noam Raphael <[EMAIL PROTECTED]> wrote:
> 
> Hello,
> 
> I have thought about freezing for some time, and I think that it is a
> fundamental need - the need to know, sometimes, that objects aren't
> going to change.

I agree with this point.

> This is mostly the need of containers. dicts need to know that the
> objects that are used as keys aren't going to change, because if they
> change, their hash value changes, and you end up with a data structure
> in an inconsistent state. This is the need of sets too, and of heaps,
> and binary trees, and so on.

You are exactly mirroring the sentiments of the PEP.

> I want to give another example: I and my colleges designed something
> which can be described as an "electronic spreadsheet in Python". We
> called it a "table". The values in the table are Python objects, and
> the functions which relate them are written in Python. Then comes the
> problem: the user has, of course, access to the objects stored in the
> table. What would happen if he changes them? The answer is that the
> table would be in an inconsistent state, since something which should
> be the return value of a function is now something else, and there's
> no way for the table to know about that.

I respectfully disagree with this point and the rest of your email. Why?
For two use-cases, you offer 'tables of values' and 'graphs', as well as
a possible solution to the 'problem'; copy on write.

In reading your description of a 'table of values', I can't help but be
reminded of the wxPython (and wxWidget) wx.Grid and its semantics.  It
offers arbitrary tables of values (whose editors and viewers you can
change at will), which offers a mechanism by which you can "listen" to
changes that occur to the contents of a cell.  I can't help but think
that if you offered a protocol by which a user can signal that a cell
has been changed, perhaps by writing the value to the table itself
(table.SetValue(row, col, value)), every read a deepcopy (or a PEP 351
freeze), etc., that both you and the users of your table would be much
happier.

As for the graph issue, you've got a bigger problem than users just
being able to edit edge lists, users can clear the entire dictionary of
vertices (outgoing.clear()).  It seems to me that a more reasonable
method to handle this particular case is to tell your users "don't
modify the dictionaries or the edge lists", and/or store your edge lists
as tuples instead of lists or dictionaries, and/or use an immutable
dictionary (as offered by Barry in the PEP).


There's also this little issue of "copy on write" semantics with Python. 
Anyone who tells you that "copy on write" is easy, is probably hanging
out with the same kind of people who say that "threading is easy".  Of
course both are easy if you limit your uses to some small subset of
interesting interactions, but "copy on write" gets far harder when you
start thinking of dictionaries, lists, StringIOs, arrays, and all the
possible user-defined classes, which may be mutated beyond obj[key] =
value and/or obj.attr = value (some have obj.method() which mutates the
object). As such, offering a callback mechanism similar to weak
references is probably pretty close to impossible with CPython.


One of the reasons why I liked the freeze protocol is that it offered a
simple mechanism by which Python could easily offer support, for both
new and old objects alike.  Want an example?  Here's the implementation
for array freezing: tuple(a).  What about lists?  tuple(map(freeze, lst)) 
Freezing may not ultimately be the right solution for everything, but it
is a simple solution which handles the majority of cases.  Copy on write
in Python, on the other hand, is significantly harder to implement,
support, and is probably not the right solution for many problems.


 - Josiah

P.S. To reiterate to Barry: map freeze to the contents of containers,
otherwise the object can still be modified, and hence is not frozen.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-29 Thread Noam Raphael
Hello,

I have thought about freezing for some time, and I think that it is a
fundamental need - the need to know, sometimes, that objects aren't
going to change.

This is mostly the need of containers. dicts need to know that the
objects that are used as keys aren't going to change, because if they
change, their hash value changes, and you end up with a data structure
in an inconsistent state. This is the need of sets too, and of heaps,
and binary trees, and so on.

I want to give another example: I and my colleges designed something
which can be described as an "electronic spreadsheet in Python". We
called it a "table". The values in the table are Python objects, and
the functions which relate them are written in Python. Then comes the
problem: the user has, of course, access to the objects stored in the
table. What would happen if he changes them? The answer is that the
table would be in an inconsistent state, since something which should
be the return value of a function is now something else, and there's
no way for the table to know about that.

The solution is to have a "freeze" protocol. It may be called "frozen"
(like frozen(set([1,2,3]))), so that it will be clear that it does not
change the object itself. The definition of a frozen object is that
its value can't change - that is, if you compare it with another
object, you should get the same result as long as the other object
hasn't changed. As a rule, only frozen objects should be hashable.

I want to give another, different, use case for freezing objects. I
once thought about writing a graph package in Python - I mean a graph
with vertices and edges. The most obvious way to store a directed
graph is as a mapping (dict) from a node to the set of nodes that it
points to. Since I want to be able to find also which nodes point to a
specific node, I will store another mapping, from a node to the set of
nodes that point to it. Now, I want a method of the graph which will
return the set of nodes that a given node points to, for example to
let me write "if y in graph.adjacent_nodes(x) then". The question is,
what will the adjacent_nodes method return? If it returns the set
which is a part of the data structure, there is nothing (even no
convention!) that will prevent the user from playing with it. This
will corrupt the data structure, since the change won't be recorded in
the inverse mapping. adjacent_nodes can return a copy of the set, it's
a waste if you only want to check whether an object is a member of the
set.

I gave this example to say that the "frozen" protocol should (when
possible) return an object which doesn't really contain a copy of the
data, but rather gives an "image" of the original object. If the
original object changes while there are frozen copies of it, the data
will be copied, and all the frozen objects will then reference a
version of the data that will never change again.

This will solve the graph problem nicely - adjacent_nodes would simply
return a frozen copy of the set, and a copy operation would happen
only in the rare cases when the returned set is being modified. This
would also help the container use cases: they may call the frozen()
method on objects that should be inserted into the container, and
usually the data won't be copied. Some objects can't be created in
their final form, but can only be constructed step after step. This
means that they must be non-frozen objects. Sometimes they are
constructed in order to get into a container. Unless the frozen()
method is copy-on-change the way I described, all the data would have
to be copied again, just for the commitment that it won't change.

I don't mean to frighten, but in principle, this may mean that
immutable strings might be introduced, which will allow us to get rid
of all the cStringIO workarounds. Immutable strings would be
constructed whenever they are needed, at a low performance cost
(remember that a frozen copy of a given object has to be constructed
only once - once it has been created, the same object can be returned
on additional frozen() calls.)

Copy-on-change of containers of non-frozen objects requires additional
complication: it requires frozen objects to have a way for setting a
callback that will be called when the original object was changed.
This is because the change makes the container of the original object
change, so it must drop its own frozen copy. This needs to happen only
once per frozen object, since after a change, all the containers drop
their frozen copies. I think this callback is conceptually similar to
the weakref callback.

Just an example that copy-on-change (at least of containers of frozen
objects) is needed: sets. It was decided that you can test whether a
non-frozen set is a member of a set. I understand that it is done by
"temporarily freezing" the set, and that it caused some threading
issues. A copy-on-change mechanism might solve it more elegantly.

What do you think?

Noam
___

Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-26 Thread Paolino
Paolino wrote:

> Is __hash__=id inside a class enough to use a set (sets.Set before 2.5) 
> derived class instance as a key to a mapping?
It is __hash__=lambda self:id(self) that is terribly slow ,it needs a 
faster way to state that to let them be useful as key to mapping as all 
set operations will pipe into the mechanism .In my application that 
function is eating time like hell, and will keep on doing it even with 
the PEP proposed .OT probably.

Regards Paolino


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-25 Thread Josiah Carlson

Nick Coghlan <[EMAIL PROTECTED]> wrote:
> 
> Josiah Carlson wrote:
> > Nick Coghlan <[EMAIL PROTECTED]> wrote:
> >> I think having dicts and sets automatically invoke freeze would be a 
> >> mistake, 
> >> because at least one of the following two cases would behave unexpectedly:
> > 
> > I'm pretty sure that the PEP was only aslomg if one would freeze the
> > contents of dicts IF the dict was being frozen.
> > 
> > That is, which of the following should be the case:
> > freeze({1:[2,3,4]}) -> {1:[2,3,4]}
> > freeze({1:[2,3,4]}) -> xdict(1=(2,3,4))
> 
> I believe the choices you intended are:
>   freeze({1:[2,3,4]}) -> imdict(1=[2,3,4])
>   freeze({1:[2,3,4]}) -> imdict(1=(2,3,4))
> 
> Regardless, that question makes a lot more sense (and looking at the PEP 
> again, I realised I simply read it wrong the first time).
> 
> For containers where equality depends on the contents of the container (i.e., 
> all the builtin ones), I don't see how it is possible to implement a sensible 
> hash function without freezing the contents as well - otherwise your 
> immutable 
> isn't particularly immutable.
> 
> Consider what would happen if list "__freeze__" simply returned a tuple 
> version of itself - you have a __freeze__ method which returns a potentially 
> unhashable object!

I agree completely, hence my original statement on 10/23: "it is of my
opinion that a container which is frozen should have its contents frozen
as well."

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-25 Thread Nick Coghlan
Josiah Carlson wrote:
> Nick Coghlan <[EMAIL PROTECTED]> wrote:
>> I think having dicts and sets automatically invoke freeze would be a 
>> mistake, 
>> because at least one of the following two cases would behave unexpectedly:
> 
> I'm pretty sure that the PEP was only aslomg if one would freeze the
> contents of dicts IF the dict was being frozen.
> 
> That is, which of the following should be the case:
> freeze({1:[2,3,4]}) -> {1:[2,3,4]}
> freeze({1:[2,3,4]}) -> xdict(1=(2,3,4))

I believe the choices you intended are:
  freeze({1:[2,3,4]}) -> imdict(1=[2,3,4])
  freeze({1:[2,3,4]}) -> imdict(1=(2,3,4))

Regardless, that question makes a lot more sense (and looking at the PEP 
again, I realised I simply read it wrong the first time).

For containers where equality depends on the contents of the container (i.e., 
all the builtin ones), I don't see how it is possible to implement a sensible 
hash function without freezing the contents as well - otherwise your immutable 
isn't particularly immutable.

Consider what would happen if list "__freeze__" simply returned a tuple 
version of itself - you have a __freeze__ method which returns a potentially 
unhashable object!

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandlaziness.blogspot.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-24 Thread Raymond Hettinger
[Barry Warsaw]
> I've had this PEP laying around for quite a few months.  It was
inspired
> by some code we'd written which wanted to be able to get immutable
> versions of arbitrary objects.


* FWIW, the _as_immutable() protocol was dropped from sets.py for a
reason.  User reports indicated that it was never helpful in practice.
It added complexity and confusion without producing offsetting benefits.

* AFAICT, there are no use cases for freezing arbitrary objects when the
object types are restricted to just lists and sets but not dicts,
arrays, or other containers.  Even if the range of supported types were
expanded, what applications could use this?  Most apps cannot support
generic substitution of lists and sets -- they have too few methods in
common -- they are almost never interchangeable.

* I'm concerned that generic freezing leads to poor design and
hard-to-find bugs.  One class of bugs results from conflating ordered
and unordered collections as lookup keys.  It is difficult to assess
program correctness when the ordered/unordered distinction has been
abstracted away.  A second class of errors can arise when the original
object mutates and gets out-of-sync with its frozen counterpart.

* For a rare app needing mutable lookup keys, a simple recipe would
suffice:

freeze_pairs = [(list, tuple), (set, frozenset)]

def freeze(obj):
try:
hash(obj)
except TypeError:
for sourcetype, desttype in freeze_pairs:
if isinstance(obj, sourcetype):
return desttype(obj)
raise
else:
return obj

Unlike the PEP, the recipe works with older pythons and is trivially
easy to extend to include other containers.

* The name "freeze" is problematic because it suggests an in-place
change.  Instead, the proposed mechanism creates a new object.  In
contrast, explicit conversions like tuple(l) or frozenset(s) are obvious
about their running time, space consumed, and new object identity.  

Overall, I'm -1 on the PEP.  Like a bad C macro, the proposed
abstraction hides too much.  We lose critical distinctions of ordered vs
unordered, mutable vs immutable, new objects vs in-place change, etc.
Without compelling use cases, the mechanism smells like a
hyper-generalization.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-24 Thread Gary Poster

On Oct 23, 2005, at 6:43 PM, Barry Warsaw wrote:

> I've had this PEP laying around for quite a few months.  It was  
> inspired
> by some code we'd written which wanted to be able to get immutable
> versions of arbitrary objects.  I've finally finished the PEP,  
> uploaded
> a sample patch (albeit a bit incomplete), and I'm posting it here  
> to see
> if there is any interest.
>
> http://www.python.org/peps/pep-0351.html

I like this.  I'd like it better if it integrated with the adapter  
PEP, so that the freezing mechanism for a given type could be  
pluggable, and could be provided even if the original object did not  
contemplate it.  I don't know where the adapter PEP stands: skimming  
through the (most recent?) thread in January didn't give me a clear  
idea.

As another poster mentioned, in-place freezing is also of interest to  
me (and why I read the PEP Initially), but as also as mentioned  
that's probably unrelated to your PEP.

Gary
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-24 Thread Paolino
I'm not sure I understood completely the idea but deriving freeze 
function from hash gives hash a wider importance.
Is __hash__=id inside a class enough to use a set (sets.Set before 2.5) 
derived class instance as a key to a mapping?
Sure I missed the point.


Regards Paolino

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-24 Thread Josiah Carlson

Nick Coghlan <[EMAIL PROTECTED]> wrote:
> I think having dicts and sets automatically invoke freeze would be a mistake, 
> because at least one of the following two cases would behave unexpectedly:

I'm pretty sure that the PEP was only aslomg if one would freeze the
contents of dicts IF the dict was being frozen.

That is, which of the following should be the case:
freeze({1:[2,3,4]}) -> {1:[2,3,4]}
freeze({1:[2,3,4]}) -> xdict(1=(2,3,4))

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-24 Thread Josiah Carlson

Christopher Armstrong <[EMAIL PROTECTED]> wrote:
> 
> On 10/24/05, Josiah Carlson <[EMAIL PROTECTED]> wrote:
> > "Should dicts and sets automatically freeze their mutable keys?"
> >
> > Dictionaries don't have mutable keys,
> 
> Since when?
> 
> Maybe you meant something else? I can't think of any way in which
> "dictionaries don't have mutable keys" is true. The only rule about
> dictionary keys that I know of is that they need to be hashable and
> need to be comparable with the equality operator.

Good point, I forgot about user-defined classes (I rarely use them as
keys myself, it's all too easy to make a mutable whose hash is dependant
on mutable contents, as having an object which you can only find if you
have the exact object is not quite as useful I generally need).  I will,
however, stand by, "a container which is frozen should have its contents
frozen as well."

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-24 Thread Christopher Armstrong
On 10/24/05, Josiah Carlson <[EMAIL PROTECTED]> wrote:
> "Should dicts and sets automatically freeze their mutable keys?"
>
> Dictionaries don't have mutable keys,

Since when?

class Foo:
def __init__(self):
self.x = 1

f = Foo()
d = {f: 1}
f.x = 2

Maybe you meant something else? I can't think of any way in which
"dictionaries don't have mutable keys" is true. The only rule about
dictionary keys that I know of is that they need to be hashable and
need to be comparable with the equality operator.

--
  Twisted   |  Christopher Armstrong: International Man of Twistery
   Radix|-- http://radix.twistedmatrix.com
|  Release Manager, Twisted Project
  \\\V///   |-- http://twistedmatrix.com
   |o O||
wvw-+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-24 Thread Nick Coghlan
Barry Warsaw wrote:
> I've had this PEP laying around for quite a few months.  It was inspired
> by some code we'd written which wanted to be able to get immutable
> versions of arbitrary objects.  I've finally finished the PEP, uploaded
> a sample patch (albeit a bit incomplete), and I'm posting it here to see
> if there is any interest.
> 
> http://www.python.org/peps/pep-0351.html

I think it's definitely worth considering. It may also reduce the need for "x" 
and "frozenx" builtin pairs. We already have "set" and "frozenset", and the 
various "bytes" ideas that have been kicked around have generally considered 
the need for a "frozenbytes" as well.

If freeze was available, then "freeze(x(*args))" might server as a replacement 
for any builtin "frozen" variants.

I think having dicts and sets automatically invoke freeze would be a mistake, 
because at least one of the following two cases would behave unexpectedly:

   d = {}
   l = []
   d[l] = "Oops!"
   d[l] # Raises KeyError if freeze() isn't also invoked in __getitem__

   d = {}
   l = []
   d[l] = "Oops!"
   l.append(1)
   d[l] # Raises KeyError regardless

Oh, and the PEP's xdict example is even more broken than the PEP implies, 
because two imdicts which compare equal (same contents) may not hash equal 
(different id's).

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandlaziness.blogspot.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-23 Thread Josiah Carlson

Barry Warsaw <[EMAIL PROTECTED]> wrote:
> I've had this PEP laying around for quite a few months.  It was inspired
> by some code we'd written which wanted to be able to get immutable
> versions of arbitrary objects.  I've finally finished the PEP, uploaded
> a sample patch (albeit a bit incomplete), and I'm posting it here to see
> if there is any interest.
> 
> http://www.python.org/peps/pep-0351.html



class xlist(list):
def __freeze__(self):
return tuple(self)

Shouldn't that be:

class xlist(list):
def __freeze__(self):
return tuple(map(freeze, self))


"Should dicts and sets automatically freeze their mutable keys?"

Dictionaries don't have mutable keys, but it is of my opinion that a
container which is frozen should have its contents frozen as well.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 351, the freeze protocol

2005-10-23 Thread Adam Olsen
On 10/23/05, Barry Warsaw <[EMAIL PROTECTED]> wrote:
> I've had this PEP laying around for quite a few months.  It was inspired
> by some code we'd written which wanted to be able to get immutable
> versions of arbitrary objects.  I've finally finished the PEP, uploaded
> a sample patch (albeit a bit incomplete), and I'm posting it here to see
> if there is any interest.
>
> http://www.python.org/peps/pep-0351.html

My sandboxes need freezing for some stuff and ultimately freezable
user classes will be desirable, but for performance reasons I prefer
freezing inplace.  Not much overlap with PEP 351 really.

--
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 351, the freeze protocol

2005-10-23 Thread Barry Warsaw
I've had this PEP laying around for quite a few months.  It was inspired
by some code we'd written which wanted to be able to get immutable
versions of arbitrary objects.  I've finally finished the PEP, uploaded
a sample patch (albeit a bit incomplete), and I'm posting it here to see
if there is any interest.

http://www.python.org/peps/pep-0351.html

Cheers,
-Barry



signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com