[issue9269] Cannot pickle self-referencing sets

2013-11-29 Thread Alexandre Vassalotti

Changes by Alexandre Vassalotti :


--
assignee:  -> alexandre.vassalotti
resolution:  -> duplicate
stage: patch review -> committed/rejected
status: open -> closed
superseder:  -> Implement PEP 3154 (pickle protocol 4)

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2012-07-27 Thread Antoine Pitrou

Changes by Antoine Pitrou :


--
assignee: belopolsky -> 
dependencies:  -Pickle breakage with reduction of recursive structures
stage: needs patch -> patch review
versions: +Python 3.4 -Python 3.2

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2012-07-27 Thread Stefan Mihaila

Stefan Mihaila  added the comment:

Attaching patch for fixing a test and adding better testing of sets.

--
Added file: http://bugs.python.org/file26539/sets-test.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2012-07-26 Thread Stefan Mihaila

Stefan Mihaila  added the comment:

I have attached a fix to this issue (and implicitly issue1062277).

This patch allows pickling self-referential sets by implementing a 
set.__reduce__ which uses states as opposed to ctor parameters.

Before:
>>> s=set([1,2,3])
>>> s.__reduce__()
(, ([1, 2, 3],), None)
>>> len(pickle.dumps(s,1))
38

After:
>>> s=set([1,2,3])
>>> s.__reduce__()
(, (), [1, 2, 3])
>>> len(pickle.dumps(s,1))
36

Basically what this does is: instead of unpickling the set by doing 
set([1,2,3]) it does s=set(); s.__setstate__([1,2,3]).
States are supported in all versions of pickle so this shouldn't break anything.
Creating empty data structures and then filling them is the way pickle does it 
for all mutable containers in order to allow self-references (with the 
exception of sets, of course).

Since memoization is performed after the object is created but before its state 
is set, pickling an object's state can contain references to oneself.

class A:
pass

a=A()
s=set([a])
a.s=s

s_=loads(dumps(s,1))
next(iter(s_)).s is s_ # True

Note that this fix only applies for sets, not frozensets. Frozensets are a 
different matter, because their immutability makes it impossible to set their 
state. Self-referential frozensets are currently supported in my implementation 
of pickle4 using a trick similar to what tuples use. But the trick works more 
easily there because frozensets have their own opcodes, like tuples.

Also note that applying this patch makes 
Lib/test/pickletester.py:test_pickle_to_2x fail (DATA3 and DATA6 there contain 
pickled data of sets, which naturally have changed).
I'll upload a patch fixing this as well as adding one or more test for sets 
soon.

--
keywords: +patch
nosy: +mstefanro
Added file: http://bugs.python.org/file26533/self_referential-sets.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2011-12-10 Thread Jesús Cea Avión

Changes by Jesús Cea Avión :


--
nosy: +jcea

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2011-04-23 Thread Alex Gaynor

Changes by Alex Gaynor :


--
nosy: +alex

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2011-03-08 Thread bcroq

Changes by bcroq :


--
nosy: +bcroq

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-09-01 Thread Raymond Hettinger

Changes by Raymond Hettinger :


--
priority: normal -> low

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-08-09 Thread Terry J. Reedy

Changes by Terry J. Reedy :


--
components: +Library (Lib) -Interpreter Core
versions:  -Python 2.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-07-17 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

Upon further investigation, I am no longer convinced that "reduce cycles" are 
necessarily fatal to pickling.  I am attaching a script testcycle.py that 
allows creating cycles using various containers. It takes the name of container 
class as an argument and prints SUCCESS/FAILURE followed by annotated pickle 
disassembly.  It turns out that self-referential tuples are pickled 
successfully even though being immutable, they must receive their state through 
arguments:


$ ./python.exe testcycle.py tuple
SUCCESS
0: \x80 PROTO  3  Protocol version indicator.
2: cGLOBAL '__main__ C'   Push a global object (module.attr) on the 
stack.
   14: )EMPTY_TUPLE   Push an empty tuple.
   15: \x81 NEWOBJBuild an object instance.
   16: qBINPUT 1  Store the stack top into the memo.  The 
stack is not popped.
   18: }EMPTY_DICTPush an empty dict.
   19: XBINUNICODE 'foo'  Push a Python Unicode string object.
   27: hBINGET 1  Read an object from the memo and push it 
on the stack.
   29: \x85 TUPLE1Build a one-tuple out of the topmost item 
on the stack.
   30: qBINPUT 4  Store the stack top into the memo.  The 
stack is not popped.
   32: sSETITEM   Add a key+value pair to an existing dict.
   33: bBUILD Finish building an object, via 
__setstate__ or dict update.
   34: 0POP   Discard the top stack item, shrinking the 
stack by one item.
   35: hBINGET 4  Read an object from the memo and push it 
on the stack.
   37: .STOP  Stop the unpickling machine.
highest protocol among opcodes = 2

There is a comment in save_tuple() that explains how it works:

# Subtle.  d was not in memo when we entered save_tuple(), so   

# the process of saving the tuple's elements must have saved

# the tuple itself:  the tuple is recursive.  The proper action 

# now is to throw away everything we put on the stack, and  

# simply GET the tuple (it's already constructed).  This check  

# could have been done in the "for element" loop instead, but   

# recursive tuples are a rare thing.


I wonder if the same trick could be used in save_reduce() to resolve #1062277 
by pickling "reduce cycles" correctly instead of bailing out when they are 
encountered.

--
priority: low -> normal

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-07-16 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> The test case in cycle.py tricks set into accepting mutable objects by > 
> creating a class with default __hash__.  This falls into a category of > 
> "don't do it".

I beg to differ. There is a reason we allow people to define __hash__ and 
that's to define arbitrary hashable types (not only immutable ones).
Furthermore, the default __hash__ (equivalent to id()) is also perfectly useful 
in some cases.
And in object-oriented designs it is very common to have reference cycles.

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-07-15 Thread mike bayer

mike bayer  added the comment:

OK, more specifically, here's the kind of situation where items in a set are 
mutable:

company = Session.query(Company).first()

# company.employees is a set()
company.employees

# each employee references the parent company
for e in company.employees:
assert e.company is company

So nothing is mutated relationally in this case.   It's just a plain old 
bidirectional structure.If two objects are related via many-to-many, then 
you might have a set in both directions.   I'm not sure if this specific 
situation produces the pickle bug, however, it's been awhile since I've seen 
which specific case causes the problem.   But there are mutable items in a set 
in these examples and it doesn't seem unreasonable to me.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-07-15 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

On Thu, Jul 15, 2010 at 8:52 PM, Jack Diederich  wrote:
..
>   If database rows were mutable the results of a JOIN could be nonsensical.

And if your result set is self-referential you have a bigger problem
on your hands than not being able to pickle it. :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-07-15 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

On Thu, Jul 15, 2010 at 8:26 PM, mike bayer  wrote:
..
> where is it defined that sets are not "supposed" to contain mutable items?   
> such a requirement vastly limits the usefulness of sets.
>

Well, there is no such requirement. The actual requirement is that
they should be hashable.  For built-in types, however hashable is the
same as immutable.  Arguably, user-defined classes should emulate
that.

> Consider that relational database rows are mutable.  A result set containing 
> multiple rows which each have a primary key comprises a set, hashed on 
> primary key.  But other attributes of each row can be updated.   Surely this 
> is not controversial ?  What Python data structure should I (and a whole 
> bunch of Python ORMs) be using to represent mutable, relational database 
> rows, unordered and unique on primary key, in memory ?

Why wouldn't you represent a result set as a dict mapping primary key
(tuple) to list of column values?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-07-15 Thread Jack Diederich

Jack Diederich  added the comment:

Mike, it is better to think of database rows as immutable tuples.  During the 
course of a query the contents of the database are considered static - hence 
all that locking and kvetching about this or that database not having "true" 
foreign key support.  If database rows were mutable the results of a JOIN could 
be nonsensical.

--
nosy: +jackdied

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-07-15 Thread mike bayer

mike bayer  added the comment:

where is it defined that sets are not "supposed" to contain mutable items?   
such a requirement vastly limits the usefulness of sets.

Consider that relational database rows are mutable.  A result set containing 
multiple rows which each have a primary key comprises a set, hashed on primary 
key.  But other attributes of each row can be updated.   Surely this is not 
controversial ?  What Python data structure should I (and a whole bunch of 
Python ORMs) be using to represent mutable, relational database rows, unordered 
and unique on primary key, in memory ?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-07-15 Thread Alexander Belopolsky

Alexander Belopolsky  added the comment:

I am reclassifying this as an RFE because as a bug, this is a duplicate of 
issue1062277.  The later contains an excellent description of the problem by 
Dima Dorfman:

 Our pickle implementations don't check for reduce
 cycles. This is somewhat cosmetic except that stack
 overflow protection is imperfect (for cPickle), causing
 crashes, and some kinds of cycles trigger asserts (in
 pickle).  [msg47267]

This is undeniably a bug and the solution offered in issue1062277 is quite 
reasonable: detect reduce cycles and bail out with an informative message.  
This will not solve the problem of pickling self-referencing sets, but at least 
once documented can be defended as expected behavior.

What remains after issue1062277 is a feature request to allow pickling of 
self-referencing sets.  I would argue that this is really a pathological case.  
Sets are not supposed to contain mutable items and immutable objects cannot 
create reference cycles among themselves.

The test case in cycle.py tricks set into accepting mutable objects by creating 
a class with default __hash__.  This falls into a category of "don't do it".

I am lowering the priority and adding #1062277 as a dependency.  Once #1062277 
is fixed, I would not mind closing this as "won't fix".

--
dependencies: +Pickle breakage with reduction of recursive structures
priority: normal -> low
type: behavior -> feature request

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-07-15 Thread Éric Araujo

Changes by Éric Araujo :


--
nosy: +merwok

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-07-15 Thread Alexander Belopolsky

Changes by Alexander Belopolsky :


Removed file: http://bugs.python.org/file18020/cycle.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-07-15 Thread Alexander Belopolsky

Changes by Alexander Belopolsky :


Added file: http://bugs.python.org/file18021/cycle.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-07-15 Thread Alexander Belopolsky

Changes by Alexander Belopolsky :


--
nosy: +alexandre.vassalotti, grubert, rhettinger, schmir, zzzeek

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2010-07-15 Thread Alexander Belopolsky

New submission from Alexander Belopolsky :

Attached script, cycle.py demonstrates a simplification of the behavior 
reported by mike bayer in msg77200.

Essentially, the script attempts to pickle a set that contains a class instance 
that has an attribute referring back to the set:

class C:
pass
c = C()
cycle = set([c])
c.foo = cycle

An attempt to pickle the *cycle* object triggers an assertion error in 2.7 or 
in 3.2 with disabled _pickle acceleration and produces a broken cycle in 3.2 or 
if cPickle is used instead of pickle in 2.7.

$ python3 cycle.py
FAIILURE
..

$ python2 cycle.py
Traceback (most recent call last):
  ..
  File ".../pickle.py", line 244, in memoize
assert id(obj) not in self.memo
AssertionError

If you run cycle.py with an argument, it uses a dict instead of set to create 
the cycle and shows that the cycles with dict can be pickled correctly:

$ python3 cycle.py dict
SUCCESS
..

After reporting success or failure, cycle.py, prints a disassembly of the 
pickle stream which makes it clear what happens:

In case of dict, we see the following:

$ python3 cycle.py dict
SUCCESS
 ..
2: }EMPTY_DICT
3: qBINPUT 0
 ..
   26: XBINUNICODE 'foo'
..
   36: hBINGET 0
   38: sSETITEM
..
   40: NNONE
   41: sSETITEM

An empty dict is created and saved in the memo.  Then a C object is built with 
foo attribute is set to the dict retrieved from the memo. Finally, the same 
dict is updated with (C object, None) key-value pair.  The result is the cycle 
identical to the one we built in python code.

The sets, however, are built differently.  There is no pickle opcode to add 
items to a set, so all set items must exist by the time set is built.  So here 
is what we see:  

$ python3 cycle.py
FAIILURE
2: cGLOBAL 'builtins set'
   16: qBINPUT 0
.. instead of empty set the constructor is saved in memo
   42: XBINUNICODE 'foo'
   52: hBINGET 0
..
   63: RREDUCE
.. a set object containing c is constructed 
   66: sSETITEM  
.. and assigned to c.foo
   72: RREDUCE
.. another set object is constructed containing c
 
As a result, we have

cycle = {c}
c.foo = {c}

Instead of 

c.foo = cycle

--
assignee: belopolsky
components: Interpreter Core
files: cycle.py
messages: 110397
nosy: belopolsky
priority: normal
severity: normal
stage: needs patch
status: open
title: Cannot pickle self-referencing sets
type: behavior
versions: Python 2.7, Python 3.2
Added file: http://bugs.python.org/file18020/cycle.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com