[issue5084] unpickling does not intern attribute names

2009-05-02 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Committed in r72223, r72224. Thanks!

--
resolution:  -> fixed
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-05-02 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Thanks, I'll take a look very soon.

--
assignee: alexandre.vassalotti -> pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-05-02 Thread Antoine Pitrou

Changes by Antoine Pitrou :


--
stage: needs patch -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-05-01 Thread Jake McGuire

Jake McGuire  added the comment:

This fell through the cracks.  But the following unit test seems like it 
does the right thing (fails with the old module, works with the new ones).

--
Added file: http://bugs.python.org/file13835/pickletester.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-05-01 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

(s/string/still/, of course...)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-05-01 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Jake, are you string working on a test?

--
priority:  -> normal
stage:  -> needs patch
versions: +Python 2.7, Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-02-25 Thread Jake McGuire

Changes by Jake McGuire :


Removed file: http://bugs.python.org/file13169/cPickle.c.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-02-25 Thread Jake McGuire

Changes by Jake McGuire :


Removed file: http://bugs.python.org/file12882/cPickle.c.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-02-25 Thread Jake McGuire

Jake McGuire  added the comment:

Ugh.  Clearly I didn't check to see if it worked or not, as it didn't even 
compile.  A new diff, tested and verified to work, is attached.  I'll work 
on creating a test.

Added file: http://bugs.python.org/file13178/cPickle.c.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-02-24 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

To give an example of what the test could check:

>>> class C(object):
...  def __init__(self):
...self.some_long_attribute_name = 5
... 
>>> c = C()
>>> c.__dict__
{'some_long_attribute_name': 5}
>>> sorted(map(id, c.__dict__))
[140371243499696]
>>> import pickle
>>> d = pickle.loads(pickle.dumps(c, -1))
>>> d
<__main__.C object at 0x7faaba1b0390>
>>> d.__dict__
{'some_long_attribute_name': 5}
>>> sorted(map(id, d.__dict__))
[140371243501232]

The `sorted(map(id, d.__dict__))` should have been the same before and
after pickling.

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-02-24 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

In your patch, I'm not sure where the `name` variable is coming from.
Have you checked it works?

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-02-24 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

The test should check that the unpickled strings have been interned.

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-02-24 Thread Jake McGuire

Jake McGuire  added the comment:

The fromstring/asstring dance was due to my incomplete understanding of 
refcounting.  PyDict_Next returns a borrowed reference but 
PyString_InternInPlace expects an owned reference.  Thanks to Kirk 
McDonald, I have a new patch that does the refcounting correctly.

What sort of test did you have in mind?

Added file: http://bugs.python.org/file13169/cPickle.c.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-02-02 Thread Collin Winter

Changes by Collin Winter :


--
nosy: +collinwinter, jyasskin

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-02-01 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Why do you call PyString_AsString() followed by PyString_FromString()?
Strings are immutable so you shouldn't neek to take a copy.

Besides, it would be nice to add a test.

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-01-29 Thread Jesús Cea Avión

Changes by Jesús Cea Avión :


--
nosy: +jcea

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-01-27 Thread Alexandre Vassalotti

Alexandre Vassalotti  added the comment:

Oh, you are right. I was looking at py3k's version of pickle, which uses
PyDict_Update instead of a while loop; that is why I got confused.

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-01-27 Thread Jake McGuire

Jake McGuire  added the comment:

Are you sure?  I may not have enough context in my diff, which I  
should have done against an anonymous svn checkout, but it looks like  
the slot attributes get set several lines after my diff.  "while  
(PyDict_Next(slotstate, ...))" as opposed to the "while  
(PyDict_Next(state, ...))" in my change...

-jake

On Jan 27, 2009, at 6:54 PM, Alexandre Vassalotti wrote:

>
> Alexandre Vassalotti  added the comment:
>
> The patch for cPickle doesn't do the same thing as the pickle one. In
> the cPickle one, you are only interning slot attributes, which, I
> believe, is not what you intended. :-)
>
> --
> assignee:  -> alexandre.vassalotti
> nosy: +alexandre.vassalotti
>
> ___
> Python tracker 
> 
> ___

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-01-27 Thread Alexandre Vassalotti

Alexandre Vassalotti  added the comment:

The patch for cPickle doesn't do the same thing as the pickle one. In
the cPickle one, you are only interning slot attributes, which, I
believe, is not what you intended. :-)

--
assignee:  -> alexandre.vassalotti
nosy: +alexandre.vassalotti

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-01-27 Thread Jake McGuire

Changes by Jake McGuire :


Added file: http://bugs.python.org/file12882/cPickle.c.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-01-27 Thread Jake McGuire

Changes by Jake McGuire :


Removed file: http://bugs.python.org/file12879/cPickle.c.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-01-27 Thread Gabriel Genellina

Gabriel Genellina  added the comment:

Either my browser got crazy, or you uploaded the same patch (.py) twice.

--
nosy: +gagenellina

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-01-27 Thread Jake McGuire

Changes by Jake McGuire :


Added file: http://bugs.python.org/file12880/pickle.py.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5084] unpickling does not intern attribute names

2009-01-27 Thread Jake McGuire

New submission from Jake McGuire :

Instance attribute names are normally interned - this is done in 
PyObject_SetAttr (among other places).  Unpickling (in pickle and 
cPickle) directly updates __dict__ on the instance object.  This 
bypasses the interning so you end up with many copies of the strings 
representing your attribute names, which wastes a lot of space, both in 
RAM and in pickles of sequences of objects created from pickles.  Note 
that the native python memcached client uses pickle to serialize 
objects.

>>> import pickle
>>> class C(object):
...   def __init__(self, x):
... self.long_attribute_name = x
...
>>> len(pickle.dumps([pickle.loads(pickle.dumps(C(None), 
pickle.HIGHEST_PROTOCOL)) for i in range(100)], 
pickle.HIGHEST_PROTOCOL))
3658
>>> len(pickle.dumps([C(None) for i in range(100)], 
pickle.HIGHEST_PROTOCOL))
1441
>>>

Interning the strings on unpickling makes the pickles smaller, and at 
least for cPickle actually makes unpickling sequences of many objects 
slightly faster.  I have included proposed patches to cPickle.c and 
pickle.py, and would appreciate any feedback.

--
components: Library (Lib)
files: cPickle.c.diff
keywords: patch
messages: 80670
nosy: jakemcguire
severity: normal
status: open
title: unpickling does not intern attribute names
type: resource usage
Added file: http://bugs.python.org/file12879/cPickle.c.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com