[issue17810] Implement PEP 3154 (pickle protocol 4)

2013-06-03 Thread Stefan Mihaila

Stefan Mihaila added the comment:

On 6/3/2013 9:33 PM, Alexandre Vassalotti wrote:
> Alexandre Vassalotti added the comment:
>
> Stefan, could you address my review comments soon? The improved support for 
> globals is the only big piece missing from the implementation of PEP, which I 
> would like to get done and submitted by the end of the month.
>
> --
>
> ___
> Python tracker 
> <http://bugs.python.org/issue17810>
> ___
>
Yes, I apologize for the delay again. Today is my last exam this 
semester, so
I'll do my best to get it done as soon as possible (hopefully this weekend).

--

___
Python tracker 
<http://bugs.python.org/issue17810>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15642] Integrate pickle protocol version 4 GSoC work by Stefan Mihaila

2013-05-11 Thread Stefan Mihaila

Changes by Stefan Mihaila :


Added file: http://bugs.python.org/file30216/d0c3a8d4947a.diff

___
Python tracker 
<http://bugs.python.org/issue15642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17810] Implement PEP 3154 (pickle protocol 4)

2013-05-10 Thread Stefan Mihaila

Stefan Mihaila added the comment:

On 5/10/2013 11:46 PM, Stefan Mihaila wrote:
> Changes by Stefan Mihaila :
>
>
> --
> nosy: +mstefanro
>
> ___
> Python tracker 
> <http://bugs.python.org/issue17810>
> ___
>
Hello. I've worked on implementing PEP3154 as part of GSoC2012.
My work is available in a repo at [1].
The blog I've used to report my work is at [2] and contains some useful 
information.

Here is a list of features that were implemented as part of GSoC:

* Pickling of very large bytes and strings
* Better pickling of small string and bytes (+ tests)
* Native pickling of sets and frozensets (+ tests)
* Self-referential sets and frozensets (+ tests)
* Implicit memoization (BINPUT is implicit for certain opcodes)
   - The argument against this was that pickletools.optimize would
 not be able to prevent memoization of objects that are not
 referred later. For such situations, a special flag at beginning
 could be added, which indicates whether implicit BINPUT is enabled.
 This flag could be added as one of the higher-order bits of the 
protocol
 version. For instance:
 PROTO \x04 + BINUNICODE ".."
 and
 PROTO \x84 + BINUNICODE ".." + BINPUT 1
 would be equivalent. Then pickletools.optimize could choose whether
 it wants implicit BINPUT or not. Sure, this would complicate 
matters and it's
 not for me to decide whether it's worth it.
 In my midterm report at [3] there are some examples of what a 
pickled string
 looks in v4 without implicit memoization, and some size comparisons
 to v3.
* Pickling of nested globals, methods etc. (+ tests)
* Pickling calls to __new__ with keyword args (+ tests)
* A BAIL_OUT opcode was always outputted when pickling failed, so that
   the Pickler and Unpickler can be both run at once on different ends
   of a stream. The Pickler could guarantee to always send a
   correct pickle on the stream. The Unpickler would never end up hanging
   when Pickling failed mid-work.
   -  At the time, Alexandre suggested this would probably not be a great
  idea because it should be the responsibility of the protocol used
  to assure some consistency. However, this does not appear to be
  a trivial task to achieve. The size of the pickle is not known in
  advance, and waiting for the Pickler to complete before sending
  the data via stream is not as efficient, because the Unpickler
  would not be able to run at the same time.
  write and read methods of the stream would have to be wrapped and
  some escape sequence used. This would
  increase the size of the pickled string for some sort of worst-case
  of the escape sequence, probably. My thought was that it would be
  beneficial for the average user to have the guarantee that the Pickler
  always outputs a correct pickle to a stream, even if it raises an 
exception.
* Other minor changes that I can't really remember.

Although I'm sure Alexandre had his good reasons to start the work from
scratch, it would be a shame to waste all this work. The features mentioned
above are working and although the implementation may not be ideal (I don't
have the cpython experience of a regular dev), I'm sure useful bits can be
extracted from it.
Alexandre suggested that I extract bits and post patches, so I have 
attached,
for now, support for pickling methods and nested globals (+tests).
I'm willing to do so for some or the rest of the features, should this 
be requested
and should I have the necessary time to do so.

[1] https://bitbucket.org/mstefanro/pickle4/
[2] https://pypickle4.wordpress.com/
[3] https://gist.github.com/mstefanro/3086647

--
Added file: http://bugs.python.org/file30213/methods.patch

___
Python tracker 
<http://bugs.python.org/issue17810>
___diff -r 780722877a3e Lib/pickle.py
--- a/Lib/pickle.py Wed May 01 13:16:11 2013 -0700
+++ b/Lib/pickle.py Sat May 11 03:06:28 2013 +0300
@@ -23,7 +23,7 @@
 
 """
 
-from types import FunctionType, BuiltinFunctionType
+from types import FunctionType, BuiltinFunctionType, MethodType, ModuleType
 from copyreg import dispatch_table
 from copyreg import _extension_registry, _inverted_registry, _extension_cache
 from itertools import islice
@@ -34,10 +34,44 @@
 import io
 import codecs
 import _compat_pickle
+import builtins
+from inspect import ismodule, isclass
 
 __all__ = ["PickleError", "PicklingError", "UnpicklingError", "Pickler",
"Unpickler", "dump", "dumps", "load", "loads"]
 
+# Issue 15397: Unbinding of methods
+# Adds the possibility to unbind methods as well as a few definitio

[issue17810] Implement PEP 3154 (pickle protocol 4)

2013-05-10 Thread Stefan Mihaila

Changes by Stefan Mihaila :


Removed file: http://bugs.python.org/file30211/780722877a3e.diff

___
Python tracker 
<http://bugs.python.org/issue17810>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17810] Implement PEP 3154 (pickle protocol 4)

2013-05-10 Thread Stefan Mihaila

Changes by Stefan Mihaila :


Added file: http://bugs.python.org/file30211/780722877a3e.diff

___
Python tracker 
<http://bugs.python.org/issue17810>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17810] Implement PEP 3154 (pickle protocol 4)

2013-05-10 Thread Stefan Mihaila

Changes by Stefan Mihaila :


--
nosy: +mstefanro

___
Python tracker 
<http://bugs.python.org/issue17810>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15642] Integrate pickle protocol version 4 GSoC work by Stefan Mihaila

2013-04-22 Thread Stefan Mihaila

Stefan Mihaila added the comment:

Hello. I apologize once again for not finalizing my work, but once I have 
started my final year of faculty and a job, I have been busy pretty much all 
the time. I would really like to finish this as I've really enjoyed working on 
it, and everything on PEP 3154 and some other stuff has already been 
implemented. The only remaining part was finalizing the code review and fixing 
some memory leaks that gave me some headaches at the time.
I would really appreciate if you could give me a few more days before deciding 
to start a new implementation from scratch. I'll get to fixing those memory 
leaks in the next couple of days and then the code review can be finalized.
Would this be acceptable to you?

--

___
Python tracker 
<http://bugs.python.org/issue15642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15773] `is' operator returns False on classmethods

2012-08-23 Thread Stefan Mihaila

Changes by Stefan Mihaila :


--
type:  -> behavior

___
Python tracker 
<http://bugs.python.org/issue15773>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15773] `is' operator returns False on classmethods

2012-08-23 Thread Stefan Mihaila

New submission from Stefan Mihaila:

Here are a few counter-intuitive outputs:

>>> dict.fromkeys is dict.fromkeys
False

>>> id(dict.fromkeys) == id(dict.fromkeys)
True

>>> x=dict.fromkeys; id(x) == id(x)
True

>>> x=dict.fromkeys; id(x) == id(dict.fromkeys)
False

>>> x=dict.fromkeys; y=dict.fromkeys; id(x),id(y),id(dict.fromkeys)
(3924, 39064632, 39065144)

>>> a=id(dict.fromkeys); x=dict.fromkeys; b=id(dict.fromkeys); a,b
(3924, 39480568)

Attached is a failing test.

--
files: is_on_classmethods.py
messages: 168967
nosy: mstefanro
priority: normal
severity: normal
status: open
title: `is' operator returns False on classmethods
versions: Python 3.3
Added file: http://bugs.python.org/file26978/is_on_classmethods.py

___
Python tracker 
<http://bugs.python.org/issue15773>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15642] Integrate pickle protocol version 4 GSoC work by Stefan Mihaila

2012-08-23 Thread Stefan Mihaila

Stefan Mihaila added the comment:

Are there also some known techniques on tracking down memory leaks?

I've played around with sys.gettotalrefcount to narrow down
the place where the leaks occur, but they seem to only occur in v4,
i.e. pickle.dumps(3.0+1j, 4) leaks but pickle.dumps(3.0+1j, 3) does
not.
However, there appears to be no difference in the code that gets
executed in v3 to the one executed in v4.

--

___
Python tracker 
<http://bugs.python.org/issue15642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15642] Integrate pickle protocol version 4 GSoC work by Stefan Mihaila

2012-08-22 Thread Stefan Mihaila

Stefan Mihaila added the comment:

>- I don't really like the idea of changing the semantics of the PUT and GET 
>opcodes. I would prefer new opcodes if possible.

Well, the semantics of PUT and GET haven't really changed. It's just that the 
PUT opcode is not generated anymore and memoization is done "in agreement" 
(i.e. both the pickler and the unpickler know when to memoize so their memo 
tables stay in sync). So, in fact, it's the semantics of the other opcodes that 
has slightly changed.

>- I would like to see benchmarks for this change.

I've tried the following two snippets with timeit:

./python3.3 -m timeit \
-s 'from pickle import dumps' \
-s 'd=["a"]*100'
'dumps(d,3)' # replace 3 with 4 for comparison

./python3.3 -m timeit \
-s 'from pickle import dumps' \
-s 'd=list(map(chr,range(0,256)))' \
'dumps(d,3)' # replace 3 with 4 for comparison
 # you can also use loads(dumps(d,3)) here to 
benchmark both 
 # operations at once


The first one generates 99 BINGET opcodes. It generates 1 BINPUT opcode in 
pickle3 and no BINPUT opcodes in pickle4.
There appears no noticeable speed difference.

The second one generates no BINGET opcodes. It generates no BINPUT opcodes in 
v4, respectively 256 BINPUT opcodes in v3. It appears the v4 one is slightly 
faster, but I have a hard time comparing correctly, given that the measurements 
seem to have a very large standard deviation (v4 gets times somewhere between 
32.3 and 44.2, whereas v3 gets times between 37.7 and 52.2).

I'm not sure this is the best way to benchmark, so let me know what is usually 
used.

--

___
Python tracker 
<http://bugs.python.org/issue15642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15642] Integrate pickle protocol version 4 GSoC work by Stefan Mihaila

2012-08-19 Thread Stefan Mihaila

Stefan Mihaila added the comment:

There are still some upcoming changes.

--

___
Python tracker 
<http://bugs.python.org/issue15642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15642] Integrate pickle protocol version 4 GSoC work by Stefan Mihaila

2012-08-18 Thread Stefan Mihaila

Stefan Mihaila added the comment:

Maybe you can set this issue as the superseder of issue9269, because the 
patches there have already been applied here.

--

___
Python tracker 
<http://bugs.python.org/issue15642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15642] Integrate pickle protocol version 4 GSoC work by Stefan Mihaila

2012-08-14 Thread Stefan Mihaila

Stefan Mihaila added the comment:

Maybe we could postpone the review process for a few days
until I fix some known issues

--

___
Python tracker 
<http://bugs.python.org/issue15642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue633930] Nested class __name__

2012-07-29 Thread Stefan Mihaila

Stefan Mihaila added the comment:

Only an issue in Python2.

>>> A.B.__qualname__
'A.B'
>>> repr(A.B)
""

--
nosy: +mstefanro
versions: +Python 2.6, Python 2.7

___
Python tracker 
<http://bugs.python.org/issue633930>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2012-07-27 Thread Stefan Mihaila

Stefan Mihaila  added the comment:

Attaching patch for fixing a test and adding better testing of sets.

--
Added file: http://bugs.python.org/file26539/sets-test.patch

___
Python tracker 
<http://bugs.python.org/issue9269>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1062277] Pickle breakage with reduction of recursive structures

2012-07-27 Thread Stefan Mihaila

Changes by Stefan Mihaila :


--
nosy: +mstefanro

___
Python tracker 
<http://bugs.python.org/issue1062277>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9269] Cannot pickle self-referencing sets

2012-07-26 Thread Stefan Mihaila

Stefan Mihaila  added the comment:

I have attached a fix to this issue (and implicitly issue1062277).

This patch allows pickling self-referential sets by implementing a 
set.__reduce__ which uses states as opposed to ctor parameters.

Before:
>>> s=set([1,2,3])
>>> s.__reduce__()
(, ([1, 2, 3],), None)
>>> len(pickle.dumps(s,1))
38

After:
>>> s=set([1,2,3])
>>> s.__reduce__()
(, (), [1, 2, 3])
>>> len(pickle.dumps(s,1))
36

Basically what this does is: instead of unpickling the set by doing 
set([1,2,3]) it does s=set(); s.__setstate__([1,2,3]).
States are supported in all versions of pickle so this shouldn't break anything.
Creating empty data structures and then filling them is the way pickle does it 
for all mutable containers in order to allow self-references (with the 
exception of sets, of course).

Since memoization is performed after the object is created but before its state 
is set, pickling an object's state can contain references to oneself.

class A:
pass

a=A()
s=set([a])
a.s=s

s_=loads(dumps(s,1))
next(iter(s_)).s is s_ # True

Note that this fix only applies for sets, not frozensets. Frozensets are a 
different matter, because their immutability makes it impossible to set their 
state. Self-referential frozensets are currently supported in my implementation 
of pickle4 using a trick similar to what tuples use. But the trick works more 
easily there because frozensets have their own opcodes, like tuples.

Also note that applying this patch makes 
Lib/test/pickletester.py:test_pickle_to_2x fail (DATA3 and DATA6 there contain 
pickled data of sets, which naturally have changed).
I'll upload a patch fixing this as well as adding one or more test for sets 
soon.

--
keywords: +patch
nosy: +mstefanro
Added file: http://bugs.python.org/file26533/self_referential-sets.patch

___
Python tracker 
<http://bugs.python.org/issue9269>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15397] Unbinding of methods

2012-07-22 Thread Stefan Mihaila

Stefan Mihaila  added the comment:

Andrew, thanks for creating a separate issue (the refleak was very rare and I 
thought I'd put it in the same place, but now I realize it was a bad idea).

Richard, actually, the isinstance(self, type) check I mentioned earlier would 
have to be before the hastattr(f, '__func__') check, because Python 
classmethods provide a __func__ too:

def unbind(f):
self = getattr(f, '__self__', None)
if self is not None and not isinstance(self, types.ModuleType) \
and not isinstance(self, type):
if hasattr(f, '__func__'):
return f.__func__
return getattr(type(f.__self__), f.__name__)
raise TypeError('not a bound method')

Anyway, I'm not convinced this is worth adding anymore. As Antoine Pitrou 
suggested on the ml, it would probably be a better idea if I implemented 
__reduce__ for builtin methods as well as Python methods rather than having a 
separate opcode for pickling methods.

--

___
Python tracker 
<http://bugs.python.org/issue15397>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15397] Unbinding of methods

2012-07-22 Thread Stefan Mihaila

Stefan Mihaila  added the comment:

Richard, yes, I think that would work, I didn't think of using f.__self__'s 
type.
You might want to replace
  if self is not None and not isinstance(self, types.ModuleType):
with
  if self is not None and not isinstance(self, types.ModuleType) \
  and not isinstance(self, type):
to correctly raise an exception when called on a classmethod too.

--

___
Python tracker 
<http://bugs.python.org/issue15397>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15397] Unbinding of methods

2012-07-19 Thread Stefan Mihaila

Stefan Mihaila  added the comment:

Doesn't the definition I've added at the end of methodobject.c suffice? 
(http://codereview.appspot.com/6425052/patch/1/10) Or should the macro be 
removed altogether?

--

___
Python tracker 
<http://bugs.python.org/issue15397>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15397] Unbinding of methods

2012-07-19 Thread Stefan Mihaila

Stefan Mihaila  added the comment:

Yes, the patch is at http://codereview.appspot.com/6425052/
The code there also contains some tests I've written for functools.unbind.

--
Added file: http://bugs.python.org/file26439/unbind_test.patch

___
Python tracker 
<http://bugs.python.org/issue15397>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15397] Unbinding of methods

2012-07-19 Thread Stefan Mihaila

New submission from Stefan Mihaila :

In order to implement pickling of instance methods, a means of separating
the object and the unbound method is necessary.

This is easily done for Python methods (f.__self__ and f.__func__),
but not all of builtins support __func__. Moreover, there currently
appears to be no good way to distinguish functions from bound methods.

As a first step in solving this issue, I have attached a patch which:
1) adds __func__ for all function types
2) adds a few new definitions in the types module (AllFunctionTypes etc.)
3) adds isanyfunction(), isanyboundfunction(), isanyunboundfunction() in
  inspect (admittedly these are bad names)
4) functools.unbind

In case applying this patch is being considered, serious review is necessary,
as I'm not knowledgeable of cpython internals.

--
components: Library (Lib)
files: func.patch
keywords: patch
messages: 165845
nosy: mstefanro
priority: normal
severity: normal
status: open
title: Unbinding of methods
type: enhancement
versions: Python 3.3
Added file: http://bugs.python.org/file26438/func.patch

___
Python tracker 
<http://bugs.python.org/issue15397>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15397] Unbinding of methods

2012-07-19 Thread Stefan Mihaila

Changes by Stefan Mihaila :


--
nosy: +alexandre.vassalotti, ncoghlan, rhettinger

___
Python tracker 
<http://bugs.python.org/issue15397>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com