[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-26 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

People may use SystemError for other purposes, but the docs are pretty clear it 
is only for internal errors that indicate an interpreter bug: 
http://docs.python.org/dev/library/exceptions.html#SystemError

Extension modules or an embedding application passing in bad data that fails a 
validation check isn't such a situation, so raising TypeError or ValueError is 
more appropriate.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-25 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Alexander Belopolsky wrote:
 
 Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
 
 On Thu, Feb 24, 2011 at 3:54 PM, Antoine Pitrou rep...@bugs.python.org 
 wrote:
 ..
 I've committed the part of the patch which disallows a NULL data pointer
 with PyMemoryView_FromBuffer in r88550 and r88551.
 
 Is it possible to create such buffer in Python (other than by
 exploiting a bug or writing a rogue extension module)?  If not, this
 should be a SystemError or even just an assert() rather than
 ValueError.

We normally raise a SystemError for invalid NULL pointers, not
ValueErrors.

SystemError is used for things that go wrong at the C level and
indicate a likely programming error at that level. The origin of
that error is not important, but the check will usually happen
in the interpreter.

FWIW, it's also used in C extensions for much the same reason,
so in the wild you'll also find SystemErrors raised by
3rd party extensions.

--
title: Some trivial python 2.x pickles fails to load in Python 3.2 - Some 
trivial python 2.x pickles fails to load in  Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-24 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

Committed in r88546 (3.3) and r88548 (3.2).

Note that a simple work-around before 3.2.1 is to spell encoding as 'latin-1' 
or 'iso-8859-1' in pickle.loads().

--
components: +Extension Modules -Library (Lib)
resolution:  - fixed
stage: commit review - committed/rejected
status: open - closed
versions: +Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-24 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

I've committed the part of the patch which disallows a NULL data pointer with 
PyMemoryView_FromBuffer in r88550 and r88551.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-24 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

On Thu, Feb 24, 2011 at 3:54 PM, Antoine Pitrou rep...@bugs.python.org wrote:
..
 I've committed the part of the patch which disallows a NULL data pointer
 with PyMemoryView_FromBuffer in r88550 and r88551.

Is it possible to create such buffer in Python (other than by
exploiting a bug or writing a rogue extension module)?  If not, this
should be a SystemError or even just an assert() rather than
ValueError.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-24 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

  I've committed the part of the patch which disallows a NULL data pointer
  with PyMemoryView_FromBuffer in r88550 and r88551.
 
 Is it possible to create such buffer in Python (other than by
 exploiting a bug or writing a rogue extension module)?  If not, this
 should be a SystemError or even just an assert() rather than
 ValueError.

I'm against asserts for such use, since they get disabled in non-debug
mode (which is the mode 99.99% of users run in).

As for SystemError, it means Internal error in the Python interpreter,
which isn't the case here (most likely it's an error in an extension
module instead, possibly a third-party one).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-24 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

It seems appropriate to consult python-dev on this.  I thought
ValueError was for values that are valid Python objects but out of
acceptable range of the function.  Errors that can only be triggered
in C code normally handled with either assert() or raise SystemError.
I think you are splitting hairs too thin by distinguishing between
stdlib and 3rd party extensions.

On Thu, Feb 24, 2011 at 4:07 PM, Antoine Pitrou rep...@bugs.python.org wrote:

 Antoine Pitrou pit...@free.fr added the comment:

  I've committed the part of the patch which disallows a NULL data pointer
  with PyMemoryView_FromBuffer in r88550 and r88551.

 Is it possible to create such buffer in Python (other than by
 exploiting a bug or writing a rogue extension module)?  If not, this
 should be a SystemError or even just an assert() rather than
 ValueError.

 I'm against asserts for such use, since they get disabled in non-debug
 mode (which is the mode 99.99% of users run in).

 As for SystemError, it means Internal error in the Python interpreter,
 which isn't the case here (most likely it's an error in an extension
 module instead, possibly a third-party one).

 --

 ___
 Python tracker rep...@bugs.python.org
 http://bugs.python.org/issue11286
 ___


--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-24 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 It seems appropriate to consult python-dev on this.  I thought
 ValueError was for values that are valid Python objects but out of
 acceptable range of the function.  Errors that can only be triggered
 in C code normally handled with either assert() or raise SystemError.
 I think you are splitting hairs too thin by distinguishing between
 stdlib and 3rd party extensions.

Hey, I'm not sure who's splitting hairs here... Nick was ok with the
initial patch and that was sufficient as far as I'm concerned, but you
can ask python-dev if you care.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-24 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

A SystemError indicates that an internal API was given bogus input or produces 
bogus output (i.e. we screwed up somewhere, or a third party is messing with 
interfaces they shouldn't be)

If data validation fails for part of the public C API (whether it is visible to 
Python code or not), then ValueError is the right thing to raise.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Alexander Belopolsky wrote:
 
 I am not sure PyUnicode_Decode() should treat NULL as an empty string.

Definitely not. That would hide programming errors.

--
nosy: +lemburg
title: Some trivial python 2.x pickles fails to load in Python 3.2 - Some 
trivial python 2.x pickles fails to load in  Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

  I am not sure PyUnicode_Decode() should treat NULL as an empty string.
 
 Definitely not. That would hide programming errors.

Well, this could break some third-party code.

--
title: Some trivial python 2.x pickles fails to load in   Python 3.2 - 
Some trivial python 2.x pickles fails to load in Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Antoine Pitrou wrote:
 
 Antoine Pitrou pit...@free.fr added the comment:
 
 I am not sure PyUnicode_Decode() should treat NULL as an empty string.

 Definitely not. That would hide programming errors.
 
 Well, this could break some third-party code.

If that code passes NULL in as buffer s, that 3rd party code is
already broken and the patch would hide this fact.

--
title: Some trivial python 2.x pickles fails to load in Python 3.2 - Some 
trivial python 2.x pickles fails to load in  Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 Antoine Pitrou wrote:
  
  Antoine Pitrou pit...@free.fr added the comment:
  
  I am not sure PyUnicode_Decode() should treat NULL as an empty string.
 
  Definitely not. That would hide programming errors.
  
  Well, this could break some third-party code.
 
 If that code passes NULL in as buffer s, that 3rd party code is
 already broken and the patch would hide this fact.

Why broken? Passing NULL as a pointer and 0 as the length doesn't sound
broken.
Other APIs such as PyString_FromStringAndSize() allow exactly this
convention.

--
title: Some trivial python 2.x pickles fails to load in   Python 3.2 - 
Some trivial python 2.x pickles fails to load in Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Antoine Pitrou wrote:
 
 Antoine Pitrou pit...@free.fr added the comment:
 
 Antoine Pitrou wrote:

 Antoine Pitrou pit...@free.fr added the comment:

 I am not sure PyUnicode_Decode() should treat NULL as an empty string.

 Definitely not. That would hide programming errors.

 Well, this could break some third-party code.

 If that code passes NULL in as buffer s, that 3rd party code is
 already broken and the patch would hide this fact.
 
 Why broken? Passing NULL as a pointer and 0 as the length doesn't sound
 broken.
 Other APIs such as PyString_FromStringAndSize() allow exactly this
 convention.

Right. They allow for this because those are object constructors
that allow for creating objects without pre-defined content
(hence the NULL). The 0 length feature is just a side-effect,
not the main reason why we allow NULLs to be passed in as
content buffer to those constructors.

PyUnicode_Decode() et al. are conversion functions and these
require valid content to work on. Passing in a NULL pointer
does not fit that specification and so allowing for this
would hide programming errors.

E.g. an application using PyUnicode_Decode()
might have hit a C lib error and forgets to check the pointer
for NULL. It then calls PyUnicode_Decode() and the application
continues happily without reporting the error.

That's a broken application.

--
title: Some trivial python 2.x pickles fails to load in Python 3.2 - Some 
trivial python 2.x pickles fails to load in  Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 PyUnicode_Decode() et al. are conversion functions and these
 require valid content to work on. Passing in a NULL pointer
 does not fit that specification and so allowing for this
 would hide programming errors.

Valid content doesn't mean a lot when the length is 0.
What is a valid 0-length string compared to an invalid one?
What if the pointer is non-NULL but segfaults when trying to dereference
it? Is it valid?

Moreover, malloc() is allowed by POSIX to return NULL when called with a
0 length:

If size is 0, either a null pointer or a unique pointer that can
be successfully passed to free() shall be returned.

(http://www.opengroup.org/onlinepubs/007904875/functions/malloc.html)

... which means that such a pointer can then, depending on the platform,
get passed (legitimately) to PyUnicode_Decode().

So, IMO, practicality beats purity here. Especially since it is bound to
land in a bugfix release (3.2.1), which users don't expect to produce
regressions in their own code.

OTOH, I agree that a NULL pointer combined with non-0 length could
produce an explicit error.

--
title: Some trivial python 2.x pickles fails to load in   Python 3.2 - 
Some trivial python 2.x pickles fails to load in Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Antoine Pitrou wrote:
 
 Antoine Pitrou pit...@free.fr added the comment:
 
 PyUnicode_Decode() et al. are conversion functions and these
 require valid content to work on. Passing in a NULL pointer
 does not fit that specification and so allowing for this
 would hide programming errors.
 
 Valid content doesn't mean a lot when the length is 0.
 What is a valid 0-length string compared to an invalid one?
 What if the pointer is non-NULL but segfaults when trying to dereference
 it? Is it valid?
 
 Moreover, malloc() is allowed by POSIX to return NULL when called with a
 0 length:
 
 If size is 0, either a null pointer or a unique pointer that can
 be successfully passed to free() shall be returned.
 
 (http://www.opengroup.org/onlinepubs/007904875/functions/malloc.html)
 
 ... which means that such a pointer can then, depending on the platform,
 get passed (legitimately) to PyUnicode_Decode().

... and Python has for years made sure that PyMem_Malloc() et al.
return a non-NULL pointer when passed a size 0 value (see pymem.h for
details), since the above was a really poor design choice.

A lot of Python code relies on those functions returning NULL only
in case of an error.

 So, IMO, practicality beats purity here. Especially since it is bound to
 land in a bugfix release (3.2.1), which users don't expect to produce
 regressions in their own code.

Nope. Your suggestion would be a new feature and those are not
allowed in patch level releases.

I'm -1 on the idea for the reasons already stated.

--
title: Some trivial python 2.x pickles fails to load in Python 3.2 - Some 
trivial python 2.x pickles fails to load in  Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

  So, IMO, practicality beats purity here. Especially since it is bound to
  land in a bugfix release (3.2.1), which users don't expect to produce
  regressions in their own code.
 
 Nope. Your suggestion would be a new feature and those are not
 allowed in patch level releases.

What new feature are you talking about? I think you misunderstood the
actual issue: NULL as an empty string *worked* in 3.1 and that's why we
have a regression with _pickle here.

So, in the end, you are the one proposing a change in actual behaviour,
while I'm trying to minimize changes (and breakage).

--
title: Some trivial python 2.x pickles fails to load in   Python 3.2 - 
Some trivial python 2.x pickles fails to load in Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Jesús Cea Avión

Jesús Cea Avión j...@jcea.es added the comment:

What if we commit Antoine patch for 3.2.x, and the correct patch for py3k 
trunk?.

I am actually +1 to Marc-Andre. I feel in my guts that the provided patch is 
hidding a deeper issue. But avoiding surprises for third parties in 3.2.1 is a 
good idea.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

  Nope. Your suggestion would be a new feature and those are not
  allowed in patch level releases.
 
 What new feature are you talking about? I think you misunderstood the
 actual issue: NULL as an empty string *worked* in 3.1

And, with a very high likelihood, it also worked in 2.7, 2.6, etc.
So, feel free to suggest a compatibility breakage in 3.3 if you think
that's better (*), but I don't think that's acceptable in 3.2.1, where
the issue should be fixed anyway.

(*) which really means provide a patch :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Please go with Alexander's solution of fixing the higher level code rather than 
silently trying to introduce a new feature in PyUnicode_Decode() that hides 
programming errors.

Thanks.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 Please go with Alexander's solution of fixing the higher level code
 rather than silently trying to introduce a new feature in
 PyUnicode_Decode() that hides programming errors.

I'm sorry, I'm perfectly fine with my own patch, so someone else will
have to propose an alternative patch if they want you. I am certainly
not at your disposal for any orders you might give :)

Remember, this is open source, so active contributors have the last say,
not by-standers. Thank you.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Jesús Cea Avión wrote:
 
 Jesús Cea Avión j...@jcea.es added the comment:
 
 What if we commit Antoine patch for 3.2.x, and the correct patch for py3k 
 trunk?.
 
 I am actually +1 to Marc-Andre. I feel in my guts that the provided patch is 
 hidding a deeper issue. But avoiding surprises for third parties in 3.2.1 is 
 a good idea.

Jesus, could you please check whether Alexander's solution fixes
the problem ?

Thanks.

--
title: Some trivial python 2.x pickles fails to load in Python 3.2 - Some 
trivial python 2.x pickles fails to load in  Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Oops, I hadn't seen Alexander's patch. Sorry.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

So, about Alexander's patch:
- it lacks a test
- it doesn't solve the issue with PyUnicode_Decode's confusing error message 
when a NULL is passed (ValueError: operation forbidden on released memoryview 
object); if we want to disallow NULL, we should have a clean error message

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

Antoine is right, my patch is only of discussion quality and if my approach 
gets support, I will produce a more polished patch.

While I am fairly certain that this bug should be fixed where it was 
introduced, namely in the _pickle module, I am not sure patching both 
load_binstring() and load_short_binstring() is the right approach.

It may be better to modify _Unpickler_Read() so that it returns 
self-input_buffer (or even self-input_buffer + self-next_read_idx) for zero 
n.  This would be a cleaner design similar to that for PyMem_Malloc() et al. 
(Since _Unpickler_Read() is private API and the comment documenting it does not 
specify that it returns NULL for n = 0, I think this can be done in a bugfix 
release.  Furthermore, I reviewed the uses of _Unpickler_Read() with variable 
and thus potentially zero size and one of them is followed by a null check for 
s.)

On the other hand, my patch also eliminates redundant call to _Unpickler_Read() 
and makes load_binstring() and load_short_binstring() logic similar to that in 
load_counted_long().  The main advantage, of course is skipping 
PyUnicode_Decode() which will load a codec potentially triggering an import and 
execution of python code.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 It may be better to modify _Unpickler_Read() so that it returns
 self-input_buffer (or even self-input_buffer + self-next_read_idx)
 for zero n.

I agree this would be better for readability and maintainability.

 On the other hand, my patch also eliminates redundant call to
 _Unpickler_Read() and makes load_binstring() and
 load_short_binstring() logic similar to that in load_counted_long().
 The main advantage, of course is skipping PyUnicode_Decode() which
 will load a codec potentially triggering an import and execution of
 python code.

Well, a theoretical argument could be made that some codec could return
a non-empty string when asked to decode an empty bytestring, but I'm not
sure it has much practical worth :)

--
title: Some trivial python 2.x pickles fails to load in   Python 3.2 - 
Some trivial python 2.x pickles fails to load in Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

On Wed, Feb 23, 2011 at 10:22 AM, Antoine Pitrou rep...@bugs.python.org wrote:
..
 Well, a theoretical argument could be made that some codec could return
 a non-empty string when asked to decode an empty bytestring, but I'm not
 sure it has much practical worth :)

I was thinking about that as well.  Note that the opposite is quite
common, for example any encoding that uses BOM will turn empty unicode
string into a non-empty byte string.  I don't think a codec that
decodes b'' into non-empty string exists, but it would be reasonable
for a codec that requires BOM or some other metadata to reject raise
an error on b''.  If we rely on decode(b'') == '', these errors will
go unnoticed.  I am going to prepare a Unpickler_Read() patch with
tests and play with it a little.  It is a good idea to separate
performance optimizations from bug fixes anyways.  If we want to
bypass decode on empty strings, we can do it independently.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Alexander Belopolsky

Changes by Alexander Belopolsky belopol...@users.sourceforge.net:


--
assignee:  - belopolsky

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

I am attaching a new version of issue11286.diff which fixes the issue by 
removing special handling of n == 0 case from _Unpickler_Read().  Note that 
_Unpickler_Read() (formerly known as unpickler_read()) only started to return 
null pointer instead of a pointer to an empty string when given n == 0 after 
optimizations implemented in issue #9410.  This observation makes me more 
comfortable with changing the behavior of this function because current 
behavior is itself a regression from 3.1.

(An off-topic remark:  What was the point of renaming static functions in 
_pickle.c made in r84653?  There is no need to prefix static C functions with 
an underscore to make them private and the convention seems to be not to use 
CamelCase in the names of non-CAPI functions.)

--
stage: patch review - commit review
Added file: http://bugs.python.org/file20870/issue11286.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-23 Thread Alexander Belopolsky

Changes by Alexander Belopolsky belopol...@users.sourceforge.net:


Removed file: http://bugs.python.org/file20856/issue11286.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-22 Thread Jesús Cea Avión

New submission from Jesús Cea Avión j...@jcea.es:

I have 10MB pickled structure generated in Python 2.7. I only use basic
types (no clases) like sets, dictionaries, lists, strings, etc.

The pickle stores a lot of strings. Some of them should be bytes,
while other should be unicode. My idea is to import ALL the strings as
bytes in Python 3.2 and navigate the data structure to convert the
appropiate values to unicode, in a one-time operation (I version the
structure, so I can know if this conversion is already done, simply
storing a new version value).

But I get this error:


Python 3.2 (r32:88445, Feb 21 2011, 13:34:07)
[GCC 4.4.3] on linux2
Type help, copyright, credits or license for more information.
  f=open(file.pickle, mode=rb).read()
  len(f)
9847316
  b=pickle.loads(f,encoding=latin1)
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: operation forbidden on released memoryview object


I use the encoding latin1 for transparent byte/unicode conversion (do
not touch the values!).

This seems to be a bug in Python 3.2. Any suggestion?.

The bytestream is protocol 2.


PYTHON 3.1.3 loads the pickle just fine.

Trying to generate a testcase. I can't upload the pickle, because it contains 
propietary information.

--
components: Library (Lib)
messages: 129075
nosy: georg.brandl, jcea
priority: release blocker
severity: normal
stage: test needed
status: open
title: Some trivial python 2.x pickles fails to load in Python 3.2
type: behavior
versions: Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-22 Thread Jesús Cea Avión

Jesús Cea Avión j...@jcea.es added the comment:

I got a trivial (41 bytes long) reproductable case:

Pickle the following structure in Python 2.7, and try to unpickle with Python 
3.2, using a latin1 encoding:


{'ya_volcados': {'comment': ''}}


Load with:


#!/usr/local/bin/python3

import pickle

f=open(z.pickle, rb).read()

a=pickle.loads(f,encoding=latin1)


(I use latin1 because my real pickle includes binary data stored as strings. 
Note that the testcase doesn't content binary data).

Python 3.2, 32 bits. Tested under Solaris and Linux.

This worked in Python 3.1.3, so we have a regression.

--
keywords: +3.2regression
stage: test needed - needs patch
Added file: http://bugs.python.org/file20839/z.pickle

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-22 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Ok, so it boils down to:

 pickle.loads(b'\x80\x02}q\x00U\x00q\x01h\x01s.')
{'': ''}
 pickle.loads(b'\x80\x02}q\x00U\x00q\x01h\x01s.', encoding='latin1')
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: operation forbidden on released memoryview object

(this is protocol 2, by the way. Protocol 0 works fine.)

--
nosy: +pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-22 Thread Jesús Cea Avión

Jesús Cea Avión j...@jcea.es added the comment:

Storing the pickle using protocol 0, it works OK.

Using protocol 1 or 2, it fails.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-22 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

This is because of the buffer pointer passed to the codecs machinery being NULL 
when the empty string is being decoded.

Quick patch follows (needs a test). Also, it is not clear whether it is allowed 
to store a NULL pointer in the buf member of a Py_buffer when the length is 
0. Nick, Mark?


Index: Objects/unicodeobject.c
===
--- Objects/unicodeobject.c (révision 88500)
+++ Objects/unicodeobject.c (copie de travail)
@@ -1460,6 +1460,7 @@
 PyObject *buffer = NULL, *unicode;
 Py_buffer info;
 char lower[11];  /* Enough for any encoding shortcut */
+static const char empty[] = ;
 
 if (encoding == NULL)
 encoding = PyUnicode_GetDefaultEncoding();
@@ -1485,6 +1486,8 @@
 
 /* Decode via the codec registry */
 buffer = NULL;
+if (s == NULL)
+s = empty;
 if (PyBuffer_FillInfo(info, NULL, (void *)s, size, 1, PyBUF_FULL_RO)  0)
 goto onError;
 buffer = PyMemoryView_FromBuffer(info);

--
nosy: +mark.dickinson, ncoghlan

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-22 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Here is a patch with tests.

--
keywords: +patch
stage: needs patch - patch review
Added file: http://bugs.python.org/file20842/null_ptr_buffer.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-22 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

Antoine's patch looks good to me.

Given the assumptions in the memoryview code, disallowing NULL for the buf 
pointer sounds like the right thing to do, even for zero-length buffers.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-22 Thread Alexander Belopolsky

Changes by Alexander Belopolsky belopol...@users.sourceforge.net:


--
nosy: +belopolsky

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11286] Some trivial python 2.x pickles fails to load in Python 3.2

2011-02-22 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

I am not sure PyUnicode_Decode() should treat NULL as an empty string.  
Decoding empty string is wasteful and if the caller knows that the string is 
empty, it should skip decoding because the result is empty.  Providing *two* 
ways to invoke expensive PyUnicode_Decode() to produce a known result is 
promoting inefficient coding.

This particular issue I would fix with something like attached issue11286.diff.

--
Added file: http://bugs.python.org/file20856/issue11286.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11286
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com