[issue14249] unicodeobject.c: aliasing warnings

2012-03-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I'm sorry. Here is the corrected patch for big-endian plathform. -- Added file: http://bugs.python.org/file25072/utf16_decoder_shift_3.patch ___ Python tracker _

[issue14249] unicodeobject.c: aliasing warnings

2012-04-05 Thread Roundup Robot
Roundup Robot added the comment: New changeset 2c514c382a2a by Victor Stinner in branch 'default': Close #14249: Use an union instead of a long to short pointer to avoid aliasing http://hg.python.org/cpython/rev/2c514c382a2a -- nosy: +python-dev resolution: -> fixed stage: -> committe

[issue14249] unicodeobject.c: aliasing warnings

2012-04-05 Thread STINNER Victor
STINNER Victor added the comment: Result of the benchmark before/after my commit. I prefer an unit over manually manipulate long as short or bytes, because I think that the compiler knows better how to optimize operations on integers. unpatched: $ ./python -m timeit -s 'import codecs; d = co

[issue14249] unicodeobject.c: aliasing warnings

2012-04-05 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: What compiler are you using? With gcc 4.4 on 32-bit Linux netbook I get: unpatched union shift utf-16le " "*1 129 126109 utf-16le "\u263A"*1208 203160 utf-16be " "*1 153 147

[issue14249] unicodeobject.c: aliasing warnings

2012-04-05 Thread Stefan Krah
Stefan Krah added the comment: On 64-bit Linux with gcc-4.4 I get: Unpatched: $ ./python -m timeit -s 'import codecs; d = codecs.utf_16_be_decode; x = (" " * 1000).encode("utf-16be")' 'd(x)' 10 loops, best of 3: 4.1 usec per loop $ ./python -m timeit -s 'import codecs; d = codecs.utf_16_b

[issue14249] unicodeobject.c: aliasing warnings

2012-04-05 Thread Antoine Pitrou
Antoine Pitrou added the comment: Linux, 64-bit, Intel Core i5 2500: -> unpatched: $ ./python -m timeit -s 'import codecs; d = codecs.utf_16_be_decode; x = (" " * 1000).encode("utf-16be")' 'd(x)' 10 loops, best of 3: 2.99 usec per loop -> Victor's commit: $ ./python -m timeit -s 'import

[issue14249] unicodeobject.c: aliasing warnings

2012-04-05 Thread Roundup Robot
Roundup Robot added the comment: New changeset 489f252b1f8b by Victor Stinner in branch 'default': Close #14249: Use bit shifts instead of an union, it's more efficient. http://hg.python.org/cpython/rev/489f252b1f8b -- resolution: -> fixed stage: commit review -> committed/rejected sta

[issue14249] unicodeobject.c: aliasing warnings

2012-04-05 Thread STINNER Victor
STINNER Victor added the comment: Ok, benchmarks have spoken, amen. I applied Serhiy Storchaka's patch (version 3). I just replaced expressions in calls to Py_MAX by variables: Py_MAX is a macro and it may have to compute each expression twice. I didn't check if it's more or less efficient. T

[issue14249] unicodeobject.c: aliasing warnings

2012-04-05 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > I just replaced expressions in calls to Py_MAX by variables: Py_MAX is a > macro and it may have to compute each expression twice. gcc computes those values only once. It even caches them for use in PyUnicode_WRITE. But other compilers may not be so smart.

[issue14249] unicodeobject.c: aliasing warnings

2012-03-10 Thread Stefan Krah
New submission from Stefan Krah : There are a couple of aliasing warnings in non-debug mode. For example: http://www.python.org/dev/buildbot/all/builders/x86%20Gentoo%20Non-Debug%203.x/builds/1741 Objects/object.c:293: warning: ignoring return value of 'fwrite', declared with attribute warn_u

[issue14249] unicodeobject.c: aliasing warnings

2012-03-11 Thread Benjamin Peterson
Benjamin Peterson added the comment: gcc 4.5 doesn't warn for me. Is this a compiler bug in 4.4 or 4.5? That is, are these actual aliasing violations? -- nosy: +benjamin.peterson ___ Python tracker __

[issue14249] unicodeobject.c: aliasing warnings

2012-03-12 Thread Stefan Krah
Stefan Krah added the comment: Benjamin Peterson wrote: > gcc 4.5 doesn't warn for me. Is this a compiler bug in 4.4 or 4.5? > That is, are these actual aliasing violations? I see this with 4.4 but also with 4.6 when using -Wstrict-aliasing=2. However, nothing bad happens when I compile with -

[issue14249] unicodeobject.c: aliasing warnings

2012-03-19 Thread STINNER Victor
STINNER Victor added the comment: Attached patch uses an union to make the compiler warning quiet. It should not speed up Python because the function already ensures that the pointer is aligned to the size of a long. It may slow down the function, I don't know gcc enough to guess exactly the

[issue14249] unicodeobject.c: aliasing warnings

2012-03-20 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: What if add more hacking? If long integer already used for buffering and checking, let use it for swapping and splitting too. With my patch (attached) codecs.utf_16_be_decode runs 5% faster (on 32-bit Linux, I was not tested 64-bit). And of cause no pointer

[issue14249] unicodeobject.c: aliasing warnings

2012-03-20 Thread STINNER Victor
STINNER Victor added the comment: > With my patch (attached) codecs.utf_16_be_decode runs 5% faster (on 32-bit > Linux, I was not tested 64-bit). And of cause no pointers -- no aliasing > warnings. Your patch is wrong: you need to use & 0x to get lower 16 bits when reading a UTF-16 unit.

[issue14249] unicodeobject.c: aliasing warnings

2012-03-20 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Heh. This was in previous version of my patch. I have removed '& 0xu' and parents for simplicity. GCC produces same binaries for both sources. But you can return it back. It has effect only on plathforms with non-16-bit short, but now Python not support

[issue14249] unicodeobject.c: aliasing warnings

2012-03-20 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : Added file: http://bugs.python.org/file24960/utf16_decoder_shift_2.patch ___ Python tracker ___ ___ Python-bugs-

[issue14249] unicodeobject.c: aliasing warnings

2012-03-20 Thread STINNER Victor
STINNER Victor added the comment: > It has effect only on plathforms with non-16-bit short The problem is for 64-bit long: "long >> 32" returns the 32 higher bits (32..64), not 16 bits (32..48). -- ___ Python tracker

[issue14249] unicodeobject.c: aliasing warnings

2012-03-20 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: "(unsigned short)(long >> 32)" returns the 16 bits (32..48) if short is 16-bit. I agree that this variant is more strict and reliable (and this was my original version) and if you do not find it verbose and redundant, so be it. The difference will be notice