Re: [Python-Dev] bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123)
On 5/13/18, 12:44 PM, "Python-Dev on behalf of Christian Heimes" wrote: On 2018-05-13 06:57, Serhiy Storchaka wrote: > https://github.com/python/cpython/commit/1e2ec8a996daec65d8d5a3d43b66a9909c6d0653 > commit: 1e2ec8a996daec65d8d5a3d43b66a9909c6d0653 > branch: master > author: Rolf Eike Beer > committer: Serhiy Storchaka > date: 2018-05-13T13:57:31+03:00 > summary: > > bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123) > > The hash implementation casts the input pointer to uint64_t* and directly reads > from this, which may cause unaligned accesses. Use memcpy() instead so this code > will not crash with SIGBUS on sparc. > > https://bugs.gentoo.org/show_bug.cgi?id=636400 > > files: > A Misc/NEWS.d/next/Core and Builtins/2018-04-25-20-44-42.bpo-28055.f49kfC.rst > M Python/pyhash.c Hi Serhiy, I was against the approach a good reason. The PR adds additional CPU instructions and changes memory access pattern in a critical path of CPython. There is no reason to add additional overhead for the majority of users or X86 and X86_64 architectures. The memcpy() call should only be used on architectures that do not support unaligned memory access. See comment https://bugs.python.org/issue28055#msg276257 X86 won't *directly* write misaligned data either, it will intrinsically copy it out to a properly aligned location. In C this is also "undefined behavior", so technically the C implementation can do whatever it wants - like raise an exception - which is will on SPARC. While X86 users may not notice any problems, depending on undefined behavior working in any particular way has many drawbacks. Often C compilers will optimize code in ways that assume there is no undefined behavior in ways that breaks code that does. At least for latest GCC, the change seems to be fine. GCC emits the same assembly code for X86_64 before and after your change. Did you check the output on other CPU architectures as well as clang and MSVC, too? Christian ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/robb%40datalogics.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123)
13.05.18 20:42, Christian Heimes пише: I was against the approach a good reason. The PR adds additional CPU instructions and changes memory access pattern in a critical path of CPython. There is no reason to add additional overhead for the majority of users or X86 and X86_64 architectures. The memcpy() call should only be used on architectures that do not support unaligned memory access. See comment https://bugs.python.org/issue28055#msg276257 At least for latest GCC, the change seems to be fine. GCC emits the same assembly code for X86_64 before and after your change. Did you check the output on other CPU architectures as well as clang and MSVC, too? For the initial implementation of pyhash.c [1] I proposed a patch that use memcpy() conditionally to avoid an overhead on Windows: +#ifdef _MSC_VER +block.value = *(const Py_uhash_t*)p; +#else +memcpy(block.bytes, p, SIZEOF_PY_UHASH_T); +#endif (and similar code for FNV). But many developers confirmed that all modern compilers including latest versions of MS VS optimize memcpy() with a constant size into a single CPU instruction. Seems avoiding to use memcpy() no longer needed. If using memcpy() adds an overhead on some platforms we can return to using a compiler/platform depending code. [1] https://bugs.python.org/issue19183 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123)
On 2018-05-13 06:57, Serhiy Storchaka wrote: > https://github.com/python/cpython/commit/1e2ec8a996daec65d8d5a3d43b66a9909c6d0653 > commit: 1e2ec8a996daec65d8d5a3d43b66a9909c6d0653 > branch: master > author: Rolf Eike Beer > committer: Serhiy Storchaka > date: 2018-05-13T13:57:31+03:00 > summary: > > bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123) > > The hash implementation casts the input pointer to uint64_t* and directly > reads > from this, which may cause unaligned accesses. Use memcpy() instead so this > code > will not crash with SIGBUS on sparc. > > https://bugs.gentoo.org/show_bug.cgi?id=636400 > > files: > A Misc/NEWS.d/next/Core and Builtins/2018-04-25-20-44-42.bpo-28055.f49kfC.rst > M Python/pyhash.c Hi Serhiy, I was against the approach a good reason. The PR adds additional CPU instructions and changes memory access pattern in a critical path of CPython. There is no reason to add additional overhead for the majority of users or X86 and X86_64 architectures. The memcpy() call should only be used on architectures that do not support unaligned memory access. See comment https://bugs.python.org/issue28055#msg276257 At least for latest GCC, the change seems to be fine. GCC emits the same assembly code for X86_64 before and after your change. Did you check the output on other CPU architectures as well as clang and MSVC, too? Christian ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com