[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2019-04-12 Thread STINNER Victor


STINNER Victor  added the comment:

I see that a fix has been pushed. I'm not sure why this issue is still open, so 
I close it.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2018-07-22 Thread STINNER Victor


STINNER Victor  added the comment:

I would say that Python no longer officially supports sparc and solaris
because of the lack of volunteer.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2018-07-21 Thread Jeffrey Walton


Jeffrey Walton  added the comment:

I know this is a bit late but I wanted to share...

OpenCSW has a build farm with Solaris machines and Sparc hardware. The farm 
provides x86 and Sparc machines with Solaris 9 through 11.

I believe OpenCSW operates in the same spirit as GCC compile farm. They welcome 
open source developers and upstream maintainers to help ensure packages build 
and run on Solaris machines.

You can read about it at
https://www.opencsw.org/extend-it/signup/to-upstream-maintainers/ .

If Python is performing memory access patterns as discussed in the report then 
it would probably benefit the project to test on a Sparc machine with Solaris 
11.

--
nosy: +Jeffrey.Walton

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2018-05-13 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Indeed the memcpy() approach is the common idiom in such situations, and sounds 
like the right thing.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2018-05-13 Thread Stefan Krah

Stefan Krah  added the comment:

MSVC optimizes memcpy() to an assignment, sometimes too well (pgo): 
https://bugs.python.org/issue15993

But that is fixed long ago, so I also think that the memcpy() approach is best.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2018-05-13 Thread miss-islington

miss-islington  added the comment:


New changeset 0d17e60b33aca1a4d151a8a2bd6eaa331f0ec658 by Miss Islington (bot) 
in branch '3.6':
bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123)
https://github.com/python/cpython/commit/0d17e60b33aca1a4d151a8a2bd6eaa331f0ec658


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2018-05-13 Thread miss-islington

miss-islington  added the comment:


New changeset 8ed545f6de37efdadbcf71c45bb8136b8cb9619d by Miss Islington (bot) 
in branch '3.7':
bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123)
https://github.com/python/cpython/commit/8ed545f6de37efdadbcf71c45bb8136b8cb9619d


--
nosy: +miss-islington

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2018-05-13 Thread miss-islington

Change by miss-islington :


--
pull_requests: +6464

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2018-05-13 Thread miss-islington

Change by miss-islington :


--
pull_requests: +6463

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2018-05-13 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:


New changeset 1e2ec8a996daec65d8d5a3d43b66a9909c6d0653 by Serhiy Storchaka 
(Rolf Eike Beer) in branch 'master':
bpo-28055: Fix unaligned accesses in siphash24(). (GH-6123)
https://github.com/python/cpython/commit/1e2ec8a996daec65d8d5a3d43b66a9909c6d0653


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2018-05-01 Thread Ned Deily

Ned Deily  added the comment:

What's the status of this?  It looks like Serhiy has reviewed and approved 
Dakon's PR 6123.  Is everyone OK with merging it?  Anything more needed?

--
versions: +Python 3.8 -Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2018-03-26 Thread Rolf Eike Beer

Rolf Eike Beer  added the comment:

So, what is the problem with this? Either the compiler knows that unaligned 
accesses are no problem and optimizes them away anyway, or it is kept because 
it would crash otherwise. I can confirm that no sparc version >= 3.5 (have not 
tried older) survives the test suite on Gentoo Sparc (64 bit kernel, 32 bit 
userspace) without memcpy().

--
nosy: +Dakon

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2018-03-26 Thread Rolf Eike Beer

Change by Rolf Eike Beer :


--
pull_requests: +5986

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2017-01-28 Thread Antoine Pitrou

Antoine Pitrou added the comment:

I agree with Stefan and Serhiy.  Unaligned memoryviews shouldn't trigger a copy 
when hashing.

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2017-01-27 Thread Greg Onufer

Greg Onufer added the comment:

32-bit and 64-bit SPARC ABIs have 64-bit integer data types.

SPARC, like many RISC architectures, also has natural alignment requirements.  
Attempting to dereference a pointer to a 4-byte-sized object requires 4-byte 
alignment, for example.  2-byte-sized objects require 2-byte alignment.  
8-byte-sized objects require 8-byte alignment.

siphash24 is encountering this bug on modern SPARC (32-bit ABI currently, 
haven't tried compiling as 64-bit yet).  The code simply is not portable.

Benjamin's patch gets the failing self-test (test_plistlib) to pass as well as 
the simple test case in msg275493 above.

--
nosy: +gco

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-14 Thread Ned Deily

Changes by Ned Deily :


--
priority: normal -> high
stage: needs patch -> patch review
versions: +Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-14 Thread STINNER Victor

STINNER Victor added the comment:

Christian Heimes added the comment:
> The main reason for two different hash algorithms was missing support for
64bit integer types. Python 3.4 was targeting platforms that had no 64bit
integer support at all (IIRC SPARC). Nowaday Python requires 64bit ints to
compile.
>
> I'm all in favor to remove FVN2 and use SipHash24 on all platforms. Let's
deprecated it now and remove it in 3.7.

Python 3.5.0 doesn't compile if the compiler doesn't support 64 signed
integer: see pytime.h ;-)

Are you aware of platforms still using FVN2?

I'm also in favor of deprecating it. Maybe use #warning in C to log a
deprecation warning.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-14 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I support Stefan. Just requiring 8-byte align is the easiest solution, but it 
doesn't work with memoryview without expensive memory allocation and copying.

Look at the FNV code. It supports non-4-byte aligned data, and does it in a 
safe and efficient way.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-14 Thread Stefan Krah

Stefan Krah added the comment:

I see. No, most NumPy arrays are C-contiguous. Multi-dimmensional arrays
are contiguous, too.

Non C-contiguous arrays arise mostly during slicing or if they're
Fortran-order to begin with.


But NumPy aside, it's weird to have slice of a huge regular bytes view
(this particular slice is still C-contiguous) that is suddenly copied
because the alignment requirements changed.

I really prefer a simple rule for memoryview: If the data is C-contiguous,
you get the fast path.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-14 Thread Christian Heimes

Christian Heimes added the comment:

memory_hash has to convert buffers unless the buffer is a single-dimensional, 
C-style and contiguous buffer. A NumPy matrix has more than one dimension, so 
it must be converted.

https://hg.python.org/cpython/file/tip/Objects/memoryobject.c#l2854

if (!MV_C_CONTIGUOUS(self->flags)) {
mem = PyMem_Malloc(view->len);
if (mem == NULL) {
PyErr_NoMemory();
return -1;
}
if (buffer_to_contiguous(mem, view, 'C') < 0) {
PyMem_Free(mem);
return -1;
}
}

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-14 Thread Stefan Krah

Stefan Krah added the comment:

I don't understand this. Could you explain?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-14 Thread Christian Heimes

Christian Heimes added the comment:

memoryview() has to create a copy for NumPy memoryviews already.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-14 Thread Stefan Krah

Stefan Krah added the comment:

s/unaligned/not 8-byte-aligned/

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-14 Thread Stefan Krah

Stefan Krah added the comment:

Numpy itself internally doesn't. Consumers of numpy arrays use
memoryviews. Numpy is often used as a library these days, even
for simple things like storing a 2-d table, which can easily be
several TB.

It is also easy to generate unaligned data by just taking a slice
of a bytes memoryview.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-14 Thread Christian Heimes

Christian Heimes added the comment:

How often does NumPy create a C-style, single dimensional, continuous 
memoryview? I would assume that it deals with matrices, Fortran data and/or 
other strides, multi-dimensional data in almost all cases.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-14 Thread Stefan Krah

Stefan Krah added the comment:

Ah, yes. But compilers optimize memcpy and this is a guaranteed slowdown for 
the unaligned memoryview case.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-14 Thread Christian Heimes

Christian Heimes added the comment:

It's totally possible. Benjamin's patch implements it like I have suggested it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-14 Thread Stefan Krah

Stefan Krah added the comment:

For memoryview this is not possible: It is explicitly unaligned and the feature 
is used in e.g. NumPy.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Benjamin Peterson

Benjamin Peterson added the comment:

Here's a patch that requires 8-byte alignment. It almost completely works 
except that on ABIs with 32-bit pointers, unicode objects can have their data 
pointers aligned at only 4-bytes. Perhaps we can get away with requiring only 
4-byte alignment on 32-bit platforms because they generally have implement the 
64-bit load as 2 32-bit loads anyway.

--
Added file: http://bugs.python.org/file44648/hash-bytes-alignment.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Guido van Rossum

Changes by Guido van Rossum :


--
nosy:  -gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

> IMO, though, we should just require alignment for the argument to 
> _PyHash_Bytes. It's private after all.

And what to do with memoryview? Memoryview data can be not aligned.

> If I understand it correctly, the hash value differs depending on the kernel 
> configuration when the python binary is built, leading to different pickle 
> objects which cannot be shared, making them incompatible .

Hash values shouldn't be leaked in pickle.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Matthias Klose

Changes by Matthias Klose :


--
nosy: +gvanrossum, ned.deily

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Matthias Klose

Matthias Klose added the comment:

if the only concern is 32bit sparc, then please let's drop this in 3.6.

Looking at #28027 the new way to obsoleting things seems to be decreeing them 
(sorry about the sarcasm).  If I interpret your concerns correctly you care 
about platforms, which you are not supposed to care about. sparc32 doesn't have 
any use cases now, while ARM32 still has, and will have for some time.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Matthias Klose

Matthias Klose added the comment:

... would be to remove the autoconf check and make the selection of the hash 
method unconditional ...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Christian Heimes

Christian Heimes added the comment:

The main reason for two different hash algorithms was missing support for 64bit 
integer types. Python 3.4 was targeting platforms that had no 64bit integer 
support at all (IIRC SPARC). Nowaday Python requires 64bit ints to compile.

I'm all in favor to remove FVN2 and use SipHash24 on all platforms. Let's 
deprecated it now and remove it in 3.7.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Matthias Klose

Matthias Klose added the comment:

> I believe the unaligned memory access configure check is supposed to
> prevent siphash from being used, so we might look into why that's not
> working.
> 
> IMO, though, we should just require alignment for the argument to
> _PyHash_Bytes. It's private after all.

If I understand it correctly, the hash value differs depending on the kernel 
configuration when the python binary is built, leading to different pickle 
objects which cannot be shared, making them incompatible .  I think the safest 
thing would be to remove the hash make the selection of the hash method 
unconditional, and to make this hash function working for all cases.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Christian Heimes

Christian Heimes added the comment:

I created #28126 for MSVC.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread STINNER Victor

Changes by STINNER Victor :


--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Matthias Klose

Matthias Klose added the comment:

I can check, if the memcpy is optimized away.  As an alternative, we could use 
__builtin_memcpy. That is available for clang as well (would have to check icc).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Matthias Klose

Matthias Klose added the comment:

a variant of the patch that keeps the parameter types of _le64toh.

--
Added file: http://bugs.python.org/file44630/pyhash2.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Stefan Krah

Stefan Krah added the comment:

FWIW, MSVC optimizes memcpy:

  http://bugs.python.org/issue15993


The pgo issue has been fixed according to Steve Dower.

--
nosy: +skrah

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Christian Heimes

Christian Heimes added the comment:

I'm a bit worried that the patch might slow down the general case of SipHash24. 
When I was working on SipHash24 I made sure that the general case in 
PyBytes_Object and PyUnicode_Object are fast and always aligned. Do all 
compilers optimize that case? For MSVC we still have a specialized Py_MEMCPY() 
variant in pyports.h.

I can see three more ways to fix the issue:

1) Have two loops, one for the aligned case with memcpy() and one for the 
unaligned case w/o memcpy()
2) Add a special variant of _le64toh() for PY_LITTLE_ENDIAN on ARM and use the 
current variant on X86_64.
3) Make it illegal to call _Py_HashBytes() with non-aligned pointer and require 
the caller to provide an aligned buffer. It's easy for datetime but requires an 
extra buffer memoryview. Memoryview already uses a buffer for all but 
single-strided C contiguous views. We can easily add another case for 
non-aligned buffers.

--
stage:  -> needs patch
type:  -> crash

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Matthias Klose

Changes by Matthias Klose :


Added file: http://bugs.python.org/file44629/pyhash.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Matthias Klose

Changes by Matthias Klose :


Removed file: http://bugs.python.org/file44628/pyhash.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Matthias Klose

Changes by Matthias Klose :


Removed file: http://bugs.python.org/file44515/pyhash.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-13 Thread Matthias Klose

Matthias Klose added the comment:

updated patch that always used memcpy for the little endian case.

--
Added file: http://bugs.python.org/file44628/pyhash.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-11 Thread Antti Haapala

Antti Haapala added the comment:

There is no need to ifdef anything, the memcpy is the only correct way to do 
it. As memcpy is also a reserved identifier in C, the compiler can and will 
optimize this into a 64-bit access on those platforms where it can be safely 
done so (x86 for example), e.g. GCC compiles

uint64_t func(char *buf) {
uint64_t rv;
memcpy(, buf+3, sizeof(rv));
return rv;
}

into

movq3(%rdi), %rax
ret

On Linux 64-bit ABI.

--
nosy: +ztane

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-10 Thread Matthias Klose

Matthias Klose added the comment:

I don't like that configure check, because it depends on the kernel being used 
at runtime.  For many architectures you can define in the kernel if the kernel 
should allow unaligned accesses or not.  Sure this is not an issue for linux 
distro builds, but might be unexpected for third party builds.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-09 Thread Benjamin Peterson

Benjamin Peterson added the comment:

I believe the unaligned memory access configure check is supposed to prevent 
siphash from being used, so we might look into why that's not working.

IMO, though, we should just require alignment for the argument to 
_PyHash_Bytes. It's private after all.

--
nosy: +benjamin.peterson

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-09 Thread Christian Heimes

Christian Heimes added the comment:

Good catch! I had trouble with the data structures in the TZ module before.

I'm fine with memcpy() on just ARM platforms as a temporary workaround. Let's 
discuss the issue another time. Right now I'm busy with ssl improvements for 
3.6.0b1.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-09 Thread Matthias Klose

Changes by Matthias Klose :


Removed file: http://bugs.python.org/file44514/pyhash.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-09 Thread Matthias Klose

Changes by Matthias Klose :


Added file: http://bugs.python.org/file44515/pyhash.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-09 Thread Matthias Klose

Changes by Matthias Klose :


--
nosy: +christian.heimes

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue28055] pyhash's siphash24 assumes alignment of the data pointer

2016-09-09 Thread Matthias Klose

New submission from Matthias Klose:

pyhash's siphash24 assumes alignment of the data pointer, casting a void 
pointer (src) to an uint64_t pointer, increasing the required alignment from 1 
to 4 bytes. That's invalid code. siphash24 can't assume that the pointer to the 
data to hash is 4-byte aligned.

Seen as a bus error trying to run a ARM32 binary on a AArch64 kernel.

./python -c 'import datetime; print(hash(datetime.datetime(2015, 1, 1)))'

the datetime type is defined as

#define _PyTZINFO_HEAD \
PyObject_HEAD \
Py_hash_t hashcode; \
char hastzinfo; /* boolean flag */

typedef struct
{
_PyTZINFO_HEAD
unsigned char data[_PyDateTime_DATE_DATASIZE];
} PyDateTime_Date;

and data is used to calculate the hash of the object, not being 4 byte aligned, 
you get the bus error. Inserting three fill bytes, are making the data member 
4-byte aligned solves the issue, however introducing an ABI change makes the 
new datetime ABI incompatible, and we don't know about the alignment of objects 
outside the standard library.

The solution is to use a memcpy instead of the cast to uint64_t, for now 
limited to the little endian ARM targets, but I don't see why the memcpy cannot 
always be used on little endian targets instead of the cast.

--
assignee: doko
components: Interpreter Core
files: pyhash.diff
keywords: patch
messages: 275493
nosy: doko
priority: normal
severity: normal
status: open
title: pyhash's siphash24 assumes alignment of the data pointer
versions: Python 3.5, Python 3.6
Added file: http://bugs.python.org/file44514/pyhash.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com