New submission from Filipe Laíns <la...@riseup.net>:

Currently, the order of set or frozenset elements when saved to bytecode is 
dependent on the random seed. This breaks reproducibility.

Example fail from an Arch Linux package: 
https://reproducible.archlinux.org/api/v0/builds/88454/diffoscope

Let's take an example file, `test_compile.py`
```python
s = {
    'aaa',
    'bbb',
    'ccc',
    'ddd',
    'eee',
}
```

$ PYTHONHASHSEED=0 python -m compileall --invalidation-mode checked-hash 
test_compile.py
$ mv __pycache__ __pycache__1
$ PYTHONHASHSEED=1 python -m compileall --invalidation-mode checked-hash 
test_compile.py

$ diff __pycache__/test_compile.cpython-39.pyc 
__pycache__1/test_compile.cpython-39.pyc
Binary files __pycache__/test_compile.cpython-39.pyc and 
__pycache__1/test_compile.cpython-39.pyc differ

$ diff <(xxd __pycache__/test_compile.cpython-39.pyc) <(xxd 
__pycache__1/test_compile.cpython-39.pyc)
5,6c5,6
< 00000040: 005a 0362 6262 5a03 6464 645a 0361 6161  .Z.bbbZ.dddZ.aaa
< 00000050: 5a03 6363 635a 0365 6565 4e29 01da 0173  Z.cccZ.eeeN)...s
---
> 00000040: 005a 0361 6161 5a03 6363 635a 0364 6464  .Z.aaaZ.cccZ.ddd
> 00000050: 5a03 6565 655a 0362 6262 4e29 01da 0173  Z.eeeZ.bbbN)...s

I believe the issue is in the marshall module. Particularly, this line[1]. My 
simple fix was to create a list from the set, sort it, and iterate over it 
instead.

[1] 
https://github.com/python/cpython/blob/00d7abd7ef588fc4ff0571c8579ab4aba8ada1c0/Python/marshal.c#L505

----------
messages: 391104
nosy: FFY00, Mark.Shannon, benjamin.peterson, yselivanov
priority: normal
severity: normal
status: open
title: unreproducible bytecode: set order depends on random seed for compiled 
bytecode
type: behavior

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue43850>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to