New submission from Filipe Laíns <la...@riseup.net>: Currently, the order of set or frozenset elements when saved to bytecode is dependent on the random seed. This breaks reproducibility.
Example fail from an Arch Linux package: https://reproducible.archlinux.org/api/v0/builds/88454/diffoscope Let's take an example file, `test_compile.py` ```python s = { 'aaa', 'bbb', 'ccc', 'ddd', 'eee', } ``` $ PYTHONHASHSEED=0 python -m compileall --invalidation-mode checked-hash test_compile.py $ mv __pycache__ __pycache__1 $ PYTHONHASHSEED=1 python -m compileall --invalidation-mode checked-hash test_compile.py $ diff __pycache__/test_compile.cpython-39.pyc __pycache__1/test_compile.cpython-39.pyc Binary files __pycache__/test_compile.cpython-39.pyc and __pycache__1/test_compile.cpython-39.pyc differ $ diff <(xxd __pycache__/test_compile.cpython-39.pyc) <(xxd __pycache__1/test_compile.cpython-39.pyc) 5,6c5,6 < 00000040: 005a 0362 6262 5a03 6464 645a 0361 6161 .Z.bbbZ.dddZ.aaa < 00000050: 5a03 6363 635a 0365 6565 4e29 01da 0173 Z.cccZ.eeeN)...s --- > 00000040: 005a 0361 6161 5a03 6363 635a 0364 6464 .Z.aaaZ.cccZ.ddd > 00000050: 5a03 6565 655a 0362 6262 4e29 01da 0173 Z.eeeZ.bbbN)...s I believe the issue is in the marshall module. Particularly, this line[1]. My simple fix was to create a list from the set, sort it, and iterate over it instead. [1] https://github.com/python/cpython/blob/00d7abd7ef588fc4ff0571c8579ab4aba8ada1c0/Python/marshal.c#L505 ---------- messages: 391104 nosy: FFY00, Mark.Shannon, benjamin.peterson, yselivanov priority: normal severity: normal status: open title: unreproducible bytecode: set order depends on random seed for compiled bytecode type: behavior _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue43850> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com