[issue46864] Deprecate ob_shash in BytesObject

2022-03-30 Thread Christian Heimes
Christian Heimes added the comment: New changeset d8f530fe329c6bd9ad6e1a9db9aa32b465c2d67f by Christian Heimes in branch 'main': bpo-46864: Suppress even more ob_shash deprecation warnings (GH-32176) https://github.com/python/cpython/commit/d8f530fe329c6bd9ad6e1a9db9aa32b465c2d67f

[issue46864] Deprecate ob_shash in BytesObject

2022-03-29 Thread Christian Heimes
Change by Christian Heimes : -- pull_requests: +30254 pull_request: https://github.com/python/cpython/pull/32176 ___ Python tracker ___

[issue46864] Deprecate ob_shash in BytesObject

2022-03-24 Thread Inada Naoki
Inada Naoki added the comment: OK. Cache efficiency is dropped from motivations list. Current motivations are: * Memory saving (currently, 4 BytesObject (= 32 bytes of ob_shash) per code object. * Make bytes objects immutable * Share objects among multi interpreters. * CoW efficiency. I

[issue46864] Deprecate ob_shash in BytesObject

2022-03-24 Thread Brandt Bucher
Brandt Bucher added the comment: Just a note: as of this week (GH-31888), we no longer use bytes objects to store bytecode. Instead, the instructions are stored as part of the PyCodeObject struct. (Of course, we still use bytes objects for the various exception handling and debugging

[issue46864] Deprecate ob_shash in BytesObject

2022-03-24 Thread Inada Naoki
Inada Naoki added the comment: > I guess not much difference in benchmarks. > But if put a bytes object into multiple dicts/sets, and len(bytes_key) is > large, it will take a long time. (1 GiB 0.40 seconds on i5-11500 DDR4-3200) > The length of bytes can be arbitrary,so computing time may be

[issue46864] Deprecate ob_shash in BytesObject

2022-03-24 Thread Ma Lin
Ma Lin added the comment: > I posted remove-bytes-hash.patch in this issue. Would you measure how this > affects whole application performance rather than micro benchmarks? I guess not much difference in benchmarks. But if put a bytes object into multiple dicts/sets, and len(bytes_key) is

[issue46864] Deprecate ob_shash in BytesObject

2022-03-23 Thread Inada Naoki
Inada Naoki added the comment: First of all, this is just deprecating direct access of `ob_shash`. This makes users need to use `PyObject_Hash()`. We don't make the final decision about removing it. We just make we can remove it in Python 3.13. RAM and CACHE efficiency is not the only

[issue46864] Deprecate ob_shash in BytesObject

2022-03-23 Thread Ma Lin
Ma Lin added the comment: If put a bytes object into multiple dicts/sets, the hash need to be computed multiple times. This seems a common usage. bytes is a very basic type, users may use it in various ways. And unskilled users may checking the same bytes object against dicts/sets many

[issue46864] Deprecate ob_shash in BytesObject

2022-03-23 Thread Inada Naoki
Inada Naoki added the comment: New changeset 894d0ea5afa822c23286e9e68ed80bb1122b402d by Inada Naoki in branch 'main': bpo-46864: Suppress deprecation warnings for ob_shash. (GH-32042) https://github.com/python/cpython/commit/894d0ea5afa822c23286e9e68ed80bb1122b402d --

[issue46864] Deprecate ob_shash in BytesObject

2022-03-22 Thread Inada Naoki
Inada Naoki added the comment: Average RAM capacity doesn't grow as CPU cores grows. Additionally, L1+L2 cache is really limited resource compared to CPU or RAM. Bytes object is used for co_code that is hot. So cache efficiency is important. Would you give us more realistic (or real world)

[issue46864] Deprecate ob_shash in BytesObject

2022-03-22 Thread Ma Lin
Ma Lin added the comment: RAM is now relatively cheaper than CPU. 1 million bytes object additionally use 7.629 MiB RAM for ob_shash. (100_*8/1024/1024). This causes hash() performance regression anyway. -- ___ Python tracker

[issue46864] Deprecate ob_shash in BytesObject

2022-03-22 Thread Inada Naoki
Inada Naoki added the comment: Since the hash is randomized, using hash(bytes) for such use case is not recommended. User should use stable hash functions instead. I agree that there is few use cases this change cause performance regression. But it is really few compared to overhead of

[issue46864] Deprecate ob_shash in BytesObject

2022-03-22 Thread Ma Lin
Ma Lin added the comment: Since hash() is a public function, maybe some users use hash value to manage bytes objects in their own way, then there may be a performance regression. For a rough example, dispatch data to 16 servers. h = hash(b) sendto(server_number=h & 0xF, data=b)

[issue46864] Deprecate ob_shash in BytesObject

2022-03-22 Thread Inada Naoki
Inada Naoki added the comment: Since Python 3.13, yes. It will be bit slower. -- ___ Python tracker ___ ___ Python-bugs-list

[issue46864] Deprecate ob_shash in BytesObject

2022-03-22 Thread Ma Lin
Ma Lin added the comment: If run this code, would it be slower? bytes_hash = hash(bytes_data) bytes_hash = hash(bytes_data) # get hash twice -- nosy: +malin ___ Python tracker

[issue46864] Deprecate ob_shash in BytesObject

2022-03-21 Thread Inada Naoki
Inada Naoki added the comment: I'm sorry. Maybe, ccache hides the warning from me. -- ___ Python tracker ___ ___ Python-bugs-list

[issue46864] Deprecate ob_shash in BytesObject

2022-03-21 Thread Inada Naoki
Change by Inada Naoki : -- pull_requests: +30132 stage: needs patch -> patch review pull_request: https://github.com/python/cpython/pull/32042 ___ Python tracker ___

[issue46864] Deprecate ob_shash in BytesObject

2022-03-21 Thread Christian Heimes
Christian Heimes added the comment: I'm getting tons of deprecation warnings from deepfreeze script. Please add _Py_COMP_DIAG_IGNORE_DEPR_DECL to Tools/scripts/deepfreeze.py. -- nosy: +christian.heimes resolution: fixed -> stage: resolved -> needs patch status: closed -> open type:

[issue46864] Deprecate ob_shash in BytesObject

2022-03-05 Thread Inada Naoki
Inada Naoki added the comment: New changeset 2d8b764210c8de10893665aaeec8277b687975cd by Inada Naoki in branch 'main': bpo-46864: Deprecate PyBytesObject.ob_shash. (GH-31598) https://github.com/python/cpython/commit/2d8b764210c8de10893665aaeec8277b687975cd --

[issue46864] Deprecate ob_shash in BytesObject

2022-03-05 Thread Inada Naoki
Change by Inada Naoki : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___ ___

[issue46864] Deprecate ob_shash in BytesObject

2022-03-02 Thread Brandt Bucher
Change by Brandt Bucher : -- nosy: +brandtbucher ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue46864] Deprecate ob_shash in BytesObject

2022-02-26 Thread Inada Naoki
Inada Naoki added the comment: When removed shash: ``` ## small key $ ./python -m pyperf timeit --compare-to ../cpython/python -s 'd={b"foo":1, b"bar":2, b"buzz":3}' -- 'b"key" in d' /home/inada-n/work/python/cpython/python: . 23.2 ns +- 1.7 ns

[issue46864] Deprecate ob_shash in BytesObject

2022-02-26 Thread Inada Naoki
Inada Naoki added the comment: > But some programs can still work with encoded bytes instead of strings. In > particular os.environ and os.environb are implemented as dict of bytes on > non-Windows. This change doesn't affect to os.environ. os.environ[key] does

[issue46864] Deprecate ob_shash in BytesObject

2022-02-26 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I think it is a legacy of Python 2. Attributes and variable names are Unicode strings in Python 3, so the main reason of this optimization is no longer relevant. But some programs can still work with encoded bytes instead of strings. In particular

[issue46864] Deprecate ob_shash in BytesObject

2022-02-26 Thread Inada Naoki
Change by Inada Naoki : -- keywords: +patch pull_requests: +29721 stage: -> patch review pull_request: https://github.com/python/cpython/pull/31598 ___ Python tracker ___

[issue46864] Deprecate ob_shash in BytesObject

2022-02-26 Thread Inada Naoki
New submission from Inada Naoki : Code objects have more and more bytes attributes for now. To reduce the RAM by code, I want to remove ob_shash (cached hash value) from bytes object. Sets and dicts have own hash cache. Unless checking same bytes object against dicts/sets many times, this