New submission from Anthony Sottile <asott...@umich.edu>:
I did some profiling (attached a few files here with svgs) of running this script: ```python import io import tokenize # picked as the second longest file in cpython with open('Lib/test/test_socket.py', 'rb') as f: bio = io.BytesIO(f.read()) def main(): for _ in range(10): bio.seek(0) for _ in tokenize.tokenize(bio.readline): pass if __name__ == '__main__': exit(main()) ``` the first profile is before the optimization, the second is after the optimization The optimization takes the execution from ~6300ms to ~4500ms on my machine (representing a 28% - 39% improvement depending on how you calculate it) (I'll attach the pstats and svgs after creation, seems I can only attach one file at once) ---------- components: Library (Lib) files: out.pstats messages: 385572 nosy: Anthony Sottile priority: normal severity: normal status: open title: tokenize spends a lot of time in `re.compile(...)` type: performance versions: Python 3.10, Python 3.9 Added file: https://bugs.python.org/file49759/out.pstats _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue43014> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com