On 15/08/2016 02:45, Wes Turner wrote: > > You can add a `make clean` build step: > > pyclean: > find . -name '*.pyc' -delete > > You can delete all .pyc files > > - $ find . -name '*.pyc' -delete > - http://manpages.ubuntu.com/manpages/precise/man1/pyclean.1.html > #.pyc, .pyo > > You can rebuild all .pyc files (for a given directory): > > - $ python -m compileall -h > - https://docs.python.org/2/library/compileall.html > - https://docs.python.org/3/library/compileall.html > > > > You can, instead of building .pyc, build .pyo > > - https://docs.python.org/2/using/cmdline.html#envvar-PYTHONOPTIMIZE > - https://docs.python.org/2/using/cmdline.html#cmdoption-O > > You can not write .pyc or .pyo w/ PYTHONDONTWRITEBYTECODE / -B > > - https://docs.python.org/2/using/cmdline.html#envvar-PYTHONDONTWRITEBYTECODE > - https://docs.python.org/2/using/cmdline.html#cmdoption-B > - If the files exist though, > - https://docs.python.org/3/reference/import.html > > You can build a PEX (which rebuilds .pyc files) and test/deploy that: > > - https://github.com/pantsbuild/pex#integrating-pex-into-your-workflow > - https://pantsbuild.github.io/python-readme.html#more-about-python-tests > > How .pyc files currently work: > > - http://nedbatchelder.com/blog/200804/the_structure_of_pyc_files.html > - https://www.python.org/dev/peps/pep-3147/#flow-chart (*.pyc -> > ./__pycache__) > - http://raulcd.com/how-python-caches-compiled-bytecode.html > > You could add a hash of the .py source file in the header of the > .pyc/.pyo object (as proposed) > > - The overhead of this hashing would be a significant performance > regression > - Instead, today, the build step can just pyclean or build a > .zip/.WHL/.PEX which is expected to be a fresh build > The problem is not the option of you have to prevent the problem, the simplest way being to delete the .pyc file, It is easy to do once you spot it. The problem is that it randomly happen in normal workflow.
To have an idea of the overhead of the whole hashing procedure I run the following script import sys from time import time from zlib import adler32 as h t2 =time() import decimal print(decimal.__file__) c1 = time()-t2 t1=time() r=h(open(decimal.__file__,'rb').read()) c2= time()-t1 print(c2,c1,c2/c1) decimal was chosen because it was the biggest file of the standard library. on 20 runs, the overhead was always between 1% and 1.5% So yes the overhead on the import process is measurable but very small. By consequence, I would not call it significant. Moreover the import process is only a part (and not the biggest one) of a whole. At the difference of my first mail I now consider only a non cryptographic hash/checksum as the only aim is to prevent accidental unmatch between .pyc and .py file. _______________________________________________ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
