On Sun, Aug 14, 2016 at 9:35 PM, Xavier Combelle <[email protected]> wrote:
> On 15/08/2016 02:45, Wes Turner wrote: > > > > You can add a `make clean` build step: > > > > pyclean: > > find . -name '*.pyc' -delete > > > > You can delete all .pyc files > > > > - $ find . -name '*.pyc' -delete > > - http://manpages.ubuntu.com/manpages/precise/man1/pyclean.1.html > > #.pyc, .pyo > > > > You can rebuild all .pyc files (for a given directory): > > > > - $ python -m compileall -h > > - https://docs.python.org/2/library/compileall.html > > - https://docs.python.org/3/library/compileall.html > > > > > > > > You can, instead of building .pyc, build .pyo > > > > - https://docs.python.org/2/using/cmdline.html#envvar-PYTHONOPTIMIZE > > - https://docs.python.org/2/using/cmdline.html#cmdoption-O > > > > You can not write .pyc or .pyo w/ PYTHONDONTWRITEBYTECODE / -B > > > > - https://docs.python.org/2/using/cmdline.html#envvar- > PYTHONDONTWRITEBYTECODE > > - https://docs.python.org/2/using/cmdline.html#cmdoption-B > > - If the files exist though, > > - https://docs.python.org/3/reference/import.html > > > > You can build a PEX (which rebuilds .pyc files) and test/deploy that: > > > > - https://github.com/pantsbuild/pex#integrating-pex-into-your-workflow > > - https://pantsbuild.github.io/python-readme.html#more-about- > python-tests > > > > How .pyc files currently work: > > > > - http://nedbatchelder.com/blog/200804/the_structure_of_pyc_files.html > > - https://www.python.org/dev/peps/pep-3147/#flow-chart (*.pyc -> > > ./__pycache__) > > - http://raulcd.com/how-python-caches-compiled-bytecode.html > > > > You could add a hash of the .py source file in the header of the > > .pyc/.pyo object (as proposed) > > > > - The overhead of this hashing would be a significant performance > > regression > > - Instead, today, the build step can just pyclean or build a > > .zip/.WHL/.PEX which is expected to be a fresh build > > > The problem is not the option of you have to prevent the problem, the > simplest way being > to delete the .pyc file, It is easy to do once you spot it. The problem > is that it randomly happen in > normal workflow. > IIUC, the timestamp in the .pyc header is designed to prevent this ocurrence? Reasons that the modification timestamp comparison could be off: - Time change - Daylight savings time - NTP drift adjustment? > To have an idea of the overhead of the whole hashing procedure I run the > following script > > import sys > > from time import time > from zlib import adler32 as h > t2 =time() > import decimal > print(decimal.__file__) > c1 = time()-t2 > t1=time() > r=h(open(decimal.__file__,'rb').read()) > c2= time()-t1 > print(c2,c1,c2/c1) > > decimal was chosen because it was the biggest file of the standard library. > on 20 runs, the overhead was always between 1% and 1.5% > So yes the overhead on the import process is measurable but very small. > By consequence, I would not call it significant. > Moreover the import process is only a part (and not the biggest one) of > a whole. > I agree that 1 to 1.5% is not significant. > At the difference of my first mail I now consider only a non > cryptographic hash/checksum > as the only aim is to prevent accidental unmatch between .pyc and .py file. > > _______________________________________________ > Python-ideas mailing list > [email protected] > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-ideas mailing list [email protected] https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
