> On Aug 23, 2016, at 7:54 PM, Nathaniel Smith <[email protected]> wrote: > > On Aug 23, 2016 12:57 PM, "Donald Stufft" <[email protected] > <mailto:[email protected]>> wrote: > > > [...] > > However, PyPI does need > > to do work when a file is uploaded to PyPI. For instance, it needs > > to verify that the file being uploaded is valid, it needs to ensure > > that it’s for the project it claims to be for, etc. To do this, PyPI > > has to know things about the file format itself, and what it can > > expect from it. One bug that has cropped up from time to time again > > is people accidentally uploading a package that inside it contains > > version say “1.0”, but when they registered it with PyPI they told > > PyPI it was version “1.0a1” or something like that, which causes a lot > > of the tooling to do subtly weird and broken things. PyPI should be > > double checking the internal metadata of these files, but it can’t > > do that unless it can expect that metadata to exist in those files > > and it has to implement it for each file type (and then, that has to > > be maintained). > > Am I understanding correctly that PyPI needs to start peeking inside sdists > but hasn't started doing that yet? If that's correct, then I just want to > double check that the cost of implementing this upcoming feature has been > factored into the .zip-vs-.tar.gz discussion, because code for peeking inside > .tar.gz files is presumably harder to write and more expensive to run than > code for peeking inside .zip files. (But maybe only negligibly harder, I > haven't tried writing such code myself, and uploads are relatively rare > compared to downloads.) I guess the worst case would be if it turns out pypi > needs to look at multiple files inside each sdist, where .tar.gz access > becomes quadratic unless you're very clever. > > -n > Yes, though I’m not real worried about the time it takes, uploading happens something like 700 times a day, so being a touch slower isn’t the worst thing in the world, particularly if it means that our disk space or bandwidth needs are less.
— Donald Stufft
_______________________________________________ Distutils-SIG maillist - [email protected] https://mail.python.org/mailman/listinfo/distutils-sig
