> On Aug 23, 2016, at 7:54 PM, Nathaniel Smith <[email protected]> wrote:
> 
> On Aug 23, 2016 12:57 PM, "Donald Stufft" <[email protected] 
> <mailto:[email protected]>> wrote:
> >
> [...]
> > However, PyPI does need
> > to do work when a file is uploaded to PyPI. For instance, it needs
> > to verify that the file being uploaded is valid, it needs to ensure
> > that it’s for the project it claims to be for, etc. To do this, PyPI
> > has to know things about the file format itself, and what it can
> > expect from it. One bug that has cropped up from time to time again
> > is people accidentally uploading a package that inside it contains
> > version say “1.0”, but when they registered it with PyPI they told
> > PyPI it was version “1.0a1” or something like that, which causes a lot
> > of the tooling to do subtly weird and broken things. PyPI should be
> > double checking the internal metadata of these files, but it can’t
> > do that unless it can expect that metadata to exist in those files
> > and it has to implement it for each file type (and then, that has to
> > be maintained).
> 
> Am I understanding correctly that PyPI needs to start peeking inside sdists 
> but hasn't started doing that yet? If that's correct, then I just want to 
> double check that the cost of implementing this upcoming feature has been 
> factored into the .zip-vs-.tar.gz discussion, because code for peeking inside 
> .tar.gz files is presumably harder to write and more expensive to run than 
> code for peeking inside .zip files. (But maybe only negligibly harder, I 
> haven't tried writing such code myself, and uploads are relatively rare 
> compared to downloads.) I guess the worst case would be if it turns out pypi 
> needs to look at multiple files inside each sdist, where .tar.gz access 
> becomes quadratic unless you're very clever.
> 
> -n
> 
Yes, though I’m not real worried about the time it takes, uploading happens 
something like 700 times a day, so being a touch slower isn’t the worst thing 
in the world, particularly if it means that our disk space or bandwidth needs 
are less.

—
Donald Stufft



_______________________________________________
Distutils-SIG maillist  -  [email protected]
https://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to