On Tue, Apr 21, 2015 at 10:55 AM Donald Stufft <don...@stufft.io> wrote:
> Just thought I'd share this since it shows how what people are using to > download things from PyPI have changed over the past year. Of particular > interest to most people will be the final graphs showing what percentage of > downloads from PyPI are for Python 3.x or 2.x. > > As always it's good to keep in mind, "Lies, Damn Lies, and Statistics". > I've > tried not to bias the results too much, but some bias is unavoidable. Of > particular note is that a lot of these numbers come from pip, and as of > version > 6.0 of pip, pip will cache downloads by default. This would mean that older > versions of pip are more likely to "inflate" the downloads than newer > versions > since they don't cache by default. In addition if a project has a file > which > is used for both 2.x and 3.x and they do a ``pip install`` on the 2.x > version > first then it will show up as counted under 2.x but not 3.x due to caching > (and > of course the inverse is true, if they install on 3.x first it won't show > up > on 2.x). > > Here's the link: https://caremad.io/2015/04/a-year-of-pypi-downloads/ > > Anyways, I'll have access to the data set for another day or two before I > shut down the (expensive) server that I have to use to crunch the numbers > so if > there's anything anyone else wants to see before I shut it down, speak up > soon. > Thanks! I like your focus on particular packages of note such as django and requests. How do CDNs influence these "lies"? I thought the download counts on PyPI were effectively meaningless due to CDN mirrors fetching and hosting things? Do we have user-agent logs from all PyPI package CDN mirrors or just from the master? -gps
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig