Yikun edited a comment on pull request #34717:
URL: https://github.com/apache/spark/pull/34717#issuecomment-979659317


   Just to start the discussion, by using below sql according [1], we can got 
the all download stat of Pandas in last 3 months.
   ```SQL
   SELECT
     file.version AS file_version,
     COUNT(*) AS num_downloads,
   FROM `the-psf.pypi.file_downloads`
   WHERE file.project = 'pandas'
   AND 
     -- Only query the last 3 months of history
     DATE(timestamp)
       BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 3 MONTH)
       AND CURRENT_DATE()
   GROUP BY `file_version`
   ORDER BY `num_downloads` DESC
   ```
   
   Here is the Top 20 data, about 77% of the overall data, complete result can 
be found in 
[here](https://gist.github.com/Yikun/a1d8168c316966db520cf8f8a43ff0bf):
     | version | number | percent
   -- | -- | -- | --
   1 | 0.25.3 | 35149221 | 14.28%
   2 | 1.1.5 | 28722806 | 11.67%
   3 | 1.3.4 | 20944236 | 8.51%
   4 | 1.3.3 | 16861573 | 6.85%
   5 | 0.24.2 | 13235233 | 5.38%
   6 | 1.0.5 | 9201989 | 3.74%
   7 | 1.3.2 | 9077326 | 3.69%
   8 | 1.2.5 | 7902532 | 3.21%
   9 | 1.2.4 | 5754284 | 2.34%
   10 | 1.1.4 | 5710439 | 2.32%
   11 | 1.1.0 | 4760847 | 1.93%
   12 | 1.1.2 | 4621441 | 1.88%
   13 | 1.2.3 | 4607043 | 1.87%
   14 | 1.0.3 | 4601230 | 1.87%
   15 | 0.23.4 | 4251044 | 1.73%
   16 | 0.25.0 | 3862673 | 1.57%
   17 | 1.2.1 | 2952346 | 1.20%
   18 | 1.0.1 | 2690006 | 1.09%
   19 | 0.22.0 | 2680710 | 1.09%
   20 | 1.2.0 | 2645339 | 1.07%
   21 | 0.24.1 | 2635411 | 1.07%
   
   - There are more than 60+% users downloaded the 1.x version in last 3 months
   - There are 26+% users downloaded the 0.23.2~1.0
        
   
   
   [1] https://packaging.python.org/guides/analyzing-pypi-package-downloads/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to