ValentinC-BR opened a new issue, #20533: URL: https://github.com/apache/superset/issues/20533
The boxplot makes a sampling by reading only the 10k first rows. By doing this, it takes the 10k first rows of the database, which is not random at all and the displayed boxplot is completely wrong (Med, Q1, Q2 are <b>really</b> different) #### How to reproduce the bug 1. Make a boxplot on a database containing more than 10k rows 2. Display the results ### Expected results The boxplot should be made on the total database (using a GROUP BY function in the query for instance) ### Actual results The boxplot is made on the 10k first rows #### Screenshots / ### Environment (please complete the following information): - browser type and version: Google Chrome Version 103.0.5060.53 (Official Build) (x86_64) - superset version: 1.5 - python version: 3.9 - node.js version: / - any feature flags active: / ### Checklist Make sure to follow these steps before submitting your issue - thank you! - [ ] I have checked the superset logs for python stacktraces and included it here as text if there are any. - [X] I have reproduced the issue with at least the latest released version of superset. - [X] I have checked the issue tracker for the same issue and I haven't found one similar. ### Additional context / -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
