ValentinC-BR opened a new issue, #20532: URL: https://github.com/apache/superset/issues/20532
The histogram plot makes a sampling by reading only the 10k first rows. By doing this, it takes the 10k first rows of the database, which is not random at all and the histogram is wrong. #### How to reproduce the bug 1. Make a histogram on a database containing more than 10k rows 2. Display the results ### Expected results The histogram shoud be made on the total database (using a GROUP BY function in the query for instance) ### Actual results The histogram is made on the 10k first rows #### Screenshots / ### Environment (please complete the following information): - browser type and version: Google Chrome Version 103.0.5060.53 (Official Build) (x86_64) - superset version: 1.5 - python version: 3.9 - node.js version: / - any feature flags active: / ### Checklist Make sure to follow these steps before submitting your issue - thank you! - [ ] I have checked the superset logs for python stacktraces and included it here as text if there are any. - [X] I have reproduced the issue with at least the latest released version of superset. - [X] I have checked the issue tracker for the same issue and I haven't found one similar. ### Additional context / -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
