[ 
https://issues.apache.org/jira/browse/DRILL-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832702#comment-16832702
 ] 

ASF GitHub Bot commented on DRILL-7222:
---------------------------------------

kkhatua commented on issue #1779: DRILL-7222: Visualize estimated and actual 
row counts for a query
URL: https://github.com/apache/drill/pull/1779#issuecomment-489185185
 
 
   @arina-ielchiieva 
   
   The motivation for this PR comes from the need for engineers to analyze 
queries as plans change due to introduction of statistics. An initial thought 
was to add an additional column, but, I think, we already have a lot of 
columns. I've tried to figure which columns to trim, but almost all seem 
relevant. I know we might come back to doing similar things with Resource 
Management as well, where we'll again need to work on estimates vs actual. So 
adding additional columns is not practical.
   
   Showing the estimates based on whether a planning decision was made using 
statistics is not possible unless the profile JSON itself carries some hint 
that statistics were used.
   
   Also, I added the toggle button to provide a mechanism to hide the estimates 
by default (another reason why not an additional column). I'm worried that 
users will get the impression that there are issues with Drill because of 
estimates being wildly off. Even if they are sufficiently accurate (like 
NDV-based estimates vs actual), most users don't have the insight into how the 
stats are being used.
   
   Users who have insight into such things can make use of the estimates to 
tune parameters (e.g. broadcast or selectivity thresholds) to force changes in 
plans that are sub-optimal. Based on this, I thought we should go with the 
parenthesis option for showing the estimated row counts.  
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Visualize estimated and actual row counts for a query
> -----------------------------------------------------
>
>                 Key: DRILL-7222
>                 URL: https://issues.apache.org/jira/browse/DRILL-7222
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Web Server
>    Affects Versions: 1.16.0
>            Reporter: Kunal Khatua
>            Assignee: Kunal Khatua
>            Priority: Major
>              Labels: doc-impacting, user-experience
>             Fix For: 1.17.0
>
>
> With statistics in place, it would be useful to have the *estimated* rowcount 
> along side the *actual* rowcount query profile's operator overview.
> We can extract this from the Physical Plan section of the profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to