alamb opened a new issue #733:
URL: https://github.com/apache/arrow-datafusion/issues/733


   **Describe the bug**
   The output of `EXPLAIN VERBOSE` does not include all the different passes 
nor final physical plan.
   
   **To Reproduce**
   run `EXPLAIN VERBOSE SELECT ...`
   
   **Expected behavior**
   I expect all the optimizer passes to be shown as well as the physical plan. 
Actually only `projection_push_down` and `simplify_expressions` are shown. This 
is despite the fact I know (by putting `println` in the code) that the other 
passes such as `aggregate_statistics` are being run)
   
   **Additional context**
   I was working to add some tests in IOx based on explain plans and I expected 
to see the results of statistics replacement in the explain plan (aka I 
expected to see `count(*)` be rewritten to `num_rows` by AggregateStatistics in 
https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/optimizer/aggregate_statistics.rs#L41
   
   Not only was the optimizer pass not included in the explain verbose, but its 
results were not reflected in the explain plan
   
   Here is an example of what came out:
   ```
   : EXPLAIN VERBOSE SELECT count(*) from h2o;
   
+-----------------------------------------+-----------------------------------------------------------------------------+
   | plan_type                               | plan                             
                                           |
   
+-----------------------------------------+-----------------------------------------------------------------------------+
   | logical_plan                            | Projection: #COUNT(UInt8(1))     
                                           |
   |                                         |   Aggregate: groupBy=[[]], 
aggr=[[COUNT(UInt8(1))]]                         |
   |                                         |     TableScan: h2o 
projection=None                                          |
   | logical_plan after projection_push_down | Projection: #COUNT(UInt8(1))     
                                           |
   |                                         |   Aggregate: groupBy=[[]], 
aggr=[[COUNT(UInt8(1))]]                         |
   |                                         |     TableScan: h2o 
projection=Some([0])                                     |
   | logical_plan after simplify_expressions | Projection: #COUNT(UInt8(1))     
                                           |
   |                                         |   Aggregate: groupBy=[[]], 
aggr=[[COUNT(UInt8(1))]]                         |
   |                                         |     TableScan: h2o 
projection=Some([0])                                     |
   | physical_plan                           | ProjectionExec: 
expr=[COUNT(UInt8(1))@0 as COUNT(UInt8(1))]                 |
   |                                         |   HashAggregateExec: mode=Final, 
gby=[], aggr=[COUNT(UInt8(1))]             |
   |                                         |     HashAggregateExec: 
mode=Partial, gby=[], aggr=[COUNT(UInt8(1))]         |
   |                                         |       ProjectionExec: 
expr=[city@0 as city]                                 |
   |                                         |         DeduplicateExec: [city@0 
ASC,state@1 ASC,time@2 ASC]                |
   |                                         |           SortExec: [city@0 
ASC,state@1 ASC,time@2 ASC]                     |
   |                                         |             IOxReadFilterNode: 
table_name=h2o, chunks=1 predicate=Predicate |
   
+-----------------------------------------+-----------------------------------------------------------------------------+
   ```
   
   Here is what should have happened (note the removal of the actual scan), 
when I added the call to `optimize_explain` in AggregateStatistics:
   
   ```
   
+-----------------------------------------+-------------------------------------------------------------+
   | plan_type                               | plan                             
                           |
   
+-----------------------------------------+-------------------------------------------------------------+
   | logical_plan                            | Projection: #COUNT(UInt8(1))     
                           |
   |                                         |   Aggregate: groupBy=[[]], 
aggr=[[COUNT(UInt8(1))]]         |
   |                                         |     TableScan: h2o 
projection=None                          |
   | logical_plan after aggregate_statistics | Projection: #COUNT(UInt8(1))     
                           |
   |                                         |   Projection: UInt64(3) AS 
COUNT(Uint8(1))                  |
   |                                         |     EmptyRelation                
                           |
   | logical_plan after projection_push_down | Projection: #COUNT(UInt8(1))     
                           |
   |                                         |   Projection: UInt64(3) AS 
COUNT(Uint8(1))                  |
   |                                         |     EmptyRelation                
                           |
   | logical_plan after simplify_expressions | Projection: #COUNT(UInt8(1))     
                           |
   |                                         |   Projection: UInt64(3) AS 
COUNT(Uint8(1))                  |
   |                                         |     EmptyRelation                
                           |
   | physical_plan                           | ProjectionExec: 
expr=[COUNT(UInt8(1))@0 as COUNT(Uint8(1))] |
   |                                         |   ProjectionExec: expr=[3 as 
COUNT(Uint8(1))]               |
   |                                         |     EmptyExec: 
produce_one_row=true                         |
   
+-----------------------------------------+-------------------------------------------------------------+
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to