alamb opened a new issue #866: URL: https://github.com/apache/arrow-datafusion/issues/866
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** I would like to be able get an overall understanding of where time is being spent during query execution via `EXPLAIN ANALYZE` (see https://github.com/apache/arrow-datafusion/pull/858) so that I know where to focuse additional performance optimization activities Additionally, I would like to be able to graph a stacked flamechart such as the following see more details on https://github.com/influxdata/influxdb_iox/issues/2273) that shows when the different operators ran in relation to each other. <img width="689" alt="Screen Shot 2021-08-12 at 11 14 33 AM" src="https://user-images.githubusercontent.com/490673/129237447-834838c8-aa97-42c4-b905-6114d28ca98b.png"> **Describe the solution you'd like** I would like to instrument all operators (`impl ExecutionPlan`) included in DataFusion so that they produce at least the following metrics: 1. output_rows: total rows produced at the output of the operator 2. cpu_nanos: the total time spent (not including any time spent in the input stream or waiting to be scheduled) 3. start_time: the wall clock time at which `execute` was run 4. stop_time: the wall clock time at which the last output record batch was produced I plan to use the `SQLMetric` infrastructure for doing so, probably after https://github.com/apache/arrow-datafusion/issues/679 **Describe alternatives you've considered** Open questions: 1. Handling the output of different partitions (each operator can produce multiple output partitions / streams, and it is not yet clear to me if recording stats on a per partition level is important) 1. How to handle operators that don't provide metrics such as potentially user defined ones (probably will fill in with their parents) **Additional context** Related work: * Metrics improvement: https://github.com/apache/arrow-datafusion/issues/679 * IOx Usecase https://github.com/influxdata/influxdb_iox/issues/2273 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
