mailmindlin opened a new issue, #22135:
URL: https://github.com/apache/datafusion/issues/22135

   ### Is your feature request related to a problem or challenge?
   
   `ExecutionPlan::metrics()` is not currently reachable through 
`datafusion-ffi`: `FFI_ExecutionPlan` exposes most of the trait (`properties`, 
`children`, `with_new_children`, `name`, `execute`, `repartitioned`) but has no 
function pointer for `metrics()`, so `ForeignExecutionPlan::metrics()` falls 
through to the trait default implementation and always returns `None`.
   
   The user-visible consequences:
   
   - **`EXPLAIN ANALYZE` produces empty metric blocks for FFI plans.**
     `AnalyzeExec` exhausts the input stream and then formats metrics via
     `DisplayableExecutionPlan::with_metrics(...)`. For any plan node loaded
     through `datafusion-ffi`, the rendered output shows no `output_rows`,
     `elapsed_compute`, etc. — even though those metrics are being recorded
     on the producer side.
   - **Anything calling `DisplayableExecutionPlan::with_metrics(...)` on a
     plan tree containing a foreign node is similarly affected** — tooling,
     dashboards, tests, etc.
   
   This makes foreign-loaded operators behave noticeably differently from
   locally-defined operators in a way that's surprising and not documented.
   The expectation, especially given the FFI crate's stated goal of allowing
   extensions to "interface to DataFusion" transparently, is that a foreign
   plan should be observationally indistinguishable from a local plan with
   respect to instrumentation.
   
   
   ### Describe the solution you'd like
   
   Pass metrics across the FFI boundary as a **snapshot**: read all
   atomic-backed counters/gauges/timers into plain integer fields at marshal
   time. Specifically:
   
   1. Add a new function pointer to `FFI_ExecutionPlan` for metrics
   2. Define FFI-stable mirrors of `MetricsSet`, `Metric`, `MetricValue`, etc. 
plus bidirectional `From`
      conversions.
   3. For `MetricValue::Custom { value: Arc<dyn CustomMetricValue> }`,
      marshal `(name, Display output, as_usize())` and reconstruct on the
      consumer side via a small `FfiCustomMetricValue` shim. This preserves
      `Display` (the common case for `EXPLAIN ANALYZE`) and `as_usize()`
      (used for sorting), at the cost of losing `aggregate` and
      `as_any` downcasting.
   
   Snapshot semantics are sufficient because every in-tree caller of
   `.metrics()` reads after the input stream has been exhausted — see
   `AnalyzeExec` exhausting its input before formatting, and
   `DisplayableExecutionPlan` rendering during post-execution display.
   
   ### Describe alternatives you've considered
   
   - **Share live atomics across the FFI boundary.** Expose the underlying
     `Arc<AtomicUsize>` such that the consumer can poll updated values without
     re-calling `metrics()`. Rejected: no in-tree caller polls metrics during
     streaming, and sharing Arcs across the FFI boundaries is complex for 
little benefit.
   
   - **A separate `FFI_ExecutionPlanMetrics` struct queried independently
     from `FFI_ExecutionPlan`.** Preserves the existing ABI but forces
     consumers to coordinate two structs and complicates lifetime/identity
     reasoning. The function-pointer-on-the-struct pattern matches every
     other method exposed on `FFI_ExecutionPlan` and reads more naturally.
   
   - **A full FFI trait object for `CustomMetricValue`.** Define an
     `FFI_CustomMetricValue` with function pointers for `display`, `as_usize`,
     `aggregate`, `clone`, and `release`. Rejected for the initial pass:
     significant surface area, while `Display` + `as_usize()` covers the only
     in-tree consumer of `Custom` metrics (display via `EXPLAIN ANALYZE`).
     Can be added later if extension authors need round-trippable custom
     metrics.
   
   ### Additional context
   
   I have a PR written for this issue that I will file shortly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to