LantaoJin opened a new issue, #82: URL: https://github.com/apache/datafusion-java/issues/82
### Is your feature request related to a problem or challenge? For any embedder running multi-tenant DataFusion workloads, two operational signals are missing from the Java binding today: 1. **Per-query memory** — there is no way to ask DataFusion how many bytes a specific in-flight query is currently holding, or what its peak was. `SessionContextBuilder.memoryLimit(...)` (PR #28) lets you cap the *global* pool, but if a tenant blows past their fair-share allocation you can't tell whose query it is. Without this you cannot do fair-share scheduling, abuse detection, or correctly attribute OOMs. 2. **Tokio runtime stats** — the JNI library drives a single shared multi-threaded Tokio runtime in `lib.rs`. There is no Java-side window into how busy that runtime is right now: how many worker threads exist, how saturated they are, how deep the global queue is. Embedders that report node-level health (e.g. an OpenSearch `_nodes/stats` endpoint) need these numbers; today they have to hand-roll a parallel native bridge to surface them. Both problems share the same FFI shape: snapshot a small struct of counters across the boundary on demand. They are bundled here so the design conversation only needs to happen once. ### Describe the solution you'd like Two narrow additions, both opt-in. ### 1. Per-query memory: `MemoryUsage` snapshot from a `DataFrame` ```java DataFrame df = ctx.sql(...); ArrowReader r = df.executeStream(allocator); // ... while another thread drains r ... MemoryUsage usage = df.memoryUsage(); // peakBytes / currentBytes / consumerCount ``` The Java surface is a single new accessor on `DataFrame`. The Rust side wraps the session's `MemoryPool` with a per-query tracking adapter (modelled on `TrackConsumersPool` upstream) so the snapshot reports only the consumers tied to *this* query's `MemoryConsumer`s, not the whole pool. The accessor is non-consuming, can be polled from any thread, and returns immutable values. For most users the wrapper is harmless and adds no measurable overhead; it can be inserted automatically by `SessionContext` so callers don't have to opt in per query. ### 2. Tokio runtime stats: `SessionContext.runtimeStats()` ```java RuntimeStats stats = ctx.runtimeStats(); // numWorkers, liveTasksCount, globalQueueDepth, elapsed, // totalBusyDuration, totalParkCount, totalPollsCount, ... ``` A small immutable POJO mirroring the subset of `tokio_metrics::RuntimeMetrics` that operations dashboards typically render. Snapshotted under the JNI guard; cheap (one struct copy across FFI). Because `tokio_metrics` requires the `--cfg tokio_unstable` build flag to compile in, **runtime stats are gated behind a default-off `runtime-metrics` Cargo feature** on `datafusion-jni`. Default `cargo build` is unaffected (no new build flags, no new dep). To enable: ```sh RUSTFLAGS="--cfg tokio_unstable" cargo build --features runtime-metrics ``` The Java method always exists; if invoked against a build that compiled the feature off it throws a clear `RuntimeException` pointing at the flag. Same shape as the just-shipped `substrait` feature (PR #75). The two pieces are intentionally bundled. Per-query memory tracking and runtime stats both need the same JNI plumbing pattern (snapshot a struct of counters across FFI), the same lifecycle considerations (polling from another thread while a query runs), and the same opt-in gating story for builds that don't want the extra dependency. Splitting them would mean two near-identical reviews. ### Describe alternatives you've considered - **Always-on Tokio metrics, set `--cfg tokio_unstable` globally.** Would force every contributor and every CI environment to override `RUSTFLAGS`. Pollutes the build env and breaks anyone who already has their own RUSTFLAGS. Rejected. ### Additional context - `tokio-metrics 0.5.0` is the right version against the Tokio currently used in `native/Cargo.toml` (`tokio = "1"`). It exposes `RuntimeMonitor` whose `intervals()` iterator yields `RuntimeMetrics` snapshots; the JNI handler grabs one snapshot per call. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
