andygrove commented on code in PR #4235:
URL: https://github.com/apache/datafusion-comet/pull/4235#discussion_r3192464274


##########
docs/source/contributor-guide/debugging.md:
##########
@@ -220,3 +220,256 @@ Example log output:
 ```
 
 When backtraces are enabled (see earlier section) then backtraces will be 
included for failed allocations.
+
+### Dumping native stream output with a `DbgExec` wrapper
+
+When a native operator is suspected of producing wrong data (wrong values, 
wrong
+nullability, wrong row count) but the JVM-side observable output is just a 
DataFrame
+mismatch, it is useful to inspect the `RecordBatch`es that the operator 
actually
+emits. A small `ExecutionPlan` wrapper that prints every batch as it flows 
through
+makes this easy. Comet does not ship this wrapper in the source tree — paste it
+into a convenient module (for example 
`native/core/src/parquet/parquet_exec.rs`)
+for the duration of your debugging session and remove it before committing.
+
+`DbgExec` forwards `schema`, `properties`, `children`, and `with_new_children` 
to the
+inner plan, so slotting it in does not change operator semantics — it only adds
+printing. Because it is itself an `ExecutionPlan` it can be inserted anywhere 
in
+the physical plan tree built by `PhysicalPlanner::create_plan` in
+`native/core/src/execution/planner.rs`.
+
+### Dumping expression inputs and outputs with a `DbgExpr` wrapper
+
+`DbgExec` works at the operator level. When the suspect is a single
+**expression** — a cast, a binary op, a `CASE WHEN` predicate, a UDF — an
+`ExecutionPlan` wrapper is too coarse. The equivalent trick at the
+`PhysicalExpr` level is a small wrapper that forwards `evaluate()` to an inner
+expression and prints both the input `RecordBatch` and the resulting
+`ColumnarValue`. Like `DbgExec`, this wrapper is not shipped in the source tree
+— paste it in for a debugging session and remove it before committing.
+
+#### Using `DbgExpr` in `planner.rs`
+
+Paste the wrappers below into a convenient module (for example
+`native/core/src/parquet/parquet_exec.rs`). The first block is `DbgExec` (wraps
+an `ExecutionPlan`); the second is `DbgExpr` (wraps a `PhysicalExpr`). You only
+need whichever one matches the granularity you want to trace — they are
+independent.
+
+```rust

Review Comment:
   I think this code should be part of the product so we make sure it always 
compiiles in CI. Can we put this functionality behind a config somehow?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to