alamb opened a new issue #333: URL: https://github.com/apache/arrow-datafusion/issues/333
When working in DataFusion, I often what to understand what is happening at
the physical plan level -- for example, where were the `Repartition` operations
added.
Right now the `EXPLAIN VERBOSE` gives a physical plan output that is quite
verbose:
Example output today:
```
> explain verbose select * from foo where a < 4;
+-----------------------------------------+-------------------------------------------------+
| plan_type | plan
|
+-----------------------------------------+-------------------------------------------------+
| logical_plan | Projection: #a, #b, #c
|
| | Filter: #a Lt Int64(4)
|
| | TableScan: foo
projection=None |
| logical_plan after projection_push_down | Projection: #a, #b, #c
|
| | Filter: #a Lt Int64(4)
|
| | TableScan: foo
projection=Some([0, 1, 2]) |
| logical_plan after projection_push_down | Projection: #a, #b, #c
|
| | Filter: #a Lt Int64(4)
|
| | TableScan: foo
projection=Some([0, 1, 2]) |
| physical_plan | ProjectionExec {
|
| | expr: [
|
| | (
|
| | Column {
|
| | name: "a",
|
| | },
|
| | "a",
|
| | ),
|
| | (
|
| | Column {
|
| | name: "b",
|
| | },
|
| | "b",
|
| | ),
|
| | (
|
| | Column {
|
| | name: "c",
|
| | },
|
| | "c",
|
| | ),
|
| | ],
|
| | schema: Schema {
|
| | fields: [
|
| | Field {
|
| | name: "a",
|
| | data_type:
Int32, |
| | nullable: false,
|
| | dict_id: 0,
|
| | dict_is_ordered:
false, |
| | metadata: None,
|
| | },
|
| | Field {
|
| | name: "b",
|
| | data_type:
Int32, |
| | nullable: false,
|
| | dict_id: 0,
|
| | dict_is_ordered:
false, |
| | metadata: None,
|
| | },
|
| | Field {
|
| | name: "c",
|
| | data_type:
Int32, |
| | nullable: false,
|
| | dict_id: 0,
|
| | dict_is_ordered:
false, |
| | metadata: None,
|
| | },
|
| | ],
|
| | metadata: {},
|
| | },
|
| | input: FilterExec {
|
| | predicate: BinaryExpr {
|
| | left: TryCastExpr {
|
| | expr: Column {
|
| | name: "a",
|
| | },
|
| | cast_type:
Int64, |
| | },
|
| | op: Lt,
|
| | right: Literal {
|
| | value: Int64(4),
|
| | },
|
| | },
|
| | input: CsvExec {
|
| | source:
PartitionedFiles { |
| | path:
"/tmp/foo.csv", |
| | filenames: [
|
| |
"/tmp/foo.csv", |
| | ],
|
| | },
|
| | schema: Schema {
|
| | fields: [
|
| | Field {
|
| | name:
"a", |
| |
data_type: Int32, |
| |
nullable: false, |
| | dict_id:
0, |
| |
dict_is_ordered: false, |
| |
metadata: None, |
| | },
|
| | Field {
|
| | name:
"b", |
| |
data_type: Int32, |
| |
nullable: false, |
| | dict_id:
0, |
| |
dict_is_ordered: false, |
| |
metadata: None, |
| | },
|
| | Field {
|
| | name:
"c", |
| |
data_type: Int32, |
| |
nullable: false, |
| | dict_id:
0, |
| |
dict_is_ordered: false, |
| |
metadata: None, |
| | },
|
| | ],
|
| | metadata: {},
|
| | },
|
| | has_header: false,
|
| | delimiter: Some(
|
| | 44,
|
| | ),
|
| | file_extension:
".csv", |
| | projection: Some(
|
| | [
|
| | 0,
|
| | 1,
|
| | 2,
|
| | ],
|
| | ),
|
| | projected_schema:
Schema { |
| | fields: [
|
| | Field {
|
| | name:
"a", |
| |
data_type: Int32, |
| |
nullable: false, |
| | dict_id:
0, |
| |
dict_is_ordered: false, |
| |
metadata: None, |
| | },
|
| | Field {
|
| | name:
"b", |
| |
data_type: Int32, |
| |
nullable: false, |
| | dict_id:
0, |
| |
dict_is_ordered: false, |
| |
metadata: None, |
| | },
|
| | Field {
|
| | name:
"c", |
| |
data_type: Int32, |
| |
nullable: false, |
| | dict_id:
0, |
| |
dict_is_ordered: false, |
| |
metadata: None, |
| | },
|
| | ],
|
| | metadata: {},
|
| | },
|
| | batch_size: 8192,
|
| | limit: None,
|
| | },
|
| | },
|
| | }
|
+-----------------------------------------+-------------------------------------------------+
```
**Describe the solution you'd like**
What I would like to see is something much closer to the `LogicalPlan`
printout -- with one line per physical plan node, indented appropriately to see
the tree structure.
It would be great to also add the key fields for each node.
Related to https://github.com/apache/arrow-datafusion/issues/219 which would
allow for graphviz printing
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
