[ 
https://issues.apache.org/jira/browse/ARROW-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Lamb updated ARROW-9683:
-------------------------------
    Description: 
For ARROW-9653, I was trying to debug the execution plan and I would have found 
it easier if there had been a way to display the execution plan to better 
understand and isolate the issue. This would also be nice to have as part of 
EXPLAIN plan functionality in ARROW-9654

In general, for debugging purposes, we would like to be able to dump out an 
execution plan. To do so in the idiomatic rust way, we should require that 
`ExecutionPlan` also implement `std::fmt::Debug`

Here is an example plan for "SELECT c1, c2, MIN(c3) FROM aggregate_test_100 
GROUP BY c1, c2"

{code}
physical plan is HashAggregateExec {
    group_expr: [
        Column {
            name: "c1",
        },
        Column {
            name: "c2",
        },
    ],
    aggr_expr: [
        Min {
            expr: Column {
                name: "c3",
            },
        },
    ],
    input: DataSourceExec {
        schema: Schema {
            fields: [
                Field {
                    name: "c1",
                    data_type: Utf8,
                    nullable: false,
                    dict_id: 0,
                    dict_is_ordered: false,
                },
                Field {
                    name: "c2",
                    data_type: UInt32,
                    nullable: false,
                    dict_id: 0,
                    dict_is_ordered: false,
                },
                Field {
                    name: "c3",
                    data_type: Int8,
                    nullable: false,
                    dict_id: 0,
                    dict_is_ordered: false,
                },
            ],
            metadata: {},
        },
        partitions.len: 1,
    },
    schema: Schema {
        fields: [
            Field {
                name: "c1",
                data_type: Utf8,
                nullable: true,
                dict_id: 0,
                dict_is_ordered: false,
            },
            Field {
                name: "c2",
                data_type: UInt32,
                nullable: true,
                dict_id: 0,
                dict_is_ordered: false,
            },
            Field {
                name: "MIN(c3)",
                data_type: Int64,
                nullable: true,
                dict_id: 0,
                dict_is_ordered: false,
            },
        ],
        metadata: {},
    },
}
{code}

  was:
For ARROW-9653, I was trying to debug the execution plan and I would have found 
it easier if there had been a way to display the execution plan to better 
understand and isolate the issue. 

In general, for debugging purposes, we would like to be able to dump out an 
execution plan. To do so in the idiomatic rust way, we should require that 
`ExecutionPlan` also implement `std::fmt::Debug`

Here is an example plan for "SELECT c1, c2, MIN(c3) FROM aggregate_test_100 
GROUP BY c1, c2"

{code}
physical plan is HashAggregateExec {
    group_expr: [
        Column {
            name: "c1",
        },
        Column {
            name: "c2",
        },
    ],
    aggr_expr: [
        Min {
            expr: Column {
                name: "c3",
            },
        },
    ],
    input: DataSourceExec {
        schema: Schema {
            fields: [
                Field {
                    name: "c1",
                    data_type: Utf8,
                    nullable: false,
                    dict_id: 0,
                    dict_is_ordered: false,
                },
                Field {
                    name: "c2",
                    data_type: UInt32,
                    nullable: false,
                    dict_id: 0,
                    dict_is_ordered: false,
                },
                Field {
                    name: "c3",
                    data_type: Int8,
                    nullable: false,
                    dict_id: 0,
                    dict_is_ordered: false,
                },
            ],
            metadata: {},
        },
        partitions.len: 1,
    },
    schema: Schema {
        fields: [
            Field {
                name: "c1",
                data_type: Utf8,
                nullable: true,
                dict_id: 0,
                dict_is_ordered: false,
            },
            Field {
                name: "c2",
                data_type: UInt32,
                nullable: true,
                dict_id: 0,
                dict_is_ordered: false,
            },
            Field {
                name: "MIN(c3)",
                data_type: Int64,
                nullable: true,
                dict_id: 0,
                dict_is_ordered: false,
            },
        ],
        metadata: {},
    },
}
{code}


> [Rust][DataFusion] Implement Debug for ExecutionPlan trait
> ----------------------------------------------------------
>
>                 Key: ARROW-9683
>                 URL: https://issues.apache.org/jira/browse/ARROW-9683
>             Project: Apache Arrow
>          Issue Type: Improvement
>            Reporter: Andrew Lamb
>            Assignee: Andrew Lamb
>            Priority: Minor
>
> For ARROW-9653, I was trying to debug the execution plan and I would have 
> found it easier if there had been a way to display the execution plan to 
> better understand and isolate the issue. This would also be nice to have as 
> part of EXPLAIN plan functionality in ARROW-9654
> In general, for debugging purposes, we would like to be able to dump out an 
> execution plan. To do so in the idiomatic rust way, we should require that 
> `ExecutionPlan` also implement `std::fmt::Debug`
> Here is an example plan for "SELECT c1, c2, MIN(c3) FROM aggregate_test_100 
> GROUP BY c1, c2"
> {code}
> physical plan is HashAggregateExec {
>     group_expr: [
>         Column {
>             name: "c1",
>         },
>         Column {
>             name: "c2",
>         },
>     ],
>     aggr_expr: [
>         Min {
>             expr: Column {
>                 name: "c3",
>             },
>         },
>     ],
>     input: DataSourceExec {
>         schema: Schema {
>             fields: [
>                 Field {
>                     name: "c1",
>                     data_type: Utf8,
>                     nullable: false,
>                     dict_id: 0,
>                     dict_is_ordered: false,
>                 },
>                 Field {
>                     name: "c2",
>                     data_type: UInt32,
>                     nullable: false,
>                     dict_id: 0,
>                     dict_is_ordered: false,
>                 },
>                 Field {
>                     name: "c3",
>                     data_type: Int8,
>                     nullable: false,
>                     dict_id: 0,
>                     dict_is_ordered: false,
>                 },
>             ],
>             metadata: {},
>         },
>         partitions.len: 1,
>     },
>     schema: Schema {
>         fields: [
>             Field {
>                 name: "c1",
>                 data_type: Utf8,
>                 nullable: true,
>                 dict_id: 0,
>                 dict_is_ordered: false,
>             },
>             Field {
>                 name: "c2",
>                 data_type: UInt32,
>                 nullable: true,
>                 dict_id: 0,
>                 dict_is_ordered: false,
>             },
>             Field {
>                 name: "MIN(c3)",
>                 data_type: Int64,
>                 nullable: true,
>                 dict_id: 0,
>                 dict_is_ordered: false,
>             },
>         ],
>         metadata: {},
>     },
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to