Michael-J-Ward commented on issue #589:
URL:
https://github.com/apache/datafusion-python/issues/589#issuecomment-2176155939
`ctx.sql(..)` returns a `PyDataFrame`, and then you are observing the python
shell call `PyDataFrame::__repr__` to print it out, which applies a `LIMIT` and
collects the first 10 rows.
If you were to assign it and then collect, you'd see the proper result.
```python
>>> df = ctx.sql("explain select 1")
>>> df.collect()
[pyarrow.RecordBatch
plan_type: string not null
plan: string not null
----
plan_type: ["logical_plan","physical_plan"]
plan: ["Projection: Int64(1)
EmptyRelation","ProjectionExec: expr=[1 as Int64(1)]
PlaceholderRowExec
"]]
```
And then here's the error reproduced with a little more detail.
```python
>>> df = ctx.sql("explain select 1")
>>> # we can print the (unsupported) logical plan - notice `EXPLAIN` is not
the top row of the plan
>>> df.limit(count=10, offset=0).explain()
DataFrame()
+--------------+--------------------------+
| plan_type | plan |
+--------------+--------------------------+
| logical_plan | Limit: skip=0, fetch=10 |
| | Explain |
| | Projection: Int64(1) |
| | EmptyRelation |
+--------------+--------------------------+
>>> # but calling collect reproduces the error
>>> df.limit(count=10, offset=0).collect()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
Exception: Internal error: Unsupported logical plan: Explain must be root of
the plan.
This was likely caused by a bug in DataFusion's code and we would welcome
that you file an bug report in our issue tracker
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]