alamb opened a new issue, #15546:
URL: https://github.com/apache/datafusion/issues/15546
### Is your feature request related to a problem or challenge?
DataFusion has a specialized TopK operation when there is a sort + limit.
You can see this with the `ident` explain plan (says ` SortExec:
TopK(fetch=10)`):
```
> explain format indent select * from hits ORDER BY "EventTime" DESC limit
10;
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type | plan
|
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| logical_plan | Sort: hits.EventTime DESC NULLS FIRST, fetch=10
|
| | TableScan: hits projection=[WatchID, JavaEnable, Title,
GoodEvent, EventTime, EventDate, CounterID, ClientIP, RegionID, UserID,
CounterClass, OS, UserAgent, URL, Referer, IsRefresh, RefererCategoryID,
RefererRegionID, URLCategoryID, URLRegionID, ResolutionWidth, ResolutionHeight,
ResolutionDepth, FlashMajor, FlashMinor, FlashMinor2, NetMajor, NetMinor,
UserAgentMajor, UserAgentMinor, CookieEnable, JavascriptEnable, IsMobile,
MobilePhone, MobilePhoneModel, Params, IPNetworkID, TraficSourceID,
SearchEngineID, SearchPhrase, AdvEngineID, IsArtifical, WindowClientWidth,
WindowClientHeight, ClientTimeZone, ClientEventTime, SilverlightVersion1,
SilverlightVersion2, SilverlightVersion3, SilverlightVersion4, PageCharset,
CodeVersion, IsLink, IsDownload, IsNotBounce, FUniqID, OriginalURL, HID,
IsOldCounter, IsEvent, IsParameter, DontCountHits, WithHash, HitColor,
LocalEventTime, Age, Sex, Income, Interests, Robotness, RemoteIP, WindowName,
OpenerName, HistoryLength, Brow
serLanguage, BrowserCountry, SocialNetwork, SocialAction, HTTPError,
SendTiming, DNSTiming, ConnectTiming, ResponseStartTiming, ResponseEndTiming,
FetchTiming, SocialSourceNetworkID, SocialSourcePage, ParamPrice, ParamOrderID,
ParamCurrency, ParamCurrencyID, OpenstatServiceName, OpenstatCampaignID,
OpenstatAdID, OpenstatSourceID, UTMSource, UTMMedium, UTMCampaign, UTMContent,
UTMTerm, FromTag, HasGCLID, RefererHash, URLHash, CLID]
|
| physical_plan | SortPreservingMergeExec: [EventTime@4 DESC], fetch=10
|
| | SortExec: TopK(fetch=10), expr=[EventTime@4 DESC],
preserve_partitioning=[true]
|
| | DataSourceExec: file_groups={16 groups:
[[Users/andrewlamb/Downloads/hits/hits.parquet:0..923748528],
[Users/andrewlamb/Downloads/hits/hits.parquet:923748528..1847497056],
[Users/andrewlamb/Downloads/hits/hits.parquet:1847497056..2771245584],
[Users/andrewlamb/Downloads/hits/hits.parquet:2771245584..3694994112],
[Users/andrewlamb/Downloads/hits/hits.parquet:3694994112..4618742640], ...]},
projection=[WatchID, JavaEnable, Title, GoodEvent, EventTime, EventDate,
CounterID, ClientIP, RegionID, UserID, CounterClass, OS, UserAgent, URL,
Referer, IsRefresh, RefererCategoryID, RefererRegionID, URLCategoryID,
URLRegionID, ResolutionWidth, ResolutionHeight, ResolutionDepth, FlashMajor,
FlashMinor, FlashMinor2, NetMajor, NetMinor, UserAgentMajor, UserAgentMinor,
CookieEnable, JavascriptEnable, IsMobile, MobilePhone, MobilePhoneModel,
Params, IPNetworkID, TraficSourceID, SearchEngineID, SearchPhrase, AdvEngineID,
IsArtifical, WindowClientWidth, WindowClientHeight, Clien
tTimeZone, ClientEventTime, SilverlightVersion1, SilverlightVersion2,
SilverlightVersion3, SilverlightVersion4, PageCharset, CodeVersion, IsLink,
IsDownload, IsNotBounce, FUniqID, OriginalURL, HID, IsOldCounter, IsEvent,
IsParameter, DontCountHits, WithHash, HitColor, LocalEventTime, Age, Sex,
Income, Interests, Robotness, RemoteIP, WindowName, OpenerName, HistoryLength,
BrowserLanguage, BrowserCountry, SocialNetwork, SocialAction, HTTPError,
SendTiming, DNSTiming, ConnectTiming, ResponseStartTiming, ResponseEndTiming,
FetchTiming, SocialSourceNetworkID, SocialSourcePage, ParamPrice, ParamOrderID,
ParamCurrency, ParamCurrencyID, OpenstatServiceName, OpenstatCampaignID,
OpenstatAdID, OpenstatSourceID, UTMSource, UTMMedium, UTMCampaign, UTMContent,
UTMTerm, FromTag, HasGCLID, RefererHash, URLHash, CLID], file_type=parquet,
predicate=DynamicFilterPhysicalExpr [ SortDynamicFilterSource[ ] ] |
| |
|
+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 row(s) fetched.
Elapsed 0.059 seconds.
```
However, it is not clear in the tree format -- which just says SortExec
```sql
> explain format tree select * from hits ORDER BY "EventTime" DESC limit 10;
+---------------+-------------------------------+
| plan_type | plan |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
| | │ SortPreservingMergeExec │ |
| | │ -------------------- │ |
| | │ EventTime DESClimit: │ |
| | │ 10 │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ SortExec │ |
| | │ -------------------- │ |
| | │ EventTime@4 DESC │ |
| | │ │ |
| | │ limit: 10 │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ DataSourceExec │ |
| | │ -------------------- │ |
| | │ files: 16 │ |
| | │ format: parquet │ |
| | │ predicate: true │ |
| | └───────────────────────────┘ |
| | |
+---------------+-------------------------------+
1 row(s) fetched.
Elapsed 0.063 seconds.
```
### Describe the solution you'd like
I would like the tree explain plan to also say `TopK` somehow when the topk
implementation will be used
### Describe alternatives you've considered
_No response_
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]