saadtajwar commented on issue #23263:
URL: https://github.com/apache/datafusion/issues/23263#issuecomment-4860899433

   Hmm okay, @RatulDawar looking at this now - going to have to ask a few 
newbie questions haha:
   - Am I correct in my understanding in that he problem is, for this query, 
DataFusion at the very start of physical execution (in the `DataSourceExec` 
node) immediately does projection of all 105 columns in the table - however 
projection can be deferred until much later, because only really need the `URL` 
column for the filter & `EventTime` for the sort & limit, and we DataFusion to 
_after_ all of those expensive computations perform the projection of all 
columns?
   - If I'm understanding the above correctly, your proposal is to start by 
just deferring the projection until after the limit (so moving `ProjectionExec` 
to above `SortExec`)?
   
   
   Please let me know if there's anything I'm missing - happy to start helping 
you out with implementation/continue digging if needed!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to