karlovnv commented on issue #7000:
URL: https://github.com/apache/datafusion/issues/7000#issuecomment-2096605072

   > I wonder if we could combine this with something like #7955 🤔
   
   It's quite a good idea!
   
   But I think it's a tricky to push ON condition down. The main reason is 
following: we know the list of ids (in perspective of columnindex) only at JOIN 
stage but not at filtering and getting data from the source.
   
   So the second approach:
   > 2\. Introduce row-based table provider
   Is about adding an ability of getting data directly from HASH stream by list 
of ids like so:
   <img width="662" alt="image" 
src="https://github.com/apache/datafusion/assets/3950601/d3bb46bf-ece3-4519-95e2-6b6b0f33e505";>
   
   Or even better to get only offsets  by ids (arrow `take` index for 
take_record_batch() kernel). This idea is very similar to indices in duck db.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to